Re: directory usage question

2014-09-07 Thread Brian Jeltema

On Sep 6, 2014, at 9:32 AM, Ted Yu yuzhih...@gmail.com wrote:

 Can you post your hbase-site.xml ?
 
 /apps/hbase/data/archive/data/default is where HFiles are archived (e.g.
 when a column family is deleted, HFiles for this column family are stored
 here).
 /apps/hbase/data/data/default seems to be your hbase.rootdir
 
 

hbase.rootdir is defined to be hdfs://foo:8020/apps/hbase/data. I think that's 
the default that Ambari creates.

So the HFiles in the archive subdirectory have been discarded and can be 
deleted safely? 

 bq. a problem I'm having running map/reduce jobs against snapshots
 
 Can you describe the problem in a bit more detail ?
 
 

I don't understand what I'm seeing well enough to ask an intelligent question 
yet.
I appear to be scanning duplicate rows when using initTableSnapshotMapperJob,
but I'm trying to get a better understanding of how this works, since It's 
probably just
something I'm doing wrong.

Brian

 Cheers
 
 
 On Sat, Sep 6, 2014 at 6:09 AM, Brian Jeltema 
 brian.jelt...@digitalenvoy.net wrote:
 
 I'm trying to track down a problem I'm having running map/reduce jobs
 against snapshots.
 Can someone explain the difference between files stored in:
 
/apps/hbase/data/archive/data/default
 
 and files stored in
 
/apps/hbase/data/data/default
 
 (Hadoop 2.4, HBase 0.98)
 
 Thanks



Re: directory usage question

2014-09-07 Thread Ted Yu
The files under archive directory are referenced by snapshots. 
Please don't delete them manually. 

You can delete unused snapshots. 

Cheers

On Sep 7, 2014, at 4:08 AM, Brian Jeltema brian.jelt...@digitalenvoy.net 
wrote:

 
 On Sep 6, 2014, at 9:32 AM, Ted Yu yuzhih...@gmail.com wrote:
 
 Can you post your hbase-site.xml ?
 
 /apps/hbase/data/archive/data/default is where HFiles are archived (e.g.
 when a column family is deleted, HFiles for this column family are stored
 here).
 /apps/hbase/data/data/default seems to be your hbase.rootdir
 
 hbase.rootdir is defined to be hdfs://foo:8020/apps/hbase/data. I think 
 that's the default that Ambari creates.
 
 So the HFiles in the archive subdirectory have been discarded and can be 
 deleted safely? 
 
 bq. a problem I'm having running map/reduce jobs against snapshots
 
 Can you describe the problem in a bit more detail ?
 
 I don't understand what I'm seeing well enough to ask an intelligent question 
 yet.
 I appear to be scanning duplicate rows when using initTableSnapshotMapperJob,
 but I'm trying to get a better understanding of how this works, since It's 
 probably just
 something I'm doing wrong.
 
 Brian
 
 Cheers
 
 
 On Sat, Sep 6, 2014 at 6:09 AM, Brian Jeltema 
 brian.jelt...@digitalenvoy.net wrote:
 
 I'm trying to track down a problem I'm having running map/reduce jobs
 against snapshots.
 Can someone explain the difference between files stored in:
 
   /apps/hbase/data/archive/data/default
 
 and files stored in
 
   /apps/hbase/data/data/default
 
 (Hadoop 2.4, HBase 0.98)
 
 Thanks
 


Re: directory usage question

2014-09-07 Thread Brian Jeltema
initTableSnapshotMapperJob writes into this directory (indirectly) via 
RestoreSnapshotHelper.restoreHdfsRegions

Is this expected? I would have expected writes to be limited to the temp 
directory passed in the init call

Brian

On Sep 7, 2014, at 8:17 AM, Ted Yu yuzhih...@gmail.com wrote:

 The files under archive directory are referenced by snapshots. 
 Please don't delete them manually. 
 
 You can delete unused snapshots. 
 
 Cheers
 
 On Sep 7, 2014, at 4:08 AM, Brian Jeltema brian.jelt...@digitalenvoy.net 
 wrote:
 
 
 On Sep 6, 2014, at 9:32 AM, Ted Yu yuzhih...@gmail.com wrote:
 
 Can you post your hbase-site.xml ?
 
 /apps/hbase/data/archive/data/default is where HFiles are archived (e.g.
 when a column family is deleted, HFiles for this column family are stored
 here).
 /apps/hbase/data/data/default seems to be your hbase.rootdir
 
 hbase.rootdir is defined to be hdfs://foo:8020/apps/hbase/data. I think 
 that's the default that Ambari creates.
 
 So the HFiles in the archive subdirectory have been discarded and can be 
 deleted safely? 
 
 bq. a problem I'm having running map/reduce jobs against snapshots
 
 Can you describe the problem in a bit more detail ?
 
 I don't understand what I'm seeing well enough to ask an intelligent 
 question yet.
 I appear to be scanning duplicate rows when using initTableSnapshotMapperJob,
 but I'm trying to get a better understanding of how this works, since It's 
 probably just
 something I'm doing wrong.
 
 Brian
 
 Cheers
 
 
 On Sat, Sep 6, 2014 at 6:09 AM, Brian Jeltema 
 brian.jelt...@digitalenvoy.net wrote:
 
 I'm trying to track down a problem I'm having running map/reduce jobs
 against snapshots.
 Can someone explain the difference between files stored in:
 
  /apps/hbase/data/archive/data/default
 
 and files stored in
 
  /apps/hbase/data/data/default
 
 (Hadoop 2.4, HBase 0.98)
 
 Thanks
 
 



Re: directory usage question

2014-09-07 Thread Ted Yu
Eclipse doesn't show that RestoreSnapshotHelper.restoreHdfsRegions() is
called by initTableSnapshotMapperJob (in master branch)

Looking at TableMapReduceUtil.java in 0.98, I don't see direct relation
between the two.

Do you have stack trace or something else showing the relationship ?

Cheers


On Sun, Sep 7, 2014 at 5:48 AM, Brian Jeltema 
brian.jelt...@digitalenvoy.net wrote:

 initTableSnapshotMapperJob writes into this directory (indirectly) via
 RestoreSnapshotHelper.restoreHdfsRegions

 Is this expected? I would have expected writes to be limited to the temp
 directory passed in the init call

 Brian

 On Sep 7, 2014, at 8:17 AM, Ted Yu yuzhih...@gmail.com wrote:

  The files under archive directory are referenced by snapshots.
  Please don't delete them manually.
 
  You can delete unused snapshots.
 
  Cheers
 
  On Sep 7, 2014, at 4:08 AM, Brian Jeltema 
 brian.jelt...@digitalenvoy.net wrote:
 
 
  On Sep 6, 2014, at 9:32 AM, Ted Yu yuzhih...@gmail.com wrote:
 
  Can you post your hbase-site.xml ?
 
  /apps/hbase/data/archive/data/default is where HFiles are archived
 (e.g.
  when a column family is deleted, HFiles for this column family are
 stored
  here).
  /apps/hbase/data/data/default seems to be your hbase.rootdir
 
  hbase.rootdir is defined to be hdfs://foo:8020/apps/hbase/data. I think
 that's the default that Ambari creates.
 
  So the HFiles in the archive subdirectory have been discarded and can
 be deleted safely?
 
  bq. a problem I'm having running map/reduce jobs against snapshots
 
  Can you describe the problem in a bit more detail ?
 
  I don't understand what I'm seeing well enough to ask an intelligent
 question yet.
  I appear to be scanning duplicate rows when using
 initTableSnapshotMapperJob,
  but I'm trying to get a better understanding of how this works, since
 It's probably just
  something I'm doing wrong.
 
  Brian
 
  Cheers
 
 
  On Sat, Sep 6, 2014 at 6:09 AM, Brian Jeltema 
  brian.jelt...@digitalenvoy.net wrote:
 
  I'm trying to track down a problem I'm having running map/reduce jobs
  against snapshots.
  Can someone explain the difference between files stored in:
 
   /apps/hbase/data/archive/data/default
 
  and files stored in
 
   /apps/hbase/data/data/default
 
  (Hadoop 2.4, HBase 0.98)
 
  Thanks
 
 




Re: directory usage question

2014-09-07 Thread Brian Jeltema

 Eclipse doesn't show that RestoreSnapshotHelper.restoreHdfsRegions() is
 called by initTableSnapshotMapperJob (in master branch)
 
 Looking at TableMapReduceUtil.java in 0.98, I don't see direct relation
 between the two.
 
 Do you have stack trace or something else showing the relationship ?

Right. That’s what I meant by ‘indirectly’. This is a stack trace that was 
caused by an ownership conflict:

java.io.IOException: java.util.concurrent.ExecutionException: 
org.apache.hadoop.security.AccessControlException: Permission denied: 
user=hbase, access=WRITE, 
inode=/apps/hbase/data/archive/data/default/Host/c41d632d5eee02e1883215460e5c261d/p:hdfs:hdfs:drwxr-xr-x
 at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
 at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
 at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
 at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5509)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5491)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5465)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3608)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3578)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3552)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:754)
 at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
 at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 at 
org.apache.hadoop.hbase.util.ModifyRegionUtils.createRegions(ModifyRegionUtils.java:131)
 at 
org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.cloneHdfsRegions(RestoreSnapshotHelper.java:475)
 at 
org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.restoreHdfsRegions(RestoreSnapshotHelper.java:208)
 at 
org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.copySnapshotForScanner(RestoreSnapshotHelper.java:733)
 at 
org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat.setInput(TableSnapshotInputFormat.java:397)
 at 
org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableSnapshotMapperJob(TableMapReduceUtil.java:301)
 at net.digitalenvoy.hp.job.ParseHostnamesJob.run(ParseHostnamesJob.java:77)
 at net.digitalenvoy.hp.HostProcessor.run(HostProcessor.java:165)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at net.digitalenvoy.hp.HostProcessor.main(HostProcessor.java:47)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

 
 Cheers
 
 
 On Sun, Sep 7, 2014 at 5:48 AM, Brian Jeltema 
 brian.jelt...@digitalenvoy.net wrote:
 
 initTableSnapshotMapperJob writes into this directory (indirectly) via
 RestoreSnapshotHelper.restoreHdfsRegions
 
 Is this expected? I would have expected writes to be limited to the temp
 directory passed in the init call
 
 Brian
 
 On Sep 7, 2014, at 8:17 AM, Ted Yu yuzhih...@gmail.com wrote:
 
 The files under archive directory are referenced by snapshots.
 Please don't delete them manually.
 
 You can delete unused snapshots.
 
 Cheers
 
 On Sep 7, 2014, at 4:08 AM, Brian Jeltema 
 brian.jelt...@digitalenvoy.net wrote:
 
 
 On Sep 6, 2014, at 9:32 AM, Ted Yu yuzhih...@gmail.com wrote:
 
 Can you post your hbase-site.xml ?
 
 /apps/hbase/data/archive/data/default is where HFiles are archived
 (e.g.
 when a column family is deleted, HFiles for this column family are
 stored
 here).
 /apps/hbase/data/data/default seems to be your hbase.rootdir
 
 hbase.rootdir is defined to be hdfs://foo:8020/apps/hbase/data. I think
 that's the default that Ambari creates.
 
 So the HFiles in the archive subdirectory have been discarded and can
 be 

need help understand log output

2014-09-07 Thread Brian Jeltema
I have a map/reduce job that is consistently failing with timeouts. The failing 
mapper log files contain a series
of records similar to those below. When I look at the hbase and hdfs logs (on 
foo.net in this case) I don’t see
anything obvious at these timestamps. The mapper task times out at/near 
attempt=25/35. Can anyone shed light
on what these log entries mean?

Thanks - Brian


2014-09-07 09:36:51,421 INFO [htable-pool1-t1] 
org.apache.hadoop.hbase.client.AsyncProcess: #3, table=Host, primary, 
attempt=10/35 failed 1062 ops, last exception: null on 
foo.net,60020,1406043467187, tracking started null, retrying after 10029 ms, 
replay 1062 ops
2014-09-07 09:37:01,642 INFO [htable-pool1-t1] 
org.apache.hadoop.hbase.client.AsyncProcess: #3, table=Host, primary, 
attempt=11/35 failed 1062 ops, last exception: null on 
foo.net,60020,1406043467187, tracking started null, retrying after 10023 ms, 
replay 1062 ops
2014-09-07 09:37:12,064 INFO [htable-pool1-t1] 
org.apache.hadoop.hbase.client.AsyncProcess: #3, table=Host, primary, 
attempt=12/35 failed 1062 ops, last exception: null on 
foo.net,60020,1406043467187, tracking started null, retrying after 20182 ms, 
replay 1062 ops
2014-09-07 09:37:32,708 INFO [htable-pool1-t1] 
org.apache.hadoop.hbase.client.AsyncProcess: #3, table=Host, primary, 
attempt=13/35 failed 1062 ops, last exception: null on 
foo.net,60020,1406043467187, tracking started null, retrying after 20140 ms, 
replay 1062 ops
2014-09-07 09:37:52,940 INFO [htable-pool1-t1] 
org.apache.hadoop.hbase.client.AsyncProcess: #3, table=Host, primary, 
attempt=14/35 failed 1062 ops, last exception: null on 
foo.net,60020,1406043467187, tracking started null, retrying after 20041 ms, 
replay 1062 ops
2014-09-07 09:38:13,324 INFO [htable-pool1-t1] 
org.apache.hadoop.hbase.client.AsyncProcess: #3, table=Host, primary, 
attempt=15/35 failed 1062 ops, last exception: null on 
foo.net,60020,1406043467187, tracking started null, retrying after 20041 ms, 
replay 1062 ops



Re: directory usage question

2014-09-07 Thread Ted Yu
Your cluster is an insecure HBase deployment, right ?

Are all files under /apps/hbase/data/archive/data/default owned by user
'hdfs' ?

BTW in tip of 0.98, with HBASE-11742, related code looks a bit different.

Cheers


On Sun, Sep 7, 2014 at 8:27 AM, Brian Jeltema 
brian.jelt...@digitalenvoy.net wrote:


  Eclipse doesn't show that RestoreSnapshotHelper.restoreHdfsRegions() is
  called by initTableSnapshotMapperJob (in master branch)
 
  Looking at TableMapReduceUtil.java in 0.98, I don't see direct relation
  between the two.
 
  Do you have stack trace or something else showing the relationship ?

 Right. That’s what I meant by ‘indirectly’. This is a stack trace that was
 caused by an ownership conflict:

 java.io.IOException: java.util.concurrent.ExecutionException:
 org.apache.hadoop.security.AccessControlException: Permission denied:
 user=hbase, access=WRITE,
 inode=/apps/hbase/data/archive/data/default/Host/c41d632d5eee02e1883215460e5c261d/p:hdfs:hdfs:drwxr-xr-x
  at
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
  at
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
  at
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
  at
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
  at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5509)
  at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5491)
  at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5465)
  at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3608)
  at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3578)
  at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3552)
  at
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:754)
  at
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
  at
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:396)
  at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
  at
 org.apache.hadoop.hbase.util.ModifyRegionUtils.createRegions(ModifyRegionUtils.java:131)
  at
 org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.cloneHdfsRegions(RestoreSnapshotHelper.java:475)
  at
 org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.restoreHdfsRegions(RestoreSnapshotHelper.java:208)
  at
 org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.copySnapshotForScanner(RestoreSnapshotHelper.java:733)
  at
 org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat.setInput(TableSnapshotInputFormat.java:397)
  at
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableSnapshotMapperJob(TableMapReduceUtil.java:301)
  at
 net.digitalenvoy.hp.job.ParseHostnamesJob.run(ParseHostnamesJob.java:77)
  at net.digitalenvoy.hp.HostProcessor.run(HostProcessor.java:165)
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
  at net.digitalenvoy.hp.HostProcessor.main(HostProcessor.java:47)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

 
  Cheers
 
 
  On Sun, Sep 7, 2014 at 5:48 AM, Brian Jeltema 
  brian.jelt...@digitalenvoy.net wrote:
 
  initTableSnapshotMapperJob writes into this directory (indirectly) via
  RestoreSnapshotHelper.restoreHdfsRegions
 
  Is this expected? I would have expected writes to be limited to the temp
  directory passed in the init call
 
  Brian
 
  On Sep 7, 2014, at 8:17 AM, Ted Yu yuzhih...@gmail.com wrote:
 
  The files under archive directory are referenced by snapshots.
  Please don't delete them manually.
 
  You can delete unused snapshots.
 
  Cheers
 
  On Sep 7, 2014, at 4:08 AM, Brian Jeltema 
  brian.jelt...@digitalenvoy.net wrote:
 
 
  On Sep 6, 2014, at 9:32 AM, Ted Yu yuzhih...@gmail.com wrote:
 
  Can you post your hbase-site.xml ?
 
  /apps/hbase/data/archive/data/default is where 

Re: need help understand log output

2014-09-07 Thread Ted Yu
When number of attempts is greater than the value of
hbase.client.start.log.errors.counter (default 9), AsyncProcess would
produce logs cited below.
The interval following 'retrying after ' is the backoff time.

Which release of HBase are you using ?

Cheers


On Sun, Sep 7, 2014 at 8:50 AM, Brian Jeltema 
brian.jelt...@digitalenvoy.net wrote:

 I have a map/reduce job that is consistently failing with timeouts. The
 failing mapper log files contain a series
 of records similar to those below. When I look at the hbase and hdfs logs
 (on foo.net in this case) I don’t see
 anything obvious at these timestamps. The mapper task times out at/near
 attempt=25/35. Can anyone shed light
 on what these log entries mean?

 Thanks - Brian


 2014-09-07 09:36:51,421 INFO [htable-pool1-t1]
 org.apache.hadoop.hbase.client.AsyncProcess: #3, table=Host, primary,
 attempt=10/35 failed 1062 ops, last exception: null on 
 foo.net,60020,1406043467187,
 tracking started null, retrying after 10029 ms, replay 1062 ops
 2014-09-07 09:37:01,642 INFO [htable-pool1-t1]
 org.apache.hadoop.hbase.client.AsyncProcess: #3, table=Host, primary,
 attempt=11/35 failed 1062 ops, last exception: null on 
 foo.net,60020,1406043467187,
 tracking started null, retrying after 10023 ms, replay 1062 ops
 2014-09-07 09:37:12,064 INFO [htable-pool1-t1]
 org.apache.hadoop.hbase.client.AsyncProcess: #3, table=Host, primary,
 attempt=12/35 failed 1062 ops, last exception: null on 
 foo.net,60020,1406043467187,
 tracking started null, retrying after 20182 ms, replay 1062 ops
 2014-09-07 09:37:32,708 INFO [htable-pool1-t1]
 org.apache.hadoop.hbase.client.AsyncProcess: #3, table=Host, primary,
 attempt=13/35 failed 1062 ops, last exception: null on 
 foo.net,60020,1406043467187,
 tracking started null, retrying after 20140 ms, replay 1062 ops
 2014-09-07 09:37:52,940 INFO [htable-pool1-t1]
 org.apache.hadoop.hbase.client.AsyncProcess: #3, table=Host, primary,
 attempt=14/35 failed 1062 ops, last exception: null on 
 foo.net,60020,1406043467187,
 tracking started null, retrying after 20041 ms, replay 1062 ops
 2014-09-07 09:38:13,324 INFO [htable-pool1-t1]
 org.apache.hadoop.hbase.client.AsyncProcess: #3, table=Host, primary,
 attempt=15/35 failed 1062 ops, last exception: null on 
 foo.net,60020,1406043467187,
 tracking started null, retrying after 20041 ms, replay 1062 ops




Re: One-table w/ multi-CF or multi-table w/ one-CF?

2014-09-07 Thread Michael Segel
I would suggest rethinking column families and look at your potential for a 
slightly different row key. 

Going with column families doesn’t really make sense. 

Also how wide are the rows? (worst case?) 

one idea is to make type part of the RK… 

HTH

-Mike

On Sep 7, 2014, at 2:40 AM, Jianshi Huang jianshi.hu...@gmail.com wrote:

 Hi Michael,
 
 Thanks for the questions.
 
 I'm modeling dynamic Graphs in HBase, all elements (vertices, edges) have a
 timestamp and I can query things like events between A and B for the last 7
 days.
 
 CFs are used for grouping different types of data for the same account.
 However, I have lots of skews in the data, to avoid having too much for the
 same row, I had to put what was in CQs to now RKs. So CF now acts more like
 a table.
 
 There's one CF containing sequence of events ordered by timestamp, and this
 CF is quite different as the use case is mostly in mapreduce jobs.
 
 Jianshi
 
 
 
 
 On Sun, Sep 7, 2014 at 4:52 AM, Michael Segel michael_se...@hotmail.com
 wrote:
 
 Again, a silly question.
 
 Why are you using column families?
 
 Just to play devil’s advocate in terms of design, why are you not treating
 your row as a record? Think hierarchal not relational.
 
 This really gets in to some design theory.
 
 Think Column Family as a way to group data that has the same row key,
 reference the same thing, yet the data in each column family is used
 separately.
 The example I always turn to when teaching, is to think of an order entry
 system at a retailer.
 
 You generate data which is segmented by business process. (order entry,
 pick slips, shipping, invoicing) All reflect a single order, yet the data
 in each process tends to be accessed separately.
 (You don’t need the order entry when using the pick slip to pull orders
 from the warehouse.)  So here, the data access pattern is that each column
 family is used separately, except in generating the data (the order entry
 is used to generate the pick slip(s) and set up things like backorders and
 then the pick process generates the shipping slip(s) etc …  And since they
 are all focused on the same order, they have the same row key.
 
 So its reasonable to ask how you are accessing the data and how you are
 designing your HBase model?
 
 Many times,  developers create a model using column families because the
 developer is thinking in terms of relationships. Not access patterns on the
 data.
 
 Does this make sense?
 
 
 On Sep 6, 2014, at 7:46 PM, Jianshi Huang jianshi.hu...@gmail.com wrote:
 
 BTW, a little explanation about the binning I mentioned.
 
 Currently the rowkey looks like type_of_events#rev_timestamp#id.
 
 And with binning, it looks like
 bin_number#type_of_events#rev_timestamp#id. The bin_number could
 be
 id % 256 or timestamp % 256. And the table could be pre-splitted. So
 future
 ingestions could do parallel insertion to #bin regions, even without
 pre-split.
 
 
 Jianshi
 
 
 On Sun, Sep 7, 2014 at 2:34 AM, Jianshi Huang jianshi.hu...@gmail.com
 wrote:
 
 Each range might span multiple regions, depending on the data size I
 want
 scan for MR jobs.
 
 The ranges are dynamic, specified by the user, but the number of bins
 can
 be static (when the table/schema is created).
 
 Jianshi
 
 
 On Sun, Sep 7, 2014 at 2:23 AM, Ted Yu yuzhih...@gmail.com wrote:
 
 bq. 16 to 256 ranges
 
 Would each range be within single region or the range may span regions
 ?
 Are the ranges dynamic ?
 
 Using command line for multiple ranges would be out of question. A file
 with ranges is needed.
 
 Cheers
 
 
 On Sat, Sep 6, 2014 at 11:18 AM, Jianshi Huang 
 jianshi.hu...@gmail.com
 wrote:
 
 Thanks Ted for the reference.
 
 That's right, extend the row.start and row.end to specify multiple
 ranges
 and also getSplits.
 
 I would probably bin the event sequence CF into 16 to 256 bins. So 16
 to
 256 ranges.
 
 Jianshi
 
 
 
 On Sun, Sep 7, 2014 at 2:09 AM, Ted Yu yuzhih...@gmail.com wrote:
 
 Please refer to HBASE-5416 Filter on one CF and if a match, then load
 and
 return full row
 
 bq. to extend TableInputFormat to accept multiple row ranges
 
 You mean extending hbase.mapreduce.scan.row.start and
 hbase.mapreduce.scan.row.stop so that multiple ranges can be
 specified ?
 How many such ranges do you normally need ?
 
 Cheers
 
 
 On Sat, Sep 6, 2014 at 11:01 AM, Jianshi Huang 
 jianshi.hu...@gmail.com
 wrote:
 
 Thanks Ted,
 
 I'll pre-split the table during ingestion. The reason to keep the
 rowkey
 monotonic is for easier working with TableInputFormat, otherwise I
 would've
 binned it into 256 splits. (well, I think a good way is to extend
 TableInputFormat to accept multiple row ranges, if there's an
 existing
 efficient implementation, please let me know :)
 
 Would you elaborate a little more on the heap memory usage during
 scan?
 Is
 there any reference to that?
 
 Jianshi
 
 
 
 On Sun, Sep 7, 2014 at 1:20 AM, Ted Yu yuzhih...@gmail.com wrote:
 
 If you use monotonically increasing rowkeys, separating