I'm running a hbase data import on 0.1.3. After 42million rows, the import 
fails with an RPC timeout exception. I've tried twice- once on a 2 node cluster 
and once on a 10 node cluster (ec2 with the same configuration) and it failed 
both times in the same spot, somewhere between 42 and 43 million rows. Where 
should I look to debug this?

>From the hbase shell, I can query the table and see the rows have been 
>inserted, but when I do a 'hadoop dfs -ls' I don't see the /hbase dir I 
>specified, so I'm suspicious it's not storing the data into dfs, and unsure 
>where it is storing this data.

hbase root last log entries
2008-07-25 13:46:10,196 INFO org.apache.hadoop.hbase.HMaster: 
HMaster.rootScanner scanning meta region {regionname: -ROOT-,,0, startKey: <>, 
server: 10.254.171.22:60020}
2008-07-25 13:46:10,213 DEBUG org.apache.hadoop.hbase.HMaster: 
HMaster.rootScanner regioninfo: {regionname: .META.,,1, startKey: <>, endKey: 
<>, encodedName: 1028785192, tableDesc: {name: .META., families: {info:={name: 
info, max versions: 1, compression: NONE, in memory: false, max length: 
2147483647, bloom filter: none}}}}, server: 10.254.243.146:60020, startCode: 
1216947114706
2008-07-25 13:46:10,214 INFO org.apache.hadoop.hbase.HMaster: 
HMaster.rootScanner scan of meta region {regionname: -ROOT-,,0, startKey: <>, 
server: 10.254.171.22:60020} complete

last log entries from one of the region servers
2008-07-25 13:44:28,190 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache 
flush for region relations,,1216948402123. Current region memcache size 0.0
2008-07-25 13:44:28,190 DEBUG org.apache.hadoop.hbase.HRegion: Finished 
memcache flush for region relations,,1216948402123 in 0ms, sequence id=32
2008-07-25 13:44:28,190 DEBUG org.apache.hadoop.hbase.HRegionServer: Compaction 
requested for region: relations,,1216948402123
2008-07-25 13:44:28,190 INFO org.apache.hadoop.hbase.HRegion: checking 
compaction on region relations,,1216948402123
2008-07-25 13:44:28,192 INFO org.apache.hadoop.hbase.HRegion: checking 
compaction completed on region relations,,1216948402123; status: false; 0sec

last lines from one of the data nodes
2008-07-25 10:10:33,398 INFO org.apache.hadoop.dfs.DataNode: BlockReport of 28 
blocks got processed in 3 msecs
2008-07-25 11:08:15,040 INFO org.apache.hadoop.dfs.DataNode: BlockReport of 28 
blocks got processed in 2 msecs
2008-07-25 12:05:56,871 INFO org.apache.hadoop.dfs.DataNode: BlockReport of 28 
blocks got processed in 2 msecs
2008-07-25 13:03:38,503 INFO org.apache.hadoop.dfs.DataNode: BlockReport of 28 
blocks got processed in 2 msecs

The relvant portion of my hbase-site.xml
<property>
    <name>hbase.rootdir</name>
    <value>hdfs://domU-12-31-39-00-E9-23:50001/hbase</value>
    <description>The directory shared by region servers.
    </description>
  </property>


Any ideas on where I can look to find an error message to help make sense of 
this?


      

Reply via email to