Has anyone encountered this problem? Maybe a known issue or common mistake?
I'm executing a Pig script that uses hbase storage. It's a 3-node cluster
(0.94.7) and I'm executing the script from the same machine running the HBase
master. The M/R job hangs in preparation state (4). A dump of the solitary
map process reveals that it is hanging indefinitely here:
java.lang.Thread.State: TIMED_WAITING (sleeping)
....
at
org.apache.hadoop.hbase.util.RetryCounter.sleepUntilNextRetry(RetryCounter.java:54)
at
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:304)
....
at
org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:124)
- locked <0x00000007b237bd78> (a
org.apache.hadoop.hbase.zookeeper.RootRegionTracker)
at
org.apache.hadoop.hbase.zookeeper.RootRegionTracker.waitRootRegionLocation(RootRegionTracker.java:83)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:989)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1102)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1000)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1102)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1004)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:961)
at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:227)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:170)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:129)
at
org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:201)
....
There are no compactions running. There are no other M/R jobs running. HBase
shell is functioning from that node. The destination table is new, empty, and
plain-jane.
The worst part is that the task tracker isn't smart enough to kill the process
after I kill the M/R job and I have to chase the task around the nodes: kill,
kill, kill, ...
regards,
-Jess