Hi all,
I'm using hbase 0.1.3 and I have a pretty simple TableMap that is randomly
hanging at OutputCollector.collect. Eventually, the task gets killed because
it doesn't report back. There are no error messages in the log. CPU is 100%
for the task. I've included a thread dump below. Any ideas?

Full thread dump Java HotSpot(TM) 64-Bit Server VM (10.0-b22 mixed mode):

"SortSpillThread" daemon prio=10 tid=0x00002aaad40d4c00 nid=0x6729 runnable
[0x000000004173f000..0x000000004173fd80]
   java.lang.Thread.State: RUNNABLE
        at
com.rexee.bandito.hadoop.logprocessing.CountFaillures$Reduce.reduce(CountFaillures.java:106)
        at
com.rexee.bandito.hadoop.logprocessing.CountFaillures$Reduce.reduce(CountFaillures.java:102)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(MapTask.java:522)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:493)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$200(MapTask.java:264)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:439)
        - locked <0x00002aaab3a874e8> (a java.lang.Object)

"org.apache.hadoop.hbase.io.HbaseObjectWritable Connection Culler" daemon
prio=10 tid=0x00002aaad40bf400 nid=0x64a0 waiting on condition
[0x000000004143c000..0x000000004143ca80]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at
org.apache.hadoop.ipc.Client$ConnectionCuller.run(Client.java:423)

"Comm thread for task_200807162013_0006_m_000001_0" daemon prio=10
tid=0x00002aaad4107400 nid=0x649f waiting on condition
[0x000000004133b000..0x000000004133ba00]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.mapred.Task$1.run(Task.java:282)
        at java.lang.Thread.run(Thread.java:619)

"[EMAIL PROTECTED]" daemon prio=10
tid=0x00002aaad4107000 nid=0x649e waiting on condition
[0x000000004123a000..0x000000004123ad80]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at
org.apache.hadoop.dfs.DFSClient$LeaseChecker.run(DFSClient.java:605)
        at java.lang.Thread.run(Thread.java:619)

"IPC Client connection to domU-12-31-38-00-D4-21/10.252.219.207:9000" daemon
prio=10 tid=0x00002aaad4129800 nid=0x649d in Object.wait()
[0x0000000041139000..0x0000000041139d00]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00002aaab3acad18> (a
org.apache.hadoop.ipc.Client$Connection)
        at java.lang.Object.wait(Object.java:485)
        at
org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:234)
        - locked <0x00002aaab3acad18> (a
org.apache.hadoop.ipc.Client$Connection)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:273)

"IPC Client connection to /127.0.0.1:55679" daemon prio=10
tid=0x00002aaad40aa000 nid=0x6499 in Object.wait()
[0x0000000041038000..0x0000000041038c80]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00002aaab3aa0780> (a
org.apache.hadoop.ipc.Client$Connection)
        at java.lang.Object.wait(Object.java:485)
        at
org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:234)
        - locked <0x00002aaab3aa0780> (a
org.apache.hadoop.ipc.Client$Connection)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:273)

"org.apache.hadoop.io.ObjectWritable Connection Culler" daemon prio=10
tid=0x00002aaad40f2c00 nid=0x6498 waiting on condition
[0x0000000040f37000..0x0000000040f37c00]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at
org.apache.hadoop.ipc.Client$ConnectionCuller.run(Client.java:423)

"Low Memory Detector" daemon prio=10 tid=0x00002aaad330ec00 nid=0x6494
runnable [0x0000000000000000..0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread1" daemon prio=10 tid=0x00002aaad330c400 nid=0x6493 waiting
on condition [0x0000000000000000..0x0000000040c33320]
   java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x00002aaad3308c00 nid=0x6492 waiting
on condition [0x0000000000000000..0x0000000040b322b0]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00002aaad3307800 nid=0x6491
runnable [0x0000000000000000..0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00002aaad32dd000 nid=0x6490 in
Object.wait() [0x0000000040931000..0x0000000040931d00]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00002aaab3ad4ad8> (a
java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
        - locked <0x00002aaab3ad4ad8> (a
java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

"Reference Handler" daemon prio=10 tid=0x00002aaad312cc00 nid=0x648f in
Object.wait() [0x0000000040830000..0x0000000040830c80]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00002aaab3af78c8> (a
java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:485)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
        - locked <0x00002aaab3af78c8> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x0000000040113400 nid=0x6489 waiting for monitor entry
[0x000000004022a000..0x000000004022aec0]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:407)
        - waiting to lock <0x00002aaab3a874e8> (a java.lang.Object)
        - locked <0x00002aaab3af7928> (a
org.apache.hadoop.mapred.MapTask$MapOutputBuffer)
        at
com.rexee.bandito.hadoop.logprocessing.CountFaillures$Map.map(CountFaillures.java:84)
        at
com.rexee.bandito.hadoop.logprocessing.CountFaillures$Map.map(CountFaillures.java:58)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
        at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2084)

"VM Thread" prio=10 tid=0x00002aaad312a800 nid=0x648e runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x000000004011e000 nid=0x648a
runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x000000004011f400 nid=0x648b
runnable

"GC task thread#2 (ParallelGC)" prio=10 tid=0x0000000040120800 nid=0x648c
runnable

"GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000040121800 nid=0x648d
runnable

"VM Periodic Task Thread" prio=10 tid=0x00002aaad3310800 nid=0x6495 waiting
on condition

JNI global references: 762

Heap
 PSYoungGen      total 170560K, used 68698K [0x00002aaac8770000,
0x00002aaad2e10000, 0x00002aaad2e10000)
  eden space 170496K, 40% used
[0x00002aaac8770000,0x00002aaacca7e9b8,0x00002aaad2df0000)
  from space 64K, 50% used
[0x00002aaad2df0000,0x00002aaad2df8000,0x00002aaad2e00000)
  to   space 64K, 0% used
[0x00002aaad2e00000,0x00002aaad2e00000,0x00002aaad2e10000)
 PSOldGen        total 236416K, used 195186K [0x00002aaab3a10000,
0x00002aaac20f0000, 0x00002aaac8770000)
  object space 236416K, 82% used
[0x00002aaab3a10000,0x00002aaabf8ac9a0,0x00002aaac20f0000)
 PSPermGen       total 21248K, used 9686K [0x00002aaaae610000,
0x00002aaaafad0000, 0x00002aaab3a10000)
  object space 21248K, 45% used
[0x00002aaaae610000,0x00002aaaaef85b08,0x00002aaaafad0000)

Reply via email to