Hi.
I've been trying to get the nutch fetcher to work, but it always hangs on
one of the reduce process, and job is failed. I am using 160 map tasks and
16 reduce tasks during fetch on a 8 machine cluster. fetch task only
fetching, don't parse content.

Everything works fine but one reduce out of N fails in the last step.

I fail to understand whats going on. Why would reduce job fail when simple
identity reduce process?

I changed OS from Linux to FreeBSD one month ago. Is this OS problem?

I'd appreciate if any one can share their experience.

Environment:
FreeBSD 6.2-RELEASE 64-bit
java version "1.5.0"
Java(TM) 2 Runtime Environment, Standard Edition (build diablo-1.5.0-b01)
Java HotSpot(TM) 64-Bit Server VM (build diablo-1.5.0_07-b01, mixed mode)
Hadoop-0.11.2

Configuration:
160 maps
16 reduce
8 node cluster

Jobtracker-webui:
Kind     % Complete  Num Tasks  Pending  Running  Complete  Killed  Failures
map     100.00%       281              0            0
281          0         0
reduce  93.89%        16                0            0
15            1         4

Failed-webui page:
Attempt Task Machine Error Logs
task_0010_r_000000_0 tip_0010_r_000000 task_node2
Task failed to report status for 603 seconds. Killing.

task_0010_r_000000_1 tip_0010_r_000000 task_node7
Task failed to report status for 609 seconds. Killing.

task_0010_r_000000_2 tip_0010_r_000000 task_node5
Task failed to report status for 601 seconds. Killing.

task_0010_r_000000_3 tip_0010_r_000000 task_node4
Task failed to report status for 602 seconds. Killing.


Jobtracker-log:
mapred.TaskInProgress - Error from task_0010_r_000000_0: Task failed to
report status for 603 seconds. Killing.
mapred.TaskInProgress - Task 'task_0010_r_000000_0' has been lost.
mapred.JobTracker - Removed completed task 'task_0010_r_000000_0' from
'tracker_task_node2:50050'


Tasktracker-log:
Task failed to report status for 603 seconds. Killing.
Process Thread Dump: lost task
17 active threads
Thread 13014 (IPC Client connection to job_node/10.8.50.31:9001):
 State: WAITING
 Blocked count: 1
 Waited count: 1
 Waiting on [EMAIL PROTECTED]
 Stack:
   java.lang.Object.wait(Native Method)
   java.lang.Object.wait(Object.java:474)
   org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:209)
   org.apache.hadoop.ipc.Client$Connection.run(Client.java:248)
Thread 12538 (Thread-11192):
 State: RUNNABLE
 Blocked count: 0
 Waited count: 0
 Stack:
   java.io.FileInputStream.readBytes(Native Method)
   java.io.FileInputStream.read(FileInputStream.java:194)
   org.apache.hadoop.mapred.TaskRunner.logStream(TaskRunner.java:363)
   org.apache.hadoop.mapred.TaskRunner.access$100(TaskRunner.java:33)
   org.apache.hadoop.mapred.TaskRunner$1.run(TaskRunner.java:326)
Thread 12537 (process reaper):
 State: RUNNABLE
 Blocked count: 0
 Waited count: 0
 Stack:
   java.lang.UNIXProcess.waitForProcessExit(Native Method)
   java.lang.UNIXProcess.access$900(UNIXProcess.java:20)
   java.lang.UNIXProcess$1$1.run(UNIXProcess.java:132)
Thread 10924 (SocketListener0-9):
 State: TIMED_WAITING
 Blocked count: 2
 Waited count: 734
 Stack:
   java.lang.Object.wait(Native Method)
   org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:522)
Thread 5608 (Thread-4636):
 State: RUNNABLE
 Blocked count: 763
 Waited count: 4356
 Stack:
   java.io.FileInputStream.readBytes(Native Method)
   java.io.FileInputStream.read(FileInputStream.java:194)
   java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
   java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
   java.io.BufferedInputStream.read(BufferedInputStream.java:313)
   org.apache.hadoop.mapred.TaskRunner.logStream(TaskRunner.java:363)
   org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:330)
   org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:220)
Thread 20 ([EMAIL PROTECTED]):
 State: TIMED_WAITING
 Blocked count: 0
 Waited count: 0
 Stack:
   java.lang.Thread.sleep(Native Method)
   org.apache.hadoop.dfs.DFSClient$LeaseChecker.run(DFSClient.java:465)
   java.lang.Thread.run(Thread.java:595)
Thread 16 (org.apache.hadoop.io.ObjectWritable Connection Culler):
 State: TIMED_WAITING
 Blocked count: 14
 Waited count: 0
 Stack:
   java.lang.Thread.sleep(Native Method)
   org.apache.hadoop.ipc.Client$ConnectionCuller.run(Client.java:397)
Thread 15 (IPC Server handler 1 on 50050):
 State: TIMED_WAITING
 Blocked count: 1132
 Waited count: 50998
 Stack:
   java.lang.Object.wait(Native Method)
   org.apache.hadoop.ipc.Server$Handler.run(Server.java:510)
Thread 14 (IPC Server handler 0 on 50050):
 State: BLOCKED
 Blocked count: 1173
 Waited count: 50996
 Blocked on [EMAIL PROTECTED]
 Blocked by 1 (main)
 Stack:
   org.apache.hadoop.mapred.TaskTracker.ping(TaskTracker.java:1261)
   sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
   sun.reflect.DelegatingMethodAccessorImpl.invoke(
DelegatingMethodAccessorImpl.java:25)
   java.lang.reflect.Method.invoke(Method.java:585)
   org.apache.hadoop.ipc.RPC$Server.call(RPC.java:337)
   org.apache.hadoop.ipc.Server$Handler.run(Server.java:538)
Thread 13 (IPC Server listener on 50050):
 State: RUNNABLE
 Blocked count: 31
 Waited count: 0
 Stack:
   sun.nio.ch.PollArrayWrapper.poll0(Native Method)
   sun.nio.ch.PollArrayWrapper.poll(PollArrayWrapper.java:100)
   sun.nio.ch.PollSelectorImpl.doSelect(PollSelectorImpl.java:56)
   sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
   sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
   sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
   org.apache.hadoop.ipc.Server$Listener.run(Server.java:230)
Thread 11 (Acceptor ServerSocket[addr=
0.0.0.0/0.0.0.0,port=0,localport=50060]):
 State: RUNNABLE
 Blocked count: 4
 Waited count: 0
 Stack:
   java.net.PlainSocketImpl.socketAccept(Native Method)
   java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
   java.net.ServerSocket.implAccept(ServerSocket.java:450)
   java.net.ServerSocket.accept(ServerSocket.java:421)
   org.mortbay.util.ThreadedServer.acceptSocket(ThreadedServer.java:432)
   org.mortbay.util.ThreadedServer$Acceptor.run(ThreadedServer.java:631)
Thread 10 (SessionScavenger):
 State: TIMED_WAITING
 Blocked count: 0
 Waited count: 0
 Stack:
   java.lang.Thread.sleep(Native Method)
   org.mortbay.jetty.servlet.AbstractSessionManager$SessionScavenger.run(
AbstractSessionManager.java:587)
Thread 9 (taskCleanup):
 State: WAITING
 Blocked count: 3
 Waited count: 1024
 Waiting on null
 Stack:
   sun.misc.Unsafe.park(Native Method)
   java.util.concurrent.locks.LockSupport.park(LockSupport.java:118)

java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(
AbstractQueuedSynchronizer.java:1767)
   java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java
:359)
   org.apache.hadoop.mapred.TaskTracker$1.run(TaskTracker.java:160)
   java.lang.Thread.run(Thread.java:595)
Thread 4 (Signal Dispatcher):
 State: RUNNABLE
 Blocked count: 0
 Waited count: 0
 Stack:
Thread 3 (Finalizer):
 State: WAITING
 Blocked count: 431
Waited count: 1030
 Waiting on [EMAIL PROTECTED]
 Stack:
   java.lang.Object.wait(Native Method)
   java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
   java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
   java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
Thread 2 (Reference Handler):
 State: WAITING
 Blocked count: 909
 Waited count: 968
 Waiting on [EMAIL PROTECTED]
 Stack:
   java.lang.Object.wait(Native Method)
   java.lang.Object.wait(Object.java:474)
   java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
Thread 1 (main):
 State: RUNNABLE
 Blocked count: 92
 Waited count: 13432
 Stack:
   sun.management.ThreadImpl.getThreadInfo0(Native Method)
   sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:144)
   sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:120)
   org.apache.hadoop.util.ReflectionUtils.printThreadInfo(
ReflectionUtils.java:102)
   org.apache.hadoop.util.ReflectionUtils.logThreadInfo(
ReflectionUtils.java:150)
   org.apache.hadoop.mapred.TaskTracker.markUnresponsiveTasks(
TaskTracker.java:655)
   org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:517)
   org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:857)
   org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1499)

Thanks.

Reply via email to