I ran into the following problem running a hadoop job written in pig.Pls help
check what caused the issue. As I could tell, it seems to me the job/task
tracker failed for some reason but
name/data nodes still functioning.
The job simply seems to make no progress at all (no output, no log). But couple
of other hadoop jobs ran successfully before this one. hadoop fs -ls can still
list files. But I did "Hadoop job -list", it took too long and then failed with
error message as follows.
Exception in thread "main" java.io.IOException: Call to
hostname/ip-address:50002 failed on
local exception: Connection reset by peer at
org.apache.hadoop.ipc.Client.call(Client.java:699) at
org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) at
org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown Source) at
org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319) at
org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:435) at
org.apache.hadoop.mapred.JobClient.init(JobClient.java:429) at
org.apache.hadoop.mapred.JobClient.run(JobClient.java:1512) at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at
org.apache.hadoop.mapred.JobClient.main(JobClient.java:1727)Caused
by: java.io.IOException: Connection reset by peer at
sun.nio.ch.FileDispatcher.read0(Native Method) at
sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) at
sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233) at
sun.nio.ch.IOUtil.read(IOUtil.java:206) at
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236) at
org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:140)
at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
at
java.io.FilterInputStream.read(FilterInputStream.java:116) at
org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:271)
at
java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at
java.io.BufferedInputStream.read(BufferedInputStream.java:237) at
java.io.DataInputStream.readInt(DataInputStream.java:370) at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:493)
at
org.apache.hadoop.ipc.Client$Connection.run(Client.java:438)
Web interface to job trac...@50030 simply came with no response at all.
By checking netstat, sometimes it shows 50030 and sometimes not. connections
and ports with data nodes were shown there.
Then, if I ran another pig, it failed with the following error:
Error before Pig is launched----------------------------ERROR
6009: Failed to create job client:Call to hostname/ip-address:50002 failed on
local exception: Connection reset by peer
org.apache.pig.backend.executionengine.ExecException:
ERROR 6009: Failed to create job client:Call to hostname/ip-address:50002
failed on
local exception: Connection reset by peer at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:217)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:137)
at
org.apache.pig.impl.PigContext.connect(PigContext.java:199) at
org.apache.pig.PigServer.<init>(PigServer.java:169) at
org.apache.pig.PigServer.<init>(PigServer.java:158) at
org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:54) at
org.apache.pig.Main.main(Main.java:395)Caused by:
java.io.IOException: Call to hostname/ip-address:50002 failed on
local exception: Connection reset by peer at
org.apache.hadoop.ipc.Client.call(Client.java:699) at
org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) at
org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown Source) at
org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319) at
org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:435) at
org.apache.hadoop.mapred.JobClient.init(JobClient.java:429) at
org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:398) at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:212)
... 6 moreCaused
by: java.io.IOException: Connection reset by peer at
sun.nio.ch.FileDispatcher.read0(Native Method) at
sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) at
sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233) at
sun.nio.ch.IOUtil.read(IOUtil.java:206) at
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236) at
org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:140)
at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
at
java.io.FilterInputStream.read(FilterInputStream.java:116) at
org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:271)
at
java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at
java.io.BufferedInputStream.read(BufferedInputStream.java:237) at
java.io.DataInputStream.readInt(DataInputStream.java:370) at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:493)
at
org.apache.hadoop.ipc.Client$Connection.run(Client.java:438)================================================================================
Thank,
Michael