[ 
https://issues.apache.org/jira/browse/TEZ-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15224733#comment-15224733
 ] 

Kurt Muehlner commented on TEZ-3187:
------------------------------------

Sorry.  Here you go.  Again, I don't see this using pig-015.0-h2.jar.

{code}
2016-04-04 10:05:44,579 [PigTezLauncher-0] INFO  
org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
status=FAILED, progress=TotalTasks: 87 Succeeded: 0 Running: 0 Failed: 0 
Killed: 0, diagnostics=Vertex failed, vertexName=scope-192, 
vertexId=vertex_1456477667034_4440_1_01, diagnostics=[Vertex init failed : 
org.apache.tez.dag.api.TezUncheckedException: java.lang.RuntimeException: 
java.io.IOException: Couldn't create proxy provider class 
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
        at 
org.apache.tez.mapreduce.committer.MROutputCommitter.getOutputCommitter(MROutputCommitter.java:139)
        at 
org.apache.tez.mapreduce.committer.MROutputCommitter.initialize(MROutputCommitter.java:81)
        at 
org.apache.tez.dag.app.dag.impl.VertexImpl$2.run(VertexImpl.java:2227)
        at 
org.apache.tez.dag.app.dag.impl.VertexImpl$2.run(VertexImpl.java:2206)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at 
org.apache.tez.dag.app.dag.impl.VertexImpl.initializeCommitters(VertexImpl.java:2206)
        at 
org.apache.tez.dag.app.dag.impl.VertexImpl.initializeVertex(VertexImpl.java:2263)
        at 
org.apache.tez.dag.app.dag.impl.VertexImpl.access$3500(VertexImpl.java:204)
        at 
org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.handleInitEvent(VertexImpl.java:2782)
        at 
org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2675)
        at 
org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2657)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at 
org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:59)
        at 
org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1799)
        at 
org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:203)
        at 
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2180)
        at 
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2166)
        at 
org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
        at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
        at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.RuntimeException: java.io.IOException: Couldn't create 
proxy provider class 
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
        at 
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.setOutputPath(FileOutputFormat.java:164)
        at 
org.apache.pig.builtin.PigStorage.setStoreLocation(PigStorage.java:455)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setLocation(PigOutputFormat.java:175)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.setUpContext(PigOutputCommitter.java:116)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.getCommitters(PigOutputCommitter.java:90)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.<init>(PigOutputCommitter.java:71)
        at 
org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigOutputFormatTez$PigOutputCommitterTez.<init>(PigOutputFormatTez.java:80)
        at 
org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigOutputFormatTez.getOutputCommitter(PigOutputFormatTez.java:55)
        at 
org.apache.tez.mapreduce.committer.MROutputCommitter.getOutputCommitter(MROutputCommitter.java:137)
        ... 24 more
Caused by: java.io.IOException: Couldn't create proxy provider class 
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
        at 
org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(NameNodeProxies.java:530)
        at 
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:172)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:665)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:601)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:148)
        at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
        at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:354)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
        at 
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.setOutputPath(FileOutputFormat.java:160)
        ... 32 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at 
org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(NameNodeProxies.java:513)
        ... 45 more
Caused by: java.lang.RuntimeException: Could not find any configured addresses 
for URI hdfs://stage-stage-hdfs-nameservice
        at 
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider.<init>(ConfiguredFailoverProxyProvider.java:93)
        ... 50 more
{code}

> Pig on tez hang with java.io.IOException: Connection reset by peer
> ------------------------------------------------------------------
>
>                 Key: TEZ-3187
>                 URL: https://issues.apache.org/jira/browse/TEZ-3187
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.8.2
>         Environment: Hadoop 2.5.0
> Pig 0.15.0
> Tez 0.8.2
>            Reporter: Kurt Muehlner
>         Attachments: 10.102.173.86.logs.gz, TEZ-3187.incomplete-tasks.txt, 
> dag_1437886552023_169758_3.dot, stack.application_1437886552023_171131.out, 
> syslog_dag_1437886552023_169758_3.gz, task_attempts.tar.gz
>
>
> We are experiencing occasional application hangs, when testing an existing 
> Pig MapReduce script, executing on Tez.  When this occurs, we find this in 
> the syslog for the executing dag:
> 016-03-21 16:39:01,643 [INFO] [DelayedContainerManager] 
> |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout 
> delay expired or is new. Releasing container, 
> containerId=container_e11_1437886552023_169758_01_000822, 
> containerExpiryTime=1458603541415, idleTimeout=5000, taskRequestsCount=0, 
> heldContainers=112, delayedContainers=27, isNew=false
> 2016-03-21 16:39:01,825 [INFO] [DelayedContainerManager] 
> |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout 
> delay expired or is new. Releasing container, 
> containerId=container_e11_1437886552023_169758_01_000824, 
> containerExpiryTime=1458603541692, idleTimeout=5000, taskRequestsCount=0, 
> heldContainers=111, delayedContainers=26, isNew=false
> 2016-03-21 16:39:01,990 [INFO] [Socket Reader #1 for port 53324] 
> |ipc.Server|: Socket Reader #1 for port 53324: readAndProcess from client 
> 10.102.173.86 threw exception [java.io.IOException: Connection reset by peer]
> java.io.IOException: Connection reset by peer
>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>         at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>         at org.apache.hadoop.ipc.Server.channelRead(Server.java:2593)
>         at org.apache.hadoop.ipc.Server.access$2800(Server.java:135)
>         at 
> org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1471)
>         at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:762)
>         at 
> org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:636)
>         at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:607)
> 2016-03-21 16:39:02,032 [INFO] [DelayedContainerManager] 
> |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout 
> delay expired or is new. Releasing container, 
> containerId=container_e11_1437886552023_169758_01_000811, 
> containerExpiryTime=1458603541828, idleTimeout=5000, taskRequestsCount=0, 
> heldContainers=110, delayedContainers=25, isNew=false
> In all cases I've been able to analyze so far, this also correlates with a 
> warning in the node identified in the IOException:
> 2016-03-21 16:36:13,641 [WARN] [I/O Setup 2 Initialize: {scope-178}] 
> |retry.RetryInvocationHandler|: A failover has occurred since the start of 
> this method invocation attempt.
> However, it does not appear that any namenode failover has actually occurred 
> (the most recent failover we see in logs is from 2015).
> Attached:
> syslog_dag_1437886552023_169758_3.gz: syslog for the dag which hangs
> 10.102.173.86.logs.gz: aggregated logs from the host identified in the 
> IOException



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to