[ 
https://issues.apache.org/jira/browse/GIRAPH-381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479676#comment-13479676
 ] 

Hudson commented on GIRAPH-381:
-------------------------------

Integrated in Giraph-trunk-Commit #248 (See 
[https://builds.apache.org/job/Giraph-trunk-Commit/248/])
    GIRAPH-381: Ensure we get the original exception from
GraphMapper#run(). (aching) (Revision 1399984)

     Result = SUCCESS
aching : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1399984
Files : 
* /giraph/trunk/CHANGELOG
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/GraphMapper.java

                
> Ensure we get the original exception from GraphMapper#run()
> -----------------------------------------------------------
>
>                 Key: GIRAPH-381
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-381
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>         Attachments: GIRAPH-381.patch
>
>
> We can lose the original exception if failureCleanup() fails.
> I.e.
> INFO    2012-10-18 14:23:25,417 [main] 
> org.apache.giraph.graph.WorkerAggregatorHandler  - marshalAggregatorValues: 
> Finished assembling aggregator values
> INFO    2012-10-18 14:23:25,451 [main-SendThread(xxx.machine.xxx:22181)] 
> org.apache.zookeeper.ClientCnxn  - Unable to read additional data from server 
> sessionid 0x13a75baca440014, likely server has closed socket, closing socket 
> c\
> onnection and attempting reconnect
> ERROR   2012-10-18 14:23:25,552 [main] 
> org.apache.giraph.graph.BspServiceWorker  - unregisterHealth: Got failure, 
> unregistering health on 
> /_hadoopBsp/job_201209271814.8652_0001/_applicationAttemptsDir/0/_superstepDir/1/_workerHea\
> lthyDir/xxx.machine.xxx_9 on superstep 1
> WARN    2012-10-18 14:23:25,554 [main-EventThread] 
> org.apache.giraph.graph.BspService  - process: Disconnected from ZooKeeper 
> (will automatically try to recover) WatchedEvent state:Disconnected type:None 
> path:null
> INFO    2012-10-18 14:23:26,916 [main-SendThread(xxx.machine.xxx:22181)] 
> org.apache.zookeeper.ClientCnxn  - Opening socket connection to server 
> xxx.machine.xxx/10.174.108.77:22181
> INFO    2012-10-18 14:23:26,917 [main-SendThread(xxx.machine.xxx:22181)] 
> org.apache.zookeeper.ClientCnxn  - Socket connection established to 
> xxx.machine.xxx/10.174.108.77:22181, initiating session
> WARN    2012-10-18 14:23:26,977 [main-SendThread(xxx.machine.xxx:22181)] 
> org.apache.zookeeper.ClientCnxn  - Session 0x13a75baca440014 for server 
> xxx.machine.xxx/10.174.108.77:22181, unexpected error, closing socket 
> connection and\
>  attempting reconnect
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:218)
> at sun.nio.ch.IOUtil.read(IOUtil.java:186)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:359)
> at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:858)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1130)
> WARN    2012-10-18 14:23:27,082 [main] org.apache.hadoop.mapred.Child  - 
> Error running child
> java.lang.IllegalStateException: unregisterHealth: KeeperException - Couldn't 
> delete 
> /_hadoopBsp/job_201209271814.8652_0001/_applicationAttemptsDir/0/_superstepDir/1/_workerHealthyDir/xxx.machine.xxx_9
> at 
> org.apache.giraph.graph.BspServiceWorker.unregisterHealth(BspServiceWorker.java:582)
> at 
> org.apache.giraph.graph.BspServiceWorker.failureCleanup(BspServiceWorker.java:590)
> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:608)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:632)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> at org.apache.hadoop.mapred.Child.main(Child.java:171)
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: 
> KeeperErrorCode = ConnectionLoss for 
> /_hadoopBsp/job_201209271814.8652_0001/_applicationAttemptsDir/0/_superstepDir/1/_workerHealthyDir/xxx.machine.xxx_9
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728)
> at 
> org.apache.giraph.graph.BspServiceWorker.unregisterHealth(BspServiceWorker.java:576)
> ... 5 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to