Hi ,
I am able to run apache giraph successfully with around 500M pairs to
find Connected components. It works great but not always, the issue seems
to be with the time out zookeeper time out. Some of the client(around 5-10
) out of 100, produces this error and the master fails due to this.Do you
have any suggestions for this error. Any suggestions will be appreaciated.

2013-10-02 01:20:43,651 WARN org.apache.giraph.bsp.BspService:
process: Disconnected from ZooKeeper (will automatically try to
recover) WatchedEvent state:Disconnected type:None path:null
2013-10-02 01:20:44,035 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server had22.rsk.admobius.com/10.240.51.32:2181.
Will not attempt to authenticate using SASL (Unable to locate a login
configuration)
2013-10-02 01:20:44,035 INFO org.apache.zookeeper.ClientCnxn: Socket
connection established to had22.rsk.admobius.com/10.240.51.32:2181,
initiating session
2013-10-02 01:20:44,037 INFO org.apache.zookeeper.ClientCnxn: Unable
to reconnect to ZooKeeper service, session 0x441604c97412331 has
expired, closing socket connection
2013-10-02 01:20:44,037 WARN org.apache.giraph.bsp.BspService:
process: Got unknown null path event WatchedEvent state:Expired
type:None path:null
2013-10-02 01:20:44,038 INFO org.apache.zookeeper.ClientCnxn:
EventThread shut down
2013-10-02 01:21:20,046 INFO
org.apache.giraph.worker.VertexInputSplitsCallable:
readVertexInputSplit: Loaded 250000 vertices at 1827.2925619484213
vertices/sec 1728790 edges at 12636.730317550928 edges/sec Memory
(free/total/max) = 1745.60M / 2262.19M / 2730.69M
2013-10-02 01:21:24,788 INFO
org.apache.giraph.worker.InputSplitsCallable: loadFromInputSplit:
Finished loading
/_hadoopBsp/job_201309260044_1132/_vertexInputSplitDir/601 (v=261131,
e=1808572)
2013-10-02 01:21:24,789 ERROR
org.apache.giraph.utils.LogStacktraceCallable: Execution of callable
failed
java.lang.IllegalStateException: markInputSplitPathFinished:
KeeperException on
/_hadoopBsp/job_201309260044_1132/_vertexInputSplitDir/601/_vertexInputSplitFinished
        at 
org.apache.giraph.worker.InputSplitsHandler.markInputSplitPathFinished(InputSplitsHandler.java:168)
        at 
org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:226)
        at 
org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:161)
        at 
org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:58)
        at 
org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
        at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for
/_hadoopBsp/job_201309260044_1132/_vertexInputSplitDir/601/_vertexInputSplitFinished
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
        at org.apache.giraph.zk.ZooKeeperExt.createExt(ZooKeeperExt.java:152)
        at 
org.apache.giraph.worker.InputSplitsHandler.markInputSplitPathFinished(InputSplitsHandler.java:159)
        ... 9 more


-- 
Best Regards,
Jyotirmoy Sundi
Admobius

San Francisco, CA 94158


On Thu, Sep 26, 2013 at 6:08 PM, Jyotirmoy Sundi <[email protected]> wrote:

> Hi ,
>
>    I got the connected component working for 1B nodes, but when I run the job 
> again, it fails with the below error. Aprt form this in zookeeper the data is 
> not cleared in the data directory. For successful jobs the data in zookeper 
> from giraph is cleared.
>
> The following errors seems to be coming because the node tries to connect to 
> the zookeeper with a session id which is cleared as seens in
>
> "Client session timed out, have not heard from server in 68845ms for 
> sessionid 0x3415cc6ce930059, closing socket connection and attempting 
> reconnect" , Any idea if increasing the session time out will be good ?
>
> 2013-09-27 00:57:11,748 WARN org.apache.giraph.bsp.BspService: process: Got 
> unknown null path event WatchedEvent state:Expired type:None path:null
> 2013-09-27 00:57:11,748 INFO org.apache.zookeeper.ClientCnxn: Unable to 
> reconnect to ZooKeeper service, session 0x3415cc6ce930059 has expired, 
> closing socket connection
> 2013-09-27 00:57:11,748 WARN org.apache.giraph.worker.InputSplitsHandler: 
> process: Problem with zookeeper, got event with path null, state Expired, 
> event type None
> 2013-09-27 00:57:11,748 INFO org.apache.zookeeper.ClientCnxn: EventThread 
> shut down
> 2013-09-27 00:57:11,925 INFO org.apache.giraph.worker.InputSplitsCallable: 
> loadFromInputSplit: Finished loading 
> /_hadoopBsp/job_201309260044_0116/_vertexInputSplitDir/89 (v=258127, 
> e=1792906)
> 2013-09-27 00:57:11,926 ERROR org.apache.giraph.utils.LogStacktraceCallable: 
> Execution of callable failed
> java.lang.IllegalStateException: markInputSplitPathFinished: KeeperException 
> on 
> /_hadoopBsp/job_201309260044_0116/_vertexInputSplitDir/89/_vertexInputSplitFinished
>       at 
> org.apache.giraph.worker.InputSplitsHandler.markInputSplitPathFinished(InputSplitsHandler.java:168)
>       at 
> org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:226)
>       at 
> org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:161)
>       at 
> org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:58)
>       at 
> org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>       at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException: 
> KeeperErrorCode = Session expired for 
> /_hadoopBsp/job_201309260044_0116/_vertexInputSplitDir/89/_vertexInputSplitFinished
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>       at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
>       at org.apache.giraph.zk.ZooKeeperExt.createExt(ZooKeeperExt.java:152)
>       at 
> org.apache.giraph.worker.InputSplitsHandler.markInputSplitPathFinished(InputSplitsHandler.java:159)
>       ... 9 more
>
>
> --
> Best Regards,
> Jyotirmoy Sundi
> Data Engineer,
> Admobius
>
> San Francisco, CA 94158
>



-- 
Best Regards,
Jyotirmoy Sundi
Data Engineer,
Admobius

San Francisco, CA 94158

Reply via email to