i think you should find the actual problem in the logs of another worker.

On Fri, Sep 6, 2013 at 8:06 PM, Bu Xiao <[email protected]> wrote:

> Thanks Claudio and Gustavo for your answer. I have another question. I run
> my algorithm on a cluster that has 20 nodes. When I specify the number of
> workers to be 10 (or more), the algorithms works well and produces the
> expected output. But, if the number of workers is less than 10 I get the
> following exception in ZooKeeper.
>  <https://plus.google.com/u/0/101834038373575526108?prsrc=4>
> 2013-09-06 10:39:04,313 INFO org.apache.giraph.comm.netty.NettyClient:
> connectAllAddresses: Successfully added 0 connections, (0 total connected)
> 0 failed, 0 failures total.
> 2013-09-06 10:39:04,313 INFO
> org.apache.giraph.partition.PartitionBalancer:
> balancePartitionsAcrossWorkers: Using algorithm static
> 2013-09-06 10:39:04,314 INFO org.apache.giraph.partition.PartitionUtils:
> analyzePartitionStats: Vertices - Mean: 200000, Min: Worker(hostname=
> node1.cluster.net, MRtaskID=5, port=30005) - 200000, Max: Worker(hostname=
> node7.cluster.net, MRtaskID=1, port=30001) - 200000
> 2013-09-06 10:39:04,314 INFO org.apache.giraph.partition.PartitionUtils:
> analyzePartitionStats: Edges - Mean: 10019985, Min: Worker(hostname=
> node9.cluster.net, MRtaskID=4, port=30004) - 10000354, Max:
> Worker(hostname=node5.cluster.net, MRtaskID=2, port=30002) - 10088901
> 2013-09-06 10:39:04,339 INFO org.apache.giraph.master.BspServiceMaster:
> barrierOnWorkerList: 0 out of 5 workers finished on superstep 2 on path
> /_hadoopBsp/job_201309060934_0013/_applicationAttemptsDir/0/_superstepDir/2/_workerFinishedDir
> 2013-09-06 10:39:04,340 INFO org.apache.giraph.master.BspServiceMaster:
> barrierOnWorkerList: Waiting on [node8.cluster.net_3, node1.cluster.net_5,
> node9.cluster.net_4, node5.cluster.net_2, node7.cluster.net_1]
>  2013-09-06 10:40:15,255 INFO
> org.apache.giraph.comm.netty.handler.RequestDecoder: decode: Server window
> metrics MBytes/sec sent = 0, MBytes/sec received = 0, MBytesSent = 0,
> MBytesReceived = 0, ave sent req MBytes = 0, ave received req MBytes = 0,
> secs waited = 71.241
> 2013-09-06 10:40:15,291 INFO org.apache.giraph.master.BspServiceMaster:
> barrierOnWorkerList: 3 out of 5 workers finished on superstep 2 on path
> /_hadoopBsp/job_201309060934_0013/_applicationAttemptsDir/0/_superstepDir/2/_workerFinishedDir
> 2013-09-06 10:40:15,291 INFO org.apache.giraph.master.BspServiceMaster:
> barrierOnWorkerList: Waiting on [node1.cluster.net_5, node7.cluster.net_1]
> 2013-09-06 10:40:15,388 INFO org.apache.giraph.master.BspServiceMaster:
> aggregateWorkerStats: Aggregation found
> (vtx=1000000,finVtx=0,edges=50099927,msgCount=0,msgBytesCount=0,haltComputation=false)
> on superstep = 2
> 2013-09-06 10:40:15,394 INFO org.apache.giraph.master.BspServiceMaster:
> coordinateSuperstep: Cleaning up old Superstep
> /_hadoopBsp/job_201309060934_0013/_applicationAttemptsDir/0/_superstepDir/1
> 2013-09-06 10:40:15,531 INFO org.apache.giraph.master.MasterThread:
> masterThread: Coordination of superstep 2 took 71.313 seconds ended with
> state THIS_SUPERSTEP_DONE and is now on superstep 3
> 2013-09-06 10:40:15,563 INFO org.apache.giraph.comm.netty.NettyClient:
> connectAllAddresses: Successfully added 0 connections, (0 total connected)
> 0 failed, 0 failures total.
> 2013-09-06 10:40:15,563 INFO
> org.apache.giraph.partition.PartitionBalancer:
> balancePartitionsAcrossWorkers: Using algorithm static
> 2013-09-06 10:40:15,564 INFO org.apache.giraph.partition.PartitionUtils:
> analyzePartitionStats: Vertices - Mean: 200000, Min: Worker(hostname=
> node1.cluster.net, MRtaskID=5, port=30005) - 200000, Max: Worker(hostname=
> node7.cluster.net, MRtaskID=1, port=30001) - 200000
> 2013-09-06 10:40:15,564 INFO org.apache.giraph.partition.PartitionUtils:
> analyzePartitionStats: Edges - Mean: 10019985, Min: Worker(hostname=
> node9.cluster.net, MRtaskID=4, port=30004) - 10000354, Max:
> Worker(hostname=node5.cluster.net, MRtaskID=2, port=30002) - 10088901
> 2013-09-06 10:40:15,587 INFO org.apache.giraph.master.BspServiceMaster:
> barrierOnWorkerList: 0 out of 5 workers finished on superstep 3 on path
> /_hadoopBsp/job_201309060934_0013/_applicationAttemptsDir/0/_superstepDir/3/_workerFinishedDir
> 2013-09-06 10:40:15,587 INFO org.apache.giraph.master.BspServiceMaster:
> barrierOnWorkerList: Waiting on [node8.cluster.net_3, node1.cluster.net_5,
> node9.cluster.net_4, node5.cluster.net_2, node7.cluster.net_1]
>  2013-09-06 10:50:18,111 ERROR org.apache.giraph.master.BspServiceMaster:
> superstepChosenWorkerAlive: Missing chosen worker Worker(hostname=
> node7.cluster.net, MRtaskID=1, port=30001) on superstep 3
> 2013-09-06 10:50:18,111 ERROR org.apache.giraph.master.BspServiceMaster:
> superstepChosenWorkerAlive: Missing chosen worker Worker(hostname=
> node9.cluster.net, MRtaskID=4, port=30004) on superstep 3
> 2013-09-06 10:50:18,111 INFO org.apache.giraph.master.MasterThread:
> masterThread: Coordination of superstep 3 took 602.58 seconds ended with
> state WORKER_FAILURE and is now on superstep 3
> 2013-09-06 10:50:18,118 ERROR org.apache.giraph.master.MasterThread:
> masterThread: Master algorithm failed with ArrayIndexOutOfBoundsException
> java.lang.ArrayIndexOutOfBoundsException: -1
>         at
> org.apache.giraph.master.BspServiceMaster.getLastGoodCheckpoint(BspServiceMaster.java:1272)
>         at org.apache.giraph.master.MasterThread.run(MasterThread.java:139)
> 2013-09-06 10:50:18,119 FATAL org.apache.giraph.graph.GraphMapper:
> uncaughtException: OverrideExceptionHandler on thread
> org.apache.giraph.master.MasterThread, msg =
> java.lang.ArrayIndexOutOfBoundsException: -1, exiting...
> java.lang.IllegalStateException: java.lang.ArrayIndexOutOfBoundsException:
> -1
>         at org.apache.giraph.master.MasterThread.run(MasterThread.java:185)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
>         at
> org.apache.giraph.master.BspServiceMaster.getLastGoodCheckpoint(BspServiceMaster.java:1272)
>         at org.apache.giraph.master.MasterThread.run(MasterThread.java:139)
> 2013-09-06 10:50:18,122 INFO org.apache.giraph.zk.ZooKeeperManager: run:
> Shutdown hook started.
> 2013-09-06 10:50:18,122 WARN org.apache.giraph.zk.ZooKeeperManager:
> onlineZooKeeperServers: Forced a shutdown hook kill of the ZooKeeper
> process.
> 2013-09-06 10:50:18,495 INFO org.apache.zookeeper.ClientCnxn: Unable to
> read additional data from server sessionid 0x140f459adcd0000, likely server
> has closed socket, closing socket connection and attempting reconnect
> 2013-09-06 10:50:18,496 INFO org.apache.giraph.zk.ZooKeeperManager:
> onlineZooKeeperServers: ZooKeeper process exited with 143 (note that 143
> typically means killed).
>
> Thank you.
>
>
> On Fri, Sep 6, 2013 at 3:51 AM, Gustavo Enrique Salazar Torres <
> [email protected]> wrote:
>
>> Hi Bu:
>> Until the interface with Gora is available you could use Apache Sqoop to
>> import your mysql table into HDFS and then run your Giraph job.
>>
>> Cheers
>> Gustavo
>> Em 06/09/2013 04:43, "Claudio Martella" <[email protected]>
>> escreveu:
>>
>>  Hi Bu,
>>>
>>> no, currently we do not have a DBInputFormat. We have an open issue with
>>> a google summer of code student working on a GoraInputFormat, which
>>> supports also reading from RDBMs through Gora. However, if/when it will get
>>> it, it will not provide a rich semantic as DBInputFormat, e.g. you'll be
>>> able to only provide scan-like/range queries, instead of ANY query like
>>> DBInputFormat.
>>>
>>> I think that creating an DB[Vertex|Edge]InputFormat starting from the
>>> hadoop DBInputFormat should not be too hard and could prove to be a very
>>> useful contribution. If you think about providing an implementation, I can
>>> provide guidance.
>>>
>>> Best,
>>> Claudio
>>>
>>>
>>> On Fri, Sep 6, 2013 at 1:45 AM, Bu Xiao <[email protected]> wrote:
>>>
>>>> Hi Girapher,
>>>>
>>>>        I am currently working on algorithm that requires reading the
>>>> vertices from MySQL table and not from HDFS. I thought that there has to be
>>>> a way of reading data from SQL table since Giraph is built on top of
>>>> Hadoop. But I do not seem to figure this part out. Do you have a class
>>>> similar to the DBInputFormat in Hadoop? Thank you very much for your help.
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>    Claudio Martella
>>>    [email protected]
>>>
>>
>


-- 
   Claudio Martella
   [email protected]

Reply via email to