Hi Avery,
thanks. It worked at least once now :).
Aapo
On Oct 1, 2011, at 2:27 AM, Avery Ching wrote:
> Hi Aapo,
>
> Thanks for the error report. I think you found a bug. Can you try the
> included patch and see if the problem goes away? I got it to pass local
> and MR unittests.
>
> Avery
>
> On 9/30/11 4:08 PM, Aapo Kyrola wrote:
>>
>> Hi,
>>
>> occasionally (maybe one time in four), my giraph run fails because of the
>> below RuntimeException.
>> According to code, it should never happen:
>>
>> if (msgMap == null)
>> { // should never happen after constructor throw new
>> RuntimeException( "sendMessage: msgMap did not exist for " +
>> addr + " for vertex " + destVertex); }
>>
>>
>>
>> This
>> happens during superstep 1 (second superstep). My application
>> actually *adds* edges on superstep 1
>> (to
>> make every out-edge also an in-edge of the destination), but
>> since I am running only on 3 workers,
>> I am
>> surprised if every worker would not had been registered in the
>> RPC layer initially.
>>
>>
>>
>> One
>> hypothesis is that Hadoop does something funny, because one of
>> my server was under heavy
>> load.
>> Maybe Hadoop launched another worker to replace a slow worker?
>> Can it happen?
>>
>>
>>
>> java.lang.RuntimeException: sendMessage: msgMap did not exist for
>> [hostname].ml.cmu.edu:30003 for vertex 875713
>> at
>> org.apache.giraph.comm.BasicRPCCommunications.sendMessageReq(BasicRPCCommunications.java:825)
>> at org.apache.giraph.graph.BasicVertex.sendMsg(BasicVertex.java:179)
>> at
>> edu.cmu.selectlab.BP.BinaryBPVertex.compute(BinaryBPVertex.java:94)
>> at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:624)
>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
>> at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:396)
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>> at org.apache.hadoop.mapred.Child.main(Child.java:253)
>>
>>
>> Aapo Kyrola
>> Ph.D. student, http://www.cs.cmu.edu/~akyrola
>>
>
> <diff.txt>
Aapo Kyrola
Ph.D. student, http://www.cs.cmu.edu/~akyrola