Re: java.lang.RuntimeException [...] msgMap did not exist [...]

2012-04-17 Thread Etienne Dumoulin
Avery,

I attach the file, indeed it looks more interesting that the others. There
is a null pointer exception:
 15 MapAttempt TASK_TYPE=MAP TASKID=task_201204121825_0001_m_02
TASK_ATTEMPT_ID=attempt_201204121825_0001_m_02_0 TASK_STATUS=FAILED
FINISH_TIME=13342517  07662 HOSTNAME=nantes
ERROR=java\.lang\.NullPointerException
   16at
org\.apache\.giraph\.graph\.GraphMapper\.run(GraphMapper\.java:639)
   17at
org\.apache\.hadoop\.mapred\.MapTask\.runNewMapper(MapTask\.java:763)
   18at org\.apache\.hadoop\.mapred\.MapTask\.run(MapTask\.java:369)
   19at org\.apache\.hadoop\.mapred\.Child$4\.run(Child\.java:259)
   20at java\.security\.AccessController\.doPrivileged(Native Method)
   21at javax\.security\.auth\.Subject\.doAs(Subject\.java:396)
   22at
org\.apache\.hadoop\.security\.UserGroupInformation\.doAs(UserGroupInformation\.java:1059)
   23at org\.apache\.hadoop\.mapred\.Child\.main(Child\.java:253)

Also I found this file in
logs/history/done/version-1/rennes.local.net_1334252188432_/2012/04/13/00/job_201204121836_0003_1334307958403_hadoop_org.apache.giraph.examples.SimpleShortestPathsVert.
I run it on the 13th at 10am local time, however in these logs the date is
20120412. In addition I have in the logs directory I have no job conf
dating of the 13th. Does hadoop does not take the local time to name the
files?

Thanks,

Étienne


On 16 April 2012 19:45, Avery Ching ach...@apache.org wrote:

  Etienne, the task tracker logs are not what I meant, sorry for the
 confusion.  Every task produces it's own output and error log.  That is
 likely where we can find the issue.  Likely a task failed, and the task
 logs should say why.

 Avery


 On 4/16/12 3:00 AM, Etienne Dumoulin wrote:

 Hi Avery,

 Thanks for your fast reply. I attach the forgotten file.

 Regards,

 Étienne

 On 13 April 2012 17:40, Avery Ching ach...@apache.org wrote:

 Hi Etienne,

 Thanks for your questions.  Giraph uses map tasks to run its master and
 workers.  Can you provide the task output logs?  It looks like your workers
 failed to report status for some reason and we need to find out why.  The
 datanode logs can't help us here.

 Avery


 On 4/13/12 3:35 AM, Etienne Dumoulin wrote:

 Hi Guys,

 I tried out giraph yesterday and I have an issue to run the shortest
 path example.

 I am working on a toy heterogeneous cluster of 3 datanodes and 1
 namenode, jobtracker, with hadoop 0.20.203.0.
 One of the datanode is a small server quad-core 16 GB ram, the others
 are small PC 1 core 1GB ram, same OS: ubuntu-server 10.04.

 I run on a first issue with the 0.1 version, the same described here:
 https://issues.apache.org/jira/browse/GIRAPH-114.
 Before I found the patch I tried different configurations:
 It works on a standalone environment, with the namenode and the server,
 with the namenode and the two small PC.
 It does not work either with the entire cluster, or with one small PC
 and the server as datanode.

 Then I downloaded today the svn version, no luck, it has the same
 behaviour than the 0.1 version (go till 100% then go back to 0%) but not
 the same info logs.
 Bellow the svn version console log, nantes is the name of the big
 datanode, rennes the namenode/jobtracker:

 hadoop@rennes:~/test$ hadoop jar
 ~/project/giraph/trunk_2012_04_13/target/giraph-0.2-SNAPSHOT-jar-with-dependencies.jar
 org.apache.giraph.examples.SimpleShortestPathsVertex
 shortestPathsInputGraph shortestPathsOutputGraph 0 3
 12/04/13 10:05:58 INFO mapred.JobClient: Running job:
 job_201204121836_0003
 12/04/13 10:05:59 INFO mapred.JobClient:  map 0% reduce 0%
 12/04/13 10:06:18 INFO mapred.JobClient:  map 25% reduce 0%
 12/04/13 10:08:55 INFO mapred.JobClient:  map 100% reduce 0%
 12/04/13 10:21:28 INFO mapred.JobClient:  map 75% reduce 0%
 12/04/13 10:21:33 INFO mapred.JobClient: Task Id :
 attempt_201204121836_0003_m_02_0, Status : FAILED
 Task attempt_201204121836_0003_m_02_0 failed to report status for
 600 seconds. Killing!
 12/04/13 10:23:57 INFO mapred.JobClient: Task Id :
 attempt_201204121836_0003_m_01_0, Status : FAILED
 java.lang.RuntimeException: sendMessage: msgMap did not exist for
 nantes:30002 for vertex 2
at
 org.apache.giraph.comm.BasicRPCCommunications.sendMessageReq(BasicRPCCommunications.java:993)
at
 org.apache.giraph.graph.BasicVertex.sendMsg(BasicVertex.java:168)
at
 org.apache.giraph.examples.SimpleShortestPathsVertex.compute(SimpleShortestPathsVertex.java:104)
at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:593)
at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:648)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
 

Re: java.lang.RuntimeException [...] msgMap did not exist [...]

2012-04-17 Thread Avery Ching

Etienne,

There should be one task log per task.  Do you have all the tasks logs?  
It looks like this one failed because another one failed.


Avery

On 4/17/12 9:37 AM, Etienne Dumoulin wrote:

Avery,

I attach the file, indeed it looks more interesting that the others. 
There is a null pointer exception:
 15 MapAttempt TASK_TYPE=MAP 
TASKID=task_201204121825_0001_m_02 
TASK_ATTEMPT_ID=attempt_201204121825_0001_m_02_0 
TASK_STATUS=FAILED FINISH_TIME=13342517  07662 
HOSTNAME=nantes ERROR=java\.lang\.NullPointerException
   16at 
org\.apache\.giraph\.graph\.GraphMapper\.run(GraphMapper\.java:639)
   17at 
org\.apache\.hadoop\.mapred\.MapTask\.runNewMapper(MapTask\.java:763)

   18at org\.apache\.hadoop\.mapred\.MapTask\.run(MapTask\.java:369)
   19at org\.apache\.hadoop\.mapred\.Child$4\.run(Child\.java:259)
   20at java\.security\.AccessController\.doPrivileged(Native Method)
   21at javax\.security\.auth\.Subject\.doAs(Subject\.java:396)
   22at 
org\.apache\.hadoop\.security\.UserGroupInformation\.doAs(UserGroupInformation\.java:1059)

   23at org\.apache\.hadoop\.mapred\.Child\.main(Child\.java:253)

Also I found this file in 
logs/history/done/version-1/rennes.local.net_1334252188432_/2012/04/13/00/job_201204121836_0003_1334307958403_hadoop_org.apache.giraph.examples.SimpleShortestPathsVert. 
I run it on the 13th at 10am local time, however in these logs the 
date is 20120412. In addition I have in the logs directory I have no 
job conf dating of the 13th. Does hadoop does not take the local time 
to name the files?


Thanks,

Étienne


On 16 April 2012 19:45, Avery Ching ach...@apache.org 
mailto:ach...@apache.org wrote:


Etienne, the task tracker logs are not what I meant, sorry for the
confusion.  Every task produces it's own output and error log. 
That is likely where we can find the issue.  Likely a task failed,

and the task logs should say why.

Avery


On 4/16/12 3:00 AM, Etienne Dumoulin wrote:

Hi Avery,

Thanks for your fast reply. I attach the forgotten file.

Regards,

Étienne

On 13 April 2012 17:40, Avery Ching ach...@apache.org
mailto:ach...@apache.org wrote:

Hi Etienne,

Thanks for your questions.  Giraph uses map tasks to run its
master and workers.  Can you provide the task output logs?
 It looks like your workers failed to report status for some
reason and we need to find out why.  The datanode logs can't
help us here.

Avery


On 4/13/12 3:35 AM, Etienne Dumoulin wrote:

Hi Guys,

I tried out giraph yesterday and I have an issue to run
the shortest path example.

I am working on a toy heterogeneous cluster of 3
datanodes and 1 namenode, jobtracker, with hadoop 0.20.203.0.
One of the datanode is a small server quad-core 16 GB
ram, the others are small PC 1 core 1GB ram, same OS:
ubuntu-server 10.04.

I run on a first issue with the 0.1 version, the same
described here:
https://issues.apache.org/jira/browse/GIRAPH-114.
Before I found the patch I tried different configurations:
It works on a standalone environment, with the namenode
and the server, with the namenode and the two small PC.
It does not work either with the entire cluster, or with
one small PC and the server as datanode.

Then I downloaded today the svn version, no luck, it has
the same behaviour than the 0.1 version (go till 100%
then go back to 0%) but not the same info logs.
Bellow the svn version console log, nantes is the name
of the big datanode, rennes the namenode/jobtracker:

hadoop@rennes:~/test$ hadoop jar

~/project/giraph/trunk_2012_04_13/target/giraph-0.2-SNAPSHOT-jar-with-dependencies.jar
org.apache.giraph.examples.SimpleShortestPathsVertex
shortestPathsInputGraph shortestPathsOutputGraph 0 3
12/04/13 10:05:58 INFO mapred.JobClient: Running job:
job_201204121836_0003
12/04/13 10:05:59 INFO mapred.JobClient:  map 0% reduce 0%
12/04/13 10:06:18 INFO mapred.JobClient:  map 25% reduce 0%
12/04/13 10:08:55 INFO mapred.JobClient:  map 100% reduce 0%
12/04/13 10:21:28 INFO mapred.JobClient:  map 75% reduce 0%
12/04/13 10:21:33 INFO mapred.JobClient: Task Id :
attempt_201204121836_0003_m_02_0, Status : FAILED
Task attempt_201204121836_0003_m_02_0 failed to
report status for 600 seconds. Killing!
12/04/13 10:23:57 INFO mapred.JobClient: Task Id :
attempt_201204121836_0003_m_01_0, Status : FAILED
java.lang.RuntimeException: sendMessage: msgMap did not
exist for nantes:30002 for