Re: java.lang.RuntimeException [...] msgMap did not exist [...]
Avery, There is no other logs than that, at least to my knowledge. Is it removes logs automatically, because I have the same type of logs mentioned but only for a job done the 16th of April (the one concerned was the 13th). I cannot launch any more the process, I had to change the hadoop version and I have now another error... Thanks for your time, Étienne On 17 April 2012 21:43, Avery Ching ach...@apache.org wrote: Etienne, There should be one task log per task. Do you have all the tasks logs? It looks like this one failed because another one failed. Avery On 4/17/12 9:37 AM, Etienne Dumoulin wrote: Avery, I attach the file, indeed it looks more interesting that the others. There is a null pointer exception: 15 MapAttempt TASK_TYPE=MAP TASKID=task_201204121825_0001_m_02 TASK_ATTEMPT_ID=attempt_201204121825_0001_m_02_0 TASK_STATUS=FAILED FINISH_TIME=13342517 07662 HOSTNAME=nantes ERROR=java\.lang\.NullPointerException 16at org\.apache\.giraph\.graph\.GraphMapper\.run(GraphMapper\.java:639) 17at org\.apache\.hadoop\.mapred\.MapTask\.runNewMapper(MapTask\.java:763) 18at org\.apache\.hadoop\.mapred\.MapTask\.run(MapTask\.java:369) 19at org\.apache\.hadoop\.mapred\.Child$4\.run(Child\.java:259) 20at java\.security\.AccessController\.doPrivileged(Native Method) 21at javax\.security\.auth\.Subject\.doAs(Subject\.java:396) 22at org\.apache\.hadoop\.security\.UserGroupInformation\.doAs(UserGroupInformation\.java:1059) 23at org\.apache\.hadoop\.mapred\.Child\.main(Child\.java:253) Also I found this file in logs/history/done/version-1/rennes.local.net_1334252188432_/2012/04/13/00/job_201204121836_0003_1334307958403_hadoop_org.apache.giraph.examples.SimpleShortestPathsVert. I run it on the 13th at 10am local time, however in these logs the date is 20120412. In addition I have in the logs directory I have no job conf dating of the 13th. Does hadoop does not take the local time to name the files? Thanks, Étienne On 16 April 2012 19:45, Avery Ching ach...@apache.org wrote: Etienne, the task tracker logs are not what I meant, sorry for the confusion. Every task produces it's own output and error log. That is likely where we can find the issue. Likely a task failed, and the task logs should say why. Avery On 4/16/12 3:00 AM, Etienne Dumoulin wrote: Hi Avery, Thanks for your fast reply. I attach the forgotten file. Regards, Étienne On 13 April 2012 17:40, Avery Ching ach...@apache.org wrote: Hi Etienne, Thanks for your questions. Giraph uses map tasks to run its master and workers. Can you provide the task output logs? It looks like your workers failed to report status for some reason and we need to find out why. The datanode logs can't help us here. Avery On 4/13/12 3:35 AM, Etienne Dumoulin wrote: Hi Guys, I tried out giraph yesterday and I have an issue to run the shortest path example. I am working on a toy heterogeneous cluster of 3 datanodes and 1 namenode, jobtracker, with hadoop 0.20.203.0. One of the datanode is a small server quad-core 16 GB ram, the others are small PC 1 core 1GB ram, same OS: ubuntu-server 10.04. I run on a first issue with the 0.1 version, the same described here: https://issues.apache.org/jira/browse/GIRAPH-114. Before I found the patch I tried different configurations: It works on a standalone environment, with the namenode and the server, with the namenode and the two small PC. It does not work either with the entire cluster, or with one small PC and the server as datanode. Then I downloaded today the svn version, no luck, it has the same behaviour than the 0.1 version (go till 100% then go back to 0%) but not the same info logs. Bellow the svn version console log, nantes is the name of the big datanode, rennes the namenode/jobtracker: hadoop@rennes:~/test$ hadoop jar ~/project/giraph/trunk_2012_04_13/target/giraph-0.2-SNAPSHOT-jar-with-dependencies.jar org.apache.giraph.examples.SimpleShortestPathsVertex shortestPathsInputGraph shortestPathsOutputGraph 0 3 12/04/13 10:05:58 INFO mapred.JobClient: Running job: job_201204121836_0003 12/04/13 10:05:59 INFO mapred.JobClient: map 0% reduce 0% 12/04/13 10:06:18 INFO mapred.JobClient: map 25% reduce 0% 12/04/13 10:08:55 INFO mapred.JobClient: map 100% reduce 0% 12/04/13 10:21:28 INFO mapred.JobClient: map 75% reduce 0% 12/04/13 10:21:33 INFO mapred.JobClient: Task Id : attempt_201204121836_0003_m_02_0, Status : FAILED Task attempt_201204121836_0003_m_02_0 failed to report status for 600 seconds. Killing! 12/04/13 10:23:57 INFO mapred.JobClient: Task Id : attempt_201204121836_0003_m_01_0, Status : FAILED java.lang.RuntimeException: sendMessage: msgMap did not exist for nantes:30002 for vertex 2 at org.apache.giraph.comm.BasicRPCCommunications.sendMessageReq(BasicRPCCommunications.java:993) at
Re: java.lang.RuntimeException [...] msgMap did not exist [...]
Avery, I attach the file, indeed it looks more interesting that the others. There is a null pointer exception: 15 MapAttempt TASK_TYPE=MAP TASKID=task_201204121825_0001_m_02 TASK_ATTEMPT_ID=attempt_201204121825_0001_m_02_0 TASK_STATUS=FAILED FINISH_TIME=13342517 07662 HOSTNAME=nantes ERROR=java\.lang\.NullPointerException 16at org\.apache\.giraph\.graph\.GraphMapper\.run(GraphMapper\.java:639) 17at org\.apache\.hadoop\.mapred\.MapTask\.runNewMapper(MapTask\.java:763) 18at org\.apache\.hadoop\.mapred\.MapTask\.run(MapTask\.java:369) 19at org\.apache\.hadoop\.mapred\.Child$4\.run(Child\.java:259) 20at java\.security\.AccessController\.doPrivileged(Native Method) 21at javax\.security\.auth\.Subject\.doAs(Subject\.java:396) 22at org\.apache\.hadoop\.security\.UserGroupInformation\.doAs(UserGroupInformation\.java:1059) 23at org\.apache\.hadoop\.mapred\.Child\.main(Child\.java:253) Also I found this file in logs/history/done/version-1/rennes.local.net_1334252188432_/2012/04/13/00/job_201204121836_0003_1334307958403_hadoop_org.apache.giraph.examples.SimpleShortestPathsVert. I run it on the 13th at 10am local time, however in these logs the date is 20120412. In addition I have in the logs directory I have no job conf dating of the 13th. Does hadoop does not take the local time to name the files? Thanks, Étienne On 16 April 2012 19:45, Avery Ching ach...@apache.org wrote: Etienne, the task tracker logs are not what I meant, sorry for the confusion. Every task produces it's own output and error log. That is likely where we can find the issue. Likely a task failed, and the task logs should say why. Avery On 4/16/12 3:00 AM, Etienne Dumoulin wrote: Hi Avery, Thanks for your fast reply. I attach the forgotten file. Regards, Étienne On 13 April 2012 17:40, Avery Ching ach...@apache.org wrote: Hi Etienne, Thanks for your questions. Giraph uses map tasks to run its master and workers. Can you provide the task output logs? It looks like your workers failed to report status for some reason and we need to find out why. The datanode logs can't help us here. Avery On 4/13/12 3:35 AM, Etienne Dumoulin wrote: Hi Guys, I tried out giraph yesterday and I have an issue to run the shortest path example. I am working on a toy heterogeneous cluster of 3 datanodes and 1 namenode, jobtracker, with hadoop 0.20.203.0. One of the datanode is a small server quad-core 16 GB ram, the others are small PC 1 core 1GB ram, same OS: ubuntu-server 10.04. I run on a first issue with the 0.1 version, the same described here: https://issues.apache.org/jira/browse/GIRAPH-114. Before I found the patch I tried different configurations: It works on a standalone environment, with the namenode and the server, with the namenode and the two small PC. It does not work either with the entire cluster, or with one small PC and the server as datanode. Then I downloaded today the svn version, no luck, it has the same behaviour than the 0.1 version (go till 100% then go back to 0%) but not the same info logs. Bellow the svn version console log, nantes is the name of the big datanode, rennes the namenode/jobtracker: hadoop@rennes:~/test$ hadoop jar ~/project/giraph/trunk_2012_04_13/target/giraph-0.2-SNAPSHOT-jar-with-dependencies.jar org.apache.giraph.examples.SimpleShortestPathsVertex shortestPathsInputGraph shortestPathsOutputGraph 0 3 12/04/13 10:05:58 INFO mapred.JobClient: Running job: job_201204121836_0003 12/04/13 10:05:59 INFO mapred.JobClient: map 0% reduce 0% 12/04/13 10:06:18 INFO mapred.JobClient: map 25% reduce 0% 12/04/13 10:08:55 INFO mapred.JobClient: map 100% reduce 0% 12/04/13 10:21:28 INFO mapred.JobClient: map 75% reduce 0% 12/04/13 10:21:33 INFO mapred.JobClient: Task Id : attempt_201204121836_0003_m_02_0, Status : FAILED Task attempt_201204121836_0003_m_02_0 failed to report status for 600 seconds. Killing! 12/04/13 10:23:57 INFO mapred.JobClient: Task Id : attempt_201204121836_0003_m_01_0, Status : FAILED java.lang.RuntimeException: sendMessage: msgMap did not exist for nantes:30002 for vertex 2 at org.apache.giraph.comm.BasicRPCCommunications.sendMessageReq(BasicRPCCommunications.java:993) at org.apache.giraph.graph.BasicVertex.sendMsg(BasicVertex.java:168) at org.apache.giraph.examples.SimpleShortestPathsVertex.compute(SimpleShortestPathsVertex.java:104) at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:593) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:648) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369) at org.apache.hadoop.mapred.Child$4.run(Child.java:259) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at
Re: java.lang.RuntimeException [...] msgMap did not exist [...]
Etienne, There should be one task log per task. Do you have all the tasks logs? It looks like this one failed because another one failed. Avery On 4/17/12 9:37 AM, Etienne Dumoulin wrote: Avery, I attach the file, indeed it looks more interesting that the others. There is a null pointer exception: 15 MapAttempt TASK_TYPE=MAP TASKID=task_201204121825_0001_m_02 TASK_ATTEMPT_ID=attempt_201204121825_0001_m_02_0 TASK_STATUS=FAILED FINISH_TIME=13342517 07662 HOSTNAME=nantes ERROR=java\.lang\.NullPointerException 16at org\.apache\.giraph\.graph\.GraphMapper\.run(GraphMapper\.java:639) 17at org\.apache\.hadoop\.mapred\.MapTask\.runNewMapper(MapTask\.java:763) 18at org\.apache\.hadoop\.mapred\.MapTask\.run(MapTask\.java:369) 19at org\.apache\.hadoop\.mapred\.Child$4\.run(Child\.java:259) 20at java\.security\.AccessController\.doPrivileged(Native Method) 21at javax\.security\.auth\.Subject\.doAs(Subject\.java:396) 22at org\.apache\.hadoop\.security\.UserGroupInformation\.doAs(UserGroupInformation\.java:1059) 23at org\.apache\.hadoop\.mapred\.Child\.main(Child\.java:253) Also I found this file in logs/history/done/version-1/rennes.local.net_1334252188432_/2012/04/13/00/job_201204121836_0003_1334307958403_hadoop_org.apache.giraph.examples.SimpleShortestPathsVert. I run it on the 13th at 10am local time, however in these logs the date is 20120412. In addition I have in the logs directory I have no job conf dating of the 13th. Does hadoop does not take the local time to name the files? Thanks, Étienne On 16 April 2012 19:45, Avery Ching ach...@apache.org mailto:ach...@apache.org wrote: Etienne, the task tracker logs are not what I meant, sorry for the confusion. Every task produces it's own output and error log. That is likely where we can find the issue. Likely a task failed, and the task logs should say why. Avery On 4/16/12 3:00 AM, Etienne Dumoulin wrote: Hi Avery, Thanks for your fast reply. I attach the forgotten file. Regards, Étienne On 13 April 2012 17:40, Avery Ching ach...@apache.org mailto:ach...@apache.org wrote: Hi Etienne, Thanks for your questions. Giraph uses map tasks to run its master and workers. Can you provide the task output logs? It looks like your workers failed to report status for some reason and we need to find out why. The datanode logs can't help us here. Avery On 4/13/12 3:35 AM, Etienne Dumoulin wrote: Hi Guys, I tried out giraph yesterday and I have an issue to run the shortest path example. I am working on a toy heterogeneous cluster of 3 datanodes and 1 namenode, jobtracker, with hadoop 0.20.203.0. One of the datanode is a small server quad-core 16 GB ram, the others are small PC 1 core 1GB ram, same OS: ubuntu-server 10.04. I run on a first issue with the 0.1 version, the same described here: https://issues.apache.org/jira/browse/GIRAPH-114. Before I found the patch I tried different configurations: It works on a standalone environment, with the namenode and the server, with the namenode and the two small PC. It does not work either with the entire cluster, or with one small PC and the server as datanode. Then I downloaded today the svn version, no luck, it has the same behaviour than the 0.1 version (go till 100% then go back to 0%) but not the same info logs. Bellow the svn version console log, nantes is the name of the big datanode, rennes the namenode/jobtracker: hadoop@rennes:~/test$ hadoop jar ~/project/giraph/trunk_2012_04_13/target/giraph-0.2-SNAPSHOT-jar-with-dependencies.jar org.apache.giraph.examples.SimpleShortestPathsVertex shortestPathsInputGraph shortestPathsOutputGraph 0 3 12/04/13 10:05:58 INFO mapred.JobClient: Running job: job_201204121836_0003 12/04/13 10:05:59 INFO mapred.JobClient: map 0% reduce 0% 12/04/13 10:06:18 INFO mapred.JobClient: map 25% reduce 0% 12/04/13 10:08:55 INFO mapred.JobClient: map 100% reduce 0% 12/04/13 10:21:28 INFO mapred.JobClient: map 75% reduce 0% 12/04/13 10:21:33 INFO mapred.JobClient: Task Id : attempt_201204121836_0003_m_02_0, Status : FAILED Task attempt_201204121836_0003_m_02_0 failed to report status for 600 seconds. Killing! 12/04/13 10:23:57 INFO mapred.JobClient: Task Id : attempt_201204121836_0003_m_01_0, Status : FAILED java.lang.RuntimeException: sendMessage: msgMap did not exist for nantes:30002 for
Re: java.lang.RuntimeException [...] msgMap did not exist [...]
Hi Avery, Thanks for your fast reply. I attach the forgotten file. Regards, Étienne On 13 April 2012 17:40, Avery Ching ach...@apache.org wrote: Hi Etienne, Thanks for your questions. Giraph uses map tasks to run its master and workers. Can you provide the task output logs? It looks like your workers failed to report status for some reason and we need to find out why. The datanode logs can't help us here. Avery On 4/13/12 3:35 AM, Etienne Dumoulin wrote: Hi Guys, I tried out giraph yesterday and I have an issue to run the shortest path example. I am working on a toy heterogeneous cluster of 3 datanodes and 1 namenode, jobtracker, with hadoop 0.20.203.0. One of the datanode is a small server quad-core 16 GB ram, the others are small PC 1 core 1GB ram, same OS: ubuntu-server 10.04. I run on a first issue with the 0.1 version, the same described here: https://issues.apache.org/**jira/browse/GIRAPH-114https://issues.apache.org/jira/browse/GIRAPH-114 . Before I found the patch I tried different configurations: It works on a standalone environment, with the namenode and the server, with the namenode and the two small PC. It does not work either with the entire cluster, or with one small PC and the server as datanode. Then I downloaded today the svn version, no luck, it has the same behaviour than the 0.1 version (go till 100% then go back to 0%) but not the same info logs. Bellow the svn version console log, nantes is the name of the big datanode, rennes the namenode/jobtracker: hadoop@rennes:~/test$ hadoop jar ~/project/giraph/trunk_2012_** 04_13/target/giraph-0.2-**SNAPSHOT-jar-with-**dependencies.jar org.apache.giraph.examples.**SimpleShortestPathsVertex shortestPathsInputGraph shortestPathsOutputGraph 0 3 12/04/13 10:05:58 INFO mapred.JobClient: Running job: job_201204121836_0003 12/04/13 10:05:59 INFO mapred.JobClient: map 0% reduce 0% 12/04/13 10:06:18 INFO mapred.JobClient: map 25% reduce 0% 12/04/13 10:08:55 INFO mapred.JobClient: map 100% reduce 0% 12/04/13 10:21:28 INFO mapred.JobClient: map 75% reduce 0% 12/04/13 10:21:33 INFO mapred.JobClient: Task Id : attempt_201204121836_0003_m_**02_0, Status : FAILED Task attempt_201204121836_0003_m_**02_0 failed to report status for 600 seconds. Killing! 12/04/13 10:23:57 INFO mapred.JobClient: Task Id : attempt_201204121836_0003_m_**01_0, Status : FAILED java.lang.RuntimeException: sendMessage: msgMap did not exist for nantes:30002 for vertex 2 at org.apache.giraph.comm.**BasicRPCCommunications.** sendMessageReq(**BasicRPCCommunications.java:**993) at org.apache.giraph.graph.**BasicVertex.sendMsg(** BasicVertex.java:168) at org.apache.giraph.examples.**SimpleShortestPathsVertex.** compute(**SimpleShortestPathsVertex.**java:104) at org.apache.giraph.graph.**GraphMapper.map(GraphMapper.** java:593) at org.apache.giraph.graph.**GraphMapper.run(GraphMapper.** java:648) at org.apache.hadoop.mapred.**MapTask.runNewMapper(MapTask.** java:763) at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:369) at org.apache.hadoop.mapred.**Child$4.run(Child.java:259) at java.security.**AccessController.doPrivileged(**Native Method) at javax.security.auth.Subject.**doAs(Subject.java:396) at org.apache.hadoop.security.**UserGroupInformation.doAs(** UserGroupInformation.java:**1059) at org.apache.hadoop.mapred.**Child.main(Child.java:253) Task attempt_201204121836_0003_m_**01_0 failed to report status for 601 seconds. Killing! 12/04/13 10:23:58 INFO mapred.JobClient: map 50% reduce 0% 12/04/13 10:24:01 INFO mapred.JobClient: map 25% reduce 0% 12/04/13 10:24:06 INFO mapred.JobClient: Task Id : attempt_201204121836_0003_m_**03_0, Status : FAILED Task attempt_201204121836_0003_m_**03_0 failed to report status for 602 seconds. Killing! I attached the hadoop logs for rennes namenode and jobtraker and for nantes the big datanode. Is someone already got this error/found a fix? Thanks for your time, Étienne hadoop-hadoop-tasktracker-nantes.log Description: Binary data
Re: java.lang.RuntimeException [...] msgMap did not exist [...]
Hi Etienne, Thanks for your questions. Giraph uses map tasks to run its master and workers. Can you provide the task output logs? It looks like your workers failed to report status for some reason and we need to find out why. The datanode logs can't help us here. Avery On 4/13/12 3:35 AM, Etienne Dumoulin wrote: Hi Guys, I tried out giraph yesterday and I have an issue to run the shortest path example. I am working on a toy heterogeneous cluster of 3 datanodes and 1 namenode, jobtracker, with hadoop 0.20.203.0. One of the datanode is a small server quad-core 16 GB ram, the others are small PC 1 core 1GB ram, same OS: ubuntu-server 10.04. I run on a first issue with the 0.1 version, the same described here: https://issues.apache.org/jira/browse/GIRAPH-114. Before I found the patch I tried different configurations: It works on a standalone environment, with the namenode and the server, with the namenode and the two small PC. It does not work either with the entire cluster, or with one small PC and the server as datanode. Then I downloaded today the svn version, no luck, it has the same behaviour than the 0.1 version (go till 100% then go back to 0%) but not the same info logs. Bellow the svn version console log, nantes is the name of the big datanode, rennes the namenode/jobtracker: hadoop@rennes:~/test$ hadoop jar ~/project/giraph/trunk_2012_04_13/target/giraph-0.2-SNAPSHOT-jar-with-dependencies.jar org.apache.giraph.examples.SimpleShortestPathsVertex shortestPathsInputGraph shortestPathsOutputGraph 0 3 12/04/13 10:05:58 INFO mapred.JobClient: Running job: job_201204121836_0003 12/04/13 10:05:59 INFO mapred.JobClient: map 0% reduce 0% 12/04/13 10:06:18 INFO mapred.JobClient: map 25% reduce 0% 12/04/13 10:08:55 INFO mapred.JobClient: map 100% reduce 0% 12/04/13 10:21:28 INFO mapred.JobClient: map 75% reduce 0% 12/04/13 10:21:33 INFO mapred.JobClient: Task Id : attempt_201204121836_0003_m_02_0, Status : FAILED Task attempt_201204121836_0003_m_02_0 failed to report status for 600 seconds. Killing! 12/04/13 10:23:57 INFO mapred.JobClient: Task Id : attempt_201204121836_0003_m_01_0, Status : FAILED java.lang.RuntimeException: sendMessage: msgMap did not exist for nantes:30002 for vertex 2 at org.apache.giraph.comm.BasicRPCCommunications.sendMessageReq(BasicRPCCommunications.java:993) at org.apache.giraph.graph.BasicVertex.sendMsg(BasicVertex.java:168) at org.apache.giraph.examples.SimpleShortestPathsVertex.compute(SimpleShortestPathsVertex.java:104) at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:593) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:648) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369) at org.apache.hadoop.mapred.Child$4.run(Child.java:259) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:253) Task attempt_201204121836_0003_m_01_0 failed to report status for 601 seconds. Killing! 12/04/13 10:23:58 INFO mapred.JobClient: map 50% reduce 0% 12/04/13 10:24:01 INFO mapred.JobClient: map 25% reduce 0% 12/04/13 10:24:06 INFO mapred.JobClient: Task Id : attempt_201204121836_0003_m_03_0, Status : FAILED Task attempt_201204121836_0003_m_03_0 failed to report status for 602 seconds. Killing! I attached the hadoop logs for rennes namenode and jobtraker and for nantes the big datanode. Is someone already got this error/found a fix? Thanks for your time, Étienne