Hi Devs, I am sending this mail to the dev list since I think Giraph developers might have experienced the issue I am facing.
I am working on extending graph to support a programming model somewhat similar to giraph++. I got an initial POC version running with in my local machine in a pseudo distributed mode. But when I run with large graphs in a cluster, suddenly the map reduce job get killed. This is because, suddenly the job receives a kill signal. I am still not sure about what's the root cause. My hunch is that it has something to do with progress reporting from mappers. I am attaching part of the log that might be helpful. It will be great if you can give me some insights based on your experience. Giraph Version: 1.1.0 Hadoop version: 2.2.0 Application Type: Map Reduce Thanks, Charith -- Charith Dhanushka Wickramaarachchi Tel +1 213 447 4253 Web http://apache.org/~charith <http://www-scf.usc.edu/~cwickram/> <http://charith.wickramaarachchi.org/> Blog http://charith.wickramaarachchi.org/ <http://charithwiki.blogspot.com/> Twitter @charithwiki <https://twitter.com/charithwiki> This communication may contain privileged or other confidential information and is intended exclusively for the addressee/s. If you are not the intended recipient/s, or believe that you may have received this communication in error, please reply to the sender indicating that fact and delete the copy you received and in addition, you should not print, copy, retransmit, disseminate, or otherwise use the information contained in this communication. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. The sender does not accept liability for any errors or omissions
2014-11-11 10:06:54,901 INFO [IPC Server handler 19 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000008_0 2014-11-11 10:06:54,901 INFO [IPC Server handler 19 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000008_0 is : 0.0 2014-11-11 10:06:54,963 INFO [IPC Server handler 20 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000005_0 2014-11-11 10:06:54,963 INFO [IPC Server handler 20 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000005_0 is : 0.0 2014-11-11 10:06:54,974 INFO [IPC Server handler 21 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000001_0 2014-11-11 10:06:54,974 INFO [IPC Server handler 21 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000001_0 is : 0.0 2014-11-11 10:06:54,990 INFO [IPC Server handler 22 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000003_0 2014-11-11 10:06:54,991 INFO [IPC Server handler 22 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000003_0 is : 0.0 2014-11-11 10:06:54,997 INFO [IPC Server handler 23 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000004_0 2014-11-11 10:06:54,998 INFO [IPC Server handler 23 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000004_0 is : 0.0 2014-11-11 10:06:55,048 INFO [IPC Server handler 24 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000002_0 2014-11-11 10:06:55,048 INFO [IPC Server handler 24 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000002_0 is : 0.0 2014-11-11 10:06:55,299 INFO [IPC Server handler 25 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000000_0 2014-11-11 10:06:55,300 INFO [IPC Server handler 25 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000000_0 is : 0.0 2014-11-11 10:06:55,329 INFO [IPC Server handler 26 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000007_0 2014-11-11 10:06:55,330 INFO [IPC Server handler 26 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000007_0 is : 0.0 2014-11-11 10:06:57,440 INFO [IPC Server handler 27 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000006_0 2014-11-11 10:06:57,907 INFO [IPC Server handler 28 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000008_0 2014-11-11 10:06:57,968 INFO [IPC Server handler 29 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000005_0 2014-11-11 10:06:57,979 INFO [IPC Server handler 0 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000001_0 2014-11-11 10:06:57,995 INFO [IPC Server handler 1 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000003_0 2014-11-11 10:06:58,003 INFO [IPC Server handler 2 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000004_0 2014-11-11 10:06:58,054 INFO [IPC Server handler 3 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000002_0 2014-11-11 10:06:58,305 INFO [IPC Server handler 4 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000000_0 2014-11-11 10:06:58,335 INFO [IPC Server handler 5 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000007_0 2014-11-11 10:07:00,443 INFO [IPC Server handler 6 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000006_0 2014-11-11 10:07:00,909 INFO [IPC Server handler 7 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000008_0 2014-11-11 10:07:00,981 INFO [IPC Server handler 8 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000001_0 2014-11-11 10:07:01,027 INFO [IPC Server handler 9 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000005_0 2014-11-11 10:07:01,027 INFO [IPC Server handler 9 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000005_0 is : 1.0 2014-11-11 10:07:01,044 INFO [IPC Server handler 10 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000003_0 2014-11-11 10:07:01,044 INFO [IPC Server handler 10 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000003_0 is : 1.0 2014-11-11 10:07:01,055 INFO [IPC Server handler 11 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000004_0 2014-11-11 10:07:01,057 INFO [IPC Server handler 11 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000004_0 is : 1.0 2014-11-11 10:07:01,103 INFO [IPC Server handler 12 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000002_0 2014-11-11 10:07:01,104 INFO [IPC Server handler 12 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000002_0 is : 1.0 2014-11-11 10:07:01,358 INFO [IPC Server handler 13 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000000_0 2014-11-11 10:07:01,358 INFO [IPC Server handler 13 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000000_0 is : 1.0 2014-11-11 10:07:01,387 INFO [IPC Server handler 14 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000007_0 2014-11-11 10:07:01,388 INFO [IPC Server handler 14 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000007_0 is : 1.0 2014-11-11 10:07:03,531 INFO [IPC Server handler 15 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000006_0 2014-11-11 10:07:03,531 INFO [IPC Server handler 15 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000006_0 is : 1.0 2014-11-11 10:07:03,963 INFO [IPC Server handler 16 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000008_0 2014-11-11 10:07:03,963 INFO [IPC Server handler 16 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000008_0 is : 1.0 2014-11-11 10:07:04,082 INFO [IPC Server handler 17 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000001_0 2014-11-11 10:07:04,082 INFO [IPC Server handler 17 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000001_0 is : 1.0 2014-11-11 10:07:04,084 INFO [IPC Server handler 18 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000005_0 2014-11-11 10:07:04,085 INFO [IPC Server handler 18 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000005_0 is : 1.0 2014-11-11 10:07:04,094 INFO [IPC Server handler 19 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000003_0 2014-11-11 10:07:04,095 INFO [IPC Server handler 19 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000003_0 is : 1.0 2014-11-11 10:07:04,109 INFO [IPC Server handler 20 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000004_0 2014-11-11 10:07:04,109 INFO [IPC Server handler 20 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000004_0 is : 1.0 2014-11-11 10:07:04,151 INFO [IPC Server handler 21 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000002_0 2014-11-11 10:07:04,152 INFO [IPC Server handler 21 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000002_0 is : 1.0 2014-11-11 10:07:04,409 INFO [IPC Server handler 22 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000000_0 2014-11-11 10:07:04,409 INFO [IPC Server handler 22 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000000_0 is : 1.0 2014-11-11 10:07:04,439 INFO [IPC Server handler 23 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000007_0 2014-11-11 10:07:04,440 INFO [IPC Server handler 23 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000007_0 is : 1.0 2014-11-11 10:07:06,536 INFO [IPC Server handler 24 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000006_0 2014-11-11 10:07:07,012 INFO [IPC Server handler 25 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000008_0 2014-11-11 10:07:07,012 INFO [IPC Server handler 25 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000008_0 is : 1.0 2014-11-11 10:07:07,087 INFO [IPC Server handler 26 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000001_0 2014-11-11 10:07:07,089 INFO [IPC Server handler 27 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000005_0 2014-11-11 10:07:07,099 INFO [IPC Server handler 28 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000003_0 2014-11-11 10:07:07,113 INFO [IPC Server handler 29 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000004_0 2014-11-11 10:07:07,156 INFO [IPC Server handler 0 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000002_0 2014-11-11 10:07:07,414 INFO [IPC Server handler 1 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000000_0 2014-11-11 10:07:07,444 INFO [IPC Server handler 2 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000007_0 2014-11-11 10:07:09,539 INFO [IPC Server handler 3 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000006_0 2014-11-11 10:07:10,059 INFO [IPC Server handler 4 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000008_0 2014-11-11 10:07:10,059 INFO [IPC Server handler 4 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000008_0 is : 1.0 2014-11-11 10:07:10,089 INFO [IPC Server handler 5 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000001_0 2014-11-11 10:07:10,091 INFO [IPC Server handler 6 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000005_0 2014-11-11 10:07:10,101 INFO [IPC Server handler 7 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000003_0 2014-11-11 10:07:10,195 INFO [IPC Server handler 8 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000002_0 2014-11-11 10:07:10,230 INFO [IPC Server handler 9 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000004_0 2014-11-11 10:07:10,416 INFO [IPC Server handler 10 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000000_0 2014-11-11 10:07:10,446 INFO [IPC Server handler 11 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000007_0 2014-11-11 10:07:13,141 INFO [IPC Server handler 12 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000005_0 2014-11-11 10:07:13,141 INFO [IPC Server handler 12 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000005_0 is : 1.0 2014-11-11 10:07:16,190 INFO [IPC Server handler 13 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000005_0 2014-11-11 10:07:16,191 INFO [IPC Server handler 13 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000005_0 is : 1.0 2014-11-11 10:07:19,196 INFO [IPC Server handler 14 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000005_0 2014-11-11 10:07:22,199 INFO [IPC Server handler 15 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000005_0 2014-11-11 10:07:25,265 INFO [IPC Server handler 16 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000005_0 2014-11-11 10:07:25,265 INFO [IPC Server handler 16 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000005_0 is : 1.0 2014-11-11 10:07:28,270 INFO [IPC Server handler 17 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000005_0 2014-11-11 10:07:31,273 INFO [IPC Server handler 18 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000005_0 2014-11-11 10:07:34,317 INFO [IPC Server handler 19 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1415143619219_0007_m_000005_0 2014-11-11 10:07:34,317 INFO [IPC Server handler 19 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1415143619219_0007_m_000005_0 is : 1.0 2014-11-11 10:07:37,323 INFO [IPC Server handler 20 on 33264] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from attempt_1415143619219_0007_m_000005_0 2014-11-11 10:07:37,969 INFO [IPC Server handler 0 on 46781] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Kill job job_1415143619219_0007 received from hadoop (auth:SIMPLE) at 10.0.0.3 2014-11-11 10:07:37,970 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1415143619219_0007Job Transitioned from RUNNING to KILL_WAIT
