Re: Java Heap space error
I'm curious if you have been able to track down the cause of the error? We've seen similar problems with loading data and I've discovered if I presort my data before the load that things go a LOT smoother. When running queries against our data sometimes we've seen it where the jobtracker just freezes. I've seen Heap out of memory errors when I cranked up jobtracker logging to debug. Still working on figuring this one out. should be an interesting ride :D On 03/06/2012 11:10 AM, Mohit Anchlia wrote: I am still trying to see how to narrow this down. Is it possible to set heapdumponoutofmemoryerror option on these individual tasks? On Mon, Mar 5, 2012 at 5:49 PM, Mohit Anchliamohitanch...@gmail.comwrote:
Re: Java Heap space error
I am still trying to see how to narrow this down. Is it possible to set heapdumponoutofmemoryerror option on these individual tasks? On Mon, Mar 5, 2012 at 5:49 PM, Mohit Anchlia mohitanch...@gmail.comwrote: Sorry for multiple emails. I did find: 2012-03-05 17:26:35,636 INFO org.apache.pig.impl.util.SpillableMemoryManager: first memory handler call- Usage threshold init = 715849728(699072K) used = 575921696(562423K) committed = 715849728(699072K) max = 715849728(699072K) 2012-03-05 17:26:35,719 INFO org.apache.pig.impl.util.SpillableMemoryManager: Spilled an estimate of 7816154 bytes from 1 objects. init = 715849728(699072K) used = 575921696(562423K) committed = 715849728(699072K) max = 715849728(699072K) 2012-03-05 17:26:36,881 INFO org.apache.pig.impl.util.SpillableMemoryManager: first memory handler call - Collection threshold init = 715849728(699072K) used = 358720384(350312K) committed = 715849728(699072K) max = 715849728(699072K) 2012-03-05 17:26:36,885 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 2012-03-05 17:26:36,888 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space at java.nio.HeapCharBuffer.init(HeapCharBuffer.java:39) at java.nio.CharBuffer.allocate(CharBuffer.java:312) at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:760) at org.apache.hadoop.io.Text.decode(Text.java:350) at org.apache.hadoop.io.Text.decode(Text.java:327) at org.apache.hadoop.io.Text.toString(Text.java:254) at org.apache.pig.piggybank.storage.SequenceFileLoader.translateWritableToPigDataType(SequenceFileLoader.java:105) at org.apache.pig.piggybank.storage.SequenceFileLoader.getNext(SequenceFileLoader.java:139) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:456) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) at org.apache.hadoop.mapred.Child.main(Child.java:264) On Mon, Mar 5, 2012 at 5:46 PM, Mohit Anchlia mohitanch...@gmail.comwrote: All I see in the logs is: 2012-03-05 17:26:36,889 FATAL org.apache.hadoop.mapred.TaskTracker: Task: attempt_201203051722_0001_m_30_1 - Killed : Java heap space Looks like task tracker is killing the tasks. Not sure why. I increased heap from 512 to 1G and still it fails. On Mon, Mar 5, 2012 at 5:03 PM, Mohit Anchlia mohitanch...@gmail.comwrote: I currently have java.opts.mapred set to 512MB and I am getting heap space errors. How should I go about debugging heap space issues?
Re: Java Heap space error
All I see in the logs is: 2012-03-05 17:26:36,889 FATAL org.apache.hadoop.mapred.TaskTracker: Task: attempt_201203051722_0001_m_30_1 - Killed : Java heap space Looks like task tracker is killing the tasks. Not sure why. I increased heap from 512 to 1G and still it fails. On Mon, Mar 5, 2012 at 5:03 PM, Mohit Anchlia mohitanch...@gmail.comwrote: I currently have java.opts.mapred set to 512MB and I am getting heap space errors. How should I go about debugging heap space issues?
Re: Java Heap space error
Sorry for multiple emails. I did find: 2012-03-05 17:26:35,636 INFO org.apache.pig.impl.util.SpillableMemoryManager: first memory handler call- Usage threshold init = 715849728(699072K) used = 575921696(562423K) committed = 715849728(699072K) max = 715849728(699072K) 2012-03-05 17:26:35,719 INFO org.apache.pig.impl.util.SpillableMemoryManager: Spilled an estimate of 7816154 bytes from 1 objects. init = 715849728(699072K) used = 575921696(562423K) committed = 715849728(699072K) max = 715849728(699072K) 2012-03-05 17:26:36,881 INFO org.apache.pig.impl.util.SpillableMemoryManager: first memory handler call - Collection threshold init = 715849728(699072K) used = 358720384(350312K) committed = 715849728(699072K) max = 715849728(699072K) 2012-03-05 17:26:36,885 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 2012-03-05 17:26:36,888 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space at java.nio.HeapCharBuffer.init(HeapCharBuffer.java:39) at java.nio.CharBuffer.allocate(CharBuffer.java:312) at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:760) at org.apache.hadoop.io.Text.decode(Text.java:350) at org.apache.hadoop.io.Text.decode(Text.java:327) at org.apache.hadoop.io.Text.toString(Text.java:254) at org.apache.pig.piggybank.storage.SequenceFileLoader.translateWritableToPigDataType(SequenceFileLoader.java:105) at org.apache.pig.piggybank.storage.SequenceFileLoader.getNext(SequenceFileLoader.java:139) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:456) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) at org.apache.hadoop.mapred.Child.main(Child.java:264) On Mon, Mar 5, 2012 at 5:46 PM, Mohit Anchlia mohitanch...@gmail.comwrote: All I see in the logs is: 2012-03-05 17:26:36,889 FATAL org.apache.hadoop.mapred.TaskTracker: Task: attempt_201203051722_0001_m_30_1 - Killed : Java heap space Looks like task tracker is killing the tasks. Not sure why. I increased heap from 512 to 1G and still it fails. On Mon, Mar 5, 2012 at 5:03 PM, Mohit Anchlia mohitanch...@gmail.comwrote: I currently have java.opts.mapred set to 512MB and I am getting heap space errors. How should I go about debugging heap space issues?
Java heap space error on PFPGrowth
I am trying to run PFPGrowth but I keep receiving this Java heap space error at the end of the first step/beginning of second step. I am using the following parameters: -method mapreduce -regex [\\t] -s 5 -g 55000 Output: .. 10/11/11 08:12:56 INFO mapred.JobClient: map 100% reduce 85% 10/11/11 08:12:59 INFO mapred.JobClient: map 100% reduce 90% 10/11/11 08:13:02 INFO mapred.JobClient: map 100% reduce 94% 10/11/11 08:13:09 INFO mapred.JobClient: map 100% reduce 100% 10/11/11 08:13:11 INFO mapred.JobClient: Job complete: job_201011101701_0005 10/11/11 08:13:11 INFO mapred.JobClient: Counters: 17 10/11/11 08:13:11 INFO mapred.JobClient: Job Counters 10/11/11 08:13:11 INFO mapred.JobClient: Launched reduce tasks=1 10/11/11 08:13:11 INFO mapred.JobClient: Launched map tasks=8 10/11/11 08:13:11 INFO mapred.JobClient: Data-local map tasks=8 10/11/11 08:13:11 INFO mapred.JobClient: FileSystemCounters 10/11/11 08:13:11 INFO mapred.JobClient: FILE_BYTES_READ=146083205 10/11/11 08:13:11 INFO mapred.JobClient: HDFS_BYTES_READ=411751517 10/11/11 08:13:11 INFO mapred.JobClient: FILE_BYTES_WRITTEN=177276794 10/11/11 08:13:11 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=82352630 10/11/11 08:13:11 INFO mapred.JobClient: Map-Reduce Framework 10/11/11 08:13:11 INFO mapred.JobClient: Reduce input groups=3146378 10/11/11 08:13:11 INFO mapred.JobClient: Combine output records=30759042 10/11/11 08:13:11 INFO mapred.JobClient: Map input records=6049220 10/11/11 08:13:11 INFO mapred.JobClient: Reduce shuffle bytes=26239336 10/11/11 08:13:11 INFO mapred.JobClient: Reduce output records=3146378 10/11/11 08:13:11 INFO mapred.JobClient: Spilled Records=54248354 10/11/11 08:13:11 INFO mapred.JobClient: Map output bytes=743485927 10/11/11 08:13:11 INFO mapred.JobClient: Combine input records=63744687 10/11/11 08:13:11 INFO mapred.JobClient: Map output records=41469874 10/11/11 08:13:11 INFO mapred.JobClient: Reduce input records=8484229 10/11/11 08:13:26 INFO pfpgrowth.PFPGrowth: No of Features: 1087215 10/11/11 08:13:40 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 10/11/11 08:13:40 INFO input.FileInputFormat: Total input paths to process : 1 10/11/11 08:13:44 INFO mapred.JobClient: Running job: job_201011101701_0006 10/11/11 08:13:45 INFO mapred.JobClient: map 0% reduce 0% 10/11/11 08:14:16 INFO mapred.JobClient: Task Id : attempt_201011101701_0006_m_00_0, Status : FAILED Error: Java heap space Is there anything I can do to alleviate this problem? FYI: I running a 4-node cluster with 12GB of ram in each machine. Thanks
TaskTracker: Java heap space error
Dear All, I am running a hadoop job processing data. The output of map is really large, and it spill 15 times. So I was trying to set io.sort.mb = 256 instead of 100. And I leave everything else default. I am using 0.20.2 version. And when I run the job, I got the following errors: 2010-03-11 11:09:37,581 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId= 2010-03-11 11:09:38,073 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1 2010-03-11 11:09:38,086 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 256 2010-03-11 11:09:38,326 FATAL org.apache.hadoop.mapred.TaskTracker: Error running child : java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:781) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) I can't figure out why, could anyone please give me a hint? Any hlep will be appreciate! Thanks a lot! SIncerely, Boyu
Re: TaskTracker: Java heap space error
Moving to mapreduce-user@, bcc: common-user Have you tried bumping up the heap for the map task? Since you are setting io.sort.mb to 256M, pls set heap-size to 512M at least, if not more. mapred.child.java.opts - -Xmx512M or -Xmx1024m Arun On Mar 11, 2010, at 8:24 AM, Boyu Zhang wrote: Dear All, I am running a hadoop job processing data. The output of map is really large, and it spill 15 times. So I was trying to set io.sort.mb = 256 instead of 100. And I leave everything else default. I am using 0.20.2 version. And when I run the job, I got the following errors: 2010-03-11 11:09:37,581 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId= 2010-03-11 11:09:38,073 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1 2010-03-11 11:09:38,086 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 256 2010-03-11 11:09:38,326 FATAL org.apache.hadoop.mapred.TaskTracker: Error running child : java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.mapred.MapTask $MapOutputBuffer.init(MapTask.java:781) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) I can't figure out why, could anyone please give me a hint? Any hlep will be appreciate! Thanks a lot! SIncerely, Boyu
Re: TaskTracker: Java heap space error
Hi Arun, I did what you said, and it seems to work. Thanks a lot! I guess I do not completely understand how the tuning parameters affect each other. Thanks! Boyu On Thu, Mar 11, 2010 at 12:27 PM, Arun C Murthy a...@yahoo-inc.com wrote: Moving to mapreduce-user@, bcc: common-user Have you tried bumping up the heap for the map task? Since you are setting io.sort.mb to 256M, pls set heap-size to 512M at least, if not more. mapred.child.java.opts - -Xmx512M or -Xmx1024m Arun On Mar 11, 2010, at 8:24 AM, Boyu Zhang wrote: Dear All, I am running a hadoop job processing data. The output of map is really large, and it spill 15 times. So I was trying to set io.sort.mb = 256 instead of 100. And I leave everything else default. I am using 0.20.2 version. And when I run the job, I got the following errors: 2010-03-11 11:09:37,581 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId= 2010-03-11 11:09:38,073 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1 2010-03-11 11:09:38,086 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 256 2010-03-11 11:09:38,326 FATAL org.apache.hadoop.mapred.TaskTracker: Error running child : java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:781) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) I can't figure out why, could anyone please give me a hint? Any hlep will be appreciate! Thanks a lot! SIncerely, Boyu