Joey, that did the trick!
Actually, I am writing to the log with System.out.println() statements, and I write about 12,000 lines, would that be a problem? I don't really need this output, so if you think it's inadvisable, I will remove that. Also, I hope that if I have not 6,000 maps but 12,000 or even 30,000, it will still work. Well, I will see pretty soon, I guess, with more data. Again, thank you. Sincerely, Mark On Wed, May 23, 2012 at 9:43 PM, Joey Krabacher <jkrabac...@gmail.com>wrote: > Mark, > > Have you tried tweaking the mapred.child.java.opts property in your > mapred-site.xml? > > <property> > <name>mapred.child.java.opts</name> > <value>-Xmx2048m</value> > </property> > > This might help. > It looks like the fatal error came right after the log truncater fired off. > Are you outputting anything to the logs manually, or have you looked at the > user logs to see if there is anything taking up lots of room? > > / * Joey */ > > > On Wed, May 23, 2012 at 9:35 PM, Mark Kerzner <mark.kerz...@shmsoft.com > >wrote: > > > Joey, > > > > my errors closely resembles this > > one< > > > http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201006.mbox/%3caanlktikr3df4ce-tgiphv9_-evfoed_5-t684nf4y...@mail.gmail.com%3E > > >in > > the archives. I can now be much more specific with the errors message, > > and it is quoted below. I tried -Xmx3096. But I got the same error. > > > > Thank you, > > Mark > > > > > > syslog logs > > 2012-05-23 20:04:52,349 WARN org.apache.hadoop.util.NativeCodeLoader: > > Unable to load native-hadoop library for your platform... using > > builtin-java classes where applicable > > 2012-05-23 20:04:52,519 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: > > Initializing JVM Metrics with processName=MAP, sessionId= > > 2012-05-23 20:04:52,695 INFO org.apache.hadoop.util.ProcessTree: setsid > > exited with exit code 0 > > 2012-05-23 20:04:52,699 INFO org.apache.hadoop.mapred.Task: Using > > ResourceCalculatorPlugin : > > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@d56b37 > > 2012-05-23 20:04:52,813 INFO org.apache.hadoop.mapred.MapTask: > io.sort.mb = > > 100 > > 2012-05-23 20:04:52,998 INFO org.apache.hadoop.mapred.MapTask: data > buffer > > = 79691776/99614720 > > 2012-05-23 20:04:52,998 INFO org.apache.hadoop.mapred.MapTask: record > > buffer = 262144/327680 > > 2012-05-23 20:04:53,010 WARN > > org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library > not > > loaded > > 2012-05-23 20:12:29,120 INFO org.apache.hadoop.mapred.MapTask: Spilling > map > > output: buffer full= true > > 2012-05-23 20:12:29,134 INFO org.apache.hadoop.mapred.MapTask: bufstart = > > 0; bufend = 79542629; bufvoid = 99614720 > > 2012-05-23 20:12:29,134 INFO org.apache.hadoop.mapred.MapTask: kvstart = > 0; > > kvend = 228; length = 327680 > > 2012-05-23 20:12:31,248 INFO org.apache.hadoop.mapred.MapTask: Finished > > spill 0 > > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: Spilling > map > > output: buffer full= true > > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: bufstart = > > 79542629; bufend = 53863940; bufvoid = 99614720 > > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: kvstart = > > 228; kvend = 431; length = 327680 > > 2012-05-23 20:13:03,294 INFO org.apache.hadoop.mapred.MapTask: Finished > > spill 1 > > 2012-05-23 20:13:48,121 INFO org.apache.hadoop.mapred.MapTask: Spilling > map > > output: buffer full= true > > 2012-05-23 20:13:48,122 INFO org.apache.hadoop.mapred.MapTask: bufstart = > > 53863940; bufend = 31696780; bufvoid = 99614720 > > 2012-05-23 20:13:48,122 INFO org.apache.hadoop.mapred.MapTask: kvstart = > > 431; kvend = 861; length = 327680 > > 2012-05-23 20:13:49,818 INFO org.apache.hadoop.mapred.MapTask: Finished > > spill 2 > > 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: Spilling > map > > output: buffer full= true > > 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: bufstart = > > 31696780; bufend = 10267329; bufvoid = 99614720 > > 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: kvstart = > > 861; kvend = 1462; length = 327680 > > 2012-05-23 20:15:27,068 INFO org.apache.hadoop.mapred.MapTask: Finished > > spill 3 > > 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: Spilling > map > > output: buffer full= true > > 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: bufstart = > > 10267329; bufend = 85241086; bufvoid = 99614720 > > 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: kvstart = > > 1462; kvend = 1642; length = 327680 > > 2012-05-23 20:15:54,760 INFO org.apache.hadoop.mapred.MapTask: Finished > > spill 4 > > 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: Spilling > map > > output: buffer full= true > > 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: bufstart = > > 85241086; bufend = 51305930; bufvoid = 99614720 > > 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: kvstart = > > 1642; kvend = 1946; length = 327680 > > 2012-05-23 20:16:27,566 INFO org.apache.hadoop.mapred.MapTask: Finished > > spill 5 > > 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: Spilling > map > > output: buffer full= true > > 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: bufstart = > > 51305930; bufend = 31353466; bufvoid = 99614720 > > 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: kvstart = > > 1946; kvend = 2263; length = 327680 > > 2012-05-23 20:16:58,076 INFO org.apache.hadoop.mapred.MapTask: Finished > > spill 6 > > 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: Spilling > map > > output: buffer full= true > > 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: bufstart = > > 31353466; bufend = 10945750; bufvoid = 99614720 > > 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: kvstart = > > 2263; kvend = 2755; length = 327680 > > 2012-05-23 20:17:53,939 INFO org.apache.hadoop.mapred.MapTask: Finished > > spill 7 > > 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: Spilling > map > > output: buffer full= true > > 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: bufstart = > > 10945750; bufend = 81838103; bufvoid = 99614720 > > 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: kvstart = > > 2755; kvend = 2967; length = 327680 > > 2012-05-23 20:18:21,145 INFO org.apache.hadoop.mapred.MapTask: Finished > > spill 8 > > 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: Spilling > map > > output: buffer full= true > > 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: bufstart = > > 81838103; bufend = 61751422; bufvoid = 99614720 > > 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: kvstart = > > 2967; kvend = 3202; length = 327680 > > 2012-05-23 20:18:53,485 INFO org.apache.hadoop.mapred.MapTask: Finished > > spill 9 > > 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: Spilling > map > > output: buffer full= true > > 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: bufstart = > > 61751422; bufend = 38440188; bufvoid = 99614720 > > 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: kvstart = > > 3202; kvend = 3628; length = 327680 > > 2012-05-23 20:19:44,346 INFO org.apache.hadoop.mapred.MapTask: Finished > > spill 10 > > 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: Spilling > map > > output: buffer full= true > > 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: bufstart = > > 38440188; bufend = 18376423; bufvoid = 99614720 > > 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: kvstart = > > 3628; kvend = 4252; length = 327680 > > 2012-05-23 20:20:48,767 INFO org.apache.hadoop.mapred.MapTask: Finished > > spill 11 > > 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: Spilling > map > > output: buffer full= true > > 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: bufstart = > > 18376423; bufend = 97890160; bufvoid = 99614720 > > 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: kvstart = > > 4252; kvend = 4488; length = 327680 > > 2012-05-23 20:21:20,176 INFO org.apache.hadoop.mapred.MapTask: Finished > > spill 12 > > 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: Spilling > map > > output: buffer full= true > > 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: bufstart = > > 97890160; bufend = 77234938; bufvoid = 99614720 > > 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: kvstart = > > 4488; kvend = 4771; length = 327680 > > 2012-05-23 20:21:48,383 INFO org.apache.hadoop.mapred.MapTask: Finished > > spill 13 > > 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: Spilling > map > > output: buffer full= true > > 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: bufstart = > > 77234938; bufend = 52962105; bufvoid = 99614720 > > 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: kvstart = > > 4771; kvend = 5203; length = 327680 > > 2012-05-23 20:22:21,641 INFO org.apache.hadoop.mapred.MapTask: Finished > > spill 14 > > 2012-05-23 20:22:46,559 INFO org.apache.hadoop.mapred.MapTask: Starting > > flush of map output > > 2012-05-23 20:22:46,615 INFO org.apache.hadoop.mapred.MapTask: Finished > > spill 15 > > 2012-05-23 20:22:47,606 INFO org.apache.hadoop.mapred.Merger: Merging 16 > > sorted segments > > 2012-05-23 20:22:47,696 INFO org.apache.hadoop.mapred.Merger: Merging 7 > > intermediate segments out of a total of 16 > > 2012-05-23 20:22:58,083 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > > Initializing logs' truncater with mapRetainSize=-1 and > reduceRetainSize=-1 > > 2012-05-23 20:22:58,108 FATAL org.apache.hadoop.mapred.Child: Error > running > > child : java.lang.OutOfMemoryError: Java heap space > > at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:355) > > at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:417) > > at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220) > > at > > > > > org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:330) > > at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350) > > at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156) > > at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:499) > > at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381) > > at org.apache.hadoop.mapred.Merger.merge(Merger.java:77) > > at > > > > > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1548) > > at > > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1180) > > at > > > org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:582) > > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) > > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:396) > > at > > > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177) > > at org.apache.hadoop.mapred.Child.main(Child.java:264) > > > > On Wed, May 23, 2012 at 2:47 PM, Joey Krabacher <jkrabac...@gmail.com > > >wrote: > > > > > My experience with this sort of problem tells me one of two things and > > > maybe both: > > > > > > 1. there are some optimizations to the code that can be made (variable > > > re-creation inside of loops, etc.) > > > 2. something has gone horribly wrong with the logic in the mapper. > > > > > > To troubleshoot I would output some log entries at specific points in > the > > > mapper (be careful not to log every execution of the mapper because > this > > > could cause major issues with the disk filling up and that sort of > > thing.) > > > > > > Hope that helps. > > > > > > /* Joey */ > > > > > > On Wed, May 23, 2012 at 2:16 PM, Mark Kerzner < > mark.kerz...@shmsoft.com > > > >wrote: > > > > > > > Hi, all, > > > > > > > > I got the exception below in the mapper. I already have my global > > Hadoop > > > > heap at 5 GB, but is there a specific other setting? Or maybe I > should > > > > troubleshoot for memory? > > > > > > > > But the same application works in the IDE. > > > > > > > > Thank you! > > > > > > > > Mark > > > > > > > > *stderr logs* > > > > > > > > Exception in thread "Thread for syncLogs" java.lang.OutOfMemoryError: > > > > Java heap space > > > > at > > > java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:76) > > > > at > > > java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:59) > > > > at > > > > org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:292) > > > > at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:365) > > > > at org.apache.hadoop.mapred.Child$3.run(Child.java:157) > > > > Exception in thread "communication thread" > java.lang.OutOfMemoryError: > > > > Java heap space > > > > > > > > Exception: java.lang.OutOfMemoryError thrown from the > > > > UncaughtExceptionHandler in thread "communication thread" > > > > > > > > > >