Joey,

that did the trick!

Actually, I am writing to the log with System.out.println() statements, and
I write about 12,000 lines, would that be a problem? I don't really need
this output, so if you think it's inadvisable, I will remove that.

Also, I hope that if I have not 6,000 maps but 12,000 or even 30,000, it
will still work.

Well, I will see pretty soon, I guess, with more data.

Again, thank you.

Sincerely,
Mark

On Wed, May 23, 2012 at 9:43 PM, Joey Krabacher <jkrabac...@gmail.com>wrote:

> Mark,
>
> Have you tried tweaking the mapred.child.java.opts property in your
> mapred-site.xml?
>
> <property>
>    <name>mapred.child.java.opts</name>
>    <value>-Xmx2048m</value>
>  </property>
>
> This might help.
> It looks like the fatal error came right after the log truncater fired off.
> Are you outputting anything to the logs manually, or have you looked at the
> user logs to see if there is anything taking up lots of room?
>
> / * Joey */
>
>
> On Wed, May 23, 2012 at 9:35 PM, Mark Kerzner <mark.kerz...@shmsoft.com
> >wrote:
>
> > Joey,
> >
> > my errors closely resembles this
> > one<
> >
> http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201006.mbox/%3caanlktikr3df4ce-tgiphv9_-evfoed_5-t684nf4y...@mail.gmail.com%3E
> > >in
> > the archives. I can now be much more specific with the errors message,
> > and it is quoted below. I tried -Xmx3096. But I got the same error.
> >
> > Thank you,
> > Mark
> >
> >
> > syslog logs
> > 2012-05-23 20:04:52,349 WARN org.apache.hadoop.util.NativeCodeLoader:
> > Unable to load native-hadoop library for your platform... using
> > builtin-java classes where applicable
> > 2012-05-23 20:04:52,519 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> > Initializing JVM Metrics with processName=MAP, sessionId=
> > 2012-05-23 20:04:52,695 INFO org.apache.hadoop.util.ProcessTree: setsid
> > exited with exit code 0
> > 2012-05-23 20:04:52,699 INFO org.apache.hadoop.mapred.Task:  Using
> > ResourceCalculatorPlugin :
> > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@d56b37
> > 2012-05-23 20:04:52,813 INFO org.apache.hadoop.mapred.MapTask:
> io.sort.mb =
> > 100
> > 2012-05-23 20:04:52,998 INFO org.apache.hadoop.mapred.MapTask: data
> buffer
> > = 79691776/99614720
> > 2012-05-23 20:04:52,998 INFO org.apache.hadoop.mapred.MapTask: record
> > buffer = 262144/327680
> > 2012-05-23 20:04:53,010 WARN
> > org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library
> not
> > loaded
> > 2012-05-23 20:12:29,120 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:12:29,134 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 0; bufend = 79542629; bufvoid = 99614720
> > 2012-05-23 20:12:29,134 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 0;
> > kvend = 228; length = 327680
> > 2012-05-23 20:12:31,248 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 0
> > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 79542629; bufend = 53863940; bufvoid = 99614720
> > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 228; kvend = 431; length = 327680
> > 2012-05-23 20:13:03,294 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 1
> > 2012-05-23 20:13:48,121 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:13:48,122 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 53863940; bufend = 31696780; bufvoid = 99614720
> > 2012-05-23 20:13:48,122 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 431; kvend = 861; length = 327680
> > 2012-05-23 20:13:49,818 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 2
> > 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 31696780; bufend = 10267329; bufvoid = 99614720
> > 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 861; kvend = 1462; length = 327680
> > 2012-05-23 20:15:27,068 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 3
> > 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 10267329; bufend = 85241086; bufvoid = 99614720
> > 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 1462; kvend = 1642; length = 327680
> > 2012-05-23 20:15:54,760 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 4
> > 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 85241086; bufend = 51305930; bufvoid = 99614720
> > 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 1642; kvend = 1946; length = 327680
> > 2012-05-23 20:16:27,566 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 5
> > 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 51305930; bufend = 31353466; bufvoid = 99614720
> > 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 1946; kvend = 2263; length = 327680
> > 2012-05-23 20:16:58,076 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 6
> > 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 31353466; bufend = 10945750; bufvoid = 99614720
> > 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 2263; kvend = 2755; length = 327680
> > 2012-05-23 20:17:53,939 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 7
> > 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 10945750; bufend = 81838103; bufvoid = 99614720
> > 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 2755; kvend = 2967; length = 327680
> > 2012-05-23 20:18:21,145 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 8
> > 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 81838103; bufend = 61751422; bufvoid = 99614720
> > 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 2967; kvend = 3202; length = 327680
> > 2012-05-23 20:18:53,485 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 9
> > 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 61751422; bufend = 38440188; bufvoid = 99614720
> > 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 3202; kvend = 3628; length = 327680
> > 2012-05-23 20:19:44,346 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 10
> > 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 38440188; bufend = 18376423; bufvoid = 99614720
> > 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 3628; kvend = 4252; length = 327680
> > 2012-05-23 20:20:48,767 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 11
> > 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 18376423; bufend = 97890160; bufvoid = 99614720
> > 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 4252; kvend = 4488; length = 327680
> > 2012-05-23 20:21:20,176 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 12
> > 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 97890160; bufend = 77234938; bufvoid = 99614720
> > 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 4488; kvend = 4771; length = 327680
> > 2012-05-23 20:21:48,383 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 13
> > 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 77234938; bufend = 52962105; bufvoid = 99614720
> > 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 4771; kvend = 5203; length = 327680
> > 2012-05-23 20:22:21,641 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 14
> > 2012-05-23 20:22:46,559 INFO org.apache.hadoop.mapred.MapTask: Starting
> > flush of map output
> > 2012-05-23 20:22:46,615 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 15
> > 2012-05-23 20:22:47,606 INFO org.apache.hadoop.mapred.Merger: Merging 16
> > sorted segments
> > 2012-05-23 20:22:47,696 INFO org.apache.hadoop.mapred.Merger: Merging 7
> > intermediate segments out of a total of 16
> > 2012-05-23 20:22:58,083 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> > Initializing logs' truncater with mapRetainSize=-1 and
> reduceRetainSize=-1
> > 2012-05-23 20:22:58,108 FATAL org.apache.hadoop.mapred.Child: Error
> running
> > child : java.lang.OutOfMemoryError: Java heap space
> >    at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:355)
> >    at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:417)
> >    at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220)
> >    at
> >
> >
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:330)
> >    at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
> >    at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156)
> >    at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:499)
> >    at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
> >    at org.apache.hadoop.mapred.Merger.merge(Merger.java:77)
> >    at
> >
> >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1548)
> >    at
> > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1180)
> >    at
> >
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:582)
> >    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649)
> >    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> >    at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
> >    at java.security.AccessController.doPrivileged(Native Method)
> >    at javax.security.auth.Subject.doAs(Subject.java:396)
> >    at
> >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
> >    at org.apache.hadoop.mapred.Child.main(Child.java:264)
> >
> > On Wed, May 23, 2012 at 2:47 PM, Joey Krabacher <jkrabac...@gmail.com
> > >wrote:
> >
> > > My experience with this sort of problem tells me one of two things and
> > > maybe both:
> > >
> > > 1. there are some optimizations to the code that can be made (variable
> > > re-creation inside of loops, etc.)
> > > 2. something has gone horribly wrong with the logic in the mapper.
> > >
> > > To troubleshoot I would output some log entries at specific points in
> the
> > > mapper (be careful not to log every execution of the mapper because
> this
> > > could cause major issues with the disk filling up and that sort of
> > thing.)
> > >
> > > Hope that helps.
> > >
> > > /* Joey */
> > >
> > > On Wed, May 23, 2012 at 2:16 PM, Mark Kerzner <
> mark.kerz...@shmsoft.com
> > > >wrote:
> > >
> > > > Hi, all,
> > > >
> > > > I got the exception below in the mapper. I already have my global
> > Hadoop
> > > > heap at 5 GB, but is there a specific other setting? Or maybe I
> should
> > > > troubleshoot for memory?
> > > >
> > > > But the same application works in the IDE.
> > > >
> > > > Thank you!
> > > >
> > > > Mark
> > > >
> > > > *stderr logs*
> > > >
> > > > Exception in thread "Thread for syncLogs" java.lang.OutOfMemoryError:
> > > > Java heap space
> > > >        at
> > > java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:76)
> > > >        at
> > > java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:59)
> > > >        at
> > > > org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:292)
> > > >        at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:365)
> > > >        at org.apache.hadoop.mapred.Child$3.run(Child.java:157)
> > > > Exception in thread "communication thread"
> java.lang.OutOfMemoryError:
> > > > Java heap space
> > > >
> > > > Exception: java.lang.OutOfMemoryError thrown from the
> > > > UncaughtExceptionHandler in thread "communication thread"
> > > >
> > >
> >
>

Reply via email to