Hello,
I reset the cluster to use:
io.sort.mb to 1400mb (was previously the default 100MB)
fs.inmemory.size.mb to 1400mb (was previously the default)
io.sort.factor to 1000 (was previously the default 10)
Heap size: 2048MB
Each node has 32GB of RAM and a 256MB block size.
Testing a word count on a single 230MB LZO file which decompresses to a
little over 1.1GB causes a spill failed error. Only one mapper is running
on the 230MB LZO file since my block size is set to 256MB.
The error I get is:
java.io.IOException: Spill Failed
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:860)
...
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at
org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38)
...
In the log files right before the Spill Failed message is the following:
INFO org.apache.hadoop.mapred.MapTask: Finished Spill 26
INFO org.apache.hadoop.mapred.MapTask: Spilling map output: record full
= true
INFO org.apache.hadoop.mapred.MapTask: bufstart = 1064643112; bufend =
1104103650; bufvoid = 1394606080
INFO org.apache.hadoop.mapred.MapTask: kvstart = 2752491; kvend =
1834986; length = 4587520
WARN org.apache.hadoop.mapred.TaskTracker: Error running child
java.io.IOException: Spill failed
...
1) Does anyone know what causes the spill to fail? My understanding was
that if the buffer reaches 80% (by default) full with data it just starts
spilling to disk but shouldn't just fail from having to spill too much
(there is plenty of disk space, and swap space is never getting touched). I
only noticed the spill failure after changing the settings for io.sort.mb,
io.sort.factor and fs.inmemory.size.mb. When I set those values back to
default the job completes without a spill failure (although it does spill a
lot, 309 spills)
2) A related question, should I be looking to have number of "Spilled
records" be equal to number of "Map output records" for optimal performance?
Thank you for your assistance!
~Ed
On Mon, Sep 27, 2010 at 12:56 PM, Srigurunath Chakravarthi <
[email protected]> wrote:
> Ed,
> Your math is right - 1400 MB would be a good setting for io.sort.mb.
>
> fs.inmemorysize.mb - what Hadoop version are you using? I suspect that
> this may be deprecated. If it is supported, you can set it to 1400 MB too. I
> know it is recognized by .21 (and older versions).
>
> Increasing io.sort.factor to 100 or even 1000 is advisable.
>
> On the reduce side, to see if you are already hitting this limit, you can
> observe reduce task logs. The spill messages will tell you if spills are
> occurring every 10 files instead of occurring when the collected map output
> size reaches fs.inmemorysize.mb.
> On the map size I don't know how io.sort.factor gets used. It probably
> imposes a limit on number of spills (and forces additonal merge steps if
> spills exceed that limit).
>
> In any case setting it to a higher value such as 100 blindly is ok since
> it won't degrade performance.
>
> Hope this helps,
> Sriguru
>
>
> ----- Original Message -----
> From: pig <[email protected]>
> To: [email protected] <[email protected]>
> Sent: Mon Sep 27 07:15:29 2010
> Subject: Re: Proper blocksize and io.sort.mb setting when using compressed
> LZO files
>
> HI Sriguru,
>
> Thank you for the tips. Just to clarify a few things.
>
> Our machines have 32 GB of RAM.
>
> I'm planning on setting each machine to run 12 mappers and 2 reducers with
> the heap size set to 2048MB so total memory usage for the heap at 28GB.
>
> If this is the case should io.sort.mb be set to 70% of 2048MB (so ~1400
> MB)?
>
> Also, I did not see a fs.inmemorysize.mb setting in any of the hadoop
> configuration files. Is that the correct setting I should be looking for?
> Should this also be set to 70% of the heap size or does it need to share
> with the io.sort.mb setting.
>
> I assume if I'm bumping up io.sort.mb that much I also need to increase
> io.sort.factor from the default of 10. Is there a recommended relation
> between these two?
>
> Thank you for your help!
>
> ~Ed
>
> On Sun, Sep 26, 2010 at 3:05 AM, Srigurunath Chakravarthi <
> [email protected]> wrote:
>
> > Ed,
> > Tuning io.sort.mb will be certainly worthwhile if you have enough RAM to
> > allow for a higher Java heap per map task without risking swapping.
> >
> > Similarly, you can decrease spills on the reduce side using
> > fs.inmemorysize.mb.
> >
> > You can use the following thumb rules for tuning those two:
> >
> > - Set these to ~70% of Java heap size. Pick heap sizes to utilize ~80%
> RAM
> > across all processes (maps, reducers, TT, DN, other)
> > - Set it small enough to avoid swap activity, but
> > - Set it large enough to minimize disk spills.
> > - Ensure that io.sort.factor is set large enough to allow full use of
> > buffer space.
> > - Balance space for output records (default 95%) & record meta-data (5%).
> > Use io.sort.spill.percent and io.sort.record.percent
> >
> > Your mileage may vary. We've seen job exec time improvements worth 1-3%
> > via spill-avoidance for miscellaneous applications.
> >
> > Your other option of running a map per 32MB or 64MB of input should give
> > you better performance if your map task execution time is significant
> (i.e.,
> > much larger than a few seconds) compared to the overhead of launching map
> > tasks and reading input.
> >
> > Regards,
> > Sriguru
> >
> > >-----Original Message-----
> > >From: pig [mailto:[email protected]]
> > >Sent: Saturday, September 25, 2010 2:36 AM
> > >To: [email protected]
> > >Subject: Proper blocksize and io.sort.mb setting when using compressed
> > >LZO files
> > >
> > >Hello,
> > >
> > >We just recently switched to using lzo compressed file input for our
> > >hadoop
> > >cluster using Kevin Weil's lzo library. The files are pretty uniform
> > >in
> > >size at around 200MB compressed. Our block size is 256MB.
> > >Decompressed the
> > >average LZO input file is around 1.0GB. I noticed lots of our jobs are
> > >now
> > >spilling lots of data to disk. We have almost 3x more spilled records
> > >than
> > >map input records for example. I'm guessing this is because each
> > >mapper is
> > >getting a 200 MB lzo file which decompresses into 1GB of data per
> > >mapper.
> > >
> > >Would you recommend solving this by reducing the block size to 64MB, or
> > >even
> > >32MB and then using the LZO indexer so that a single 200MB lzo file is
> > >actually split among 3 or 4 mappers? Would it be better to play with
> > >the
> > >io.sort.mb value? Or, would it be best to play with both? Right now
> > >the
> > >io.sort.mb value is the default 200MB. Have other lzo users had to
> > >adjust
> > >their block size to compensate for the "expansion" of the data after
> > >decompression?
> > >
> > >Thank you for any help!
> > >
> > >~Ed
> >
>