For Dmitriy and anyone else who has seen this error, I just committed a fix
to my github repository:

http://github.com/toddlipcon/hadoop-lzo/commit/f3bc3f8d003bb8e24f254b25bca2053f731cdd58

The problem turned out to be an assumption that InputStream.read() would
return all the bytes that were asked for. This turns out to almost always be
true on local filesystems, but on HDFS it's not true if the read crosses a
block boundary. So, every couple of TB of lzo compressed data one might see
this error.

Big thanks to Alex Roetter who was able to provide a file that exhibited the
bug!

Thanks
-Todd


On Tue, Apr 6, 2010 at 10:35 AM, Todd Lipcon <[email protected]> wrote:

> Hi Alex,
> Unfortunately I wasn't able to reproduce, and the data Dmitriy is
> working with is sensitive.
> Do you have some data you could upload (or send me off list) that
> exhibits the issue?
> -Todd
>
> On Tue, Apr 6, 2010 at 9:50 AM, Alex Roetter <[email protected]>
> wrote:
> >
> > Todd Lipcon <t...@...> writes:
> >
> > >
> > > Hey Dmitriy,
> > >
> > > This is very interesting (and worrisome in a way!) I'll try to take a
> look
> > > this afternoon.
> > >
> > > -Todd
> > >
> >
> > Hi Todd,
> >
> > I wanted to see if you made any progress on this front. I'm seeing a very
> > similar error, trying to run a MR (Hadoop 0.20.1) over a bunch of
> > LZOP compressed / indexed files (using Kevin Weil's package), and I have
> one
> > map task that always fails in what looks like the same place as described
> in
> > the previous post. I haven't yet done the experimentation mentioned above
> > (isolating the input file corresponding to the failed map task,
> decompressing
> > it / recompressing it, testing it out operating directly on local disk
> > instead of HDFS, etc).
> >
> > However, since I am crashing in exactly the same place it seems likely
> this
> > is related, and thought I'd check on your work in the meantime.
> >
> > FYI, my stack track is below:
> >
> > 2010-04-05 18:15:16,895 FATAL org.apache.hadoop.mapred.TaskTracker: Error
> > running child : java.lang.InternalError: lzo1x_decompress_safe returned:
> >        at
> com.hadoop.compression.lzo.LzoDecompressor.decompressBytesDirect
> > (Native Method)
> >        at com.hadoop.compression.lzo.LzoDecompressor.decompress
> > (LzoDecompressor.java:303)
> >        at
> > com.hadoop.compression.lzo.LzopDecompressor.decompress
> > (LzopDecompressor.java:104)
> >        at com.hadoop.compression.lzo.LzopInputStream.decompress
> > (LzopInputStream.java:223)
> >        at
> > org.apache.hadoop.io.compress.DecompressorStream.read
> > (DecompressorStream.java:74)
> >        at java.io.InputStream.read(InputStream.java:85)
> >        at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
> >        at org.apache.hadoop.util.LineReader.readLine(LineReader.java:187)
> >        at
> > com.hadoop.mapreduce.LzoLineRecordReader.nextKeyValue
> > (LzoLineRecordReader.java:126)
> >        at
> > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue
> > (MapTask.java:423)
> >        at
> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
> >        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
> >        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> >
> >
> > Any update much appreciated,
> > Alex
> >
> >
> >
> >
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to