[ 
https://issues.apache.org/jira/browse/HADOOP-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17275771#comment-17275771
 ] 

Claus Stadler commented on HADOOP-17453:
----------------------------------------

I am facing IndexOutOfBounds exceptions with my custom RecordReader on hadoop 
common 2.8.5 and I am also inclined to think this is a major bug in this line.
If I understand this codec stuff correctly (not claiming I do) reading decoded 
data (e.g. text) with READ_MODE.BY_BLOCK mode backed by an encoded stream (e.g. 
bzip2) should make a read on the decoded stream return when the backing stream 
hits its set boundary (typically the split end); and this mechanism is referred 
to as "advertise".

But before my repeated reads actually hit the split boundary I get an 
IndexOutOfBounds exception - apparently because my buffer's length is at some 
point less than 2 * offset + 1 - huh?


> BZip2Codec incorrectly throws IndexOutOfBoundsException: offs(X) + len(X+1) > 
> dest.length(Y).
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-17453
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17453
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: common
>    Affects Versions: 3.1.2
>            Reporter: Christian Asmussen
>            Priority: Major
>
> In org.apache.hadoop.io.compress.BZip2Codec$BZip2CompressionInputStream
>  around line 496 seems to mistakenly add the offset to the length.
> {noformat}
>       if (this.posSM == POS_ADVERTISEMENT_STATE_MACHINE.ADVERTISE) {
>         result = this.input.read(b, off, off + 1 << HERE);
> {noformat}
> Here's a reference 
> [BZip2Codec.java:L496|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/BZip2Codec.java#L496]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to