[ 
https://issues.apache.org/jira/browse/TEZ-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204003#comment-17204003
 ] 

László Bodor commented on TEZ-4234:
-----------------------------------

[~jeagles]: could you please take a look at  [^TEZ-4234.03.patch]? now, it 
seems to be a regression of TEZ-4135 (cc: [~rajesh.balamohan])

1. actual fix is 1 line, please find my explanation in comment there in code: 
basically, it takes care of setting the codec's buffersize config back to get 
rid of small, problematic buffersize in compressor/decompressor instances:
{code}
configurableCodec.getConf().setInt(bufferSizeProp, originalSize);
{code}

2. getDecompressedInputStreamWithBufferSize moved to CodecUtils with the fix

3. CodecUtils.getCodec(conf) introduced to get rid of repeated logic

4. left some unit tests in TestIFile which describes the issue (they depend on 
native library, and expected to fail), they don't prove the fix directly, just 
shows the root cause (the scenario that [~gopalv] described), for which setting 
the originalSize back could be a solution



> Compressor can cause IllegalArgumentException in Buffer.limit where limit 
> exceeds capacity
> ------------------------------------------------------------------------------------------
>
>                 Key: TEZ-4234
>                 URL: https://issues.apache.org/jira/browse/TEZ-4234
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Turner Eagles
>            Assignee: László Bodor
>            Priority: Blocker
>              Labels: 0.10_blocker
>             Fix For: 0.10.0
>
>         Attachments: TEZ-4234.01.patch, TEZ-4234.02.patch, TEZ-4234.03.patch, 
> TEZ-4234.repro.patch, TEZ-4234.wip.patch
>
>
> {code}
> java.lang.IllegalArgumentException
>   at java.nio.Buffer.limit(Buffer.java:275)
>   at 
> org.apache.hadoop.io.compress.lz4.Lz4Compressor.compress(Lz4Compressor.java:240)
>   at 
> org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
>   at 
> org.apache.hadoop.io.compress.BlockCompressorStream.write(BlockCompressorStream.java:131)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)
>   at java.io.DataOutputStream.write(DataOutputStream.java:107)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Writer.writeKVPair(IFile.java:423)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Writer.append(IFile.java:392)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.spill(DefaultSorter.java:927)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.sortAndSpill(DefaultSorter.java:865)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.flush(DefaultSorter.java:729)
>   at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.close(OrderedPartitionedKVOutput.java:191)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:398)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:83)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:2035)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to