[ 
https://issues.apache.org/jira/browse/TEZ-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15825666#comment-15825666
 ] 

Rajesh Balamohan commented on TEZ-3569:
---------------------------------------

Cancelled the patch as I encountered the following exception when running a job 
(q67)

{noformat}
Caused by: java.lang.IndexOutOfBoundsException
        at 
org.apache.hadoop.io.compress.BlockCompressorStream.write(BlockCompressorStream.java:91)
        at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)
        at java.io.DataOutputStream.write(DataOutputStream.java:107)
        at 
org.apache.tez.runtime.library.common.sort.impl.IFile$Writer.writeKVPair(IFile.java:418)
        at 
org.apache.tez.runtime.library.common.sort.impl.IFile$Writer.append(IFile.java:388)
        at 
org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:586)
        at 
org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:329)
        at 
org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:412)
        at 
org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:438)
        at 
org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.write(PipelinedSorter.java:385)
        at 
org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput$1.write(OrderedPartitionedKVOutput.java:167)
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor$TezKVOutputCollector.collect(TezProcessor.java:253)
{noformat}

> Improve buffer handling in PipelinedSorter after encountering initial spill 
> condition
> -------------------------------------------------------------------------------------
>
>                 Key: TEZ-3569
>                 URL: https://issues.apache.org/jira/browse/TEZ-3569
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>         Attachments: TEZ-3569.1.patch
>
>
> With large number of buffers, there is a possibility that these buffers are 
> not used effecitively and could end up spilling more often (after 
> encountering first spill condition). Will debug more and share more details 
> here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to