[
https://issues.apache.org/jira/browse/TEZ-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14526314#comment-14526314
]
Gopal V commented on TEZ-2405:
------------------------------
[~rajesh.balamohan]: the patch looks good - +1
But the code confusion remains - we have to investigate dropping the old MR
InputBuffer impl which we can't fix anymore.
{code}
public class InputBuffer extends FilterInputStream {
...
public void reset(byte[] input, int start, int length) {
this.buf = input;
this.count = start+length;
this.pos = start;
...
}
public int getPosition() { return pos; }
public int getLength() { return count; }
{code}
This makes it obvious that InputBuffer.getLength() is not similar to any other
getLength calls, but instead is a capacity parameter of unknown clarity (i.e
the other areas of the byte[] array might be owned by other buffers).
Post 0.7.x, we can rewrite this codepath to avoid this particular anti-pattern,
by dropping references to the old DataInputBuffer impl.
> PipelinedSorter can throw NPE with custom compartor
> ---------------------------------------------------
>
> Key: TEZ-2405
> URL: https://issues.apache.org/jira/browse/TEZ-2405
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Rajesh Balamohan
> Assignee: Rajesh Balamohan
> Priority: Critical
> Attachments: TEZ-2405.1.patch
>
>
> If custom comparators are used, PipelinedSorter can throw NPE depending on
> custom comparator implementations.
> {noformat}
> ], TaskAttempt 1 failed, info=[Error: Failure while running
> task:java.lang.NullPointerException
> at
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SpanIterator.compareTo(PipelinedSorter.java:837)
> at
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SpanIterator.compareTo(PipelinedSorter.java:767)
> at java.util.PriorityQueue.siftUpComparable(PriorityQueue.java:637)
> at java.util.PriorityQueue.siftUp(PriorityQueue.java:629)
> at java.util.PriorityQueue.offer(PriorityQueue.java:329)
> at java.util.PriorityQueue.add(PriorityQueue.java:306)
> at
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SpanMerger.add(PipelinedSorter.java:996)
> at
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SpanMerger.next(PipelinedSorter.java:1065)
> at
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$PartitionFilter.next(PipelinedSorter.java:936)
> at
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:366)
> at
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.flush(PipelinedSorter.java:406)
> at
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.close(OrderedPartitionedKVOutput.java:183)
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:355)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)