[ https://issues.apache.org/jira/browse/PIG-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482081#comment-13482081 ]
Jonathan Coveney commented on PIG-2999: --------------------------------------- I have a theory on what it is. {code} case BinInterSedes.TINYBYTEARRAY: case BinInterSedes.SMALLBYTEARRAY: case BinInterSedes.BYTEARRAY: { type1 = DataType.BYTEARRAY; type2 = getGeneralizedDataType(dt2); if (type1 == type2) { int basz1 = readSize(bb1, dt1); int basz2 = readSize(bb2, dt2); rc = org.apache.hadoop.io.WritableComparator.compareBytes( bb1.array(), bb1.position(), basz1, bb2.array(), bb2.position(), basz2); } break; } {code} In the old code, the act of comparing would have advanced the respective pointers in the bytebuffer. In this case, now it doesn't. So now, after the comparison, assuming that the two are equal, it will keep going as if the next data type where the next byte (this would explain why the specific line in question it fails on is DateTime in code I bet is comparing Bytearrays). The solution is after doing the comparison, to skip ahead the bytebuffers. That's my guess, though. > Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort > failing > --------------------------------------------------------------------------------- > > Key: PIG-2999 > URL: https://issues.apache.org/jira/browse/PIG-2999 > Project: Pig > Issue Type: Bug > Affects Versions: 0.11, 0.12 > Reporter: Koji Noguchi > > I think I broke the build from PIG-2975. I see couple of tests failing at > BinInterSedesTupleRawComparator. > {noformat} > 12/10/22 22:26:15 WARN mapred.LocalJobRunner: job_local_0022 > java.nio.BufferUnderflowException > at java.nio.Buffer.nextGetIndex(Buffer.java:478) > at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:387) > at > org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinInterSedesDatum(BinInterSedes.java:829) > at > org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinSedesTuple(BinInterSedes.java:732) > at > org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compare(BinInterSedes.java:695) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSecondaryKeyComparator.compare(PigSecondaryKeyComparator.java:78) > at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373) > at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:139) > at > org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103) > at > org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335) > at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350) > at org.apache.hadoop.mapred.ReduceTask$4.next(ReduceTask.java:625) > at > org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:117) > at > org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) > at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) > at > org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira