[ 
https://issues.apache.org/jira/browse/PARQUET-1417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16621709#comment-16621709
 ] 

ASF GitHub Bot commented on PARQUET-1417:
-----------------------------------------

gszadovszky closed pull request #522: PARQUET-1417: 
BINARY_AS_SIGNED_INTEGER_COMPARATOR fails with IOBE for the same arrays with 
the different length
URL: https://github.com/apache/parquet-mr/pull/522
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/parquet-column/src/main/java/org/apache/parquet/schema/PrimitiveComparator.java
 
b/parquet-column/src/main/java/org/apache/parquet/schema/PrimitiveComparator.java
index 5e9adbcf7..d343b0ea4 100644
--- 
a/parquet-column/src/main/java/org/apache/parquet/schema/PrimitiveComparator.java
+++ 
b/parquet-column/src/main/java/org/apache/parquet/schema/PrimitiveComparator.java
@@ -238,8 +238,8 @@ int compare(ByteBuffer b1, ByteBuffer b2) {
       int p1 = b1.position();
       int p2 = b2.position();
 
-      boolean isNegative1 = l1 > 0 ? b1.get(p1) < 0 : false;
-      boolean isNegative2 = l2 > 0 ? b2.get(p2) < 0 : false;
+      boolean isNegative1 = l1 > 0 && b1.get(p1) < 0;
+      boolean isNegative2 = l2 > 0 && b2.get(p2) < 0;
       if (isNegative1 != isNegative2) {
         return isNegative1 ? -1 : 1;
       }
@@ -259,7 +259,7 @@ int compare(ByteBuffer b1, ByteBuffer b2) {
 
       // The beginning of the longer buffer equals to the padding or the 
lengths are equal
       if (result == 0) {
-        result = compare(l1, b1, p1, b2, p2);
+        result = compare(Math.min(l1, l2), b1, p1, b2, p2);
       }
       return result;
     }
diff --git 
a/parquet-column/src/test/java/org/apache/parquet/schema/TestPrimitiveComparator.java
 
b/parquet-column/src/test/java/org/apache/parquet/schema/TestPrimitiveComparator.java
index 3f9d6431b..0bf359941 100644
--- 
a/parquet-column/src/test/java/org/apache/parquet/schema/TestPrimitiveComparator.java
+++ 
b/parquet-column/src/test/java/org/apache/parquet/schema/TestPrimitiveComparator.java
@@ -23,6 +23,8 @@
 
 import java.math.BigInteger;
 import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.List;
 
 import static org.apache.parquet.schema.PrimitiveComparator.BOOLEAN_COMPARATOR;
 import static org.apache.parquet.schema.PrimitiveComparator.DOUBLE_COMPARATOR;
@@ -249,6 +251,23 @@ public void testBinaryAsSignedIntegerComparator() {
             ByteBuffer.wrap(new 
BigInteger("9999999999999999999999999999999999999999").toByteArray())));
   }
 
+  @Test
+  public void testBinaryAsSignedIntegerComparatorWithEquals() {
+    List<Binary> valuesToCompare = new ArrayList<>();
+    valuesToCompare.add(Binary.fromConstantByteBuffer(ByteBuffer.wrap(new 
byte[] { 0, 0, -108 })));
+    valuesToCompare.add(Binary.fromConstantByteBuffer(ByteBuffer.wrap(new 
byte[] { 0, 0, 0, 0, 0, -108 })));
+    valuesToCompare.add(Binary.fromConstantByteBuffer(ByteBuffer.wrap(new 
byte[] { 0, 0, 0, -108 })));
+    valuesToCompare.add(Binary.fromConstantByteBuffer(ByteBuffer.wrap(new 
byte[] { 0, 0, 0, 0, -108 })));
+    valuesToCompare.add(Binary.fromConstantByteBuffer(ByteBuffer.wrap(new 
byte[] { 0, -108 })));
+
+    for (Binary v1 : valuesToCompare) {
+      for (Binary v2 : valuesToCompare) {
+        assertEquals(String.format("Wrong result of comparison %s and %s", v1, 
v2),
+            0, BINARY_AS_SIGNED_INTEGER_COMPARATOR.compare(v1, v2));
+      }
+    }
+  }
+
   private <T> void testObjectComparator(PrimitiveComparator<T> comparator, 
T... valuesInAscendingOrder) {
     for (int i = 0; i < valuesInAscendingOrder.length; ++i) {
       for (int j = 0; j < valuesInAscendingOrder.length; ++j) {


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> BINARY_AS_SIGNED_INTEGER_COMPARATOR fails with IOBE for the same arrays with 
> the different length
> -------------------------------------------------------------------------------------------------
>
>                 Key: PARQUET-1417
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1417
>             Project: Parquet
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Volodymyr Vysotskyi
>            Assignee: Volodymyr Vysotskyi
>            Priority: Major
>              Labels: pull-request-available
>
> {{BINARY_AS_SIGNED_INTEGER_COMPARATOR}} fails when the same byte arrays but 
> with the different number leading zeros are compared:
> {code:java}
>     BINARY_AS_SIGNED_INTEGER_COMPARATOR.compare(
>         Binary.fromConstantByteBuffer(ByteBuffer.wrap(new byte[] { 0, 0, -108 
> })),
>         Binary.fromConstantByteBuffer(ByteBuffer.wrap(new byte[] { 0, -108 
> })));
> {code}
> Error is:
> {noformat}
> java.lang.IndexOutOfBoundsException
>       at java.nio.Buffer.checkIndex(Buffer.java:540)
>       at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:139)
>       at 
> org.apache.parquet.schema.PrimitiveComparator$9.compare(PrimitiveComparator.java:280)
>       at 
> org.apache.parquet.schema.PrimitiveComparator$9.compare(PrimitiveComparator.java:262)
>       at 
> org.apache.parquet.schema.PrimitiveComparator$BinaryComparator.compareNotNulls(PrimitiveComparator.java:186)
>       at 
> org.apache.parquet.schema.PrimitiveComparator$BinaryComparator.compareNotNulls(PrimitiveComparator.java:183)
>       at 
> org.apache.parquet.schema.PrimitiveComparator.compare(PrimitiveComparator.java:63)
> {noformat}
> The problem is that {{BINARY_AS_SIGNED_INTEGER_COMPARATOR.compare(ByteBuffer 
> b1, ByteBuffer b2)}} method passes the length of the first {{ByteBuffer}}, 
> but it should pass the less length since padding was calculated and passed 
> for the {{ByteBuffer}} with greater length to the {{compare(int length, 
> ByteBuffer b1, int p1, ByteBuffer b2, int p2)}} method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to