[ 
https://issues.apache.org/jira/browse/PARQUET-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17770426#comment-17770426
 ] 

ASF GitHub Bot commented on PARQUET-2357:
-----------------------------------------

fengjiajie commented on code in PR #1160:
URL: https://github.com/apache/parquet-mr/pull/1160#discussion_r1341335124


##########
parquet-encoding/src/test/java/org/apache/parquet/bytes/TestCapacityByteArrayOutputStream.java:
##########
@@ -49,6 +49,41 @@ public void testWriteArray() throws Throwable {
     validate(capacityByteArrayOutputStream, v * 3);
   }
 
+  @Test
+  public void testWriteArrayExpand() throws Throwable {
+    CapacityByteArrayOutputStream capacityByteArrayOutputStream = 
newCapacityBAOS(2);
+    assertEquals(0, capacityByteArrayOutputStream.getCapacity());
+
+    byte[] toWrite = {(byte) (1), (byte) (2), (byte) (3), (byte) (4)};
+    int toWriteOffset = 0;
+    int writeLength = 2;
+    // write 2 bytes array
+    capacityByteArrayOutputStream.write(toWrite, toWriteOffset, writeLength);
+    toWriteOffset += writeLength;
+    assertEquals(2, capacityByteArrayOutputStream.size());
+    assertEquals(2, capacityByteArrayOutputStream.getCapacity());
+
+    // write 1 byte array, expand capacity to 4
+    writeLength = 1;
+    capacityByteArrayOutputStream.write(toWrite, toWriteOffset, writeLength);
+    toWriteOffset += writeLength;
+    assertEquals(3, capacityByteArrayOutputStream.size());
+    assertEquals(4, capacityByteArrayOutputStream.getCapacity());
+
+    // write 1 byte array, not expand
+    capacityByteArrayOutputStream.write(toWrite, toWriteOffset, writeLength);
+    assertEquals(4, capacityByteArrayOutputStream.size());
+    assertEquals(4, capacityByteArrayOutputStream.getCapacity());

Review Comment:
   the original version would expand the capacity to 8, it is unnecessary





> Modest refactor of CapacityByteArrayOutputStream
> ------------------------------------------------
>
>                 Key: PARQUET-2357
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2357
>             Project: Parquet
>          Issue Type: Improvement
>          Components: parquet-mr
>            Reporter: Feng Jiajie
>            Priority: Minor
>             Fix For: 1.14.0
>
>
> Optimization for the CapacityByteArrayOutputStream class:
>  # The functionality of {{currentSlabIndex}} is the same as 
> {{{}currentSlab.position(){}}}, so there is no need to maintain the 
> {{currentSlabIndex}} variable.
>  # When writing an array of length equal to the remaining capacity of the 
> buffer, there is no need to expand to a new buffer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to