[
https://issues.apache.org/jira/browse/PARQUET-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17770426#comment-17770426
]
ASF GitHub Bot commented on PARQUET-2357:
-----------------------------------------
fengjiajie commented on code in PR #1160:
URL: https://github.com/apache/parquet-mr/pull/1160#discussion_r1341335124
##########
parquet-encoding/src/test/java/org/apache/parquet/bytes/TestCapacityByteArrayOutputStream.java:
##########
@@ -49,6 +49,41 @@ public void testWriteArray() throws Throwable {
validate(capacityByteArrayOutputStream, v * 3);
}
+ @Test
+ public void testWriteArrayExpand() throws Throwable {
+ CapacityByteArrayOutputStream capacityByteArrayOutputStream =
newCapacityBAOS(2);
+ assertEquals(0, capacityByteArrayOutputStream.getCapacity());
+
+ byte[] toWrite = {(byte) (1), (byte) (2), (byte) (3), (byte) (4)};
+ int toWriteOffset = 0;
+ int writeLength = 2;
+ // write 2 bytes array
+ capacityByteArrayOutputStream.write(toWrite, toWriteOffset, writeLength);
+ toWriteOffset += writeLength;
+ assertEquals(2, capacityByteArrayOutputStream.size());
+ assertEquals(2, capacityByteArrayOutputStream.getCapacity());
+
+ // write 1 byte array, expand capacity to 4
+ writeLength = 1;
+ capacityByteArrayOutputStream.write(toWrite, toWriteOffset, writeLength);
+ toWriteOffset += writeLength;
+ assertEquals(3, capacityByteArrayOutputStream.size());
+ assertEquals(4, capacityByteArrayOutputStream.getCapacity());
+
+ // write 1 byte array, not expand
+ capacityByteArrayOutputStream.write(toWrite, toWriteOffset, writeLength);
+ assertEquals(4, capacityByteArrayOutputStream.size());
+ assertEquals(4, capacityByteArrayOutputStream.getCapacity());
Review Comment:
the original version would expand the capacity to 8, it is unnecessary
> Modest refactor of CapacityByteArrayOutputStream
> ------------------------------------------------
>
> Key: PARQUET-2357
> URL: https://issues.apache.org/jira/browse/PARQUET-2357
> Project: Parquet
> Issue Type: Improvement
> Components: parquet-mr
> Reporter: Feng Jiajie
> Priority: Minor
> Fix For: 1.14.0
>
>
> Optimization for the CapacityByteArrayOutputStream class:
> # The functionality of {{currentSlabIndex}} is the same as
> {{{}currentSlab.position(){}}}, so there is no need to maintain the
> {{currentSlabIndex}} variable.
> # When writing an array of length equal to the remaining capacity of the
> buffer, there is no need to expand to a new buffer.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)