[ 
https://issues.apache.org/jira/browse/PARQUET-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774303#comment-17774303
 ] 

ASF GitHub Bot commented on PARQUET-2357:
-----------------------------------------

fengjiajie commented on PR #1165:
URL: https://github.com/apache/parquet-mr/pull/1165#issuecomment-1758841809

   @wgtmac  Thank you for your review. 
   
   The reason why write() no longer checks for overflow is because addSlab() 
ensures that bytesAllocated will not overflow.
   In the subsequent write() function, it is guaranteed that **bytesAllocated 
>= bytesUsed**, and thus bytesUsed will not overflow.
   
   But I agree that it would be advantageous to add the check here as well, I 
have changed the commit.




> Modest refactor of CapacityByteArrayOutputStream
> ------------------------------------------------
>
>                 Key: PARQUET-2357
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2357
>             Project: Parquet
>          Issue Type: Improvement
>          Components: parquet-mr
>            Reporter: Feng Jiajie
>            Priority: Minor
>             Fix For: 1.14.0
>
>
> Optimization for the CapacityByteArrayOutputStream class:
>  # The functionality of {{currentSlabIndex}} is the same as 
> {{{}currentSlab.position(){}}}, so there is no need to maintain the 
> {{currentSlabIndex}} variable.
>  # When writing an array of length equal to the remaining capacity of the 
> buffer, there is no need to expand to a new buffer.
>  # If the {{addSlab}} operation has already implemented safeguards using 
> {{Math.addExact}} to prevent overflow of {{bytesAllocated}} and 
> {{{}bytesUsed{}}}, it is unnecessary to perform additional checks during the 
> {{write}} operation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to