mapleFU commented on issue #14923:
URL: https://github.com/apache/arrow/issues/14923#issuecomment-1369959539

   The code is like:
   
   ```java
     private void loadNewBlockToBuffer() throws IOException {
       try {
         minDeltaInCurrentBlock = BytesUtils.readZigZagVarLong(in);
       } catch (IOException e) {
         throw new ParquetDecodingException("can not read min delta in current 
block", e);
       }
   
       readBitWidthsForMiniBlocks();
   
       // mini block is atomic for reading, we read a mini block when there are 
more values left
       int i;
       for (i = 0; i < config.miniBlockNumInABlock && valuesBuffered < 
totalValueCount; i++) {
         BytePackerForLong packer = 
Packer.LITTLE_ENDIAN.newBytePackerForLong(bitWidths[i]);
         unpackMiniBlock(packer);
       }
   
       //calculate values from deltas unpacked for current block
       int valueUnpacked=i*config.miniBlockSizeInValues;
       for (int j = valuesBuffered-valueUnpacked; j < valuesBuffered; j++) {
         int index = j;
         valuesBuffer[index] += minDeltaInCurrentBlock + valuesBuffer[index - 
1];
       }
     }
   
     private void readBitWidthsForMiniBlocks() {
       for (int i = 0; i < config.miniBlockNumInABlock; i++) {
         try {
           bitWidths[i] = BytesUtils.readIntLittleEndianOnOneByte(in);
         } catch (IOException e) {
           throw new ParquetDecodingException("Can not decode bitwidth in block 
header", e);
         }
       }
     }
   
     private int[] bitWidths;
   ```
   
   * when `readBitWidthsForMiniBlocks`, it will not check it's size
   * Only when `valuesBuffered < totalValueCount`, it will use `bitWidths` and 
unpack mini block
   
   I think your fixing is much better, let's make it into our codebase :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to