This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/orc.git


The following commit(s) were added to refs/heads/main by this push:
     new 8cf9057fc ORC-1384: Fix `ArrayIndexOutOfBoundsException` when reading 
dictionary stream bigger then dictionary
8cf9057fc is described below

commit 8cf9057fc498f977125be3b721daf2170330b3f9
Author: Zoltan Ratkai <[email protected]>
AuthorDate: Thu Mar 9 12:51:22 2023 -0800

    ORC-1384: Fix `ArrayIndexOutOfBoundsException` when reading dictionary 
stream bigger then dictionary
    
    ### What changes were proposed in this pull request?
    Avoid  ArrayIndexOutOfBoundsException when reading dictionary stream bigger 
then dictionary. Check the size of the dictionary and input and read only the 
min of those.
    
    ### Why are the changes needed?
    In Hive when reading with LLAP data is read in 4kB blocks which leads to 
ArrayIndexOutOfBoundsException when the dictionary is smaller.
    
    ### How was this patch tested?
    It is tested with HIVE's qtest, since here we do not have the necessary 
subclasses.
    
    Closes #1431 from zratkai/ORC-1384.
    
    Lead-authored-by: Zoltan Ratkai <[email protected]>
    Co-authored-by: Dongjoon Hyun <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java 
b/java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java
index ecc02fb8d..2a2adf50d 100644
--- a/java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java
+++ b/java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java
@@ -2292,10 +2292,15 @@ public class TreeReaderFactory {
           int dictionaryBufferSize = 
dictionaryOffsets[dictionaryOffsets.length - 1];
           dictionaryBuffer = new byte[dictionaryBufferSize];
           int pos = 0;
-          int chunkSize = in.available();
-          byte[] chunkBytes = new byte[chunkSize];
+          // check if dictionary size is smaller than available stream size
+          // to avoid ArrayIndexOutOfBoundsException
+          int readSize = Math.min(in.available(), dictionaryBufferSize);
+          byte[] chunkBytes = new byte[readSize];
           while (pos < dictionaryBufferSize) {
-            int currentLength = in.read(chunkBytes, 0, chunkSize);
+            int currentLength = in.read(chunkBytes, 0, readSize);
+            // check if dictionary size is smaller than available stream size
+            // to avoid ArrayIndexOutOfBoundsException
+            currentLength = Math.min(currentLength, dictionaryBufferSize - 
pos);
             System.arraycopy(chunkBytes, 0, dictionaryBuffer, pos, 
currentLength);
             pos += currentLength;
           }

Reply via email to