This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch branch-1.8
in repository https://gitbox.apache.org/repos/asf/orc.git
The following commit(s) were added to refs/heads/branch-1.8 by this push:
new 5576f18ef ORC-1384: Fix `ArrayIndexOutOfBoundsException` when reading
dictionary stream bigger then dictionary
5576f18ef is described below
commit 5576f18efcedd1136f67ac220771da25f7d018b8
Author: Zoltan Ratkai <[email protected]>
AuthorDate: Thu Mar 9 12:51:22 2023 -0800
ORC-1384: Fix `ArrayIndexOutOfBoundsException` when reading dictionary
stream bigger then dictionary
### What changes were proposed in this pull request?
Avoid ArrayIndexOutOfBoundsException when reading dictionary stream bigger
then dictionary. Check the size of the dictionary and input and read only the
min of those.
### Why are the changes needed?
In Hive when reading with LLAP data is read in 4kB blocks which leads to
ArrayIndexOutOfBoundsException when the dictionary is smaller.
### How was this patch tested?
It is tested with HIVE's qtest, since here we do not have the necessary
subclasses.
Closes #1431 from zratkai/ORC-1384.
Lead-authored-by: Zoltan Ratkai <[email protected]>
Co-authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 8cf9057fc498f977125be3b721daf2170330b3f9)
Signed-off-by: Dongjoon Hyun <[email protected]>
---
java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java
b/java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java
index fff419956..f5ed69dc2 100644
--- a/java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java
+++ b/java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java
@@ -2235,10 +2235,15 @@ public class TreeReaderFactory {
int dictionaryBufferSize =
dictionaryOffsets[dictionaryOffsets.length - 1];
dictionaryBuffer = new byte[dictionaryBufferSize];
int pos = 0;
- int chunkSize = in.available();
- byte[] chunkBytes = new byte[chunkSize];
+ // check if dictionary size is smaller than available stream size
+ // to avoid ArrayIndexOutOfBoundsException
+ int readSize = Math.min(in.available(), dictionaryBufferSize);
+ byte[] chunkBytes = new byte[readSize];
while (pos < dictionaryBufferSize) {
- int currentLength = in.read(chunkBytes, 0, chunkSize);
+ int currentLength = in.read(chunkBytes, 0, readSize);
+ // check if dictionary size is smaller than available stream size
+ // to avoid ArrayIndexOutOfBoundsException
+ currentLength = Math.min(currentLength, dictionaryBufferSize -
pos);
System.arraycopy(chunkBytes, 0, dictionaryBuffer, pos,
currentLength);
pos += currentLength;
}