cxzl25 opened a new pull request, #1412:
URL: https://github.com/apache/orc/pull/1412
### What changes were proposed in this pull request?
When `DynamicByteArray` calculates `chunkIndex` overflow, it will throw NPE.
We can add a log to remind users to avoid this problem by configuring what
ORC parameters.
### Why are the changes needed?
When the written string is very large, the grow calculation may overflow,
causing the array data not to be expanded, and then NPE.
org.apache.orc.impl.DynamicByteArray#add(byte[], int, int)
```java
grow((length + valueLength) / chunkSize);
```
#### Log
```java
Caused by: java.lang.NullPointerException
at java.lang.System.arraycopy(Native Method)
at org.apache.orc.impl.DynamicByteArray.add(DynamicByteArray.java:115)
at
org.apache.orc.impl.StringRedBlackTree.addNewKey(StringRedBlackTree.java:48)
at
org.apache.orc.impl.StringRedBlackTree.add(StringRedBlackTree.java:60)
at
org.apache.orc.impl.writer.StringTreeWriter.writeBatch(StringTreeWriter.java:69)
at
org.apache.orc.impl.writer.StructTreeWriter.writeRootBatch(StructTreeWriter.java:56)
at org.apache.orc.impl.WriterImpl.addRowBatch(WriterImpl.java:696)
```
#### Local Test
org.apache.orc.impl.TestDynamicArray#testBigByteArray
```java
@Test
public void testBigByteArray() {
DynamicByteArray dba = new DynamicByteArray(128, 32 * 1024);
byte[] val = new byte[1024];
dba.add(val, 0, val.length);
byte[] bigVal = new byte[Integer.MAX_VALUE - 16];
dba.add(bigVal, 0, bigVal.length);
}
```
### How was this patch tested?
local test
Output>
```bash
2023-02-15 20:25:16,938 [main] ERROR DynamicByteArray: chunkIndex
overflow:-65535. You can adjust the relevant configuration:
orc.column.encoding.direct,orc.dictionary.key.threshold.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]