Hi:
I encounter 'java.lang.NegativeArraySizeException' error with carbondata
1.3.1 + spark 2.2.
When I run the compact command to compact 8 level-1 segments to a level-2
segment, the 'java.lang.NegativeArraySizeException' error occurred:
*java.lang.NegativeArraySizeException
at
org.apache.carbondata.core.datastore.chunk.store.impl.unsafe.UnsafeVariableLengthDimesionDataChunkStore.getRow(UnsafeVariableLengthDimesionDataChunkStore.java:172)
at
org.apache.carbondata.core.datastore.chunk.impl.AbstractDimensionDataChunk.getChunkData(AbstractDimensionDataChunk.java:46)
at
org.apache.carbondata.core.scan.result.AbstractScannedResult.getNoDictionaryKeyArray(AbstractScannedResult.java:431)
at
org.apache.carbondata.core.scan.result.impl.NonFilterQueryScannedResult.getNoDictionaryKeyArray(NonFilterQueryScannedResult.java:67)
at
org.apache.carbondata.core.scan.collector.impl.RawBasedResultCollector.scanResultAndGetData(RawBasedResultCollector.java:83)
at
org.apache.carbondata.core.scan.collector.impl.RawBasedResultCollector.collectData(RawBasedResultCollector.java:58)
at
org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:51)
at
org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:32)
at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:49)
at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:41)
at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:31)
at
org.apache.carbondata.core.scan.result.iterator.RawResultIterator.hasNext(RawResultIterator.java:72)
at
org.apache.carbondata.processing.merger.RowResultMergerProcessor.execute(RowResultMergerProcessor.java:131)
at
org.apache.carbondata.spark.rdd.CarbonMergerRDD$$anon$1.<init>(CarbonMergerRDD.scala:228)
at
org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute(CarbonMergerRDD.scala:84)
at
org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)*
I traced the code of 'UnsafeVariableLengthDimesionDataChunkStore.getRow',
found that the root cause is the value of length is negative when create
byte array: 'byte[] data = new byte[length];', the value of some parameters
are below when error ocurred:
when 'rowId < numberOfRows - 1':
*this.dataLength=192000
currentDataOffset=2
rowId=0
OffsetOfNextdata=-12173 (why)
length=-12177*
otherwise :
*this.dataLength=320000
currentDataOffset=263702
rowId=31999
length=-9238*
the value of (320000 - 263702) is exceed the range of short.
I patch the PR#2796(https://github.com/apache/carbondata/pull/2796), but
error still occurred.
finally, my test steps are:
for example: there are 4 level-1 compacted segments: 1.1, 2.1, 3.1, 4.1:
*1. run compact command, it failed;
2. delete 1.1 segment, run compact command again, it failed;
3. delete 2.1 segment, run compact command again, it failed;
3. delete 3.1 segment, run compact command again, it succeeded;*
So I think that one of 8 level-1 compacted segments maybe have some problem
but I don't how to find out.
--
Sent from:
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/