xuchuanyin created CARBONDATA-1839:
--------------------------------------
Summary: Data load failed when using compressed sort temp file
Key: CARBONDATA-1839
URL: https://issues.apache.org/jira/browse/CARBONDATA-1839
Project: CarbonData
Issue Type: Bug
Reporter: xuchuanyin
Assignee: xuchuanyin
Carbondata provide an option to optimize data load process by compressing the
intermediate sort temp files.
The option is `carbon.is.sort.temp.file.compression.enabled` and its default
value is `false`. In some disk tense scenario, user can turn on this feature by
setting the option `true`, it will compress the file content before write it to
disk.
How ever I have found bugs in the related code and the data load is failed
after turn on this feature.
Error messages are shown as below:
```
17/11/29 18:04:12 ERROR SortDataRows: SortDataRowPool:test1
java.lang.ClassCastException: [B cannot be cast to [Ljava.lang.Integer;
at
org.apache.carbondata.core.util.NonDictionaryUtil.getDimension(NonDictionaryUtil.java:93)
at
org.apache.carbondata.processing.sort.sortdata.UnCompressedTempSortFileWriter.writeDataOutputStream(UnCompressedTempSortFileWriter.java:52)
at
org.apache.carbondata.processing.sort.sortdata.CompressedTempSortFileWriter.writeSortTempFile(CompressedTempSortFileWriter.java:65)
at
org.apache.carbondata.processing.sort.sortdata.SortTempFileChunkWriter.writeSortTempFile(SortTempFileChunkWriter.java:72)
at
org.apache.carbondata.processing.sort.sortdata.SortDataRows.writeSortTempFile(SortDataRows.java:245)
at
org.apache.carbondata.processing.sort.sortdata.SortDataRows.writeDataTofile(SortDataRows.java:232)
at
org.apache.carbondata.processing.sort.sortdata.SortDataRows.access$300(SortDataRows.java:45)
at
org.apache.carbondata.processing.sort.sortdata.SortDataRows$DataSorterAndWriter.run(SortDataRows.java:426)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
```
```
17/11/29 18:04:13 ERROR SortDataRows: SafeParallelSorterPool:test1 exception
occurred while trying to acquire a semaphore lock: Task
org.apache.carbondata.processing.sort.sortdata.SortDataRows$DataSorterAndWriter@3d413b40
rejected from java.util.concurrent.ThreadPoolExecutor@cb56011[Terminated, pool
size = 0, active threads = 0, queued tasks = 0, completed tasks = 1]
17/11/29 18:04:13 ERROR ParallelReadMergeSorterImpl:
SafeParallelSorterPool:test1
org.apache.carbondata.processing.sort.exception.CarbonSortKeyAndGroupByException:
at
org.apache.carbondata.processing.sort.sortdata.SortDataRows.addRowBatch(SortDataRows.java:173)
at
org.apache.carbondata.processing.loading.sort.impl.ParallelReadMergeSorterImpl$SortIteratorThread.run(ParallelReadMergeSorterImpl.java:227)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.RejectedExecutionException: Task
org.apache.carbondata.processing.sort.sortdata.SortDataRows$DataSorterAndWriter@3d413b40
rejected from java.util.concurrent.ThreadPoolExecutor@cb56011[Terminated, pool
size = 0, active threads = 0, queued tasks = 0, completed tasks = 1]
at
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
at
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
at
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
at
org.apache.carbondata.processing.sort.sortdata.SortDataRows.addRowBatch(SortDataRows.java:169)
... 4 more
```
```
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by:
org.apache.carbondata.processing.sort.exception.CarbonSortKeyAndGroupByException:
at
org.apache.carbondata.processing.sort.sortdata.SortDataRows.addRowBatch(SortDataRows.java:173)
at
org.apache.carbondata.processing.loading.sort.impl.ParallelReadMergeSorterImpl$SortIteratorThread.run(ParallelReadMergeSorterImpl.java:227)
... 3 more
Caused by: java.util.concurrent.RejectedExecutionException: Task
org.apache.carbondata.processing.sort.sortdata.SortDataRows$DataSorterAndWriter@3d413b40
rejected from java.util.concurrent.ThreadPoolExecutor@cb56011[Terminated, pool
size = 0, active threads = 0, queued tasks = 0, completed tasks = 1]
at
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
at
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
at
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
at
org.apache.carbondata.processing.sort.sortdata.SortDataRows.addRowBatch(SortDataRows.java:169)
... 4 more
```
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)