[
https://issues.apache.org/jira/browse/CARBONDATA-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15498271#comment-15498271
]
ASF GitHub Bot commented on CARBONDATA-247:
-------------------------------------------
GitHub user dhatchayani opened a pull request:
https://github.com/apache/incubator-carbondata/pull/162
[CARBONDATA-247] Higher MAXCOLUMNS value in load DML options is leading to
out of memory error
Problem: Higher MAXCOLUMNS value in load DML options is leading to out of
memory error
Analysis: When a higher value lets say Integer max value is configured for
maxcolumns option in load DML and executor memory is less, then in that case
UnivocityCsvParser throws an out of memory error when it tries to create an
array of size of maxColumns option value.
Fix: Set the threshold value for maxColumns option value that our system
can support and if maxColumns option value is greater than threshold value then
assign the threshold value to maxColumns option value
Impact: Data loading
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dhatchayani/incubator-carbondata
maxColumns_issue
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-carbondata/pull/162.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #162
----
commit 25ece8bf8cf214324cc2abf2e27bbbb43a16d4a7
Author: manishgupta88 <[email protected]>
Date: 2016-09-17T05:22:27Z
Problem: Higher MAXCOLUMNS value in load DML options is leading to out of
memory error
Analysis: When a higher value lets say Integer max value is configured for
maxcolumns option in load DML and executor memory is less, then in that case
UnivocityCsvParser throws an out of memory error when it tries to create an
array of size of maxColumns option value.
Fix: Set the threshold value for maxColumns option value that our system
can support and if maxColumns option value is greater than threshold value then
assign the threshold value to maxColumns option value
Impact: Data loading
----
> Higher MAXCOLUMNS value in load DML options is leading to out of memory error
> -----------------------------------------------------------------------------
>
> Key: CARBONDATA-247
> URL: https://issues.apache.org/jira/browse/CARBONDATA-247
> Project: CarbonData
> Issue Type: Bug
> Reporter: dhatchayani
> Priority: Minor
>
> When a higher value lets say Integer max value is configured for maxcolumns
> option in load DML and executor memory is less, then in that case
> UnivocityCsvParser throws an out of memory error when it tries to create an
> array of size of maxColumns option value.
> java.lang.OutOfMemoryError: Java heap space
> at
> com.univocity.parsers.common.ParserOutput.<init>(ParserOutput.java:86)
> at
> com.univocity.parsers.common.AbstractParser.<init>(AbstractParser.java:66)
> at com.univocity.parsers.csv.CsvParser.<init>(CsvParser.java:50)
> at
> org.apache.carbondata.processing.csvreaderstep.UnivocityCsvParser.initialize(UnivocityCsvParser.java:114)
> at
> org.apache.carbondata.processing.csvreaderstep.CsvInput.doProcessUnivocity(CsvInput.java:427)
> at
> org.apache.carbondata.processing.csvreaderstep.CsvInput.access$100(CsvInput.java:60)
> at
> org.apache.carbondata.processing.csvreaderstep.CsvInput$1.call(CsvInput.java:389)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)