[ 
https://issues.apache.org/jira/browse/CARBONDATA-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506741#comment-15506741
 ] 

ASF GitHub Bot commented on CARBONDATA-260:
-------------------------------------------

GitHub user manishgupta88 opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/180

    [CARBONDATA-260] Equal or lesser value of MAXCOLUMNS option than column 
count in CSV header results into array index of bound exception

    Problem: Equal or lesser value of MAXCOLUMNS option than column count in 
CSV header results into array index of bound exception
    
    Analysis: If column count in CSV header is more or equal to MAXCOLUMNS 
option value then array index out of bound exception is thrown by the Univocity 
CSV parser. This is because while parsing the row, parser adds each row to an 
array and increments the index and after incrementing it performs one more 
operation using the incremented index value which leads to array index pf bound 
exception. Code snipped as attached below for CSV parser.
    
    public void valueParsed() {
        this.parsedValues[column++] = appender.getAndReset();
        this.appender = appenders[column];
    }
    
    e.g. In the above code if column value is 7 then array index will be from 
0-6 and when column value becomes 6 then in the second line 
ArrayIndexOutOfBoundException will be thrown as column value will become 7.
    
    Fix: Whenever Column count in CSV header is equal or more than MAXCOLUMNS 
option value or default value, increment it by 1.
    
    Impact: Data load flow


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/manishgupta88/incubator-carbondata 
maxcolumns_array_indexOfBound

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/180.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #180
    
----
commit 3f32424e55615c8e45470d5169b817f9f703dc3e
Author: manishgupta88 <tomanishgupt...@gmail.com>
Date:   2016-09-20T14:21:33Z

    Problem: Equal or lesser value of MAXCOLUMNS option than column count in 
CSV header results into array index of bound exception
    
    Analysis: If column count in CSV header is more or equal to MAXCOLUMNS 
option value then array index out of bound exception is thrown by the Univocity 
CSV parser. This is because while parsing the row, parser adds each row to an 
array and increments the index and after incrementing it performs one more 
operation using the incremented index value which leads to array index pf bound 
exception. Code snipped as attached below for CSV parser.
    
    public void valueParsed() {
        this.parsedValues[column++] = appender.getAndReset();
        this.appender = appenders[column];
    }
    
    Fix: Whenever Column count in CSV header is equal or more than MAXCOLUMNS 
option value or default value, increment it by 1.
    
    Impact: Data load flow

----


> Equal or lesser value of MAXCOLUMNS option than column count in CSV header 
> results into array index of bound exception
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-260
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-260
>             Project: CarbonData
>          Issue Type: Bug
>            Reporter: Manish Gupta
>            Assignee: Manish Gupta
>
> If column count in CSV header is more or equal to MAXCOLUMNS option value 
> then array index out of bound exception is thrown by the Univocity CSV 
> parser. This is because while parsing the row, parser adds each row to an 
> array and increments the index and after incrementing it performs one more 
> operation using the incremented index value which leads to array index pf 
> bound exception
> java.lang.OutOfMemoryError: Java heap space
> at com.univocity.parsers.common.ParserOutput.<init>(ParserOutput.java:86)
> at com.univocity.parsers.common.AbstractParser.<init>(AbstractParser.java:66)
> at com.univocity.parsers.csv.CsvParser.<init>(CsvParser.java:50)
> at 
> org.apache.carbondata.processing.csvreaderstep.UnivocityCsvParser.initialize(UnivocityCsvParser.java:114)
> at 
> org.apache.carbondata.processing.csvreaderstep.CsvInput.doProcessUnivocity(CsvInput.java:427)
> at 
> org.apache.carbondata.processing.csvreaderstep.CsvInput.access$100(CsvInput.java:60)
> at 
> org.apache.carbondata.processing.csvreaderstep.CsvInput$1.call(CsvInput.java:389)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to