GitHub user manishgupta88 opened a pull request:
https://github.com/apache/incubator-carbondata/pull/180
[CARBONDATA-260] Equal or lesser value of MAXCOLUMNS option than column
count in CSV header results into array index of bound exception
Problem: Equal or lesser value of MAXCOLUMNS option than column count in
CSV header results into array index of bound exception
Analysis: If column count in CSV header is more or equal to MAXCOLUMNS
option value then array index out of bound exception is thrown by the Univocity
CSV parser. This is because while parsing the row, parser adds each row to an
array and increments the index and after incrementing it performs one more
operation using the incremented index value which leads to array index pf bound
exception. Code snipped as attached below for CSV parser.
public void valueParsed() {
this.parsedValues[column++] = appender.getAndReset();
this.appender = appenders[column];
}
e.g. In the above code if column value is 7 then array index will be from
0-6 and when column value becomes 6 then in the second line
ArrayIndexOutOfBoundException will be thrown as column value will become 7.
Fix: Whenever Column count in CSV header is equal or more than MAXCOLUMNS
option value or default value, increment it by 1.
Impact: Data load flow
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/manishgupta88/incubator-carbondata
maxcolumns_array_indexOfBound
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-carbondata/pull/180.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #180
commit 3f32424e55615c8e45470d5169b817f9f703dc3e
Author: manishgupta88
Date: 2016-09-20T14:21:33Z
Problem: Equal or lesser value of MAXCOLUMNS option than column count in
CSV header results into array index of bound exception
Analysis: If column count in CSV header is more or equal to MAXCOLUMNS
option value then array index out of bound exception is thrown by the Univocity
CSV parser. This is because while parsing the row, parser adds each row to an
array and increments the index and after incrementing it performs one more
operation using the incremented index value which leads to array index pf bound
exception. Code snipped as attached below for CSV parser.
public void valueParsed() {
this.parsedValues[column++] = appender.getAndReset();
this.appender = appenders[column];
}
Fix: Whenever Column count in CSV header is equal or more than MAXCOLUMNS
option value or default value, increment it by 1.
Impact: Data load flow
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---