[
https://issues.apache.org/jira/browse/DRILL-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001908#comment-16001908
]
Paul Rogers commented on DRILL-5489:
------------------------------------
Similarly, the {{RepeatedVarCharOutput}} class does not protect itself from a
file with more than 64K fields on input:
{code}
@Override
public void startField(int index) {
fieldIndex = index;
collect = collectedFields[index];
fieldOpen = true;
}
{code}
Here, the parser counts the fields and calls {{startField}} for each. If the
field is 65537, then the above method will check the {{collectedFields}} to
determine if the field is wanted. But, that array is hard-coded at 65536
entries, so the code will fail with an array out-of-bounds exception.
Instead, the code should simply discard extra fields, or throw a
{{UserException}} to report too many fields.
> Unprotected array access in RepeatedVarCharOutput ctor
> ------------------------------------------------------
>
> Key: DRILL-5489
> URL: https://issues.apache.org/jira/browse/DRILL-5489
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.10.0
> Reporter: Paul Rogers
> Priority: Minor
>
> Suppose a user runs a query of form:
> {code}
> SELECT columns[70000] FROM `dfs`.`mycsv.csv`
> {code}
> Internally, this will create a {{PathSegment}} to represent the selected
> column. This is passed into the {{RepeatedVarCharOutput}} constructor where
> it is used to set a flag in an array of 64K booleans. But, while the code is
> very diligent of making sure that the column name is "columns" and that the
> path segment is an array, it does not check the array value. Instead:
> {code}
> for(Integer i : columnIds){
> ...
> fields[i] = true;
> }
> {code}
> We need to add a bounds check to reject array indexes that are not valid:
> negative or above 64K. It may be that the code further up the hierarchy does
> the checks. But, if so, it should do the other checks as well. Leaving the
> checks incomplete is confusing.
> The result:
> {code}
> Exception (no rows returned):
> org.apache.drill.common.exceptions.UserRemoteException:
> SYSTEM ERROR: ArrayIndexOutOfBoundsException: 70000
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)