Re: IndexOutOfBoundsException on selecting column from CSV

MattK Thu, 11 Aug 2016 18:06:48 -0700

Problem was trailing whitespace in column names:https://issues.apache.org/jira/browse/DRILL-4843


On 11 Aug 2016, at 20:06, MattK wrote:

On MapR Community cluster with Drill v1.6, using simple commadelimited data with a header line, gzip compressed, and storage as:
~~~
    "csv": {
      "type": "text",
      "extensions": [
        "csv",
        "gz"
      ],
      "extractHeader": true,
      "delimiter": ","
    },
~~~
Running a simple SELECT * gives me the data as expected with a columnname header, however attempting to reference any of those column namesresults in:
~~~
0: jdbc:drill:> select date_dt from `data/test.csv.gz` limit 10;
Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 32384, length:4 (expected: range(0, 16384))
Fragment 0:0
[Error Id: 5ff884c0-5f9d-448c-8c77-b4bb4cd16541 onnfd002.sj2.hwcdn.net:31010] (state=,code=0)
~~~
As a test I tried this, with a very odd result, as these columns havevalues in them:
~~~
0: jdbc:drill:> with a as (select * from `data/test.csv.gz` limit 10)select date_dt from a;
+----------+
| date_dt  |
+----------+
| null     |
| null     |
| null     |
| null     |
| null     |
| null     |
| null     |
| null     |
| null     |
| null     |
+----------+
10 rows selected (0.332 seconds)
~~~

Verbose error:

~~~
Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 32384, length:4 (expected: range(0, 16384))
Fragment 0:0
[Error Id: 1e56232e-3229-44cc-a3e4-18c234a78a64 onnfd004.sj2.hwcdn.net:31010]
(java.lang.IndexOutOfBoundsException) index: 32384, length: 4(expected: range(0, 16384))
    io.netty.buffer.DrillBuf.checkIndexD():123
    io.netty.buffer.DrillBuf.chk():147
    io.netty.buffer.DrillBuf.getInt():520
    org.apache.drill.exec.vector.UInt4Vector$Accessor.get():353
org.apache.drill.exec.vector.VarCharVector$Mutator.setValueCount():640
    org.apache.drill.exec.physical.impl.ScanBatch.next():247
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext():115
    org.apache.drill.exec.record.AbstractRecordBatch.next():162
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():94
    org.apache.drill.exec.record.AbstractRecordBatch.next():162
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
    org.apache.drill.exec.record.AbstractRecordBatch.next():162
    org.apache.drill.exec.physical.impl.BaseRootExec.next():104
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
    org.apache.drill.exec.physical.impl.BaseRootExec.next():94
    org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():257
    org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():251
    java.security.AccessController.doPrivileged():-2
    javax.security.auth.Subject.doAs():422
    org.apache.hadoop.security.UserGroupInformation.doAs():1595
    org.apache.drill.exec.work.fragment.FragmentExecutor.run():251
    org.apache.drill.common.SelfCleaningRunnable.run():38
    java.util.concurrent.ThreadPoolExecutor.runWorker():1142
    java.util.concurrent.ThreadPoolExecutor$Worker.run():617
    java.lang.Thread.run():745 (state=,code=0)
~~~

Re: IndexOutOfBoundsException on selecting column from CSV

Reply via email to