Problem was trailing whitespace in column names:
https://issues.apache.org/jira/browse/DRILL-4843
On 11 Aug 2016, at 20:06, MattK wrote:
On MapR Community cluster with Drill v1.6, using simple comma
delimited data with a header line, gzip compressed, and storage as:
~~~
"csv": {
"type": "text",
"extensions": [
"csv",
"gz"
],
"extractHeader": true,
"delimiter": ","
},
~~~
Running a simple SELECT * gives me the data as expected with a column
name header, however attempting to reference any of those column names
results in:
~~~
0: jdbc:drill:> select date_dt from `data/test.csv.gz` limit 10;
Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 32384, length:
4 (expected: range(0, 16384))
Fragment 0:0
[Error Id: 5ff884c0-5f9d-448c-8c77-b4bb4cd16541 on
nfd002.sj2.hwcdn.net:31010] (state=,code=0)
~~~
As a test I tried this, with a very odd result, as these columns have
values in them:
~~~
0: jdbc:drill:> with a as (select * from `data/test.csv.gz` limit 10)
select date_dt from a;
+----------+
| date_dt |
+----------+
| null |
| null |
| null |
| null |
| null |
| null |
| null |
| null |
| null |
| null |
+----------+
10 rows selected (0.332 seconds)
~~~
Verbose error:
~~~
Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 32384, length:
4 (expected: range(0, 16384))
Fragment 0:0
[Error Id: 1e56232e-3229-44cc-a3e4-18c234a78a64 on
nfd004.sj2.hwcdn.net:31010]
(java.lang.IndexOutOfBoundsException) index: 32384, length: 4
(expected: range(0, 16384))
io.netty.buffer.DrillBuf.checkIndexD():123
io.netty.buffer.DrillBuf.chk():147
io.netty.buffer.DrillBuf.getInt():520
org.apache.drill.exec.vector.UInt4Vector$Accessor.get():353
org.apache.drill.exec.vector.VarCharVector$Mutator.setValueCount():640
org.apache.drill.exec.physical.impl.ScanBatch.next():247
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext():115
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():94
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.physical.impl.BaseRootExec.next():104
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
org.apache.drill.exec.physical.impl.BaseRootExec.next():94
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():257
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():251
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():422
org.apache.hadoop.security.UserGroupInformation.doAs():1595
org.apache.drill.exec.work.fragment.FragmentExecutor.run():251
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1142
java.util.concurrent.ThreadPoolExecutor$Worker.run():617
java.lang.Thread.run():745 (state=,code=0)
~~~