Abhishek Girish created DRILL-1616:
--------------------------------------

             Summary: Drill throws "Schema is currently null" error on 
count(field) when field is an JSON object/array
                 Key: DRILL-1616
                 URL: https://issues.apache.org/jira/browse/DRILL-1616
             Project: Apache Drill
          Issue Type: Bug
          Components: Storage - JSON
            Reporter: Abhishek Girish
            Assignee: Jason Altekruse


Count(field) throws error on fields which are objects or arrays and these are 
not clean. They do not indicate an error in usage. Also, count on 
objects/arrays should be supported. 

> select * from `abc.json`;
+------------+------------+------------+------------+------------+
|  field_1   |  field_2   |  field_3   |  field_4   |  field_5   |
+------------+------------+------------+------------+------------+
| ["1"]      | null       | {"inner_3":[]} | {"inner_1":[],"inner_3":{}} | []   
      |
| ["5"]      | 2          | {"inner_1":"2","inner_3":[]} | 
{"inner_1":["1","2","3"],"inner_2":"3","inner_3":{"inner_object_field_1":"2"}} 
| [{"inner_list":["1","null","6"],"inner_ |
| ["5","10","15"] | A wild string appears! | 
{"inner_1":"5","inner_2":"3","inner_3":[{},{"inner_object_field_1":"10"}]} | 
{"inner_1":["4","5","6"],"inner_2":"3","inner_3":{}} | [{ |
+------------+------------+------------+------------+------------+
3 rows selected (0.081 seconds)

> select count(field_1) from `abc.json`;
Query failed: Failure while running fragment., Schema is currently null.  You 
must call buildSchema(SelectionVectorMode) before this container can return a 
schema. [ b6f021f9-213e-475e-83f4-a6facf6fd76d on abhi7.qa.lab:31010 ]
Error: exception while executing query: Failure while executing query. 
(state=,code=0)

Error is seen on fields 1,3,4,5. 

The issue is not seen when array index is specified. 
> select count(field_1[0]) from `abc.json`;
+------------+
|   EXPR$0   |
+------------+
| 3          |
+------------+
1 row selected (0.152 seconds)

Or when the element in the object is specified:
> select count(t.field_3.inner_3) from `textmode.json` as t;
+------------+
|   EXPR$0   |
+------------+
| 3          |
+------------+
1 row selected (0.155 seconds)

LOG:
2014-10-30 13:28:20,286 [a90cc246-e60b-452b-ba96-7f79709f5ffa:frag:0:0] ERROR 
o.a.d.e.w.f.AbstractStatusReporter - Error 
bc438332-0828-4a86-8063-9dc8c5a703d9: Failure while running fragment.
java.lang.NullPointerException: Schema is currently null.  You must call 
buildSchema(SelectionVectorMode) before this container can return a schema.
        at 
com.google.common.base.Preconditions.checkNotNull(Preconditions.java:208) 
~[guava-14.0.1.jar:na]
        at 
org.apache.drill.exec.record.VectorContainer.getSchema(VectorContainer.java:273)
 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.getSchema(AbstractRecordBatch.java:116)
 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.getSchema(IteratorValidatorBatchIterator.java:75)
 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.buildSchema(ScreenCreator.java:100)
 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:103)
 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:249)
 
[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_65]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_65]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to