Aman Sinha created DRILL-4679:
---------------------------------

             Summary: CONVERT_FROM()  json format fails if 0 rows are received 
from upstream operator
                 Key: DRILL-4679
                 URL: https://issues.apache.org/jira/browse/DRILL-4679
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Relational Operators
    Affects Versions: 1.6.0
            Reporter: Aman Sinha


CONVERT_FROM() json format fails as below if the underlying Filter produces 0 
rows: 
{noformat}
0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'json') as x from 
cp.`tpch/region.parquet` where r_regionkey = 9999;
Error: SYSTEM ERROR: IllegalStateException: next() returned NONE without first 
returning OK_NEW_SCHEMA [#16, ProjectRecordBatch]

Fragment 0:0
{noformat}

If the conversion is applied as UTF8 format,  the same query succeeds: 
{noformat}
0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'utf8') as x from 
cp.`tpch/region.parquet` where r_regionkey = 9999;
+----+
| x  |
+----+
+----+
No rows selected (0.241 seconds)
{noformat}

The reason for this is the special handling in the ProjectRecordBatch for JSON. 
 The output schema is not known for this until the run time and the 
ComplexWriter in the Project relies on seeing the input data to determine the 
output schema - this could be a MapVector or ListVector etc.  

If the input data has 0 rows due to a filter condition, we should at least 
produce a default output schema, e.g an empty MapVector ?  Need to decide a 
good default.  Note that the CONVERT_FROM(x, 'json') could occur on 2 branches 
of a UNION-ALL and if one input is empty while the other side is not, it may 
still cause incompatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to