[
https://issues.apache.org/jira/browse/DRILL-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15785593#comment-15785593
]
ASF GitHub Bot commented on DRILL-3562:
---------------------------------------
GitHub user Serhii-Harnyk opened a pull request:
https://github.com/apache/drill/pull/713
DRILL-3562: Query fails when using flatten on JSON data where some do…
…cuments have an empty array
1. Added set for ListWriters tracking to keep empty arrays for further
initializing in ensureAtLeastOneField method.
2. Added check to avoid schema generating with field type "Late" and mode
"Optional", replaced it to "Int" type in FlattenRecordBatch class.
3. Added unit tests to cover cases querying Json with empty arrays with
flatten.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/Serhii-Harnyk/drill DRILL-3562
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/drill/pull/713.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #713
----
commit 815de0cc6f16b247b3a655007241a074e38394c7
Author: Serhii-Harnyk <[email protected]>
Date: 2016-12-20T16:55:41Z
DRILL-3562: Query fails when using flatten on JSON data where some
documents have an empty array
----
> Query fails when using flatten on JSON data where some documents have an
> empty array
> ------------------------------------------------------------------------------------
>
> Key: DRILL-3562
> URL: https://issues.apache.org/jira/browse/DRILL-3562
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - JSON
> Affects Versions: 1.1.0
> Reporter: Philip Deegan
> Assignee: Serhii Harnyk
> Fix For: Future
>
>
> Drill query fails when using flatten when some records contain an empty array
> {noformat}
> SELECT COUNT(*) FROM (SELECT FLATTEN(t.a.b.c) AS c FROM dfs.`flat.json` t)
> flat WHERE flat.c.d.e = 'f' limit 1;
> {noformat}
> Succeeds on
> { "a": { "b": { "c": [ { "d": { "e": "f" } } ] } } }
> Fails on
> { "a": { "b": { "c": [] } } }
> Error
> {noformat}
> Error: SYSTEM ERROR: ClassCastException: Cannot cast
> org.apache.drill.exec.vector.NullableIntVector to
> org.apache.drill.exec.vector.complex.RepeatedValueVector
> {noformat}
> Is it possible to ignore the empty arrays, or do they need to be populated
> with dummy data?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)