[
https://issues.apache.org/jira/browse/DRILL-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304152#comment-14304152
]
Jason Altekruse commented on DRILL-2153:
----------------------------------------
Can you give a use case for including these records in the output? This is
currently considered correct behavior, as we assumed users of flatten were
interested in the relationship between members of a list, or the elements in a
list in relation to other fields in a record. For both of these purposes there
is no meaning in a list with no members.
The problem with allowing this extra record is it creates a collision in
semantics of the resulting set. Now we would have an outgoing record in the
case where we have a single element in a list or an empty list.
This would produce issues with simple use cases of flatten such as an
aggregation over all of the flattened values. As you can see in the link below,
aggregations become complicated with nulls as operations between values and
NULL produce null, this would remove a very useful case of aggregating across
lists, unless we force users to always defensively include coalesce statements
on the column coming out of the flatten operation, unless they can guarantee
the lists are non-empty.
http://stackoverflow.com/questions/23739657/calculate-average-of-some-columns-not-counting-null-values
> flatten function not handling nulls
> -----------------------------------
>
> Key: DRILL-2153
> URL: https://issues.apache.org/jira/browse/DRILL-2153
> Project: Apache Drill
> Issue Type: Bug
> Components: Functions - Drill
> Affects Versions: 0.7.0
> Environment: Sandbox 4.0.2
> Reporter: Sudhakar Thota
> Assignee: Daniel Barclay (Drill/MapR)
>
> Function flatten not handling nulls resulting in eliminating relevant records
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)