[ 
https://issues.apache.org/jira/browse/DRILL-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304152#comment-14304152
 ] 

Jason Altekruse commented on DRILL-2153:
----------------------------------------

Can you give a use case for including these records in the output? This is 
currently considered correct behavior, as we assumed users of flatten were 
interested in the relationship between members of a list, or the elements in a 
list in relation to other fields in a record. For both of these purposes there 
is no meaning in a list with no members.

The problem with allowing this extra record is it creates a collision in 
semantics of the resulting set. Now we would have an outgoing record in the 
case where we have a single element in a list or an empty list.

This would produce issues with simple use cases of flatten such as an 
aggregation over all of the flattened values. As you can see in the link below, 
aggregations become complicated with nulls as operations between values and 
NULL produce null, this would remove a very useful case of aggregating across 
lists, unless we force users to always defensively include coalesce statements 
on the column coming out of the flatten operation, unless they can guarantee 
the lists are non-empty.

http://stackoverflow.com/questions/23739657/calculate-average-of-some-columns-not-counting-null-values

> flatten function not handling nulls
> -----------------------------------
>
>                 Key: DRILL-2153
>                 URL: https://issues.apache.org/jira/browse/DRILL-2153
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>    Affects Versions: 0.7.0
>         Environment: Sandbox 4.0.2
>            Reporter: Sudhakar Thota
>            Assignee: Daniel Barclay (Drill/MapR)
>
> Function flatten not handling nulls resulting in eliminating relevant records 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to