Github user gvramana commented on the pull request:

    https://github.com/apache/spark/pull/4466#issuecomment-81427255
  
    We need to satisfy 3 cases:
    1) sum of all null values = zero
    2) Sum for table column with no data = Null
    3) sum of column with null and not null values = sum of not null values
    
    i.e.,
    <table border=1>
    <tr><th>Combining level</th><th>Paritition(s) level</th><th>Input 
data</th></tr>
    <tr><td>Zero </td><td> &lt;-- Zero(f1) </td><td>&lt;-- null <br/> &lt;-- 
null <br/> &lt;-- null </td></tr>
    <tr><td>null(f2)</td><td>&lt;-- null<br/> &lt;-- null <br/> &lt;-- 
null</td><td> &lt;-- No data <br/>&lt;-- No data<br/> &lt;-- No data</td></tr>
    </table>
    if same aggregate expression has to work at partition level and combining 
level. It cannot distinguish between f1 and f2 cases. As there is no way to 
know "No data" case at combining expression.
    So expressions are separated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to