[ 
https://issues.apache.org/jira/browse/HIVE-9385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279769#comment-14279769
 ] 

Nick Martin commented on HIVE-9385:
-----------------------------------

[~damien.carol] So I have ~150m rows of sales data in an ORC table and there's 
a column for the sales amount I'm storing as a double. When I sum on that 
column I get the value I reported above (4.7...). The true sum of that field is 
~$2.5b or so.

When I do the exact same thing (create the same table, store the sales column 
as a double, sum on that column) but store the table as textfile I get the 
correct amount. 

So, I'm saying I think there's something going on with sum() on doubles in ORC 
tables and am hoping someone could give it a shot in their environment and let 
me know if it appears to be a bug or not.

> Sum a Double using an ORC table
> -------------------------------
>
>                 Key: HIVE-9385
>                 URL: https://issues.apache.org/jira/browse/HIVE-9385
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.13.1
>         Environment: HDP 2.x, Hive
>            Reporter: Nick Martin
>            Priority: Minor
>
> I’m storing a sales amount column as a double in an ORC table and when I do:
> {code:sql}
> select sum(x) from sometable
> {code}
> I get a value like {{4.79165141174808E9}}
> A visual inspection of the column values reveals no glaring anomalies…all 
> looks pretty normal. 
> If I do the same thing in a textfile table I get a perfectly fine aggregation 
> of the double field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to