[ https://issues.apache.org/jira/browse/DRILL-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mehant Baid updated DRILL-3318: ------------------------------- Priority: Critical (was: Minor) > SUM(CAST(col as INT)) shows different results when used in window functions > --------------------------------------------------------------------------- > > Key: DRILL-3318 > URL: https://issues.apache.org/jira/browse/DRILL-3318 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization > Affects Versions: 1.1.0 > Reporter: Deneche A. Hakim > Assignee: Mehant Baid > Priority: Critical > Labels: window_function > Fix For: 1.1.0 > > Attachments: 3278.parquet > > > I have a parquet file that contains an INT field with large enough values so > that the sum of the values overflows the INT limits, and a VARCHAR column > with one single value (to force all rows to be part of the same partition). > Computing the sums without casting will give similar results: > {noformat} > SELECT SUM(col_int) OVER(PARTITION BY col_char) FROM `3278.parquet` LIMIT 1; > +--------------+ > | EXPR$0 | > +--------------+ > | -3216087191 | > +--------------+ > {noformat} > {noformat} > SELECT SUM(col_int) FROM `3278.parquet`; > +--------------+ > | EXPR$0 | > +--------------+ > | -3216087191 | > +--------------+ > {noformat} > But if we cast the column before doing the sum, the results are now different: > {noformat} > SELECT SUM(CAST(col_int AS INT)) OVER(PARTITION BY col_char) FROM > `3278.parquet` LIMIT 1; > +-------------+ > | EXPR$0 | > +-------------+ > | 1078880105 | > +-------------+ > {noformat} > {noformat} > SELECT SUM(CAST(col_int AS INT)) FROM `3278.parquet`; > +--------------+ > | EXPR$0 | > +--------------+ > | -3216087191 | > +--------------+ > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)