[ https://issues.apache.org/jira/browse/HIVE-6508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13922304#comment-13922304 ]
Remus Rusanu commented on HIVE-6508: ------------------------------------ The value 0 comes in the input vector unscaled (scale 0). As aggregates (SUM, STDxx) are being updated, they us the scale of the input value, not the scale of the input column. So any 0 in the input will round the intermediate fractional part of the intermediate. Final result is off. AVG uses a special scale so is not affected. MIN/MAX use the input value scale, but has no side effects. Fix is to pass in the column scale explictly, rather than assume the input value scale has the column scale. Ultimately the behavior of passing in unscaled 0s is wrong, but this comes from the row-mode join modus-operandi and I don't want to change that. Hardening the aggregates against this case is more robust. > Mismatched results between vector and non-vector mode with decimal field > ------------------------------------------------------------------------ > > Key: HIVE-6508 > URL: https://issues.apache.org/jira/browse/HIVE-6508 > Project: Hive > Issue Type: Bug > Components: Query Processor > Affects Versions: 0.13.0 > Reporter: Remus Rusanu > Assignee: Remus Rusanu > > Following query has a little mismatch in result as compared to the non-vector > mode. > {code} > select d_year, i_brand_id, i_brand, > sum(ss_ext_sales_price) as sum_agg > from date_dim > join store_sales on date_dim.d_date_sk = store_sales.ss_sold_date_sk > join item on store_sales.ss_item_sk = item.i_item_sk > where i_manufact_id = 128 > and d_moy = 11 > group by d_year, i_brand, i_brand_id > order by d_year, sum_agg desc, i_brand_id > limit 100; > {code} > This query is on tpcds data. > The field ss_ext_sales_price is of type decimal(7,2) and everything else is > an integer. -- This message was sent by Atlassian JIRA (v6.2#6252)