[ 
https://issues.apache.org/jira/browse/IMPALA-5946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670635#comment-16670635
 ] 

Tim Armstrong commented on IMPALA-5946:
---------------------------------------

There's some precision lost when dividing. Repro:
{noformat}
with ss_items as
 (select i_item_id item_id
        ,sum(ss_ext_sales_price) ss_item_rev
 from store_sales
     ,item
     ,date_dim
 where ss_item_sk = i_item_sk
   and d_date in (select d_date
                  from date_dim
                  where d_week_seq in (select d_week_seq
                                      from date_dim
                                      where d_date = '2000-01-03'))
   and ss_sold_date_sk   = d_date_sk
 group by i_item_id),
 cs_items as
 (select i_item_id item_id
        ,sum(cs_ext_sales_price) cs_item_rev
  from catalog_sales
      ,item
      ,date_dim
 where cs_item_sk = i_item_sk
  and  d_date in (select d_date
                  from date_dim
                  where d_week_seq in (select d_week_seq
                                      from date_dim
                                      where d_date = '2000-01-03'))
  and  cs_sold_date_sk = d_date_sk
 group by i_item_id),
 ws_items as
 (select i_item_id item_id
        ,sum(ws_ext_sales_price) ws_item_rev
  from web_sales
      ,item
      ,date_dim
 where ws_item_sk = i_item_sk
  and  d_date in (select d_date
                  from date_dim
                  where d_week_seq in (select d_week_seq
                                     from date_dim
                                     where d_date = '2000-01-03'))
  and ws_sold_date_sk   = d_date_sk
 group by i_item_id)
  select  ss_items.item_id
       ,ss_item_rev
       ,ws_item_rev
       ,ws_item_rev / ss_item_rev calc
 from ss_items,cs_items,ws_items
 where ss_items.item_id=cs_items.item_id
   and ss_items.item_id=ws_items.item_id
   and ss_item_rev between 0.9 * cs_item_rev and 1.1 * cs_item_rev
   and ss_item_rev between 0.9 * ws_item_rev and 1.1 * ws_item_rev
   and cs_item_rev between 0.9 * ss_item_rev and 1.1 * ss_item_rev
   and cs_item_rev between 0.9 * ws_item_rev and 1.1 * ws_item_rev
   and ws_item_rev between 0.9 * ss_item_rev and 1.1 * ss_item_rev
   and ws_item_rev between 0.9 * cs_item_rev and 1.1 * cs_item_rev
 order by item_id
         ,ss_item_rev
 limit 100
Query submitted at: 2018-10-31 12:36:23 (Coordinator: 
http://tarmstrong-box:25000)
Query progress can be monitored at: 
http://tarmstrong-box:25000/query_plan?query_id=d342334a56ac6c57:e53813cb00000000
+------------------+-------------+-------------+----------+
| item_id          | ss_item_rev | ws_item_rev | calc     |
+------------------+-------------+-------------+----------+
| AAAAAAAACNGBAAAA | 1900.15     | 1829.52     | 0.962829 |
| AAAAAAAAIDOAAAAA | 6605.22     | 6338.25     | 0.959582 |
| AAAAAAAAJMFCAAAA | 3608.52     | 3305.82     | 0.916115 |
+------------------+-------------+-------------+----------+
{noformat}

 0.962829 is a rounded value that drops some precision. If I divide the two 
inputs in python I get 0.9628292503223429. In the context of the more complex 
expression it gets a bit weird because we add some additional precision with 
the subsequent add and multiply operations

> Calculations on DECIMAL columns in WITH tables lose precision
> -------------------------------------------------------------
>
>                 Key: IMPALA-5946
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5946
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 2.10.0
>            Reporter: Tim Wood
>            Assignee: Tim Armstrong
>            Priority: Major
>              Labels: correctness, decimal
>         Attachments: ttq-212.out, ttq-214.out
>
>
> See attached query (tpcds-q58.test) + outputs.  ttq-212.out uses float 
> constants in expressions, ttq-214.out uses integer constants.  Both lose 
> precision.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to