[
https://issues.apache.org/jira/browse/HIVE-15640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941922#comment-15941922
]
Pengcheng Xiong commented on HIVE-15640:
----------------------------------------
I am deferring this to Hive 3.0 as we are going to cut the first RC and it is
not marked as blocker. Please feel free to commit to the branch if this can be
resolved before the release.
> Hive/Druid integration: null handling for metrics
> -------------------------------------------------
>
> Key: HIVE-15640
> URL: https://issues.apache.org/jira/browse/HIVE-15640
> Project: Hive
> Issue Type: Bug
> Components: Druid integration
> Affects Versions: 2.2.0
> Reporter: Jesus Camacho Rodriguez
> Assignee: Jesus Camacho Rodriguez
> Priority: Critical
>
> Null values for metrics in Druid and Hive are not handled the same way (_0.0_
> vs _NULL_).
> In Druid:
> {code:sql}
> SELECT i_brand_id, floor_day(`__time`), max(ss_quantity),
> sum(ss_wholesale_cost) as s
> FROM store_sales_sold_time_subset
> WHERE floor_day(`__time`) BETWEEN '1999-11-01 00:00:00' AND '1999-11-10
> 00:00:00'
> GROUP BY i_brand_id, floor_day(`__time`)
> ORDER BY s
> LIMIT 10;
> OK
> 6015006 1999-11-03 00:00:00 0.0 0.0
> 9011009 1999-11-05 00:00:00 0.0 0.0
> 8003009 1999-11-03 00:00:00 11.0 1.0299999713897705
> 10005014 1999-11-05 00:00:00 86.0 1.100000023841858
> 6008007 1999-11-09 00:00:00 81.0 1.3700000047683716
> 6003003 1999-11-08 00:00:00 45.0 1.600000023841858
> 8008009 1999-11-08 00:00:00 98.0 1.7100000381469727
> 8015003 1999-11-02 00:00:00 10.0 1.7400000095367432
> 8004008 1999-11-10 00:00:00 45.0 1.7599999904632568
> 8009009 1999-11-07 00:00:00 81.0 1.7699999809265137
> {code}
> In Hive:
> {code:sql}
> SELECT i_brand_id, floor_day(`__time`), max(ss_quantity),
> sum(ss_wholesale_cost) as s
> FROM store_sales_sold_time_subset_hive
> WHERE floor_day(`__time`) BETWEEN '1999-11-01 00:00:00' AND '1999-11-10
> 00:00:00'
> GROUP BY i_brand_id, floor_day(`__time`)
> ORDER BY s
> LIMIT 10;
> OK
> 6015006 1999-11-03 00:00:00 NULL NULL
> 9011009 1999-11-05 00:00:00 NULL NULL
> 8003009 1999-11-03 00:00:00 11 1.03
> 10005014 1999-11-05 00:00:00 86 1.1
> 6008007 1999-11-09 00:00:00 81 1.37
> 6003003 1999-11-08 00:00:00 45 1.6
> 8008009 1999-11-08 00:00:00 98 1.71
> 8015003 1999-11-02 00:00:00 10 1.74
> 8004008 1999-11-10 00:00:00 45 1.76
> 8009009 1999-11-07 00:00:00 81 1.77
> {code}
> However, for Druid dimensions, NULL values seem to be handled properly.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)