[
https://issues.apache.org/jira/browse/HIVE-19474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470940#comment-16470940
]
Jesus Camacho Rodriguez commented on HIVE-19474:
------------------------------------------------
[~bslim], could we change the message in {{DruidStorageHandlerUtils}} to "Cast
to any numeric type supported by Druid: x, y, z, t"? Thanks
> Decimal type should be casted as part of the CTAS or INSERT Clause.
> -------------------------------------------------------------------
>
> Key: HIVE-19474
> URL: https://issues.apache.org/jira/browse/HIVE-19474
> Project: Hive
> Issue Type: Bug
> Components: Druid integration
> Reporter: slim bouguerra
> Assignee: slim bouguerra
> Priority: Major
> Labels: druid
> Fix For: 3.0.0
>
> Attachments: HIVE-19474.patch
>
>
> HIVE-18569 introduced a runtime config variable to allow the indexing of
> Decimal as Double, this leads to kind of messy state, Hive metadata think the
> column is still decimal while it is stored as double. Since the Hive metadata
> of the column is Decimal the logical optimizer will not push down aggregates.
> i tried to fix this by adding some logic to the application but it makes the
> code very clumsy with lot of branches. Instead i propose to revert
> HIVE-18569 and let the user introduce an explicit cast this will be better
> since the metada reflects actual storage type and push down aggregates will
> kick in and there is no config needed without adding any code or bug.
> cc [~ashutoshc] and [~nishantbangarwa]
> You can see the difference with the following DDL
> {code:java}
> create table test_base_table(`timecolumn` timestamp, `interval_marker`
> string, `num_l` DECIMAL(10,2));
> insert into test_base_table values ('2015-03-08 00:00:00', 'i1-start', 4.5);
> set hive.druid.approx.result=true;
> CREATE TABLE druid_test_table
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.segment.granularity" = "DAY")
> AS
> select cast(`timecolumn` as timestamp with local time zone) as `__time`,
> `interval_marker`, cast(`num_l` as double)
> FROM test_base_table;
> describe druid_test_table;
> explain select sum(num_l), min(num_l) FROM druid_test_table;
> CREATE TABLE druid_test_table_2
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.segment.granularity" = "DAY")
> AS
> select cast(`timecolumn` as timestamp with local time zone) as `__time`,
> `interval_marker`, `num_l`
> FROM test_base_table;
> describe druid_test_table_2;
> explain select sum(num_l), min(num_l) FROM druid_test_table_2;
> {code}
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)