[ 
https://issues.apache.org/jira/browse/HIVE-20085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16532690#comment-16532690
 ] 

Nishant Bangarwa commented on HIVE-20085:
-----------------------------------------

Discussion with [~ashutoshc]
To play well in this new world of managed and external tables, we need to make 
all Druid tables as external. For that to work well, we will need following 
changes:
1) User will be required to specify external qualifier for creating Druid 
tables.
2) If user creates table from Hive without specifying druid datasource we will 
store the schema in HMS and use it.
3) If user creates table from Hive and specifies druid datasource then we don't 
store schema in HMS.
4) Inserts should be allowed to external tables (to be consistent with external 
tables semantics)
5) Drop will not delete any data from druid. They may use 
external.table.purge=true as tblprops to override this.


> Druid-Hive (managed) table creation fails with strict managed table checks: 
> Table is marked as a managed table but is not transactional
> ---------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-20085
>                 URL: https://issues.apache.org/jira/browse/HIVE-20085
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, StorageHandler
>    Affects Versions: 3.0.0
>            Reporter: Dileep Kumar Chiguruvada
>            Assignee: Nishant Bangarwa
>            Priority: Major
>             Fix For: 3.0.0
>
>
> Druid-Hive (managed) table creation fails with strict managed table checks: 
> Table is marked as a managed table but is not transactional
> {code}
> drop table if exists calcs;
> create table calcs
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES (
> "druid.segment.granularity" = "MONTH",
> "druid.query.granularity" = "DAY")
> AS SELECT
> cast(datetime0 as timestamp with local time zone) `__time`,
> key,
> str0, str1, str2, str3,
> date0, date1, date2, date3,
> time0, time1,
> datetime0, datetime1,
> zzz,
> cast(bool0 as string) bool0,
> cast(bool1 as string) bool1,
> cast(bool2 as string) bool2,
> cast(bool3 as string) bool3,
> int0, int1, int2, int3,
> num0, num1, num2, num3, num4
> from tableau_orc.calcs;
> 2018-07-03 04:57:31,911|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Status: Running 
> (Executing on YARN cluster with App id application_1530592209763_0009)
> ...
> ...
> 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SHUFFLE_BYTES_TO_MEM: > 0
> 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SHUFFLE_PHASE_TIME: 
> 330
> 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SPILLED_RECORDS: 17
> 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : 
> TaskCounter_Reducer_2_OUTPUT_out_Reducer_2:
> 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : OUTPUT_RECORDS: 0
> 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : 
> org.apache.hadoop.hive.llap.counters.LlapWmCounters:
> 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : GUARANTEED_QUEUED_NS: > 0
> 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : 
> GUARANTEED_RUNNING_NS: 0
> 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : 
> SPECULATIVE_QUEUED_NS: 2162643606
> 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : 
> SPECULATIVE_RUNNING_NS: 12151664909
> 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task 
> [Stage-2:DEPENDENCY_COLLECTION] in serial mode
> 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task 
> [Stage-0:MOVE] in serial mode
> 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Moving data to 
> directory 
> hdfs://mycluster/warehouse/tablespace/managed/hive/druid_tableau.db/calcs 
> from 
> hdfs://mycluster/warehouse/tablespace/managed/hive/druid_tableau.db/.hive-staging_hive_2018-07-03_04-57-27_351_7124633902209008283-3/-ext-10002
> 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task 
> [Stage-4:DDL] in serial mode
> 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|ERROR : FAILED: Execution 
> Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:Table druid_tableau.calcs failed strict managed table 
> checks due to the following reason: Table is marked as a managed table but is 
> not transactional.)
> 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Completed executing 
> command(queryId=hive_20180703045727_c39c40d2-7d4a-46c7-a36d-7925e7c4a788); 
> Time taken: 6.794 seconds
> 2018-07-03 04:57:36,337|INFO|Thread-721|machine.py:111 - 
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|Error: Error while 
> processing statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Table 
> druid_tableau.calcs failed strict managed table checks due to the following 
> reason: Table is marked as a managed table but is not transactional.) 
> (state=08S01,code=1)
> {code}
> This will not allow druid tables to be managed.
> So its not direct to create Druid tables.
> while trying to modify things to external tables..we see below issues
> 1) INSERT/ INSERT OVERWRITE/ DROP are supported by Hive managed tables (not 
> external) , we have few tests which covers this.. what would be the course of 
> action.
> 2) Druid Hive kafka ingestion (where we will not have any datasource, we just 
> get from Kafka topic) , here when declared as external tables we get error
> {code}
> message:Datasource name should be specified using [druid.datasource] for 
> external tables using Druid
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to