[
https://issues.apache.org/jira/browse/HIVE-20085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16532749#comment-16532749
]
Nishant Bangarwa commented on HIVE-20085:
-----------------------------------------
[~ashutoshc] attached a patch, please review.
Changes Include -
# During table creation If user has specified druid datasource in CREATE TABLE
statement, schema will be discovered from druid otherwise user needs to provide
the schema
# Added of new config to enable CTAS - hive.ctas.external.tables, default to
false
# CTAS on an existing druid datasource will append data to any existing data
present in druid
# Druid Schema is now stored always, this is to allow ALTER Table statement on
druid tables and have same semantics for table modifications whether the table
was initially discovered from druid or created by HIVE.
# Insert/Insert overwrite will be supported all Druid tables, when hive config
hive.insert.into.external.tables is set to true. By default it is true
# Drop will drop data when external.table.purge is true on the table, default
for new druid tables is set via property - hive.external.table.purge.default
(by default false)
> Druid-Hive (managed) table creation fails with strict managed table checks:
> Table is marked as a managed table but is not transactional
> ---------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-20085
> URL: https://issues.apache.org/jira/browse/HIVE-20085
> Project: Hive
> Issue Type: Bug
> Components: Hive, StorageHandler
> Affects Versions: 3.0.0
> Reporter: Dileep Kumar Chiguruvada
> Assignee: Nishant Bangarwa
> Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-20085.patch
>
>
> Druid-Hive (managed) table creation fails with strict managed table checks:
> Table is marked as a managed table but is not transactional
> {code}
> drop table if exists calcs;
> create table calcs
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES (
> "druid.segment.granularity" = "MONTH",
> "druid.query.granularity" = "DAY")
> AS SELECT
> cast(datetime0 as timestamp with local time zone) `__time`,
> key,
> str0, str1, str2, str3,
> date0, date1, date2, date3,
> time0, time1,
> datetime0, datetime1,
> zzz,
> cast(bool0 as string) bool0,
> cast(bool1 as string) bool1,
> cast(bool2 as string) bool2,
> cast(bool3 as string) bool3,
> int0, int1, int2, int3,
> num0, num1, num2, num3, num4
> from tableau_orc.calcs;
> 2018-07-03 04:57:31,911|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Status: Running
> (Executing on YARN cluster with App id application_1530592209763_0009)
> ...
> ...
> 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SHUFFLE_BYTES_TO_MEM: > 0
> 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SHUFFLE_PHASE_TIME:
> 330
> 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SPILLED_RECORDS: 17
> 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO :
> TaskCounter_Reducer_2_OUTPUT_out_Reducer_2:
> 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : OUTPUT_RECORDS: 0
> 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO :
> org.apache.hadoop.hive.llap.counters.LlapWmCounters:
> 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : GUARANTEED_QUEUED_NS: > 0
> 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO :
> GUARANTEED_RUNNING_NS: 0
> 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO :
> SPECULATIVE_QUEUED_NS: 2162643606
> 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO :
> SPECULATIVE_RUNNING_NS: 12151664909
> 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task
> [Stage-2:DEPENDENCY_COLLECTION] in serial mode
> 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task
> [Stage-0:MOVE] in serial mode
> 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Moving data to
> directory
> hdfs://mycluster/warehouse/tablespace/managed/hive/druid_tableau.db/calcs
> from
> hdfs://mycluster/warehouse/tablespace/managed/hive/druid_tableau.db/.hive-staging_hive_2018-07-03_04-57-27_351_7124633902209008283-3/-ext-10002
> 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task
> [Stage-4:DDL] in serial mode
> 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|ERROR : FAILED: Execution
> Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
> MetaException(message:Table druid_tableau.calcs failed strict managed table
> checks due to the following reason: Table is marked as a managed table but is
> not transactional.)
> 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Completed executing
> command(queryId=hive_20180703045727_c39c40d2-7d4a-46c7-a36d-7925e7c4a788);
> Time taken: 6.794 seconds
> 2018-07-03 04:57:36,337|INFO|Thread-721|machine.py:111 -
> tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|Error: Error while
> processing statement: FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Table
> druid_tableau.calcs failed strict managed table checks due to the following
> reason: Table is marked as a managed table but is not transactional.)
> (state=08S01,code=1)
> {code}
> This will not allow druid tables to be managed.
> So its not direct to create Druid tables.
> while trying to modify things to external tables..we see below issues
> 1) INSERT/ INSERT OVERWRITE/ DROP are supported by Hive managed tables (not
> external) , we have few tests which covers this.. what would be the course of
> action.
> 2) Druid Hive kafka ingestion (where we will not have any datasource, we just
> get from Kafka topic) , here when declared as external tables we get error
> {code}
> message:Datasource name should be specified using [druid.datasource] for
> external tables using Druid
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)