[
https://issues.apache.org/jira/browse/HIVE-20343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16579135#comment-16579135
]
Sergey Shelukhin commented on HIVE-20343:
-----------------------------------------
Confirmed after some testing... if neither of the default flags is set, the
incorrect table is created with txn props but no txn.
If mm is on by default, everything is fine.
If ACID is on by default, the conversion code treats the table as non-txn
(similar to the first case) and then replaces txn=false, props=insert_only with
full ACID.
I think the solution is to outlaw this syntax.. if properties are specified
explicitly, the flag should also be specified, it's not good for us to guess
what user wanted based on partial specification, given that there are 4
combinations of default options.
Location is a separate issue.
> Hive 3: CTAS does not respect transactional_properties
> ------------------------------------------------------
>
> Key: HIVE-20343
> URL: https://issues.apache.org/jira/browse/HIVE-20343
> Project: Hive
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 3.1.0
> Environment: hive-3
> Reporter: Rajkumar Singh
> Assignee: Sergey Shelukhin
> Priority: Major
>
> Steps to reproduce:
> {code}
> create table ctasexampleinsertonly stored as orc TBLPROPERTIES
> ("transactional_properties"="insert_only") as select * from testtable limit 1;
> {code}
> look for transactional_properties which is 'default' not the expected
> "insert_only"
> {code}
> describe formatted ctasexampleinsertonly
>
> +-------------------------------+----------------------------------------------------+-----------------------+
> | col_name | data_type
> | comment |
> +-------------------------------+----------------------------------------------------+-----------------------+
> | # col_name | data_type
> | comment |
> | name | varchar(8)
> | |
> | time | double
> | |
> | | NULL
> | NULL |
> | # Detailed Table Information | NULL
> | NULL |
> | Database: | default
> | NULL |
> | OwnerType: | USER
> | NULL |
> | Owner: | hive
> | NULL |
> | CreateTime: | Wed Aug 08 21:35:15 UTC 2018
> | NULL |
> | LastAccessTime: | UNKNOWN
> | NULL |
> | Retention: | 0
> | NULL |
> | Location: |
> hdfs://xxxxxxxxxx:8020/warehouse/tablespace/managed/hive/ctasexampleinsertonly
> | NULL |
> | Table Type: | MANAGED_TABLE
> | NULL |
> | Table Parameters: | NULL
> | NULL |
> | | COLUMN_STATS_ACCURATE
> | {} |
> | | bucketing_version
> | 2 |
> | | numFiles
> | 1 |
> | | numRows
> | 1 |
> | | rawDataSize
> | 0 |
> | | totalSize
> | 754 |
> | | transactional
> | true |
> | | transactional_properties
> | default |
> | | transient_lastDdlTime
> | 1533764115 |
> | | NULL
> | NULL |
> | # Storage Information | NULL
> | NULL |
> | SerDe Library: | org.apache.hadoop.hive.ql.io.orc.OrcSerde
> | NULL |
> | InputFormat: |
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat | NULL |
> | OutputFormat: |
> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat | NULL |
> | Compressed: | No
> | NULL |
> | Num Buckets: | -1
> | NULL |
> | Bucket Columns: | []
> | NULL |
> | Sort Columns: | []
> | NULL |
> | Storage Desc Params: | NULL
> | NULL |
> | | serialization.format
> | 1 |
> +-------------------------------+----------------------------------------------------+-----------------------+
> {code}
> not sure whether its a cosmatic issue but it does creates a problem with
> insert
> {code}
> CREATE TABLE TABLE42 ROW FORMAT SERDE
> 'org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe' STORED AS RCFILE
> LOCATION '/tmp/test10' as select * from testtable limit 1;
> {code}
> it takes the default transactional_properties as insert_only instead of
> default and failed with the following Exception
> {code}
> ERROR : Job Commit failed with exception
> 'org.apache.hadoop.hive.ql.metadata.HiveException(The following files were
> committed but not found: [/tmp/test10/delta_0000004_0000004_0000/000000_0])'
> org.apache.hadoop.hive.ql.metadata.HiveException: The following files were
> committed but not found: [/tmp/test10/delta_0000004_0000004_0000/000000_0]
> at
> org.apache.hadoop.hive.ql.exec.Utilities.handleMmTableFinalPath(Utilities.java:4329)
> at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1393)
> at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:798)
> at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803)
> at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803)
> at org.apache.hadoop.hive.ql.exec.tez.TezTask.close(TezTask.java:579)
> at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:316)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2668)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2339)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2015)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1713)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1707)
> at
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
> at
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:224)
> at
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
> at
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:316)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
> at
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:329)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)