[
https://issues.apache.org/jira/browse/HIVE-17691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eugene Koifman updated HIVE-17691:
----------------------------------
Description:
# DDLSemanticAnalyzer.alterTableOutput is unused
# DDLTask.generateAddMmTasks(Table) - stmtId should probably come from
TransactionManager
# DDLTask.createTable(Hive db, CreateTableDesc crtTbl) has _Long mmWriteId =
crtTbl.getInitialMmWriteId();_ logic is unclear.. this ID is only set in one
place..
# FileSinkOperator has multiple places that look like _conf.getWriteType() ==
AcidUtils.Operation.NOT_ACID || conf.isMmTable()_ - what is the writeType for
MM tables? Seems that Wei opted for "work.getLoadTableWork().getWriteType() !=
AcidUtils.Operation.NOT_ACID && !tbd.isMmTable()" to mean MM, e.g.
MoveTask.handleStaticParts() call to Hive.loadPartition()
# HiveConf.HIVE_TXN_OPERATIONAL_PROPERTIES - the doc/explanation there is
obsolete
# Compactor Initiator likely doesn't work for MM tables. It's triggered by
into in TXN_COMPONENTS/COMPLETED_TXN_COMPONENTS. MM tables don't write to
either because DbTxnManager.acquireLocks() does
_compBuilder.setIsAcid(AcidUtils.isFullAcidTable(t));_ i.e. it treats MM as
non-acid tables
# In general integration with full Acid seems confused wrt to MM and seems to
treat MM as special table type rather than subtype of Acid table. (mostly, but
not always).
## e.g. _SemanticAnalyzer.genBucketingSortingDest(String dest, Operator input,
QB qb, TableDesc table_desc, Table dest_tab, SortBucketRSCtx ctx)_
## _SemanticAnalyzer.validate()_ has _if (tbl != null &&
(AcidUtils.isFullAcidTable(tbl) ||
MetaStoreUtils.isInsertOnlyTable(tbl.getParameters()))) {_
# LoadSemanticAnalyzer.analyzeInternal(ASTNode) sets statementId to 0 rather
than from TM
# ImportCommitTask - doesn't currently do anything. It used to commit mmID.
Need to verify we properly commit the txn in the Driver
# As far as I can tell all the mm_*.q tests run on TestCliDriver which means
MR. This doesn't exercise some code specifically for dealing with writes from
Union All queries (CTAS, Insert into). On MR this requires
"hive.optimize.union.remove=true" (false by default)
# Remove MoveWork().setNoop(boolean) and usages per todo in
_GenMapRedUtils.createMRWorkForMergingFiles (FileSinkOperator fsInput, Path
finalName, DependencyCollectionTask dependencyTask, List<Task<MoveWork>>
mvTasks, HiveConf conf, Task<? extends Serializable> currTask)_
# PartialScanWork.tblDesc - unused
# _Partition.getBucketPath(int bucketNum)_ has "// Note: this makes assumptions
that won't work with MM tables, unions, etc.". File Jira?
# _PartitionDesc.LOG_ is unused
# Insert Overwrite for MM is incomplete - see comments in HIVE-15212 regarding
IOW and multi IOW
was:
# DDLSemanticAnalyzer.alterTableOutput is unused
# DDLTask.generateAddMmTasks(Table) - stmtId should probably come from
TransactionManager
# DDLTask.createTable(Hive db, CreateTableDesc crtTbl) has _Long mmWriteId =
crtTbl.getInitialMmWriteId();_ logic is unclear.. this ID is only set in one
place..
# FileSinkOperator has multiple places that look like _conf.getWriteType() ==
AcidUtils.Operation.NOT_ACID || conf.isMmTable()_ - what is the writeType for
MM tables? Seems that Wei opted for "work.getLoadTableWork().getWriteType() !=
AcidUtils.Operation.NOT_ACID && !tbd.isMmTable()" to mean MM, e.g.
MoveTask.handleStaticParts() call to Hive.loadPartition()
# HiveConf.HIVE_TXN_OPERATIONAL_PROPERTIES - the doc/explanation there is
obsolete
# Compactor Initiator likely doesn't work for MM tables. It's triggered by
into in TXN_COMPONENTS/COMPLETED_TXN_COMPONENTS. MM tables don't write to
either because DbTxnManager.acquireLocks() does
_compBuilder.setIsAcid(AcidUtils.isFullAcidTable(t));_ i.e. it treats MM as
non-acid tables
# In general integration with full Acid seems confused wrt to MM and seems to
treat MM as special table type rather than subtype of Acid table. (mostly, but
not always).
## e.g. _SemanticAnalyzer.genBucketingSortingDest(String dest, Operator input,
QB qb, TableDesc table_desc, Table dest_tab, SortBucketRSCtx ctx)_
# LoadSemanticAnalyzer.analyzeInternal(ASTNode) sets statementId to 0 rather
than from TM
# ImportCommitTask - doesn't currently do anything. It used to commit mmID.
Need to verify we properly commit the txn in the Driver
# As far as I can tell all the mm_*.q tests run on TestCliDriver which means
MR. This doesn't exercise some code specifically for dealing with writes from
Union All queries (CTAS, Insert into). On MR this requires
"hive.optimize.union.remove=true" (false by default)
# Remove MoveWork().setNoop(boolean) and usages per todo in
_GenMapRedUtils.createMRWorkForMergingFiles (FileSinkOperator fsInput, Path
finalName, DependencyCollectionTask dependencyTask, List<Task<MoveWork>>
mvTasks, HiveConf conf, Task<? extends Serializable> currTask)_
# PartialScanWork.tblDesc - unused
# _Partition.getBucketPath(int bucketNum)_ has "// Note: this makes assumptions
that won't work with MM tables, unions, etc.". File Jira?
# _PartitionDesc.LOG_ is unused
# Insert Overwrite for MM is incomplete - see comments in HIVE-15212 regarding
IOW and multi IOW
> Miscellaneous List
> ------------------
>
> Key: HIVE-17691
> URL: https://issues.apache.org/jira/browse/HIVE-17691
> Project: Hive
> Issue Type: Sub-task
> Components: Transactions
> Reporter: Eugene Koifman
>
> # DDLSemanticAnalyzer.alterTableOutput is unused
> # DDLTask.generateAddMmTasks(Table) - stmtId should probably come from
> TransactionManager
> # DDLTask.createTable(Hive db, CreateTableDesc crtTbl) has _Long mmWriteId =
> crtTbl.getInitialMmWriteId();_ logic is unclear.. this ID is only set in one
> place..
> # FileSinkOperator has multiple places that look like _conf.getWriteType() ==
> AcidUtils.Operation.NOT_ACID || conf.isMmTable()_ - what is the writeType for
> MM tables? Seems that Wei opted for "work.getLoadTableWork().getWriteType()
> != AcidUtils.Operation.NOT_ACID && !tbd.isMmTable()" to mean MM, e.g.
> MoveTask.handleStaticParts() call to Hive.loadPartition()
> # HiveConf.HIVE_TXN_OPERATIONAL_PROPERTIES - the doc/explanation there is
> obsolete
> # Compactor Initiator likely doesn't work for MM tables. It's triggered by
> into in TXN_COMPONENTS/COMPLETED_TXN_COMPONENTS. MM tables don't write to
> either because DbTxnManager.acquireLocks() does
> _compBuilder.setIsAcid(AcidUtils.isFullAcidTable(t));_ i.e. it treats MM as
> non-acid tables
> # In general integration with full Acid seems confused wrt to MM and seems to
> treat MM as special table type rather than subtype of Acid table. (mostly,
> but not always).
> ## e.g. _SemanticAnalyzer.genBucketingSortingDest(String dest, Operator
> input, QB qb, TableDesc table_desc, Table dest_tab, SortBucketRSCtx ctx)_
> ## _SemanticAnalyzer.validate()_ has _if (tbl != null &&
> (AcidUtils.isFullAcidTable(tbl) ||
> MetaStoreUtils.isInsertOnlyTable(tbl.getParameters()))) {_
> # LoadSemanticAnalyzer.analyzeInternal(ASTNode) sets statementId to 0 rather
> than from TM
> # ImportCommitTask - doesn't currently do anything. It used to commit mmID.
> Need to verify we properly commit the txn in the Driver
> # As far as I can tell all the mm_*.q tests run on TestCliDriver which means
> MR. This doesn't exercise some code specifically for dealing with writes
> from Union All queries (CTAS, Insert into). On MR this requires
> "hive.optimize.union.remove=true" (false by default)
> # Remove MoveWork().setNoop(boolean) and usages per todo in
> _GenMapRedUtils.createMRWorkForMergingFiles (FileSinkOperator fsInput, Path
> finalName, DependencyCollectionTask dependencyTask, List<Task<MoveWork>>
> mvTasks, HiveConf conf, Task<? extends Serializable> currTask)_
> # PartialScanWork.tblDesc - unused
> # _Partition.getBucketPath(int bucketNum)_ has "// Note: this makes
> assumptions that won't work with MM tables, unions, etc.". File Jira?
> # _PartitionDesc.LOG_ is unused
> # Insert Overwrite for MM is incomplete - see comments in HIVE-15212
> regarding IOW and multi IOW
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)