[jira] [Updated] (HIVE-17691) Miscellaneous List

Eugene Koifman (JIRA) Wed, 11 Oct 2017 13:28:15 -0700

     [ 
https://issues.apache.org/jira/browse/HIVE-17691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Eugene Koifman updated HIVE-17691:
----------------------------------
    Description: 
# DDLSemanticAnalyzer.alterTableOutput is unused
# DDLTask.generateAddMmTasks(Table) - stmtId should probably come from 
TransactionManager
# DDLTask.createTable(Hive db, CreateTableDesc crtTbl) has _Long mmWriteId = 
crtTbl.getInitialMmWriteId();_ logic is unclear..  this ID is only set in one 
place..
# FileSinkOperator has multiple places that look like _conf.getWriteType() == 
AcidUtils.Operation.NOT_ACID || conf.isMmTable()_ - what is the writeType for 
MM tables?  Seems that Wei opted for "work.getLoadTableWork().getWriteType() != 
AcidUtils.Operation.NOT_ACID && !tbd.isMmTable()" to mean MM, e.g. 
MoveTask.handleStaticParts() call to Hive.loadPartition()
# HiveConf.HIVE_TXN_OPERATIONAL_PROPERTIES - the doc/explanation there is 
obsolete
# Compactor Initiator likely doesn't work for MM tables.  It's triggered by 
into in TXN_COMPONENTS/COMPLETED_TXN_COMPONENTS.  MM tables don't write to 
either because DbTxnManager.acquireLocks() does  
_compBuilder.setIsAcid(AcidUtils.isFullAcidTable(t));_ i.e. it treats MM as 
non-acid tables
# In general integration with full Acid seems confused wrt to MM and seems to 
treat MM as special table type rather than subtype of Acid table.  (mostly, but 
not always).
## e.g. _SemanticAnalyzer.genBucketingSortingDest(String dest, Operator input, 
QB qb, TableDesc table_desc, Table dest_tab, SortBucketRSCtx ctx)_ 
##  _SemanticAnalyzer.validate()_ has _if (tbl != null && 
(AcidUtils.isFullAcidTable(tbl) || 
MetaStoreUtils.isInsertOnlyTable(tbl.getParameters()))) {_
# LoadSemanticAnalyzer.analyzeInternal(ASTNode) sets statementId to 0 rather 
than from TM
# ImportCommitTask - doesn't currently do anything.  It used to commit mmID.  
Need to verify we properly commit the txn in the Driver
# As far as I can tell all the mm_*.q tests run on TestCliDriver which means 
MR.  This doesn't exercise some code specifically for dealing with writes from 
Union All queries (CTAS, Insert into).  On MR this requires 
"hive.optimize.union.remove=true" (false by default)
# Remove MoveWork().setNoop(boolean) and usages per todo in 
_GenMapRedUtils.createMRWorkForMergingFiles (FileSinkOperator fsInput, Path 
finalName, DependencyCollectionTask dependencyTask,   List<Task<MoveWork>> 
mvTasks, HiveConf conf,   Task<? extends Serializable> currTask)_
# PartialScanWork.tblDesc - unused
# _Partition.getBucketPath(int bucketNum)_ has "// Note: this makes assumptions 
that won't work with MM tables, unions, etc.".  File Jira?
# _PartitionDesc.LOG_ is unused
# Insert Overwrite for MM is incomplete - see comments in HIVE-15212 regarding 
IOW and multi IOW




  was:
# DDLSemanticAnalyzer.alterTableOutput is unused
# DDLTask.generateAddMmTasks(Table) - stmtId should probably come from 
TransactionManager
# DDLTask.createTable(Hive db, CreateTableDesc crtTbl) has _Long mmWriteId = 
crtTbl.getInitialMmWriteId();_ logic is unclear..  this ID is only set in one 
place..
# FileSinkOperator has multiple places that look like _conf.getWriteType() == 
AcidUtils.Operation.NOT_ACID || conf.isMmTable()_ - what is the writeType for 
MM tables?  Seems that Wei opted for "work.getLoadTableWork().getWriteType() != 
AcidUtils.Operation.NOT_ACID && !tbd.isMmTable()" to mean MM, e.g. 
MoveTask.handleStaticParts() call to Hive.loadPartition()
# HiveConf.HIVE_TXN_OPERATIONAL_PROPERTIES - the doc/explanation there is 
obsolete
# Compactor Initiator likely doesn't work for MM tables.  It's triggered by 
into in TXN_COMPONENTS/COMPLETED_TXN_COMPONENTS.  MM tables don't write to 
either because DbTxnManager.acquireLocks() does  
_compBuilder.setIsAcid(AcidUtils.isFullAcidTable(t));_ i.e. it treats MM as 
non-acid tables
# In general integration with full Acid seems confused wrt to MM and seems to 
treat MM as special table type rather than subtype of Acid table.  (mostly, but 
not always).
## e.g. _SemanticAnalyzer.genBucketingSortingDest(String dest, Operator input, 
QB qb, TableDesc table_desc, Table dest_tab, SortBucketRSCtx ctx)_ 
# LoadSemanticAnalyzer.analyzeInternal(ASTNode) sets statementId to 0 rather 
than from TM
# ImportCommitTask - doesn't currently do anything.  It used to commit mmID.  
Need to verify we properly commit the txn in the Driver
# As far as I can tell all the mm_*.q tests run on TestCliDriver which means 
MR.  This doesn't exercise some code specifically for dealing with writes from 
Union All queries (CTAS, Insert into).  On MR this requires 
"hive.optimize.union.remove=true" (false by default)
# Remove MoveWork().setNoop(boolean) and usages per todo in 
_GenMapRedUtils.createMRWorkForMergingFiles (FileSinkOperator fsInput, Path 
finalName, DependencyCollectionTask dependencyTask,   List<Task<MoveWork>> 
mvTasks, HiveConf conf,   Task<? extends Serializable> currTask)_
# PartialScanWork.tblDesc - unused
# _Partition.getBucketPath(int bucketNum)_ has "// Note: this makes assumptions 
that won't work with MM tables, unions, etc.".  File Jira?
# _PartitionDesc.LOG_ is unused
# Insert Overwrite for MM is incomplete - see comments in HIVE-15212 regarding 
IOW and multi IOW





> Miscellaneous List
> ------------------
>
>                 Key: HIVE-17691
>                 URL: https://issues.apache.org/jira/browse/HIVE-17691
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Transactions
>            Reporter: Eugene Koifman
>
> # DDLSemanticAnalyzer.alterTableOutput is unused
> # DDLTask.generateAddMmTasks(Table) - stmtId should probably come from 
> TransactionManager
> # DDLTask.createTable(Hive db, CreateTableDesc crtTbl) has _Long mmWriteId = 
> crtTbl.getInitialMmWriteId();_ logic is unclear..  this ID is only set in one 
> place..
> # FileSinkOperator has multiple places that look like _conf.getWriteType() == 
> AcidUtils.Operation.NOT_ACID || conf.isMmTable()_ - what is the writeType for 
> MM tables?  Seems that Wei opted for "work.getLoadTableWork().getWriteType() 
> != AcidUtils.Operation.NOT_ACID && !tbd.isMmTable()" to mean MM, e.g. 
> MoveTask.handleStaticParts() call to Hive.loadPartition()
> # HiveConf.HIVE_TXN_OPERATIONAL_PROPERTIES - the doc/explanation there is 
> obsolete
> # Compactor Initiator likely doesn't work for MM tables.  It's triggered by 
> into in TXN_COMPONENTS/COMPLETED_TXN_COMPONENTS.  MM tables don't write to 
> either because DbTxnManager.acquireLocks() does  
> _compBuilder.setIsAcid(AcidUtils.isFullAcidTable(t));_ i.e. it treats MM as 
> non-acid tables
> # In general integration with full Acid seems confused wrt to MM and seems to 
> treat MM as special table type rather than subtype of Acid table.  (mostly, 
> but not always).
> ## e.g. _SemanticAnalyzer.genBucketingSortingDest(String dest, Operator 
> input, QB qb, TableDesc table_desc, Table dest_tab, SortBucketRSCtx ctx)_ 
> ##  _SemanticAnalyzer.validate()_ has _if (tbl != null && 
> (AcidUtils.isFullAcidTable(tbl) || 
> MetaStoreUtils.isInsertOnlyTable(tbl.getParameters()))) {_
> # LoadSemanticAnalyzer.analyzeInternal(ASTNode) sets statementId to 0 rather 
> than from TM
> # ImportCommitTask - doesn't currently do anything.  It used to commit mmID.  
> Need to verify we properly commit the txn in the Driver
> # As far as I can tell all the mm_*.q tests run on TestCliDriver which means 
> MR.  This doesn't exercise some code specifically for dealing with writes 
> from Union All queries (CTAS, Insert into).  On MR this requires 
> "hive.optimize.union.remove=true" (false by default)
> # Remove MoveWork().setNoop(boolean) and usages per todo in 
> _GenMapRedUtils.createMRWorkForMergingFiles (FileSinkOperator fsInput, Path 
> finalName, DependencyCollectionTask dependencyTask,   List<Task<MoveWork>> 
> mvTasks, HiveConf conf,   Task<? extends Serializable> currTask)_
> # PartialScanWork.tblDesc - unused
> # _Partition.getBucketPath(int bucketNum)_ has "// Note: this makes 
> assumptions that won't work with MM tables, unions, etc.".  File Jira?
> # _PartitionDesc.LOG_ is unused
> # Insert Overwrite for MM is incomplete - see comments in HIVE-15212 
> regarding IOW and multi IOW



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17691) Miscellaneous List

Reply via email to