[
https://issues.apache.org/jira/browse/HIVE-17674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190690#comment-16190690
]
Sergey Shelukhin commented on HIVE-17674:
-----------------------------------------
Removed some TODOs, fixed a couple, and marked others that are MM gaps as such.
Remaining TODOs relevant to after the merge (although those with a "?" are very
low pri, or just need to be verified):
{noformat}
$ git diff master | grep TODO | grep "^+" | grep -E "(--|MM gap)"
+ // TODO [MM gap?]: by design; no-one seems to use LB tables. They will
work, but not convert.
+ // TODO [MM gap?]: this doesn't work, however this is MR only.
+ // TODO [MM gap]: CTAS may currently be broken. It used to work. See the old
code, and why isCtas isn't used?
+ // TODO [MM gap]: for IOW, we also need to count in base dir, if any
+ // TODO [MM gap?]: bad merge; tblDesc is no longer CreateTableDesc, but
ImportTableDesc.
+-- TODO: doesn't work truncate table part_mm partition(key_mm=455);
+-- TODO: need to include merge+union+DP, but it's broken for now
{noformat}
In the following TODOs, a couple are from moved lines (i.e. they exist in
master), and the rest are for existing code. In the process of working on MM
tables, I saw all of the legacy code in Hive and I was horrified, but it
doesn't mean I'm going to fix it all...
{noformat}
$ git diff master | grep TODO | grep "^+" | grep -v -E "(--|MM gap)"
+ // TODO: why is it stored in both table and dpCtx?
+ // TODO: In a follow-up to HIVE-1361, we should refactor
loadDynamicPartitions
+ // TODO: not clear why two if conditions are different. Preserve the
existing logic for now.
+ // TODO: not clear why two if conditions are different. Preserve the
existing logic for now.
+ // TODO: see HIVE-14886 - removeTempOrDuplicateFiles is broken for list
bucketing,
+ // TODO: not having aliases for path usually means some bug. Should it
give up?
+ String lbDirSuffix = lbDirName.replace(partDirName, ""); // TODO: should it
rather do a prefix?
+ // TODO: we should really probably throw. Keep the existing logic for
now.
+ // TODO: why is this in text formatter?!!
+ // TODO: if we are not dealing with concatenate DDL, we should not create
a merge+move path
+ // TODO: wtf?!! why is this in this method? This has nothing to do with
anything.
+ // TODO: this code is buggy - it relies on having one file per bucket; no
MM support (by design).
+ // TODO: should this use getPartitionDescFromPathRecursively? That's
what other code uses.
+ // TODO: we should probably block all ACID tables here.
+ // TODO: the below seem like they should just be combined into partitionDesc
+ // TODO: could we instead get FS from path here and add normal files for
every UGI?
+ // TODO: move these test parameters to more specific places... there's no
need to have them here
{noformat}
> grep TODO HIVE-15212.17.patch |wc - l = 49
> ------------------------------------------
>
> Key: HIVE-17674
> URL: https://issues.apache.org/jira/browse/HIVE-17674
> Project: Hive
> Issue Type: Sub-task
> Components: Transactions
> Reporter: Eugene Koifman
> Assignee: Sergey Shelukhin
>
> grep TODO HIVE-15212.17.patch
> + // TODO The following two utility methods can be moved to AcidUtils once
> no class in metastore is relying on them,
> + // TODO: CTAS for MM may currently be broken. It used to work. See the old
> code and why isCtas isn't used?
> + // TODO: Setting these separately is a very hairy issue in certain
> combinations, since we
> + // TODO: this doesn't handle list bucketing properly. Does the original
> exim do that?
> + // TODO: support this? we only bail because it's a PITA and hardly
> anyone seems to care.
> + // TODO: support this?
> + // TODO: support this?
> + // TODO: why is it stored in both?
> + // TODO: this doesn't work... the path for writer and reader mismatch
> - // TODO: In a follow-up to HIVE-1361, we should refactor
> loadDynamicPartitions
> + // TODO: In a follow-up to HIVE-1361, we should refactor
> loadDynamicPartitions
> + // TODO: this particular bit will not work for MM tables, as there can
> be multiple
> + // TODO: not clear why two if conditions are different. Preserve the
> existing logic for now.
> + // TODO: not clear why two if conditions are different. Preserve the
> existing logic for now.
> + // TODO: not clear why two if conditions are different. Preserve the
> existing logic for now.
> + // TODO: not clear why two if conditions are different. Preserve the
> existing logic for now.
> + // TODO: do we also need to do this when creating an empty partition
> from select?
> + // TODO: see HIVE-14886 - removeTempOrDuplicateFiles is broken for list
> bucketing,
> + // TODO: for IOW, we also need to count in base dir, if any
> + // TODO: HiveFileFormatUtils.getPartitionDescFromPathRecursively for
> MM tables?
> + // TODO: not having aliases for path usually means some bug. Should
> it give up?
> updateMrWork(jobConf); // TODO: refactor this in HIVE-6366
> + // TODO: this assumes both paths are qualified; which they are,
> currently.
> + // TODO: just like the move path, we only do one level of recursion.
> + // TODO: may be broken? no LB bugs for now but if any are found.
> + String lbDirSuffix = lbDirName.replace(partDirName, ""); // TODO: wtf?
> + // TODO: we should really probably throw. Keep the existing logic
> for now.
> + // TODO: this assumes both paths are qualified; which they are,
> currently.
> + // TODO: we assume lbLevels is 0 here. Same as old code for non-MM.
> + // TODO: we could just find directories with any MM directories
> inside?
> // TODO: add test case and clean it up
> + // Duplicates logic in TextMetaDataFormatter TODO: wtf?!!
> + // TODO: why is this in text formatter? grrr
> + // TODO# noop MoveWork to avoid q file changes in HIVE-14990. Remove
> (w/the flag) after merge.
> + // TODO: this should never happen for mm tables.
> + // TODO: if we are not dealing with concatenate DDL, we should not
> create a merge+move path
> + // TODO: wtf? wtf?!! why is this in this method?
> + // TODO: this relies a lot on having one file per bucket. No support for
> MM tables for now.
> + // TODO: should this use getPartitionDescFromPathRecursively?
> + // TODO: we should probably block all ACID tables here.
> + @SuppressWarnings("unused") // TODO: wtf?
> + // TODO: due to the master merge, tblDesc is no longer CreateTableDesc,
> but ImportTableDesc
> + // TODO: ReplCopyTask is completely screwed. Need to support when
> it's not as screwed.
> + // TODO: ReplCopyTask is completely screwed. Need to support when
> it's not as screwed.
> // TODO: the below seems like they should just be combined into
> partitionDesc
> + // TODO: could we instead get FS from path here and add normal files for
> every UGI?
> +-- TODO: doesn't work truncate table part_mm partition(key_mm=455);
> +-- TODO: need to include merge+union+DP, but it's broken for now
> + // TODO: move these test parameters to more specific places... there's no
> need to have them here
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)