[
https://issues.apache.org/jira/browse/HIVE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720714#comment-16720714
]
Eugene Koifman commented on HIVE-20941:
---------------------------------------
Notes for myself:
{{AcidUtils.getAcidState()}} has
{code:java}
else if (prev != null && next.maxWriteId == prev.maxWriteId
&& next.minWriteId == prev.minWriteId
&& next.statementId == prev.statementId) {
// The 'next' parsedDelta may have everything equal to the 'prev'
parsedDelta, except
// the path. This may happen when we have split update and we have two
types of delta
// directories- 'delta_x_y' and 'delete_delta_x_y' for the SAME txn
range.
// Also note that any delete_deltas in between a given delta_x_y range
would be made
// obsolete. For example, a delta_30_50 would make delete_delta_40_40
obsolete.
// This is valid because minor compaction always compacts the normal
deltas and the delete
// deltas for the same range. That is, if we had 3 directories,
delta_30_30,
// delete_delta_40_40 and delta_50_50, then running minor compaction
would produce
// delta_30_50 and delete_delta_30_50.
deltas.add(next);
prev = next;
}
{code}
{{AcidUtils.ParsedDelta.compareTo()}} sorts delta_x_y after delete_delta_x_y
{{CompactorMR.run()}} calls getAcidState() and looks at all the deltas (insert
+ delete) to find min/max for delta_min_max that it will produce.
{{CompactorMap.map()}} feeds all delta dir Paths to {{OrcRawRecordMerger}}
which does a multiway merge to output a single stream of events that can be
either Insert or Delete. {{map()}} then splits the stream into 2 according to
this type.
So the invariant remains the same, for any given x, y we can {{delta_x_y}} or
({{delta_x_y}} and {{delete_delta_x_y}}) or {{delete_delta_x_y}} just like
before this change.
I tweaked the text of a comment, so attaching patch 6 for completeness.
> Compactor produces a delete_delta_x_y even if there are no input delete events
> ------------------------------------------------------------------------------
>
> Key: HIVE-20941
> URL: https://issues.apache.org/jira/browse/HIVE-20941
> Project: Hive
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 3.0.0
> Reporter: Eugene Koifman
> Assignee: Igor Kryvenko
> Priority: Major
> Attachments: HIVE-20941.01.patch, HIVE-20941.02.patch,
> HIVE-20941.03.patch, HIVE-20941.04.patch, HIVE-20941.05.patch,
> HIVE-20941.06.patch
>
>
> see example in HIVE-20901
>
> Probably change logic in CompactorMR.CompactorMap.map() which creates delete
> event writer
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)