[ 
https://issues.apache.org/jira/browse/HIVE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720714#comment-16720714
 ] 

Eugene Koifman commented on HIVE-20941:
---------------------------------------

Notes for myself:

{{AcidUtils.getAcidState()}} has
{code:java}
else if (prev != null && next.maxWriteId == prev.maxWriteId
                  && next.minWriteId == prev.minWriteId
                  && next.statementId == prev.statementId) {
        // The 'next' parsedDelta may have everything equal to the 'prev' 
parsedDelta, except
        // the path. This may happen when we have split update and we have two 
types of delta
        // directories- 'delta_x_y' and 'delete_delta_x_y' for the SAME txn 
range.

        // Also note that any delete_deltas in between a given delta_x_y range 
would be made
        // obsolete. For example, a delta_30_50 would make delete_delta_40_40 
obsolete.
        // This is valid because minor compaction always compacts the normal 
deltas and the delete
        // deltas for the same range. That is, if we had 3 directories, 
delta_30_30,
        // delete_delta_40_40 and delta_50_50, then running minor compaction 
would produce
        // delta_30_50 and delete_delta_30_50.

        deltas.add(next);
        prev = next;
      }
{code}
{{AcidUtils.ParsedDelta.compareTo()}} sorts delta_x_y after delete_delta_x_y

{{CompactorMR.run()}} calls getAcidState() and looks at all the deltas (insert 
+ delete) to find min/max for delta_min_max that it will produce.

{{CompactorMap.map()}} feeds all delta dir Paths to {{OrcRawRecordMerger}} 
which does a multiway merge to output a single stream of events that can be 
either Insert or Delete. {{map()}} then splits the stream into 2 according to 
this type.

So the invariant remains the same, for any given x, y we can {{delta_x_y}} or 
({{delta_x_y}} and {{delete_delta_x_y}}) or {{delete_delta_x_y}} just like 
before this change.

I tweaked the text of a comment, so attaching patch 6 for completeness.

> Compactor produces a delete_delta_x_y even if there are no input delete events
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-20941
>                 URL: https://issues.apache.org/jira/browse/HIVE-20941
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 3.0.0
>            Reporter: Eugene Koifman
>            Assignee: Igor Kryvenko
>            Priority: Major
>         Attachments: HIVE-20941.01.patch, HIVE-20941.02.patch, 
> HIVE-20941.03.patch, HIVE-20941.04.patch, HIVE-20941.05.patch, 
> HIVE-20941.06.patch
>
>
> see example in HIVE-20901
>  
> Probably change logic in CompactorMR.CompactorMap.map() which creates delete 
> event writer



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to