[jira] [Commented] (HIVE-20699) Query based compactor for full CRUD Acid tables

Vaibhav Gumashta (JIRA) Mon, 04 Feb 2019 15:08:48 -0800


    [ 
https://issues.apache.org/jira/browse/HIVE-20699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760289#comment-16760289
 ]


Vaibhav Gumashta commented on HIVE-20699:
-----------------------------------------

[~ekoifman] Thanks, fixed the memory settings for the test case, so I don't see 
the memory estimation returning -ve values now. Also removed redundant imports.

> Query based compactor for full CRUD Acid tables
> -----------------------------------------------
>
>                 Key: HIVE-20699
>                 URL: https://issues.apache.org/jira/browse/HIVE-20699
>             Project: Hive
>          Issue Type: New Feature
>          Components: Transactions
>    Affects Versions: 3.1.0
>            Reporter: Eugene Koifman
>            Assignee: Vaibhav Gumashta
>            Priority: Major
>         Attachments: HIVE-20699.1.patch, HIVE-20699.1.patch, 
> HIVE-20699.10.patch, HIVE-20699.11.patch, HIVE-20699.2.patch, 
> HIVE-20699.3.patch, HIVE-20699.4.patch, HIVE-20699.5.patch, 
> HIVE-20699.6.patch, HIVE-20699.7.patch, HIVE-20699.8.patch, HIVE-20699.9.patch
>
>
> Currently the Acid compactor is implemented as generated MR job 
> ({{CompactorMR.java}}).
> It could also be expressed as a Hive query that reads from a given partition 
> and writes data back to the same partition.  This will merge the deltas and 
> 'apply' the delete events.  The simplest would be to just use Insert 
> Overwrite but that will change all ROW__IDs which we don't want.
> Need to implement this in a way that preserves ROW__IDs and creates a new 
> {{base_x}} directory to handle Major compaction.
> Minor compaction will be investigated separately.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20699) Query based compactor for full CRUD Acid tables

Reply via email to