Sergey Shelukhin commented on HIVE-19124:

The issue for compactor specifically could probably be addressed by
1) Modifying txn list in recordValidWriteIds in Driver based on the flag. We 
create the driver so the flag can be set directly, no shennanigans necessary. 
Compactor write IDs can be created and serialized for the query to only read 
the data we want.
2) When renaming the directory, generating the final name in commit method, 
AFTER the query, based on write IDs that the driver actually used.

That way we don't even need any UDFs or INPUT_FILE_NAME stuff and it will work 
just like that.
I'm not sure I'll have enough time to finish this today and I'm out next week, 
but I'll attach a WIP patch. 

> implement a basic major compactor for MM tables
> -----------------------------------------------
>                 Key: HIVE-19124
>                 URL: https://issues.apache.org/jira/browse/HIVE-19124
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>            Priority: Major
>              Labels: mm-gap-2
>         Attachments: HIVE-19124.01.patch, HIVE-19124.patch
> For now, it will run a query directly and only major compactions will be 
> supported.

This message was sent by Atlassian JIRA

Reply via email to