[ 
https://issues.apache.org/jira/browse/HIVE-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-21109:
----------------------------------
        Labels:   (was: pull-request-available)
    Attachment: HIVE-21109.06.patch
        Status: Patch Available  (was: In Progress)

The patch has code to replicate statistics for a table migrated to a 
transactional table.

The test that failed in the last ptest run passed locally for me.

Here's short description of all the changes made for replicating stats for ACID 
tables.

During bootstrap we use a method similar to non-ACID tables to transfer 
statistics of an ACID table from source to replica. However installing 
statistics of an ACID table requires a valid writeId and writeId list. We use 
the table/partition's latest writeId and a valid transaction list containing 
only that writeId to install the statistics in the metastore. For a table 
migrated to a transactional table, we use the default bootstrap write ID to 
install statistics if any.

During incremental replication writeId is obtained from the UpdateStats event 
and valid writeId list with that writeId marked as valid is used to install the 
column statistics. Table level statistics is replicated by replaying 
corresponding ALTER_TABLE/ALTER_PARTITION event. For a table migrated to a 
transactional table we open a migration transaction and use the corresponding 
writeId for installing column statistics.

Further this commit has following related changes.

1. The table or the partition associated with the commit transaction event 
should have been created when replaying corresponding events before commit 
transaction event. Thus there is no need to add tasks for creating the table or 
the partition.

2 Maintain a list of open replicated transactions and use that to create valid 
transactions list
when replaying a replicated event.

> Stats replication for ACID tables.
> ----------------------------------
>
>                 Key: HIVE-21109
>                 URL: https://issues.apache.org/jira/browse/HIVE-21109
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Ashutosh Bapat
>            Assignee: Ashutosh Bapat
>            Priority: Major
>         Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to