+1

Cheers,
Gopal


On 1/9/18, 10:55 PM, "Sankar Hariappan" <shariap...@hortonworks.com> wrote:

    Hi all,
    
    "Hive Replication” feature is advancing to support ACID tables 
(HIVE-18320<https://issues.apache.org/jira/browse/HIVE-18320>).
    “Per Table Write ID” is an important requirement to support replication for 
ACID tables especially for the use case of “Analytics workload off-loading for 
scalability”. Details are available in the design document attached in the JIRA.
    
    Per table Write ID implementation have several changes.
    
      1.  Add metadata tables to allocate and manage write ID. Also, map it 
against global transaction.
      2.  Handle snapshot isolation for ACID/MM table reads by using 
ValidWriteIDList instead of ValidTxnList.
      3.  Modify ORC/Hive row readers to use ValidWriteIDList instead of 
ValidTxnList to read valid delta/base directories.
      4.  Update ValidCompactorTxnList to use table Write Ids.
      5.  Upgrade from existing Hive versions by migrating the ACID/MM tables 
to use Write ID instead of global transaction ID.
      6.  Correct the UT test scripts to use ValidWriteIDList instead of 
ValidTxnList for snapshot isolation tests.
      7.  Rename the method/variable names of several classes to use WriteId 
instead of TxnId.
    
    As part of HIVE-18192<https://issues.apache.org/jira/browse/HIVE-18192>, I 
have implemented first 3 changes in the list which makes ACID read/write to 
work with Write ID change. But, this feature will be incomplete without rest of 
the changes.
    
    Hence, I would like to create a branch (branch-per-table-writeid) from 
master to commit this feature with multiple patches. This branch is expected to 
be short-lived for 2 to 3 weeks.
    
    Request feedback from the community.
    
    Best regards
    Sankar
    
    


Reply via email to