[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-05-02 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-18988.01-branch-3.patch, HIVE-18988.01.patch, 
> HIVE-18988.02.patch, HIVE-18988.03.patch, HIVE-18988.04.patch, 
> HIVE-18988.05.patch, HIVE-18988.06.patch, HIVE-18988.07.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-05-02 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Open  (was: Patch Available)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-18988.01-branch-3.patch, HIVE-18988.01.patch, 
> HIVE-18988.02.patch, HIVE-18988.03.patch, HIVE-18988.04.patch, 
> HIVE-18988.05.patch, HIVE-18988.06.patch, HIVE-18988.07.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-05-02 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Attachment: HIVE-18988.01-branch-3.patch

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-18988.01-branch-3.patch, HIVE-18988.01.patch, 
> HIVE-18988.02.patch, HIVE-18988.03.patch, HIVE-18988.04.patch, 
> HIVE-18988.05.patch, HIVE-18988.06.patch, HIVE-18988.07.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-05-02 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Patch Available  (was: Open)

Attached branch-3 patch for 3.0.0.

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-18988.01-branch-3.patch, HIVE-18988.01.patch, 
> HIVE-18988.02.patch, HIVE-18988.03.patch, HIVE-18988.04.patch, 
> HIVE-18988.05.patch, HIVE-18988.06.patch, HIVE-18988.07.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-05-02 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Fix Version/s: 3.0.0

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch, HIVE-18988.05.patch, 
> HIVE-18988.06.patch, HIVE-18988.07.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-05-01 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Patch Available  (was: Open)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch, HIVE-18988.05.patch, 
> HIVE-18988.06.patch, HIVE-18988.07.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-05-01 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Attachment: HIVE-18988.07.patch

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch, HIVE-18988.05.patch, 
> HIVE-18988.06.patch, HIVE-18988.07.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-05-01 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Attachment: (was: HIVE-18988.07.patch)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch, HIVE-18988.05.patch, 
> HIVE-18988.06.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-05-01 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Open  (was: Patch Available)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch, HIVE-18988.05.patch, 
> HIVE-18988.06.patch, HIVE-18988.07.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-30 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Patch Available  (was: Open)

Thanks for the review [~maheshk114] and [~thejas]!

Attached 07.patch after rebasing with master. 

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch, HIVE-18988.05.patch, 
> HIVE-18988.06.patch, HIVE-18988.07.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-30 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Attachment: HIVE-18988.07.patch

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch, HIVE-18988.05.patch, 
> HIVE-18988.06.patch, HIVE-18988.07.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-30 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Open  (was: Patch Available)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch, HIVE-18988.05.patch, 
> HIVE-18988.06.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-27 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Attachment: HIVE-18988.06.patch

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch, HIVE-18988.05.patch, 
> HIVE-18988.06.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-27 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Patch Available  (was: Open)

Attached 06.patch with fixes for review comments from [~maheshk114].

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch, HIVE-18988.05.patch, 
> HIVE-18988.06.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-27 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Open  (was: Patch Available)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch, HIVE-18988.05.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-26 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Patch Available  (was: Open)

Added 05.patch with
 * Changed logic to trigger major compaction on tables/partitions with aborted 
data from bootstrap load flow itself instead of expecting auto-trigger based on 
TXN_COMPONENTS.

 

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch, HIVE-18988.05.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-26 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Attachment: HIVE-18988.05.patch

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch, HIVE-18988.05.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-26 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Open  (was: Patch Available)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-24 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Patch Available  (was: Open)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-24 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Attachment: HIVE-18988.04.patch

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-24 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Attachment: (was: HIVE-18988.04.patch)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-24 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Open  (was: Patch Available)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-24 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Patch Available  (was: Open)

Added 04.patch with
 * Logic to timeout the open txns which are opened before triggering bootstrap.
 * Replicate the write ids state in target based on validWriteIdlist for each 
ACID/MM table getting replicated.

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-24 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Attachment: HIVE-18988.04.patch

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-24 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Attachment: (was: HIVE-18988.04.patch)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-24 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Description: 
Bootstrapping of ACID tables, need special handling to replicate a stable state 
of data.
 - If ACID feature enables, then perform bootstrap dump for ACID tables with in 
read txn.
 -> Dump table/partition metadata.
 -> Get the list of valid data files for a table using same logic as read txn 
do.
 -> Dump latest ValidWriteIdList as per current read txn.
 - Set the valid last replication state such that it doesn't miss any open txn 
started after triggering bootstrap dump.
 - If any txns on-going which was opened before triggering bootstrap dump, then 
it is not guaranteed that if open_txn event captured for these txns. Also, if 
these txns are opened for streaming ingest case, then dumped ACID table data 
may include data of open txns which impact snapshot isolation at target. To 
avoid that, bootstrap dump should wait for timeout (new configuration: 
hive.repl.bootstrap.dump.open.txn.timeout). After timeout, just force abort 
those txns and continue.
 - If any txns force aborted belongs to a streaming ingest case, then dumped 
ACID table data may have aborted data too. So, it is necessary to replicate the 
aborted write ids to target to mark those data invalid for any readers.

  was:
Bootstrapping of ACID tables, need special handling to replicate a stable state 
of data.
 - If ACID feature enables, then perform bootstrap dump for ACID tables with in 
read txn.
 -> Dump table/partition metadata.
 -> Get the list of valid data files for a table using same logic as read txn 
do.
 -> Dump latest ValidWriteIdList as per current read txn.
 - Find the valid last replication state such that it points to event ID of 
open_txn event of oldest on-going txn.


> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-24 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Attachment: HIVE-18988.04.patch

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch, HIVE-18988.04.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Find the valid last replication state such that it points to event ID of 
> open_txn event of oldest on-going txn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-18 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Target Version/s: 3.0.0

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Find the valid last replication state such that it points to event ID of 
> open_txn event of oldest on-going txn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-16 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Open  (was: Patch Available)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Find the valid last replication state such that it points to event ID of 
> open_txn event of oldest on-going txn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Patch Available  (was: Open)

Added 03.patch after rebasing with master.

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Find the valid last replication state such that it points to event ID of 
> open_txn event of oldest on-going txn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Attachment: HIVE-18988.03.patch

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Find the valid last replication state such that it points to event ID of 
> open_txn event of oldest on-going txn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Open  (was: Patch Available)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch, 
> HIVE-18988.03.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Find the valid last replication state such that it points to event ID of 
> open_txn event of oldest on-going txn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-09 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-18988:
---
Fix Version/s: (was: 3.0.0)
   3.1.0

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Find the valid last replication state such that it points to event ID of 
> open_txn event of oldest on-going txn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-08 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Patch Available  (was: Open)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Find the valid last replication state such that it points to event ID of 
> open_txn event of oldest on-going txn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-08 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Attachment: HIVE-18988.02.patch

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Find the valid last replication state such that it points to event ID of 
> open_txn event of oldest on-going txn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-08 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Open  (was: Patch Available)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-18988.01.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Find the valid last replication state such that it points to event ID of 
> open_txn event of oldest on-going txn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-08 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Status: Patch Available  (was: In Progress)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-18988.01.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Find the valid last replication state such that it points to event ID of 
> open_txn event of oldest on-going txn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-08 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Attachment: HIVE-18988.01.patch

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-18988.01.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Find the valid last replication state such that it points to event ID of 
> open_txn event of oldest on-going txn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-08 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-18988:
--
Labels: ACID DR pull-request-available replication  (was: ACID DR 
replication)

> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Find the valid last replication state such that it points to event ID of 
> open_txn event of oldest on-going txn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18988) Support bootstrap replication of ACID tables

2018-04-05 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18988:

Description: 
Bootstrapping of ACID tables, need special handling to replicate a stable state 
of data.
 - If ACID feature enables, then perform bootstrap dump for ACID tables with in 
read txn.
 -> Dump table/partition metadata.
 -> Get the list of valid data files for a table using same logic as read txn 
do.
 -> Dump latest ValidWriteIdList as per current read txn.
 - Find the valid last replication state such that it points to event ID of 
open_txn event of oldest on-going txn.

  was:
Bootstrapping of ACID tables, need special handling to replicate a stable state 
of data.
 - If ACID feature enables, then perform bootstrap dump for ACID tables with in 
read txn.
-> Dump table/partition metadata.
-> Get the list of valid data files for a table using same logic as read txn do.
-> Dump latest valid table Write ID as per current read txn.
 - Find the valid last replication state such that it points to event ID of 
open_txn event of oldest on-going txn.


> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, replication
> Fix For: 3.0.0
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Find the valid last replication state such that it points to event ID of 
> open_txn event of oldest on-going txn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)