[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-15995: --- Resolution: Fixed Fix Version/s: 1.4.0 Status: Resolved (was: Patch Available) I ran test suite with 15995.branch-1.v7.patch which passed. Thanks Vincent for the patch. > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-15995.branch-1.v7.patch, > HBASE-15995.master.v1.patch, HBASE-15995.master.v2.patch, > HBASE-15995.master.v3.patch, HBASE-15995.master.v4.patch, > HBASE-15995.master.v6.patch, HBASE-15995.master.v7.patch, > replicationV1_100ms_delay.png, replicationV2_100ms_delay.png > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-15995: --- Status: Patch Available (was: Reopened) > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.branch-1.v7.patch, > HBASE-15995.master.v1.patch, HBASE-15995.master.v2.patch, > HBASE-15995.master.v3.patch, HBASE-15995.master.v4.patch, > HBASE-15995.master.v6.patch, HBASE-15995.master.v7.patch, > replicationV1_100ms_delay.png, replicationV2_100ms_delay.png > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated HBASE-15995: - Attachment: HBASE-15995.branch-1.v7.patch Here's a branch-1 patch. Compatibility checker didn't show any problems. > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.branch-1.v7.patch, > HBASE-15995.master.v1.patch, HBASE-15995.master.v2.patch, > HBASE-15995.master.v3.patch, HBASE-15995.master.v4.patch, > HBASE-15995.master.v6.patch, HBASE-15995.master.v7.patch, > replicationV1_100ms_delay.png, replicationV2_100ms_delay.png > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-15995: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks for the patch, Vincent. Thanks all for the reviews. > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.master.v1.patch, > HBASE-15995.master.v2.patch, HBASE-15995.master.v3.patch, > HBASE-15995.master.v4.patch, HBASE-15995.master.v6.patch, > HBASE-15995.master.v7.patch, replicationV1_100ms_delay.png, > replicationV2_100ms_delay.png > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated HBASE-15995: - Attachment: HBASE-15995.master.v7.patch rebase > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.master.v1.patch, > HBASE-15995.master.v2.patch, HBASE-15995.master.v3.patch, > HBASE-15995.master.v4.patch, HBASE-15995.master.v6.patch, > HBASE-15995.master.v7.patch, replicationV1_100ms_delay.png, > replicationV2_100ms_delay.png > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated HBASE-15995: - Attachment: HBASE-15995.master.v6.patch fixed the rebase. Was supposed to accept "theirs" not "ours" on TestReplicationWALReaderManager.java > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.master.v1.patch, > HBASE-15995.master.v2.patch, HBASE-15995.master.v3.patch, > HBASE-15995.master.v4.patch, HBASE-15995.master.v6.patch, > replicationV1_100ms_delay.png, replicationV2_100ms_delay.png > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated HBASE-15995: - Attachment: (was: HBASE-15995.master.v6.patch) > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.master.v1.patch, > HBASE-15995.master.v2.patch, HBASE-15995.master.v3.patch, > HBASE-15995.master.v4.patch, replicationV1_100ms_delay.png, > replicationV2_100ms_delay.png > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated HBASE-15995: - Attachment: HBASE-15995.master.v6.patch rebased > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.master.v1.patch, > HBASE-15995.master.v2.patch, HBASE-15995.master.v3.patch, > HBASE-15995.master.v4.patch, HBASE-15995.master.v6.patch, > replicationV1_100ms_delay.png, replicationV2_100ms_delay.png > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated HBASE-15995: - Attachment: (was: HBASE-15995.master.v6.patch) > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.master.v1.patch, > HBASE-15995.master.v2.patch, HBASE-15995.master.v3.patch, > HBASE-15995.master.v4.patch, HBASE-15995.master.v6.patch, > replicationV1_100ms_delay.png, replicationV2_100ms_delay.png > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated HBASE-15995: - Attachment: HBASE-15995.master.v6.patch Fixed flapping TestSerialReplication > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.master.v1.patch, > HBASE-15995.master.v2.patch, HBASE-15995.master.v3.patch, > HBASE-15995.master.v4.patch, HBASE-15995.master.v6.patch, > replicationV1_100ms_delay.png, replicationV2_100ms_delay.png > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated HBASE-15995: - Attachment: HBASE-15995.master.v4.patch > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.master.v1.patch, > HBASE-15995.master.v2.patch, HBASE-15995.master.v3.patch, > HBASE-15995.master.v4.patch, replicationV1_100ms_delay.png, > replicationV2_100ms_delay.png > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated HBASE-15995: - Attachment: HBASE-15995.master.v3.patch > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.master.v1.patch, > HBASE-15995.master.v2.patch, HBASE-15995.master.v3.patch, > replicationV1_100ms_delay.png, replicationV2_100ms_delay.png > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated HBASE-15995: - Attachment: (was: HBASE-17328-master.v2.patch) > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.master.v1.patch, > HBASE-15995.master.v2.patch, replicationV1_100ms_delay.png, > replicationV2_100ms_delay.png > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated HBASE-15995: - Attachment: HBASE-15995.master.v2.patch > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.master.v1.patch, > HBASE-15995.master.v2.patch, replicationV1_100ms_delay.png, > replicationV2_100ms_delay.png > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated HBASE-15995: - Attachment: HBASE-17328-master.v2.patch > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.master.v1.patch, > HBASE-17328-master.v2.patch, replicationV1_100ms_delay.png, > replicationV2_100ms_delay.png > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated HBASE-15995: - Attachment: replicationV2_100ms_delay.png replicationV1_100ms_delay.png Example of network usage. X-axis is time and Y-axis is network usage. The patch keeps the network utilization higher while replicating. V1 has wider gaps while reading/filtering the next batch. > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.master.v1.patch, > replicationV1_100ms_delay.png, replicationV2_100ms_delay.png > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated HBASE-15995: - Status: Patch Available (was: Open) > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.master.v1.patch > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15995) Separate replication WAL reading from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated HBASE-15995: - Attachment: HBASE-15995.master.v1.patch This patch does two things 1) Puts replication reading into a separate thread. This is done by ReplicationWALEntryBatcher, which reads entries and puts them onto a queue. ReplicationSourceWorkerThread is reduced to simply reading batches off the queue and shipping them. 2) Puts the actual WAL entry reading logic in a WALEntryStream class. This implements iterator. Eventually when we have a way to stream over the network, we can get rid of the batcher above and simplify to something like {code} while(entryStream.hasNext()) { shipEntry(entryStream.next()); } {code} I tried to keep the rest of the logic the same as what currently exists. We could put ReplicationSource into another class ReplicationSourceV2 if so desired. I believe all replication tests pass except TestGlobalThrottler. This is because one thread is currently reading a batch, and the other thread is shipping the last batch, so even if your queue holds only 1 batch, you're using double the memory. (If I modify the test to double the threshold, it passes) I've done performance testing by setting up a single standalone region server shipping to a remote cluster, and then running PerformanceEvaluation to generate 3gb of data. The amount of time for replication to catch up: ReplicationSourceV1-190s (source.size.capacity of 64mb) ReplicationSourceV2-160s (source.size.capacity of 32mb, with queue size of 1 so that the max memory used should be 64mb) There's better performance in situations where reading or filtering entries is more expensive (e.g. contention for disk/cpu). For example, I tried introducing a 100ms delay in a custom entry filter. ReplicationSourceV1 - 366s ReplicationSourceV2 - 236s > Separate replication WAL reading from shipping > -- > > Key: HBASE-15995 > URL: https://issues.apache.org/jira/browse/HBASE-15995 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 >Reporter: Vincent Poon >Assignee: Vincent Poon > Fix For: 2.0.0 > > Attachments: HBASE-15995.master.v1.patch > > > Currently ReplicationSource reads edits from the WAL and ships them in the > same thread. > By breaking out the reading from the shipping, we can introduce greater > parallelism and lay the foundation for further refactoring to a pipelined, > streaming model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)