[ 
https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated HBASE-15995:
---------------------------------
    Attachment: HBASE-15995.master.v1.patch

This patch does two things

1)  Puts replication reading into a separate thread.  This is done by 
ReplicationWALEntryBatcher, which reads entries and puts them onto a queue.  
ReplicationSourceWorkerThread is reduced to simply reading batches off the 
queue and shipping them.

2)  Puts the actual WAL entry reading logic in a WALEntryStream class.  This 
implements iterator.  Eventually when we have a way to stream over the network, 
we can get rid of the batcher above and simplify to something like
{code}
while(entryStream.hasNext()) {
  shipEntry(entryStream.next());
}
{code}

I tried to keep the rest of the logic the same as what currently exists.  We 
could put ReplicationSource into another class ReplicationSourceV2 if so 
desired.

I believe all replication tests pass except TestGlobalThrottler.  This is 
because one thread is currently reading a batch, and the other thread is 
shipping the last batch, so even if your queue holds only 1 batch, you're using 
double the memory.  (If I modify the test to double the threshold, it passes)

I've done performance testing by setting up a single standalone region server 
shipping to a remote cluster, and then running PerformanceEvaluation to 
generate 3gb of data.  The amount of time for replication to catch up:
ReplicationSourceV1    -    190s   (source.size.capacity of 64mb)
ReplicationSourceV2    -    160s   (source.size.capacity of 32mb, with queue 
size of 1 so that the max memory used should be 64mb)

There's better performance in situations where reading or filtering entries is 
more expensive (e.g. contention for disk/cpu).  For example, I tried 
introducing a 100ms delay in a custom entry filter.  
ReplicationSourceV1  -  366s
ReplicationSourceV2  -  236s


> Separate replication WAL reading from shipping
> ----------------------------------------------
>
>                 Key: HBASE-15995
>                 URL: https://issues.apache.org/jira/browse/HBASE-15995
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Replication
>    Affects Versions: 2.0.0
>            Reporter: Vincent Poon
>            Assignee: Vincent Poon
>             Fix For: 2.0.0
>
>         Attachments: HBASE-15995.master.v1.patch
>
>
> Currently ReplicationSource reads edits from the WAL and ships them in the 
> same thread.
> By breaking out the reading from the shipping, we can introduce greater 
> parallelism and lay the foundation for further refactoring to a pipelined, 
> streaming model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to