Hi, We are seeing some performance issue on one of our write heavy cluster, and trying to find out the root cause. One confusion I have during investigate is that I found in ReplicationSink.java, it says this which seems wrong:
** * This class is responsible for replicating the edits coming * from another cluster. * <p/> * This replication process is currently waiting for the edits to be applied * before the method can return. This means that the replication of edits * is synchronized (after reading from HLogs in ReplicationSource) and that a * single region server cannot receive edits from two sources at the same time * <p/> * This class uses the native HBase client in order to replicate entries. * <p/> * I think replicateLogEntries() is a public API provided by HRegionserver, if two sources picked the same sink and sent their requests at the same time, each of them should be dequeued by a free thread hander on that RegionServer, and being processed in parallel. How can it achieve the goal stated in the comments? Am I missing something here? In summary, two questions: 1. How can it prevent two sources from invoking replicateLogEntries() at the same time? 2. What is the concern if it is true intention that the author want to prevent two sources from invoking replicateLogEntries() at the same time? I think with timestamp in the put, it should not worry about the order. Thanks Tian-Ying
