Rushabh Shah created HBASE-28184:
------------------------------------

             Summary: Tailing the WAL is very slow if there are multiple peers.
                 Key: HBASE-28184
                 URL: https://issues.apache.org/jira/browse/HBASE-28184
             Project: HBase
          Issue Type: Bug
          Components: Replication
    Affects Versions: 2.0.0
            Reporter: Rushabh Shah
            Assignee: Rushabh Shah


Noticed in one of our production clusters which has 4 peers.

Due to sudden ingestion of data, the size of log queue increased to a peak of 
506. We have configured log roll size to 256 MB. Most of the edits in the WAL 
were from a table for which replication is disabled. 

So all ReplicationSourceWALReader thread had to do was to replay the WAL and 
NOT replicate them. Still it took 12 hours to drain the queue.

Took few jstacks and found that ReplicationSourceWALReader was waiting to 
acquire rollWriterLock 
[here|https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java#L1231]
{noformat}
"regionserver/<rs>,1" #1036 daemon prio=5 os_prio=0 tid=0x00007f44b374e800 
nid=0xbd7f waiting on condition [0x00007f37b4d19000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00007f3897a3e150> (a 
java.util.concurrent.locks.ReentrantLock$FairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:837)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:872)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1202)
        at 
java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:228)
        at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
        at 
org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.getLogFileSizeIfBeingWritten(AbstractFSWAL.java:1102)
        at 
org.apache.hadoop.hbase.wal.WALProvider.lambda$null$0(WALProvider.java:128)
        at 
org.apache.hadoop.hbase.wal.WALProvider$$Lambda$177/1119730685.apply(Unknown 
Source)
        at 
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
        at 
java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1361)
        at 
java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
        at 
java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:499)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:486)
        at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
        at java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at 
java.util.stream.ReferencePipeline.findAny(ReferencePipeline.java:536)
        at 
org.apache.hadoop.hbase.wal.WALProvider.lambda$getWALFileLengthProvider$2(WALProvider.java:129)
        at 
org.apache.hadoop.hbase.wal.WALProvider$$Lambda$140/1246380717.getLogFileSizeIfBeingWritten(Unknown
 Source)
        at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.readNextEntryAndRecordReaderPosition(WALEntryStream.java:260)
        at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:172)
        at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:101)
        at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:222)
        at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:157)
{noformat}
 All the peers will contend for this lock during every batch read.
Look at the code snippet below. We are guarding this section with 
rollWriterLock if we are replicating the active WAL file. But in our case we 
are NOT replicating active WAL file but still we acquire this lock only to 
return OptionalLong.empty();
{noformat}
  /**
   * if the given {@code path} is being written currently, then return its 
length.
   * <p>
   * This is used by replication to prevent replicating unacked log entries. See
   * https://issues.apache.org/jira/browse/HBASE-14004 for more details.
   */
  @Override
  public OptionalLong getLogFileSizeIfBeingWritten(Path path) {
    rollWriterLock.lock();
    try {
       ...
       ...
    } finally {
      rollWriterLock.unlock();
    }
{noformat}
We can check the size of log queue and if it is greater than 1 then we can 
return early without acquiring the lock.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to