ramkrishna.s.vasudevan created HBASE-25065:
----------------------------------------------

             Summary: WAL archival can be batched/throttled and also done by a 
separate thread
                 Key: HBASE-25065
                 URL: https://issues.apache.org/jira/browse/HBASE-25065
             Project: HBase
          Issue Type: Improvement
          Components: wal
    Affects Versions: 3.0.0-alpha-1, 2.4.0
            Reporter: ramkrishna.s.vasudevan
            Assignee: ramkrishna.s.vasudevan


Currently we do clean up of logs once we ensure that the region data has been 
flushed. We track the sequence number and if we ensure that the seq number has 
been flushed for any given region and the WAL that was rolled has that seq 
number then those WAL can be archived.
When we have around ~50 files to archive (per RS) - we do the archiving one 
after the other. Since archiving is nothing but a rename operation it adds to 
the meta operation load of Cloud based FS. 
Not only that - the entire archival is done inside the rollWriterLock. Though 
we have closed the writer and created a new writer and the writes are ongoing - 
we never release the lock until we are done with the archiving. 
What happens is that during that period our logs grow in size compared to the 
default size configured (when we have consistent writes happening). 
So the proposal is to move the log archival to a seperate thread and ensure we 
can do some kind of throttling or batching so that we don't do archival at one 
shot. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to