[ https://issues.apache.org/jira/browse/HBASE-25065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17199162#comment-17199162 ]
Anoop Sam John commented on HBASE-25065: ---------------------------------------- +1 to do rename in another thread. For cloud FS rename is not just a meta op. > WAL archival can be batched/throttled and also done by a separate thread > ------------------------------------------------------------------------ > > Key: HBASE-25065 > URL: https://issues.apache.org/jira/browse/HBASE-25065 > Project: HBase > Issue Type: Improvement > Components: wal > Affects Versions: 3.0.0-alpha-1, 2.4.0 > Reporter: ramkrishna.s.vasudevan > Assignee: ramkrishna.s.vasudevan > Priority: Major > > Currently we do clean up of logs once we ensure that the region data has been > flushed. We track the sequence number and if we ensure that the seq number > has been flushed for any given region and the WAL that was rolled has that > seq number then those WAL can be archived. > When we have around ~50 files to archive (per RS) - we do the archiving one > after the other. Since archiving is nothing but a rename operation it adds to > the meta operation load of Cloud based FS. > Not only that - the entire archival is done inside the rollWriterLock. Though > we have closed the writer and created a new writer and the writes are ongoing > - we never release the lock until we are done with the archiving. > What happens is that during that period our logs grow in size compared to the > default size configured (when we have consistent writes happening). > So the proposal is to move the log archival to a seperate thread and ensure > we can do some kind of throttling or batching so that we don't do archival at > one shot. -- This message was sent by Atlassian Jira (v8.3.4#803005)