kiran-maturi commented on PR #6179:
URL: https://github.com/apache/hbase/pull/6179#issuecomment-2316629102
@Apache9 I have observed in production WAL files not being cleaned for days
from the roll over. This was due to the WALs not being marked for close when
they had unflushed entries.
```
if (!isUnflushedEntries()) {
markClosedAndClean(oldPath);
}
```
In the clean up stage in
[cleanOldLogs](https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java#L743)
```
if (!e.getValue().closed) {
LOG.debug("{} is not closed yet, will try archiving it next time",
e.getKey());
continue;
}
```
The above log line was emitted for wal files that were rolled over days ago.
Once its not marked close we will not be able to clean up there by leading lot
of files to be processed during SCP .
The current change marks it close. The WAL file won't be cleaned as there is
check to make sure all the entries related to WAL have been flushed
```
Map<byte[], Long> sequenceNums = e.getValue().encodedName2HighestSequenceId;
if (this.sequenceIdAccounting.areAllLower(sequenceNums)) {
if (logsToArchive == null) {
logsToArchive = new ArrayList<>();
}
logsToArchive.add(Pair.newPair(log, e.getValue().logSize));
if (LOG.isTraceEnabled()) {
LOG.trace("WAL file ready for archiving " + log);
}
}
```
My understanding is it is safe to mark them closed even with unflushed
entries as there is check in cleanup which will not let the wal file to be
cleaned up
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]