[
https://issues.apache.org/jira/browse/HBASE-25924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Somogyi resolved HBASE-25924.
-----------------------------------
Resolution: Fixed
Pushed the revert commit to branch-2.3. Resolving.
> Seeing a spike in uncleanlyClosedWALs metric.
> ---------------------------------------------
>
> Key: HBASE-25924
> URL: https://issues.apache.org/jira/browse/HBASE-25924
> Project: HBase
> Issue Type: Bug
> Components: Replication, wal
> Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.4
> Reporter: Rushabh Shah
> Assignee: Rushabh Shah
> Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0, 1.7.1, 2.4.4
>
>
> Getting the following log line in all of our production clusters when
> WALEntryStream is dequeuing WAL file.
> {noformat}
> 2021-05-02 04:01:30,437 DEBUG [04901996] regionserver.WALEntryStream -
> Reached the end of WAL file hdfs://<wal-file-name>. It was not closed
> cleanly, so we did not parse 8 bytes of data. This is normally ok.
> {noformat}
> The 8 bytes are usually the trailer serialized size (SIZE_OF_INT (4bytes) +
> "LAWP" (4 bytes) = 8 bytes)
> While dequeue'ing the WAL file from WALEntryStream, we reset the reader here.
> [WALEntryStream|https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/WALEntryStream.java#L199-L221]
> {code:java}
> private void tryAdvanceEntry() throws IOException {
> if (checkReader()) {
> readNextEntryAndSetPosition();
> if (currentEntry == null) { // no more entries in this log file - see
> if log was rolled
> if (logQueue.getQueue(walGroupId).size() > 1) { // log was rolled
> // Before dequeueing, we should always get one more attempt at
> reading.
> // This is in case more entries came in after we opened the reader,
> // and a new log was enqueued while we were reading. See HBASE-6758
> resetReader(); ---> HERE
> readNextEntryAndSetPosition();
> if (currentEntry == null) {
> if (checkAllBytesParsed()) { // now we're certain we're done with
> this log file
> dequeueCurrentLog();
> if (openNextLog()) {
> readNextEntryAndSetPosition();
> }
> }
> }
> } // no other logs, we've simply hit the end of the current open log.
> Do nothing
> }
> }
> // do nothing if we don't have a WAL Reader (e.g. if there's no logs in
> queue)
> }
> {code}
> In resetReader, we call the following methods, WALEntryStream#resetReader
> ----> ProtobufLogReader#reset ---> ProtobufLogReader#initInternal.
> In ProtobufLogReader#initInternal, we try to create the whole reader object
> from scratch to see if any new data has been written.
> We reset all the fields of ProtobufLogReader except for ReaderBase#fileLength.
> We calculate whether trailer is present or not depending on fileLength.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)