[jira] [Commented] (HBASE-25924) Seeing a spike in uncleanlyClosedWALs metric.

Hudson (Jira) Thu, 27 May 2021 12:31:11 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-25924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352715#comment-17352715
 ]


Hudson commented on HBASE-25924:
--------------------------------

Results for branch branch-2.4
        [build #128 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/128/]:
 (x) *{color:red}-1 overall{color}*
----
details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/128/General_20Nightly_20Build_20Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/128/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/128/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/128/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Seeing a spike in uncleanlyClosedWALs metric.
> ---------------------------------------------
>
>                 Key: HBASE-25924
>                 URL: https://issues.apache.org/jira/browse/HBASE-25924
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication, wal
>    Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.4
>            Reporter: Rushabh Shah
>            Assignee: Rushabh Shah
>            Priority: Major
>             Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 2.4.4, 1.7.1
>
>
> Getting the following log line in all of our production clusters when 
> WALEntryStream is dequeuing WAL file.
> {noformat}
>  2021-05-02 04:01:30,437 DEBUG [04901996] regionserver.WALEntryStream - 
> Reached the end of WAL file hdfs://<wal-file-name>. It was not closed 
> cleanly, so we did not parse 8 bytes of data. This is normally ok.
> {noformat}
> The 8 bytes are usually the trailer serialized size (SIZE_OF_INT (4bytes) + 
> "LAWP" (4 bytes) = 8 bytes)
> While dequeue'ing the WAL file from WALEntryStream, we reset the reader here.
> [WALEntryStream|https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/WALEntryStream.java#L199-L221]
> {code:java}
>   private void tryAdvanceEntry() throws IOException {
>     if (checkReader()) {
>       readNextEntryAndSetPosition();
>       if (currentEntry == null) { // no more entries in this log file - see 
> if log was rolled
>         if (logQueue.getQueue(walGroupId).size() > 1) { // log was rolled
>           // Before dequeueing, we should always get one more attempt at 
> reading.
>           // This is in case more entries came in after we opened the reader,
>           // and a new log was enqueued while we were reading. See HBASE-6758
>           resetReader(); ---> HERE
>           readNextEntryAndSetPosition();
>           if (currentEntry == null) {
>             if (checkAllBytesParsed()) { // now we're certain we're done with 
> this log file
>               dequeueCurrentLog();
>               if (openNextLog()) {
>                 readNextEntryAndSetPosition();
>               }
>             }
>           }
>         } // no other logs, we've simply hit the end of the current open log. 
> Do nothing
>       }
>     }
>     // do nothing if we don't have a WAL Reader (e.g. if there's no logs in 
> queue)
>   }
> {code}
> In resetReader, we call the following methods, WALEntryStream#resetReader  
> ---->  ProtobufLogReader#reset ---> ProtobufLogReader#initInternal.
> In ProtobufLogReader#initInternal, we try to create the whole reader object 
> from scratch to see if any new data has been written.
> We reset all the fields of ProtobufLogReader except for ReaderBase#fileLength.
> We calculate whether trailer is present or not depending on fileLength.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HBASE-25924) Seeing a spike in uncleanlyClosedWALs metric.

Reply via email to