[
https://issues.apache.org/jira/browse/HDFS-16659?focusedWorklogId=790699&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-790699
]
ASF GitHub Bot logged work on HDFS-16659:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 14/Jul/22 04:03
Start Date: 14/Jul/22 04:03
Worklog Time Spent: 10m
Work Description: ZanderXu commented on PR #4560:
URL: https://github.com/apache/hadoop/pull/4560#issuecomment-1183960852
@tomscut Are you interesting to review this bug about selecting
EditLogInputStreams?
Issue Time Tracking
-------------------
Worklog Id: (was: 790699)
Time Spent: 0.5h (was: 20m)
> JournalNode should throw CacheMissException if SinceTxId is bigger than
> HighestWrittenTxId
> ------------------------------------------------------------------------------------------
>
> Key: HDFS-16659
> URL: https://issues.apache.org/jira/browse/HDFS-16659
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: ZanderXu
> Assignee: ZanderXu
> Priority: Critical
> Labels: pull-request-available
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> JournalNode should throw `CacheMissException` if `sinceTxId` is bigger than
> `highestWrittenTxId`. And it will caused EditlogTailer can not able to tail
> edits. And it maybe caused ObserverNameNode can not able handle requests from
> clients.
> Suppose there are 3 journalNodes, JN0 ~ JN1.
> The corner case as blew:
> * JN0 has some abnormal cases when Active Namenode is journaling Edits with
> start txId 11
> * NameNode just ignore the abnormal JN0 and continue to write Edits to
> Journal 1 and 2
> * JN0 backed to health
> * Observer NameNode try to select EditLogInputStream vis PRC with start txId
> 21
> * Journal 1 has some abnormal cases caused slow rpc response
> And the expected selecting result is: Response should contain 20 Edits from
> txId 21 to txId 40 from JN1 and JN2. Because Active NameNode successfully
> write these Edits to JN1 and JN2 and failed write these edits to JN0, so
> there is no Edits from id 21 to 40 in the cache of JN0.
> But in the current implementation, there is no Edits in the Response.
> Because namenode successfully got a response from JN0 that did not contains
> any Edits.
> And the bug code as blew:
> {code:java}
> if (sinceTxId > getHighestWrittenTxId()) {
> // Requested edits that don't exist yet; short-circuit the cache here
> metrics.rpcEmptyResponses.incr();
> return
> GetJournaledEditsResponseProto.newBuilder().setTxnCount(0).build();
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]