[ 
https://issues.apache.org/jira/browse/HDFS-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18034912#comment-18034912
 ] 

ASF GitHub Bot commented on HDFS-16793:
---------------------------------------

github-actions[bot] closed pull request #4971: HDFS-16793. [SBN read] 
ObserverNN failed to select streaming inputStream from JournalNode
URL: https://github.com/apache/hadoop/pull/4971




> ObserverNameNode fails to select streaming inputStream with a timeout 
> exception 
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-16793
>                 URL: https://issues.apache.org/jira/browse/HDFS-16793
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: ZanderXu
>            Assignee: ZanderXu
>            Priority: Major
>              Labels: pull-request-available
>
> In out prod environment, we encountered one case that observer namenode 
> failed to select streaming inputStream with a timeout exception. And the 
> related code as bellow:
> {code:java}
> @Override
> public void selectInputStreams(Collection<EditLogInputStream> estreams,
>     long fromTxnId, boolean inProgressOk,
>     boolean onlyDurableTxns) throws IOException { 
>   if (inProgressOk && inProgressTailingEnabled) {
>     ...
>   }
>   // Timeout here.
>   selectStreamingInputStreams(streams, fromTxnId, inProgressOk,
>       onlyDurableTxns);
> } {code}
> After looked into the code and found that JournalNode contains one very 
> expensive and redundant operation that scan all of edits of the last 
> in-progress segment with IO. The related code as bellow:
> {code:java}
> public List<RemoteEditLog> getRemoteEditLogs(long firstTxId,
>     boolean inProgressOk) throws IOException {
>   File currentDir = sd.getCurrentDir();
>   List<EditLogFile> allLogFiles = matchEditLogs(currentDir);
>   List<RemoteEditLog> ret = Lists.newArrayListWithCapacity(
>       allLogFiles.size());
>   for (EditLogFile elf : allLogFiles) {
>     if (elf.hasCorruptHeader() || (!inProgressOk && elf.isInProgress())) {
>       continue;
>     }
>     // Here.
>     if (elf.isInProgress()) {
>       try {
>         elf.scanLog(getLastReadableTxId(), true);
>       } catch (IOException e) {
>         LOG.error("got IOException while trying to validate header of " +
>             elf + ".  Skipping.", e);
>         continue;
>       }
>     }
>     if (elf.getFirstTxId() >= firstTxId) {
>       ret.add(new RemoteEditLog(elf.firstTxId, elf.lastTxId,
>           elf.isInProgress()));
>     } else if (elf.getFirstTxId() < firstTxId && firstTxId <= 
> elf.getLastTxId()) {
>       // If the firstTxId is in the middle of an edit log segment. Return this
>       // anyway and let the caller figure out whether it wants to use it.
>       ret.add(new RemoteEditLog(elf.firstTxId, elf.lastTxId,
>           elf.isInProgress()));
>     }
>   }
>   
>   Collections.sort(ret);
>   
>   return ret;
> } {code}
> Expensive:
>  * This scan operation will scan all of the edits of the in-progress segment 
> with IO.
> Redundant:
>  * This scan operation just find the lastTxId of this in-progress segment
>  * But the caller method getEditLogManifest(long sinceTxId, boolean 
> inProgressOk) in Journal.java just ignore the lastTxId of the in-progress 
> segment and use getHighestWrittenTxId() as the lastTxId of the in-progress 
> and return to namenode.
>  * So, the scan operation is redundant.
> If end user enable the Observer Read feature, the delay of the tailing edits 
> from journalnode is very important, whether it is normal process or fallback 
> process. 
> And there is no more comments about this scan logic after looked into the 
> code and HDFS-6634 which added this logic.
> The only effect I can get is to scan the in-progress segment for corruption. 
> But namenode can handle the corrupted in-progress segment.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to