[
https://issues.apache.org/jira/browse/HDFS-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18034801#comment-18034801
]
ASF GitHub Bot commented on HDFS-16793:
---------------------------------------
github-actions[bot] commented on PR #4971:
URL: https://github.com/apache/hadoop/pull/4971#issuecomment-3476992871
We're closing this stale PR because it has been open for 100 days with no
activity. This isn't a judgement on the merit of the PR in any way. It's just a
way of keeping the PR queue manageable.
If you feel like this was a mistake, or you would like to continue working
on it, please feel free to re-open it and ask for a committer to remove the
stale tag and review again.
Thanks all for your contribution.
> ObserverNameNode fails to select streaming inputStream with a timeout
> exception
> --------------------------------------------------------------------------------
>
> Key: HDFS-16793
> URL: https://issues.apache.org/jira/browse/HDFS-16793
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: ZanderXu
> Assignee: ZanderXu
> Priority: Major
> Labels: pull-request-available
>
> In out prod environment, we encountered one case that observer namenode
> failed to select streaming inputStream with a timeout exception. And the
> related code as bellow:
> {code:java}
> @Override
> public void selectInputStreams(Collection<EditLogInputStream> estreams,
> long fromTxnId, boolean inProgressOk,
> boolean onlyDurableTxns) throws IOException {
> if (inProgressOk && inProgressTailingEnabled) {
> ...
> }
> // Timeout here.
> selectStreamingInputStreams(streams, fromTxnId, inProgressOk,
> onlyDurableTxns);
> } {code}
> After looked into the code and found that JournalNode contains one very
> expensive and redundant operation that scan all of edits of the last
> in-progress segment with IO. The related code as bellow:
> {code:java}
> public List<RemoteEditLog> getRemoteEditLogs(long firstTxId,
> boolean inProgressOk) throws IOException {
> File currentDir = sd.getCurrentDir();
> List<EditLogFile> allLogFiles = matchEditLogs(currentDir);
> List<RemoteEditLog> ret = Lists.newArrayListWithCapacity(
> allLogFiles.size());
> for (EditLogFile elf : allLogFiles) {
> if (elf.hasCorruptHeader() || (!inProgressOk && elf.isInProgress())) {
> continue;
> }
> // Here.
> if (elf.isInProgress()) {
> try {
> elf.scanLog(getLastReadableTxId(), true);
> } catch (IOException e) {
> LOG.error("got IOException while trying to validate header of " +
> elf + ". Skipping.", e);
> continue;
> }
> }
> if (elf.getFirstTxId() >= firstTxId) {
> ret.add(new RemoteEditLog(elf.firstTxId, elf.lastTxId,
> elf.isInProgress()));
> } else if (elf.getFirstTxId() < firstTxId && firstTxId <=
> elf.getLastTxId()) {
> // If the firstTxId is in the middle of an edit log segment. Return this
> // anyway and let the caller figure out whether it wants to use it.
> ret.add(new RemoteEditLog(elf.firstTxId, elf.lastTxId,
> elf.isInProgress()));
> }
> }
>
> Collections.sort(ret);
>
> return ret;
> } {code}
> Expensive:
> * This scan operation will scan all of the edits of the in-progress segment
> with IO.
> Redundant:
> * This scan operation just find the lastTxId of this in-progress segment
> * But the caller method getEditLogManifest(long sinceTxId, boolean
> inProgressOk) in Journal.java just ignore the lastTxId of the in-progress
> segment and use getHighestWrittenTxId() as the lastTxId of the in-progress
> and return to namenode.
> * So, the scan operation is redundant.
> If end user enable the Observer Read feature, the delay of the tailing edits
> from journalnode is very important, whether it is normal process or fallback
> process.
> And there is no more comments about this scan logic after looked into the
> code and HDFS-6634 which added this logic.
> The only effect I can get is to scan the in-progress segment for corruption.
> But namenode can handle the corrupted in-progress segment.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]