[
https://issues.apache.org/jira/browse/HDFS-7131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148153#comment-14148153
]
Chris Nauroth commented on HDFS-7131:
-------------------------------------
Hi Jing. This is a nice find. I have just a few minor suggestions.
# Instead of {{IOUtils#closeStream}}, I recommend using {{IOUtils#cleanup}} and
passing in the {{LOG}} instance. If close fails, then logging the details
might help with troubleshooting.
# Let's close {{prevCommittedTxnId}} in a finally block. There are a few I/O
operations between opening the file and closing it. If one of those operations
gets an I/O error, we wouldn't want to leak the file descriptor.
# I don't think rollback needs to reinitialize {{committedTxnId}}. On the next
access, the existing file would get reopened by
{{BestEffortLongFile#lazyOpen}}. Since we just rolled back, I'd expect this to
be the old file containing the correct transaction ID from before the upgrade.
I tried commenting out this part of the patch, and {{TestDFSUpgradeWithHA}}
still passed. Let me know if you think I missed something here.
> During HA upgrade, JournalNode should create a new committedTxnId file in the
> current directory
> -----------------------------------------------------------------------------------------------
>
> Key: HDFS-7131
> URL: https://issues.apache.org/jira/browse/HDFS-7131
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.4.0
> Reporter: Jing Zhao
> Assignee: Jing Zhao
> Attachments: HDFS-7131.000.patch
>
>
> Currently while doing HA upgrade, we do not create a new committedTxnId file
> in the new current directory of JournalNode. And before we have the fix in
> HDFS-7042, since the file channel is never closed, for any new journal we're
> actually updating the committedTxnId file in the previous directory. This can
> cause NN to fail to start while rollback.
> HDFS-7042 fixes the main part of the issue: the file channel inside of the
> committedTxnId object gets closed thus later a new file can be created in the
> current directory. But maybe it is still better to copy the content file
> during the upgrade so that we can always use it for sanity check.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)