[
https://issues.apache.org/jira/browse/HDFS-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523562#comment-14523562
]
Ming Ma commented on HDFS-7609:
-------------------------------
Small retryCache size can also impact the correctness, given the successful
call result might have been removed from the cache by the time the client sends
new retry of the same call to the new active NN.
Regarding the earlier call stack showing any entry with the same client id and
caller id has existed in retryCache during edit log replay, it could be due to
the following scenario.
1. A delete op is processed by nn1 successfully, thus logged to the edit log.
2. Client doesn't get the response, and nn1 fails over to nn2.
3. Client will retry the same call on nn2. Even though nn2 is still tailing
edit log and not active yet, a new cache entry will be added, because the
following code can be called even if nn2 isn't active yet.
{noformat}
public boolean delete(String src, boolean recursive) throws IOException {
checkNNStartup();
if (stateChangeLog.isDebugEnabled()) {
stateChangeLog.debug("*DIR* Namenode.delete: src=" + src
+ ", recursive=" + recursive);
}
CacheEntry cacheEntry = RetryCache.waitForCompletion(retryCache);
if (cacheEntry != null && cacheEntry.isSuccess()) {
return true; // Return previous response
}
...
{noformat}
4. By the time nn2 gets the call from the edit log and it to the cache, the
cache entry is already there.
One way to fix this is to modify waitForCompletion not to create cache entry
when NN is in standby. Instead, when NN is in standby, it just needs to check
if there is a successful cache entry. If there is, return the result. If not,
throws StandbyException.
> startup used too much time to load edits
> ----------------------------------------
>
> Key: HDFS-7609
> URL: https://issues.apache.org/jira/browse/HDFS-7609
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Affects Versions: 2.2.0
> Reporter: Carrey Zhan
> Attachments: HDFS-7609-CreateEditsLogWithRPCIDs.patch,
> recovery_do_not_use_retrycache.patch
>
>
> One day my namenode crashed because of two journal node timed out at the same
> time under very high load, leaving behind about 100 million transactions in
> edits log.(I still have no idea why they were not rolled into fsimage.)
> I tryed to restart namenode, but it showed that almost 20 hours would be
> needed before finish, and it was loading fsedits most of the time. I also
> tryed to restart namenode in recover mode, the loading speed had no different.
> I looked into the stack trace, judged that it is caused by the retry cache.
> So I set dfs.namenode.enable.retrycache to false, the restart process
> finished in half an hour.
> I think the retry cached is useless during startup, at least during recover
> process.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)