[
https://issues.apache.org/jira/browse/HBASE-10000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841792#comment-13841792
]
Nicolas Liochon commented on HBASE-10000:
-----------------------------------------
{code}
+ if (!exceptions.isEmpty()) {
+ LOG.debug("Encountered " + exceptions.size() + " exceptions");
+ throw exceptions.get(0);
+ }
{code}
=> Should be info or warning
{code}
+ public static final int LEASE_RECOVERY_UNREQUESTED = 0;
{code}
=> should be a long, no?
{code}
+ if (findIsFileClosedMeth) {
+ try {
+ isFileClosedMeth = dfs.getClass().getMethod("isFileClosed",
+ new Class[]{ Path.class });
+ } catch (NoSuchMethodException nsme) {
+ LOG.debug("isFileClosed not available");
+ } finally {
+ findIsFileClosedMeth = false;
+ }
+ }
{code}
=> findIsFileClosedMeth seems to be always true at the beginning, and not read
later (i.e. the finally clause is not needed)
The code in this method is very complex to read (that's not your fault :-) ),
I think if you change it you need to restructure it as well.
{code}
+ if (ts == HConstants.LEASE_RECOVERY_UNREQUESTED) {
+ startWaiting = EnvironmentEdgeManager.currentTimeMillis();
+ recovered = recoverLease(dfs, nbAttempt, p, startWaiting);
+ } else {
+ startWaiting = ts;
+ }
{code}
=> this seems wrong (for each loop, we will reset "startWaiting = ts") or may
be it should be outside of the loop
I'm not sure of the previous version, but I think we must be ready to do
multiple calls to recoverLease, in case the namenode crashed at the wrong time
or something alike. With this version, it seems it won't be the case if the
calls was made by the master. If I read correctly, the previous versions was
doing multiple calls.
As well, if I'm not wrong, if the master initiated the recovey but
ifFileClosed is not available, we will never succeed. If this case is not
covered voluntary this should be documented.
{code}
// On the first time through wait the short 'firstPause'.
if (nbAttempt == 0) {
Thread.sleep(firstPause);
{code}
=> Should this be changed if the master initiated the recoverLease? No need to
wait 4s.
=> This code is not needed when isFileClosed is available (as it's cheap, we
don't want to wait: we prefer to do the call sooner)
What's the target version? Could we say that 'isFileClosed' will be mandatory
there? This would simplify the code.
(I haven't reviewed everything in details, I looked especially at FSHDFSUtils).
> Initiate lease recovery for outstanding WAL files at the very beginning of
> recovery
> -----------------------------------------------------------------------------------
>
> Key: HBASE-10000
> URL: https://issues.apache.org/jira/browse/HBASE-10000
> Project: HBase
> Issue Type: Improvement
> Reporter: Ted Yu
> Assignee: Ted Yu
> Fix For: 0.98.1
>
> Attachments: 10000-recover-ts-with-pb-2.txt,
> 10000-recover-ts-with-pb-3.txt, 10000-v1.txt, 10000-v4.txt, 10000-v5.txt,
> 10000-v6.txt
>
>
> At the beginning of recovery, master can send lease recovery requests
> concurrently for outstanding WAL files using a thread pool.
> Each split worker would first check whether the WAL file it processes is
> closed.
> Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this
> idea.
--
This message was sent by Atlassian JIRA
(v6.1#6144)