[ 
https://issues.apache.org/jira/browse/HDFS-15704?focusedWorklogId=524104&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-524104
 ]

ASF GitHub Bot logged work on HDFS-15704:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 14/Dec/20 20:50
            Start Date: 14/Dec/20 20:50
    Worklog Time Spent: 10m 
      Work Description: jbrennan333 commented on a change in pull request #2511:
URL: https://github.com/apache/hadoop/pull/2511#discussion_r542744744



##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java
##########
@@ -92,21 +91,15 @@
   private long lastHolderUpdateTime;
   private String internalLeaseHolder;
 
+  //
   // Used for handling lock-leases
   // Mapping: leaseHolder -> Lease
-  private final SortedMap<String, Lease> leases = new TreeMap<>();
-  // Set of: Lease
-  private final NavigableSet<Lease> sortedLeases = new TreeSet<>(
-      new Comparator<Lease>() {
-        @Override
-        public int compare(Lease o1, Lease o2) {
-          if (o1.getLastUpdate() != o2.getLastUpdate()) {
-            return Long.signum(o1.getLastUpdate() - o2.getLastUpdate());
-          } else {
-            return o1.holder.compareTo(o2.holder);
-          }
-        }
-  });
+  // TreeMap has O(log(n)) complexity but it is more space efficient
+  //             compared to HashMap. Therefore, replacing TreeMap with a
+  //             HashMap can be considered to get faster O(1) time complexity
+  //             on the expense of 30% memory waste.
+  //

Review comment:
       This explanation belongs in the Jira, not in a comment in the code.  
When looking at current code, it's not really clear why you are talking about 
TreeMap at all.
   
   Also, I think this comment misses the point of the TreeMap.  The reason a 
TreeMap was used here was to maintain a sorted order, which allowed the 
checkLeases() to exit the while loop as soon as it hit an unexpired lease. 
   
   The new design removes the need for the TreeMap by pruning the list it 
passes to checkLeases().

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java
##########
@@ -541,10 +550,10 @@ public void run() {
               fsnamesystem.getEditLog().logSync();
             }
           }
-  
-          Thread.sleep(fsnamesystem.getLeaseRecheckIntervalMs());
         } catch(InterruptedException ie) {
-          LOG.debug("{} is interrupted", name, ie);
+          if (LOG.isDebugEnabled()) {

Review comment:
       This LOG.isDebugEnabled() check is not needed anymore.

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java
##########
@@ -515,6 +508,13 @@ public void setLeasePeriod(long softLimit, long hardLimit) 
{
     this.softLimit = softLimit;
     this.hardLimit = hardLimit; 
   }
+
+  private synchronized Collection<Lease> getExpiredCandidateLeases() {
+    final long now = Time.monotonicNow();
+    return leases.values().stream()
+        .filter(lease -> lease.expiredHardLimit(now))
+        .collect(Collectors.toCollection(HashSet::new));
+  }

Review comment:
       I much prefer the loop in @daryn-sharp's original code.
   
       Collection<Lease> expired = new HashSet<>();
       for (Lease lease : leases) {
         if (lease.expiredHardLimit(now)) {
           expired.add(lease);
         }
       }
   This streams code will have to change if we want to pull this back to branch 
2.
   I think @daryn-sharp  also said that stream()'s are more expensive.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 524104)
    Time Spent: 1h  (was: 50m)

> Mitigate lease monitor's rapid infinite loop
> --------------------------------------------
>
>                 Key: HDFS-15704
>                 URL: https://issues.apache.org/jira/browse/HDFS-15704
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Ahmed Hussein
>            Assignee: Ahmed Hussein
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> [~daryn] reported that the lease monitor goes into a rapid infinite loop if 
> an exception occurs during a lease recovery.  The two main issues are:
> # lease monitor thread does not sleep if an exception occurs before looping 
> again
> # the loop peeks at the first element of a sorted tree set so when an 
> exception occurs, the "bad" lease remains as the first element preventing 
> recovery of other leases.
> This jira is not intended to fix the underlying issues causing the exception 
> during recovery but merely to mitigate the cited issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to