[ 
https://issues.apache.org/jira/browse/HDFS-14694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190977#comment-17190977
 ] 

Ayush Saxena commented on HDFS-14694:
-------------------------------------

Thanx [~leosun08] for the update, Couple  of comments :

{code:java}
+      if (recoverOnCloseException) {
+        try {
+          dfsClient.endFileLease(fileId);
+          dfsClient.recoverLease(src);
+          leaseRecovered = true;
+        } catch (Exception e) {
+          // Ignore exception rendered by recoverLease. Throw original
+          // exception
+        }
+      }
{code}

* This can be refactored into a single method and can be used at both places, 
You can extend a javadoc for the new method, explaining what it does.


{code:java}
-  private void completeFile() throws IOException {
+  protected void completeFile() throws IOException {
{code}

* Add {{@VisibleForTesting}} and I think default access, should be enough?

{code:java}
+  public boolean isLeaseRecovered() {
{code}
* Similarly here.

{code:java}
+    String RECOVER_ON_CLOSE_EXCEPTION_KEY =
+        PREFIX + "recover.on.close.exception";
{code}

* Can we change to {{recover.lease.on.close.exception}}?

{code:java}
GenericTestUtils.waitFor(() -> {
          boolean closed;
          try {
            closed = cluster.getFileSystem().isFileClosed(
                new Path("/testExceptionInCloseWithoutRecoverLease"));
          } catch (IOException e) {
            return false;
          }
          return closed;
        }, 1000, 5000);
      } 
{code}

* The wait can be refactored to a method, and you can use in both the tests.

Nit: 
{code:java}
+      Assert.assertTrue(cluster.getFileSystem()

+        Assert.assertFalse(cluster.getFileSystem().isFileClosed(
{code}
* No need for {{Assert.}} {{assertTrue}} and {{assertFalse}} are already 
imported.




> Call recoverLease on DFSOutputStream close exception
> ----------------------------------------------------
>
>                 Key: HDFS-14694
>                 URL: https://issues.apache.org/jira/browse/HDFS-14694
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>            Reporter: Chen Zhang
>            Assignee: Lisheng Sun
>            Priority: Major
>         Attachments: HDFS-14694.001.patch, HDFS-14694.002.patch, 
> HDFS-14694.003.patch, HDFS-14694.004.patch, HDFS-14694.005.patch, 
> HDFS-14694.006.patch, HDFS-14694.007.patch, HDFS-14694.008.patch, 
> HDFS-14694.009.patch, HDFS-14694.010.patch, HDFS-14694.011.patch
>
>
> HDFS uses file-lease to manage opened files, when a file is not closed 
> normally, NN will recover lease automatically after hard limit exceeded. But 
> for a long running service(e.g. HBase), the hdfs-client will never die and NN 
> don't have any chances to recover the file.
> Usually client program needs to handle exceptions by themself to avoid this 
> condition(e.g. HBase automatically call recover lease for files that not 
> closed normally), but in our experience, most services (in our company) don't 
> process this condition properly, which will cause lots of files in abnormal 
> status or even data loss.
> This Jira propose to add a feature that call recoverLease operation 
> automatically when DFSOutputSteam close encounters exception. It should be 
> disabled by default, but when somebody builds a long-running service based on 
> HDFS, they can enable this option.
> We've add this feature to our internal Hadoop distribution for more than 3 
> years, it's quite useful according our experience.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to