[ 
https://issues.apache.org/jira/browse/HADOOP-18781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739227#comment-17739227
 ] 

ASF GitHub Bot commented on HADOOP-18781:
-----------------------------------------

mukund-thakur commented on code in PR #5780:
URL: https://github.com/apache/hadoop/pull/5780#discussion_r1248245079


##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/ITestAbfsOutputStream.java:
##########
@@ -90,36 +93,51 @@ public void testMaxRequestsAndQueueCapacity() throws 
Exception {
 
   /**
    * Verify the passing of AzureBlobFileSystem reference to AbfsOutputStream
-   * to make sure that the FS instance is not eligible for GC.
-   *
+   * to make sure that the FS instance is not eligible for GC while writing.
    */
-  @Test
+  @Test(timeout = TEST_EXECUTION_TIMEOUT)
   public void testAzureBlobFileSystemBackReferenceInOutputStream()
       throws Exception {
-    AzureBlobFileSystem fs1 = new AzureBlobFileSystem();
-    fs1.initialize(new URI(getTestUrl()), getRawConfiguration());
-    Path pathFs1 = path(getMethodName() + "1");
 
-    AzureBlobFileSystem fs2 = new AzureBlobFileSystem();
-    fs2.initialize(new URI(getTestUrl()), getRawConfiguration());
-    Path pathFs2 = path(getMethodName() + "2");
-
-    try(AbfsOutputStream out1 = createAbfsOutputStreamWithFlushEnabled(fs1,
-        pathFs1)) {
-      Assert.assertFalse("BackReference in output stream should not be null",
-          out1.getFsBackRef().isNull());
-      Assert.assertEquals("Mismatch in Filesystem reference this outputStream"
-              + " should have",
-          fs1, out1.getFsBackRef().getReference());
+    byte[] testBytes = new byte[5 * 1024];
+    // Creating an output stream using a FS in a separate method to make the
+    // FS instance used eligible for GC. Since when a method is popped from
+    // the stack frame, it's variables become anonymous, this creates higher
+    // chance of getting Garbage collected.
+    try (AbfsOutputStream out = getStream()) {
+
+      // Every 5KB block written is flushed and a GC is hinted, if the
+      // executor service is shut down in between, the test should fail
+      // indicating premature shutdown while writing.
+      for (int i = 0; i < 5; i++) {
+        out.write(testBytes);
+        out.flush();
+        System.gc();
+        Assertions.assertThat(

Review Comment:
   Should we move the assersions out of for loop?





> ABFS Output stream thread pools getting shutdown during GC.
> -----------------------------------------------------------
>
>                 Key: HADOOP-18781
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18781
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/azure
>            Reporter: Mehakmeet Singh
>            Assignee: Mehakmeet Singh
>            Priority: Major
>              Labels: pull-request-available
>
> Applications using AzureBlobFileSystem to create the AbfsOutputStream can use 
> the AbfsOutputStream for the purpose of writing, however, the OutputStream 
> doesn't hold any reference to the fs instance that created it, which can make 
> the FS instance eligible for GC, when this occurs, AzureblobFileSystem's 
> `finalize()` method gets called which in turn closes the FS, and in turn call 
> the close for AzureBlobFileSystemStore, which uses the same Threadpool that 
> is used by the AbfsOutputStream. This leads to the closing of the thread pool 
> while the writing is happening in the background and leads to hanging while 
> writing.
>  
> *Solution:*
> Pass a backreference of AzureBlobFileSystem into AzureBlobFileSystemStore and 
> AbfsOutputStream as well.
>  
> Same should be done for AbfsInputStream as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to