[
https://issues.apache.org/jira/browse/HADOOP-18781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739226#comment-17739226
]
ASF GitHub Bot commented on HADOOP-18781:
-----------------------------------------
mukund-thakur commented on code in PR #5780:
URL: https://github.com/apache/hadoop/pull/5780#discussion_r1248244590
##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/ITestAbfsOutputStream.java:
##########
@@ -90,36 +93,51 @@ public void testMaxRequestsAndQueueCapacity() throws
Exception {
/**
* Verify the passing of AzureBlobFileSystem reference to AbfsOutputStream
- * to make sure that the FS instance is not eligible for GC.
- *
+ * to make sure that the FS instance is not eligible for GC while writing.
*/
- @Test
+ @Test(timeout = TEST_EXECUTION_TIMEOUT)
public void testAzureBlobFileSystemBackReferenceInOutputStream()
throws Exception {
- AzureBlobFileSystem fs1 = new AzureBlobFileSystem();
- fs1.initialize(new URI(getTestUrl()), getRawConfiguration());
- Path pathFs1 = path(getMethodName() + "1");
- AzureBlobFileSystem fs2 = new AzureBlobFileSystem();
- fs2.initialize(new URI(getTestUrl()), getRawConfiguration());
- Path pathFs2 = path(getMethodName() + "2");
-
- try(AbfsOutputStream out1 = createAbfsOutputStreamWithFlushEnabled(fs1,
- pathFs1)) {
- Assert.assertFalse("BackReference in output stream should not be null",
- out1.getFsBackRef().isNull());
- Assert.assertEquals("Mismatch in Filesystem reference this outputStream"
- + " should have",
- fs1, out1.getFsBackRef().getReference());
+ byte[] testBytes = new byte[5 * 1024];
+ // Creating an output stream using a FS in a separate method to make the
+ // FS instance used eligible for GC. Since when a method is popped from
+ // the stack frame, it's variables become anonymous, this creates higher
+ // chance of getting Garbage collected.
Review Comment:
interesting
> ABFS Output stream thread pools getting shutdown during GC.
> -----------------------------------------------------------
>
> Key: HADOOP-18781
> URL: https://issues.apache.org/jira/browse/HADOOP-18781
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/azure
> Reporter: Mehakmeet Singh
> Assignee: Mehakmeet Singh
> Priority: Major
> Labels: pull-request-available
>
> Applications using AzureBlobFileSystem to create the AbfsOutputStream can use
> the AbfsOutputStream for the purpose of writing, however, the OutputStream
> doesn't hold any reference to the fs instance that created it, which can make
> the FS instance eligible for GC, when this occurs, AzureblobFileSystem's
> `finalize()` method gets called which in turn closes the FS, and in turn call
> the close for AzureBlobFileSystemStore, which uses the same Threadpool that
> is used by the AbfsOutputStream. This leads to the closing of the thread pool
> while the writing is happening in the background and leads to hanging while
> writing.
>
> *Solution:*
> Pass a backreference of AzureBlobFileSystem into AzureBlobFileSystemStore and
> AbfsOutputStream as well.
>
> Same should be done for AbfsInputStream as well.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]