[ 
https://issues.apache.org/jira/browse/HADOOP-18781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17740192#comment-17740192
 ] 

ASF GitHub Bot commented on HADOOP-18781:
-----------------------------------------

steveloughran commented on code in PR #5780:
URL: https://github.com/apache/hadoop/pull/5780#discussion_r1253062630


##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java:
##########
@@ -493,6 +494,12 @@ public synchronized void close() throws IOException {
     }
 
     try {
+      // Check if Executor Service got shutdown before the writes could be
+      // completed.
+      if (hasActiveBlockDataToUpload() && executorService.isShutdown()) {
+        throw new IOException("Executor Service closed before writes could be"

Review Comment:
   throw a PathIOException with the output stream path; gives better 
diagnostics about what failed to be written. whoever fields support calls will 
appreciate this. I think L515 needs the same treatment.
   
   now, when this is thrown there's no need to catch and wrap it as it is not 
from flushInternal, so need a way to avoid this. maybe add some boolean on L495 
"exceptionCreated", which, if true, allows you to throw the exception on L509 
without wrapping.





> ABFS Output stream thread pools getting shutdown during GC.
> -----------------------------------------------------------
>
>                 Key: HADOOP-18781
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18781
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/azure
>            Reporter: Mehakmeet Singh
>            Assignee: Mehakmeet Singh
>            Priority: Major
>              Labels: pull-request-available
>
> Applications using AzureBlobFileSystem to create the AbfsOutputStream can use 
> the AbfsOutputStream for the purpose of writing, however, the OutputStream 
> doesn't hold any reference to the fs instance that created it, which can make 
> the FS instance eligible for GC, when this occurs, AzureblobFileSystem's 
> `finalize()` method gets called which in turn closes the FS, and in turn call 
> the close for AzureBlobFileSystemStore, which uses the same Threadpool that 
> is used by the AbfsOutputStream. This leads to the closing of the thread pool 
> while the writing is happening in the background and leads to hanging while 
> writing.
>  
> *Solution:*
> Pass a backreference of AzureBlobFileSystem into AzureBlobFileSystemStore and 
> AbfsOutputStream as well.
>  
> Same should be done for AbfsInputStream as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to