[ 
https://issues.apache.org/jira/browse/HADOOP-18636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18636:
------------------------------------
    Description: 
The  s3a and abfs clients use LocalDirAllocator for allocating files in local 
(temporary) storage for buffering blocks to write, and, for the s3a staging 
committer, files being staged. 
When initialized (or when the configuration key value is updated) 
LocalDirAllocator enumerates all directories in the list and calls {{mkdirs()}} 
to create them.

when you ask actually for a file, it will look for the parent dir, and will 
again call {{mkdirs()}}. 

But before it does that, it looks to see if the dir has any space...if not it 
is excluded from the list of directories with room for data.

And guess what: directories which don't exist report as having no space. So 
they get excluded -the recreation code doesn't get a chance to run.




  was:
The  s3a and abfs clients use LocalDirAllocator for allocating files in local 
(temporary) storage for buffering blocks to write, and, for the s3a staging 
committer, files being staged. 
When initialized (or when the configuration key value is updated) 
LocalDirAllocator enumerates all directories in the list and calls {{mkdirs()}} 
to create them.

when you ask actually for a file, it will look for the parent dir, but it calls 
{{mkdir()}}, rather than {{mkdirs()}}

This means it will recreate a missing parent file but cannot recover from a 
missing grandparent. If during the life of an application the temp directory is 
cleaned up, it can result in the failure of the application.

Fix add an "s" to the right place in the production code, plus a new test.



> LocalDirAllocator cannot recover from directory tree deletion during the life 
> of a filesystem client
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-18636
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18636
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs, fs/azure, fs/s3
>    Affects Versions: 3.3.4
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>              Labels: pull-request-available
>
> The  s3a and abfs clients use LocalDirAllocator for allocating files in local 
> (temporary) storage for buffering blocks to write, and, for the s3a staging 
> committer, files being staged. 
> When initialized (or when the configuration key value is updated) 
> LocalDirAllocator enumerates all directories in the list and calls 
> {{mkdirs()}} to create them.
> when you ask actually for a file, it will look for the parent dir, and will 
> again call {{mkdirs()}}. 
> But before it does that, it looks to see if the dir has any space...if not it 
> is excluded from the list of directories with room for data.
> And guess what: directories which don't exist report as having no space. So 
> they get excluded -the recreation code doesn't get a chance to run.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to