[
https://issues.apache.org/jira/browse/HADOOP-18793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17741189#comment-17741189
]
Emanuel Velzi edited comment on HADOOP-18793 at 7/7/23 8:52 PM:
----------------------------------------------------------------
Hi [[email protected]] I have a similar issue related to this method:
*cleanupStagingDirs()* using {*}magic committers{*}.
We have two spark jobs writing to the same s3a directory. We have the property
*spark.hadoop.fs.s3a.committer.abort.pending.uploads=false*
So we see in logs this line:
DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up
pending uploads to s3a ...
(from
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952)
But we also see in logs that when the first job finalize {*}the __magic
directory is deleted{*}:
INFO [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting
magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s
(from
[https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137])
I'm not sure but I think that this is affecting to the second job that is still
running.
The fact that /_{_}magic *is deleted recursively* (including all subdirectoris
like: /{_}{_}magic/job-1 , /{_}_magic/job-2 ...) *Could be a problem?*
was (Author: JIRAUSER301027):
Hi [[email protected]] I have a similar issue related to this method:
*cleanupStagingDirs()* using {*}magic committers{*}.
We have two spark jobs writing to the same s3a directory. We have the property
*spark.hadoop.fs.s3a.committer.abort.pending.uploads=false*
So we see in logs this line:
DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up
pending uploads to s3a ...
(from DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup
up pending uploads to s3a)
But we also see in logs that when the first job finalize {*}the __magic
directory is deleted{*}:
INFO [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting
magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s
(from
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)
I'm not sure but I think that this is affecting to the second job that is still
running.
The fact that /__magic *is deleted recursively* (including all subdirectoris
like: /__magic/job-1 , /__magic/job-2 ...) *Could be a problem?*
> S3A StagingCommitter does not clean up staging-uploads directory
> ----------------------------------------------------------------
>
> Key: HADOOP-18793
> URL: https://issues.apache.org/jira/browse/HADOOP-18793
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 3.2.2
> Reporter: Harunobu Daikoku
> Priority: Minor
> Labels: pull-request-available
>
> When setting up StagingCommitter and its internal FileOutputCommitter, a
> temporary directory that holds MPU information will be created on the default
> FS, which by default is to be
> /user/${USER}/tmp/staging/${USER}/${UUID}/staging-uploads.
> On a successful job commit, its child directory (_temporary) will be [cleaned
> up|https://github.com/apache/hadoop/blob/a36d8adfd18e88f2752f4387ac4497aadd3a74e7/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/staging/StagingCommitter.java#L516]
> properly, but ${UUID}/staging-uploads will remain.
> This will result in having too many empty ${UUID}/staging-uploads directories
> under /user/${USER}/tmp/staging/${USER}, and will eventually cause an issue
> in an environment where the max number of items in a directory is capped
> (e.g. by dfs.namenode.fs-limits.max-directory-items in HDFS).
> {noformat}
> The directory item limit of /user/${USER}/tmp/staging/${USER} is exceeded:
> limit=1048576 items=1048576
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:1205)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]