[
https://issues.apache.org/jira/browse/HADOOP-16632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16944624#comment-16944624
]
Steve Loughran commented on HADOOP-16632:
-----------------------------------------
This test is failing because the successful job should have deleted the __magic
directory, but the assertions finds the output of an attempt. Which is
potentially the sign of something going very wrong,
As it is -all is good, except for the cleanup, which is beyond our control.
For some reason this task was speculatively executed, and the second attempt
`_0003_m_000008_1` was not committed before the job completed and the AM
terminated. When the worker requested permission to commit, it couldn't connect
over the (broken) RPC channel, retried a bit and gave up.
h3. task attempt_1570197469968_0003_m_000008_1
{code:java}
2019-10-04 15:02:34,237 INFO [main]
org.apache.hadoop.fs.s3a.commit.magic.MagicCommitTracker: File
s3a://hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/tasks/attempt_1570197469968_0003_m_000008_1/__base/part-m-00008
is written as magic file to path
fork-0001/test/ITestS3ACommitterMRJob-execute-magic/part-m-00008
2019-10-04 15:02:35,717 INFO [main]
org.apache.hadoop.fs.s3a.commit.magic.MagicCommitTracker: Uncommitted data
pending to file
s3a://hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/tasks/attempt_1570197469968_0003_m_000008_1/__base/part-m-00008;
commit metadata for 1 parts in
fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/tasks/attempt_1570197469968_0003_m_000008_1/__base/part-m-00008.pending.
sixe: 4690 byte(s)
2019-10-04 15:02:36,877 INFO [main]
org.apache.hadoop.fs.s3a.S3ABlockOutputStream: File
fork-0001/test/ITestS3ACommitterMRJob-execute-magic/part-m-00008 will be
visible when the job is committed
2019-10-04 15:02:36,888 INFO [main] org.apache.hadoop.mapred.Task:
Task:attempt_1570197469968_0003_m_000008_1 is done. And is in the process of
committing
2019-10-04 15:02:36,889 INFO [main]
org.apache.hadoop.fs.s3a.commit.magic.MagicS3GuardCommitter: Starting:
needsTaskCommit task attempt_1570197469968_0003_m_000008_1
2019-10-04 15:02:39,118 INFO [main]
org.apache.hadoop.fs.s3a.commit.magic.MagicS3GuardCommitter: needsTaskCommit
task attempt_1570197469968_0003_m_000008_1: duration 0:02.230s
2019-10-04 15:02:39,124 WARN [main] org.apache.hadoop.mapred.Task: Failure
sending commit pending: java.io.EOFException: End of File Exception between
local host is: "HW13176-2.local/192.168.1.6"; destination host is:
"localhost":57166; : java.io.EOFException; For more details see:
http://wiki.apache.org/hadoop/EOFException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:837)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:791)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1557)
at org.apache.hadoop.ipc.Client.call(Client.java:1499)
at org.apache.hadoop.ipc.Client.call(Client.java:1396)
at
org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:251)
at com.sun.proxy.$Proxy8.commitPending(Unknown Source)
at org.apache.hadoop.mapred.Task.done(Task.java:1253)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:351)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at
org.apache.hadoop.ipc.Client$IpcStreams.readResponse(Client.java:1872)
at
org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1182)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1078)
{code}
h3. task attempt_1570197469968_0003_m_000008_1
And it looks like MR task failure is a no-warning System.exit(-1) call -the
workers do not get a chance to do any cleanup themselves.
{code:java}
2019-10-04 15:03:39,315 WARN [communication thread]
org.apache.hadoop.mapred.Task: Last retry, killing
attempt_1570197469968_0003_m_000008_1
{code}
An another attempt at the same task had already been committed. Thus: the
output of this attempt was not needed - all is good.
h3. AM
{code:java}
2019-10-04 15:02:29,487 [AsyncDispatcher event handler] INFO
impl.TaskAttemptImpl (TaskAttemptImpl.java:handle(1383)) -
attempt_1570197469968_0003_m_000008_0 TaskAttempt Transitioned from
COMMIT_PENDING to SUCCESS_FINISHING_CONTAINER
2019-10-04 15:02:29,488 [AsyncDispatcher event handler] INFO impl.TaskImpl
(TaskImpl.java:sendTaskSucceededEvents(759)) - Task succeeded with attempt
attempt_1570197469968_0003_m_000008_0
2019-10-04 15:02:29,488 [AsyncDispatcher event handler] INFO impl.TaskImpl
(TaskImpl.java:transition(963)) - Issuing kill to other attempt
attempt_1570197469968_0003_m_000008_1
2019-10-04 15:02:29,488 [AsyncDispatcher event handler] INFO impl.TaskImpl
(TaskImpl.java:handle(665)) - task_1570197469968_0003_m_000008 Task
Transitioned from RUNNING to SUCCEEDED
2019-10-04 15:02:29,489 [AsyncDispatcher event handler] INFO impl.JobImpl
(JobImpl.java:transition(1979)) - Num completed Tasks: 9
2019-10-04 15:02:29,490 [AsyncDispatcher event handler] INFO
impl.TaskAttemptImpl (TaskAttemptImpl.java:handle(1383)) -
attempt_1570197469968_0003_m_000008_1 TaskAttempt Transitioned from RUNNING to
KILL_CONTAINER_CLEANUP
2019-10-04 15:02:29,490 [ContainerLauncher #1] INFO
launcher.ContainerLauncherImpl (ContainerLauncherImpl.java:run(382)) -
Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container
container_1570197469968_0003_01_000014 taskAttempt
attempt_1570197469968_0003_m_000008_1
2019-10-04 15:02:29,490 [ContainerLauncher #1] INFO
launcher.ContainerLauncherImpl (ContainerLauncherImpl.java:kill(208)) - KILLING
attempt_1570197469968_0003_m_000008_1
{code}
But it left its output directory. Which is interesting, as the
MagicS3GuardCommitter task abort code did delete the directory,
h3. AM
{code:java}
2019-10-04 15:02:29,502 [AsyncDispatcher event handler] INFO
impl.TaskAttemptImpl (TaskAttemptImpl.java:handle(1383)) -
attempt_1570197469968_0003_m_000008_1 TaskAttempt Transitioned from
KILL_CONTAINER_CLEANUP to KILL_TASK_CLEANUP
2019-10-04 15:02:29,503 [CommitterEvent Processor #2] INFO
commit.CommitterEventHandler (CommitterEventHandler.java:run(232)) - Processing
the event EventType: TASK_ABORT
2019-10-04 15:02:29,504 [CommitterEvent Processor #2] INFO
magic.MagicS3GuardCommitter (DurationInfo.java:<init>(72)) - Starting: Abort
task attempt_1570197469968_0003_m_000008_1
2019-10-04 15:02:29,513 [IPC Server handler 25 on default port 57166] INFO
mapred.TaskAttemptListenerImpl (TaskAttemptListenerImpl.java:statusUpdate(413))
- Progress of TaskAttempt attempt_1570197469968_0003_m_000009_0 is : 1.0
2019-10-04 15:02:29,516 [IPC Server handler 4 on default port 57166] INFO
mapred.TaskAttemptListenerImpl (TaskAttemptListenerImpl.java:done(273)) - Done
acknowledgment from attempt_1570197469968_0003_m_000009_0
2019-10-04 15:02:29,516 [AsyncDispatcher event handler] INFO
impl.TaskAttemptImpl (TaskAttemptImpl.java:handle(1383)) -
attempt_1570197469968_0003_m_000009_0 TaskAttempt Transitioned from
COMMIT_PENDING to SUCCESS_FINISHING_CONTAINER
2019-10-04 15:02:29,517 [AsyncDispatcher event handler] INFO impl.TaskImpl
(TaskImpl.java:sendTaskSucceededEvents(759)) - Task succeeded with attempt
attempt_1570197469968_0003_m_000009_0
2019-10-04 15:02:29,520 [AsyncDispatcher event handler] INFO impl.TaskImpl
(TaskImpl.java:handle(665)) - task_1570197469968_0003_m_000009 Task
Transitioned from RUNNING to SUCCEEDED
2019-10
{code}
And in job teardown, everything under __magic is destroyed, which is logged via
the DDB operations trace
{code:java}
2019-10-04 15:02:34,923 [CommitterEvent Processor #3] INFO
commit.AbstractS3ACommitter (DurationInfo.java:close(87)) - Aborting all
pending commits under
s3a://hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic:
duration 0:00.245s
2019-10-04 15:02:34,923 [CommitterEvent Processor #3] INFO
commit.AbstractS3ACommitter (DurationInfo.java:close(87)) - Cleanup job
job_1570197469968_0003: duration 0:00.245s
2019-10-04 15:02:34,924 [CommitterEvent Processor #3] INFO
commit.AbstractS3ACommitter (HadoopExecutors.java:shutdown(118)) - Gracefully
shutting down executor service. Waiting max 30 SECONDS
2019-10-04 15:02:34,924 [CommitterEvent Processor #3] INFO
commit.AbstractS3ACommitter (HadoopExecutors.java:shutdown(129)) - Succesfully
shutdown executor service
2019-10-04 15:02:34,925 [CommitterEvent Processor #3] INFO
magic.MagicS3GuardCommitter (DurationInfo.java:<init>(72)) - Starting: Deleting
magic directory
s3a://hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic
2019-10-04 15:02:35,492 [s3a-transfer-shared-pool3-t3] DEBUG s3guard.Operations
(DynamoDBMetadataStore.java:logPut(2432)) - #(Delete-0009) TOMBSTONE
s3a:///hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/task_1570197469968_0003_m_000000.pendingset
2019-10-04 15:02:35,492 [s3a-transfer-shared-pool3-t4] DEBUG s3guard.Operations
(DynamoDBMetadataStore.java:logPut(2432)) - #(Delete-0009) TOMBSTONE
s3a:///hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/task_1570197469968_0003_m_000001.pendingset
2019-10-04 15:02:35,492 [s3a-transfer-shared-pool3-t5] DEBUG s3guard.Operations
(DynamoDBMetadataStore.java:logPut(2432)) - #(Delete-0009) TOMBSTONE
s3a:///hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/task_1570197469968_0003_m_000002.pendingset
2019-10-04 15:02:35,493 [s3a-transfer-shared-pool3-t6] DEBUG s3guard.Operations
(DynamoDBMetadataStore.java:logPut(2432)) - #(Delete-0009) TOMBSTONE
s3a:///hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/task_1570197469968_0003_m_000003.pendingset
2019-10-04 15:02:35,493 [s3a-transfer-shared-pool3-t7] DEBUG s3guard.Operations
(DynamoDBMetadataStore.java:logPut(2432)) - #(Delete-0009) TOMBSTONE
s3a:///hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/task_1570197469968_0003_m_000004.pendingset
2019-10-04 15:02:35,493 [s3a-transfer-shared-pool3-t8] DEBUG s3guard.Operations
(DynamoDBMetadataStore.java:logPut(2432)) - #(Delete-0009) TOMBSTONE
s3a:///hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/task_1570197469968_0003_m_000005.pendingset
2019-10-04 15:02:35,493 [s3a-transfer-shared-pool3-t9] DEBUG s3guard.Operations
(DynamoDBMetadataStore.java:logPut(2432)) - #(Delete-0009) TOMBSTONE
s3a:///hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/task_1570197469968_0003_m_000006.pendingset
2019-10-04 15:02:35,495 [s3a-transfer-shared-pool3-t10] DEBUG
s3guard.Operations (DynamoDBMetadataStore.java:logPut(2432)) - #(Delete-0009)
TOMBSTONE
s3a:///hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/task_1570197469968_0003_m_000007.pendingset
2019-10-04 15:02:35,496 [s3a-transfer-shared-pool3-t11] DEBUG
s3guard.Operations (DynamoDBMetadataStore.java:logPut(2432)) - #(Delete-0009)
TOMBSTONE
s3a:///hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/task_1570197469968_0003_m_000008.pendingset
2019-10-04 15:02:35,497 [s3a-transfer-shared-pool3-t12] DEBUG
s3guard.Operations (DynamoDBMetadataStore.java:logPut(2432)) - #(Delete-0009)
TOMBSTONE
s3a:///hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/task_1570197469968_0003_m_000009.pendingset
2019-10-04 15:02:35,497 [s3a-transfer-shared-pool3-t13] DEBUG
s3guard.Operations (DynamoDBMetadataStore.java:logPut(2432)) - #(Delete-0009)
TOMBSTONE
s3a:///hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/tasks/attempt_1570197469968_0003_m_000008_1
2019-10-04 15:02:35,844 [s3a-transfer-shared-pool3-t14] DEBUG
s3guard.Operations (DynamoDBMetadataStore.java:logPut(2432)) - #(Delete-0009)
TOMBSTONE
s3a:///hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic
2019-10-04 15:02:35,913 [s3a-transfer-shared-pool3-t15] DEBUG
s3guard.Operations (DynamoDBMetadataStore.java:logPut(2432)) - #(Delete-0009)
TOMBSTONE
s3a:///hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001
2019-10-04 15:02:35,981 [s3a-transfer-shared-pool3-t16] DEBUG
s3guard.Operations (DynamoDBMetadataStore.java:logPut(2432)) - #(Delete-0009)
TOMBSTONE
s3a:///hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/tasks
2019-10-04 15:02:36,050 [CommitterEvent Processor #3] INFO
magic.MagicS3GuardCommitter (DurationInfo.java:close(87)) - Deleting magic
directory
s3a://hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic:
duration 0:01.125s
2019-10-04 15:02:36,058 [CommitterEvent Processor #3] INFO
commit.AbstractS3ACommitter (DurationInfo.java:close(87)) - Task committer
attempt_1570197469968_0003_m_000000_0: commitJob(job_1570197469968_0003):
duration 0:05.989s
2019-10-04 15:02:36,107 [AsyncDispatcher event handler] INFO impl.JobImpl
(JobImpl.java:logJobHistoryFinishedEvent(1104)) - Calling handler for
JobFinishedEvent
2019-10-04 15:02:36,108 [AsyncDispatcher event handler] INFO impl.JobImpl
(JobImpl.java:handle(1020)) - job_1570197469968_0003Job Transitioned from
COMMITTING to SUCCEEDED
{code}
So why is it there?
Because the job was completed (and did its cleanup) *before the task finished
its write*. It was only when the write finished and the file closed that the
.pending file came into existence … which is why the assertions found it.
We now have a trace of what happens when a task is partitioned and carries on
its execution after the job completes. And it does what is required to do: none
of its output becomes manifest in the destination. The .pending file wasn't
uprated to the .pendingset file of all commits from that job, and of the
pending uploads were not committed.
There is one possible (and literal) costs to this which is that if the pending
uploads of the task are not aborted, then users will be billed for the
outstanding data until it is explicitly terminated (hadoop s3guard uploads
command; the config option to do this on s3a startup) all done implicitly by a
life cycle rule to abort all pending uploads older than certain age (which is
what we recommend). I did invoke {{s3guard uploads}} to see what it found -I
couldn't see anything... But the tests may be doing some extra cleanup.
Anyway, I view this failure of the test as a chance to look at the logs and see
what happens on a task partition. And yes we are good: no data from the
uncommitted task was manifest in the final job. I am happy.
The only thing we could look at is: could we do more in the task to cleanup.
That is-why didn't it get told to clean up its directory?
I don't think the MR protocol calls any worker-side abort operation before it
exits: there is nothing we can do.
It will be interesting to see what the other execution engines do here.
> Partitioned S3A magic committers can leave pending files under __magic, maybe
> uploads
> -------------------------------------------------------------------------------------
>
> Key: HADOOP-16632
> URL: https://issues.apache.org/jira/browse/HADOOP-16632
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.2.1, 3.1.3
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Minor
>
> Partitioned S3A magic committers can leaving pending files, maybe upload data
> This surfaced in an assertion failure on a parallel test run.
> I thought it was actually a test failure, but with HADOOP-16207 all the docs
> are preserved in the local FS and I can understand what happened.
> h3. Junit process
> {code}
> [INFO]
> [ERROR] Failures:
> [ERROR]
> ITestS3ACommitterMRJob.test_200_execute:344->customPostExecutionValidation:356
> Expected a java.io.FileNotFoundException to be thrown, but got the result: :
> "Found magic dir which should have been deleted at
> S3AFileStatus{path=s3a://hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic;
> isDirectory=true; modification_time=0; access_time=0; owner=stevel;
> group=stevel; permission=rwxrwxrwx; isSymlink=false; hasAcl=false;
> isEncrypted=true; isErasureCoded=false} isEmptyDirectory=UNKNOWN eTag=null
> versionId=null
> [s3a://hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/tasks/attempt_1570197469968_0003_m_000008_1/__base/part-m-00008
> s3a://hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/tasks/attempt_1570197469968_0003_m_000008_1/__base/part-m-00008.pending
> {code}
> Full details to follow in the comment as they are, well, detailed.
>
> Key point: AM-side job and task cleanup can happen before the worker task
> finishes its writes. This will result in files under __magic. It may result
> in pending uploads too -but only if the write began after the AM job cleanup
> did a list + abort of all pending uploads under the destination directory
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]