[
https://issues.apache.org/jira/browse/HADOOP-19734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18032518#comment-18032518
]
Steve Loughran commented on HADOOP-19734:
-----------------------------------------
{code}
[ERROR] ITestS3AHugeMagicCommits.test_030_postCreationAssertions:192 »
AWSBadRequest Completing multipart upload on
job-00/test/tests3ascale/ITestS3AHugeMagicCommits/commit/commit.bin:
software.amazon.awssdk.services.s3.model.S3Exception: One or more of the
specified parts could not be found. The part may not have been uploaded, or
the specified entity tag may not match the part's entity tag. (Service: S3,
Status Code: 400, Request ID: JAEYPCZ4P3JYGMTD, Extended Request ID:
O/135mw9Xd2aEuFUh0ICWYc8DLXSpBUWaVGkEgEFGf0xO8o+XlZXY0hI+mvennOGt+C/UI7mNrQ=)
(SDK Attempt Count: 1):InvalidPart: One or more of the specified parts could
not be found. The part may not have been uploaded, or the specified entity tag
may not match the part's entity tag. (Service: S3, Status Code: 400, Request
ID: JAEYPCZ4P3JYGMTD, Extended Request ID:
O/135mw9Xd2aEuFUh0ICWYc8DLXSpBUWaVGkEgEFGf0xO8o+XlZXY0hI+mvennOGt+C/UI7mNrQ=)
(SDK Attempt Count: 1)
[ERROR]
ITestS3AHugeMagicCommits>AbstractSTestS3AHugeFiles.test_045_vectoredIOHugeFile:538->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/ITestS3AHugeMagicCommits/commit/commit.bin
in
s3a://stevel-london/job-00/test/tests3ascale/ITestS3AHugeMagicCommits/commit
[ERROR]
ITestS3AHugeFilesArrayBlocks>AbstractSTestS3AHugeFiles.test_010_CreateHugeFile:276
» AWSBadRequest Completing multipart upload on
job-00/test/tests3ascale/array/src/hugefile:
software.amazon.awssdk.services.s3.model.S3Exception: One or more of the
specified parts could not be found. The part may not have been uploaded, or
the specified entity tag may not match the part's entity tag. (Service: S3,
Status Code: 400, Request ID: 1NNBCSX4NCDN7G9X, Extended Request ID:
8vMmeyt1GfjGrf3UL9AN8vlwWSn9860f1gdeIBC3drmcjeQwC6wOPinMD8MSO6ggGw9ywwdcXroGTdVSFLYq0S0VdM/5bYfanDXJ43Eb4QU=)
(SDK Attempt Count: 1):InvalidPart: One or more of the specified parts could
not be found. The part may not have been uploaded, or the specified entity tag
may not match the part's entity tag. (Service: S3, Status Code: 400, Request
ID: 1NNBCSX4NCDN7G9X, Extended Request ID:
8vMmeyt1GfjGrf3UL9AN8vlwWSn9860f1gdeIBC3drmcjeQwC6wOPinMD8MSO6ggGw9ywwdcXroGTdVSFLYq0S0VdM/5bYfanDXJ43Eb4QU=)
(SDK Attempt Count: 1)
[ERROR]
ITestS3AHugeFilesArrayBlocks>AbstractSTestS3AHugeFiles.test_030_postCreationAssertions:433
» FileNotFound Huge file: not found
s3a://stevel-london/job-00/test/tests3ascale/array/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/array/src
[ERROR]
ITestS3AHugeFilesArrayBlocks>AbstractSTestS3AHugeFiles.test_040_PositionedReadHugeFile:478->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/array/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/array/src
[ERROR]
ITestS3AHugeFilesArrayBlocks>AbstractSTestS3AHugeFiles.test_045_vectoredIOHugeFile:538->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/array/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/array/src
[ERROR]
ITestS3AHugeFilesArrayBlocks>AbstractSTestS3AHugeFiles.test_050_readHugeFile:624->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/array/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/array/src
[ERROR]
ITestS3AHugeFilesArrayBlocks>AbstractSTestS3AHugeFiles.test_100_renameHugeFile:679->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/array/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/array/src
[ERROR]
ITestS3AHugeFilesByteBufferBlocks>AbstractSTestS3AHugeFiles.test_010_CreateHugeFile:276
» AWSBadRequest Completing multipart upload on
job-00/test/tests3ascale/bytebuffer/src/hugefile:
software.amazon.awssdk.services.s3.model.S3Exception: One or more of the
specified parts could not be found. The part may not have been uploaded, or
the specified entity tag may not match the part's entity tag. (Service: S3,
Status Code: 400, Request ID: K0K75V8AH7SVBHS3, Extended Request ID:
kDosbp+Z2PLZn9tVtRF9QfOqh1MgLbIKYaYFn2JeIptXlBV4v1a/wFukoXnaF7fCp6zx3vR8feE0fScUJEw+WhNW9lzu9dBxssOA62UA2kg=)
(SDK Attempt Count: 1):InvalidPart: One or more of the specified parts could
not be found. The part may not have been uploaded, or the specified entity tag
may not match the part's entity tag. (Service: S3, Status Code: 400, Request
ID: K0K75V8AH7SVBHS3, Extended Request ID:
kDosbp+Z2PLZn9tVtRF9QfOqh1MgLbIKYaYFn2JeIptXlBV4v1a/wFukoXnaF7fCp6zx3vR8feE0fScUJEw+WhNW9lzu9dBxssOA62UA2kg=)
(SDK Attempt Count: 1)
[ERROR]
ITestS3AHugeFilesByteBufferBlocks>AbstractSTestS3AHugeFiles.test_030_postCreationAssertions:433
» FileNotFound Huge file: not found
s3a://stevel-london/job-00/test/tests3ascale/bytebuffer/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/bytebuffer/src
[ERROR]
ITestS3AHugeFilesByteBufferBlocks>AbstractSTestS3AHugeFiles.test_040_PositionedReadHugeFile:478->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/bytebuffer/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/bytebuffer/src
[ERROR]
ITestS3AHugeFilesByteBufferBlocks>AbstractSTestS3AHugeFiles.test_045_vectoredIOHugeFile:538->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/bytebuffer/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/bytebuffer/src
[ERROR]
ITestS3AHugeFilesByteBufferBlocks>AbstractSTestS3AHugeFiles.test_050_readHugeFile:624->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/bytebuffer/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/bytebuffer/src
[ERROR]
ITestS3AHugeFilesByteBufferBlocks>AbstractSTestS3AHugeFiles.test_100_renameHugeFile:679->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/bytebuffer/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/bytebuffer/src
[ERROR]
ITestS3AHugeFilesDiskBlocks>AbstractSTestS3AHugeFiles.test_010_CreateHugeFile:276
» AWSBadRequest Completing multipart upload on
job-00/test/tests3ascale/disk/src/hugefile:
software.amazon.awssdk.services.s3.model.S3Exception: One or more of the
specified parts could not be found. The part may not have been uploaded, or
the specified entity tag may not match the part's entity tag. (Service: S3,
Status Code: 400, Request ID: 73T4YAYRWE63WAW5, Extended Request ID:
6ucEY2heh2NsxE8dBrlZp9AE4Tb+hbvnyxea1/yp5H85BEvkQdYsfNlRH5XZM1g4hHPDSoGMVtM=)
(SDK Attempt Count: 1):InvalidPart: One or more of the specified parts could
not be found. The part may not have been uploaded, or the specified entity tag
may not match the part's entity tag. (Service: S3, Status Code: 400, Request
ID: 73T4YAYRWE63WAW5, Extended Request ID:
6ucEY2heh2NsxE8dBrlZp9AE4Tb+hbvnyxea1/yp5H85BEvkQdYsfNlRH5XZM1g4hHPDSoGMVtM=)
(SDK Attempt Count: 1)
[ERROR]
ITestS3AHugeFilesDiskBlocks>AbstractSTestS3AHugeFiles.test_030_postCreationAssertions:433
» FileNotFound Huge file: not found
s3a://stevel-london/job-00/test/tests3ascale/disk/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/disk/src
[ERROR]
ITestS3AHugeFilesDiskBlocks>AbstractSTestS3AHugeFiles.test_040_PositionedReadHugeFile:478->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/disk/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/disk/src
[ERROR]
ITestS3AHugeFilesDiskBlocks>AbstractSTestS3AHugeFiles.test_045_vectoredIOHugeFile:538->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/disk/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/disk/src
[ERROR]
ITestS3AHugeFilesDiskBlocks>AbstractSTestS3AHugeFiles.test_050_readHugeFile:624->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/disk/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/disk/src
[ERROR]
ITestS3AHugeFilesDiskBlocks>AbstractSTestS3AHugeFiles.test_100_renameHugeFile:679->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/disk/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/disk/src
[ERROR]
ITestS3AHugeFilesSSECDiskBlocks>AbstractSTestS3AHugeFiles.test_010_CreateHugeFile:276
» AWSBadRequest Completing multipart upload on
job-00/test/tests3ascale/disk/src/hugefile:
software.amazon.awssdk.services.s3.model.S3Exception: One or more of the
specified parts could not be found. The part may not have been uploaded, or
the specified entity tag may not match the part's entity tag. (Service: S3,
Status Code: 400, Request ID: ZSY181YB49GQFR83, Extended Request ID:
FrPEfsXO3Gbhxi3m4ZmyYSiyfscQ1QSm/1lKjRPLHEbLWH5vtGked+fHvZl281Dm6u013/5VP6pj42h4XISftk7p9uEIDGw31E7Ymcoviq4=)
(SDK Attempt Count: 1):InvalidPart: One or more of the specified parts could
not be found. The part may not have been uploaded, or the specified entity tag
may not match the part's entity tag. (Service: S3, Status Code: 400, Request
ID: ZSY181YB49GQFR83, Extended Request ID:
FrPEfsXO3Gbhxi3m4ZmyYSiyfscQ1QSm/1lKjRPLHEbLWH5vtGked+fHvZl281Dm6u013/5VP6pj42h4XISftk7p9uEIDGw31E7Ymcoviq4=)
(SDK Attempt Count: 1)
[ERROR]
ITestS3AHugeFilesSSECDiskBlocks>AbstractSTestS3AHugeFiles.test_030_postCreationAssertions:433
» FileNotFound Huge file: not found
s3a://stevel-london/job-00/test/tests3ascale/disk/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/disk/src
[ERROR]
ITestS3AHugeFilesSSECDiskBlocks>AbstractSTestS3AHugeFiles.test_040_PositionedReadHugeFile:478->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/disk/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/disk/src
[ERROR]
ITestS3AHugeFilesSSECDiskBlocks>AbstractSTestS3AHugeFiles.test_045_vectoredIOHugeFile:538->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/disk/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/disk/src
[ERROR]
ITestS3AHugeFilesSSECDiskBlocks>AbstractSTestS3AHugeFiles.test_050_readHugeFile:624->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/disk/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/disk/src
[ERROR]
ITestS3AHugeFilesSSECDiskBlocks>AbstractSTestS3AHugeFiles.test_100_renameHugeFile:679->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/disk/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/disk/src
[ERROR]
ITestS3AHugeFilesStorageClass.test_010_CreateHugeFile:74->AbstractSTestS3AHugeFiles.test_010_CreateHugeFile:276
» AWSBadRequest Completing multipart upload on
job-00/test/tests3ascale/array/src/hugefile:
software.amazon.awssdk.services.s3.model.S3Exception: One or more of the
specified parts could not be found. The part may not have been uploaded, or
the specified entity tag may not match the part's entity tag. (Service: S3,
Status Code: 400, Request ID: APYCQNP1GY02DGDE, Extended Request ID:
lE0hQJ67sSwCYSMmO7tDEAvEIOCcpwIbLdfqqrNTpWT0bHIaacaIEzZusajj79rnFQlWudxsMHBIUXdS9ELiKR0T923lcULZy4Essx1LoTs=)
(SDK Attempt Count: 1):InvalidPart: One or more of the specified parts could
not be found. The part may not have been uploaded, or the specified entity tag
may not match the part's entity tag. (Service: S3, Status Code: 400, Request
ID: APYCQNP1GY02DGDE, Extended Request ID:
lE0hQJ67sSwCYSMmO7tDEAvEIOCcpwIbLdfqqrNTpWT0bHIaacaIEzZusajj79rnFQlWudxsMHBIUXdS9ELiKR0T923lcULZy4Essx1LoTs=)
(SDK Attempt Count: 1)
[ERROR]
ITestS3AHugeFilesStorageClass.test_030_postCreationAssertions:81->AbstractSTestS3AHugeFiles.test_030_postCreationAssertions:433
» FileNotFound Huge file: not found
s3a://stevel-london/job-00/test/tests3ascale/array/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/array/src
[ERROR]
ITestS3AHugeFilesStorageClass>AbstractSTestS3AHugeFiles.test_045_vectoredIOHugeFile:538->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/array/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/array/src
[ERROR]
ITestS3AHugeFilesStorageClass.test_100_renameHugeFile:108->AbstractSTestS3AHugeFiles.assumeHugeFileExists:404->AbstractSTestS3AHugeFiles.assumeFileExists:414
» FileNotFound huge file not created: not found
s3a://stevel-london/job-00/test/tests3ascale/array/src/hugefile in
s3a://stevel-london/job-00/test/tests3ascale/array/src
[INFO]
[ERROR] Tests run: 124, Failures: 1, Errors: 30, Skipped: 13
[INFO]
{code}
This has to be some transient issue with my s3 london bucket, as if in progress
upload parts were not being retained. Never seen this before; the expiry time
is set to 24h
When these uploads fail we do leave incomplete uploads in progress:
{code}
Listing uploads under path ""
job-00-fork-0005/test/testCommitOperations
141OKG11JHhWF1GOnunHUd9ZzBJ8cUG9z0LsW_4wUGgCXCvDMQM3kRi5IOCUV8FdCHtg_w8SlipfubRtzCQoT5yEpOLv.cWOiOwjEaBzUjnuJORppfXuKy1piHpLnu98
job-00-fork-0005/test/testIfMatchTwoMultipartUploadsRaceConditionOneClosesFirst
yBJpm3zh4DjNQIDtyWgEmWVCk5sehVz5Vzn3QGr_tQT2iOonRp5ErXsQy24yIvnzRxBCZqVapy5VepLeu2udZBT5EXLnKRA3bchvzjtKDlipywSzYlL2N_xLUDCT359I
job-00-fork-0005/test/testIfNoneMatchTwoConcurrentMultipartUploads
AnspJPHUoPJqg61t28OvLfAogi6G9ocyx1Dm6XY2C.a_H_onklM0Nr0LIXaPiYlQjZIiH0fTsQ1e2KhEjS9pGxvSKOXq_4YibiGZmFC6rBolmfACMqIRpoeaqYDgzYW4
job-00-fork-0005/test/testMagicWriteRecovery/file.txt
KpvoTuVh85Wzm9XuU1EuxbATjb6D.Zv8vEj3z2S6AvJBHCBssy4iphxNhTkLDs7ceEwak4IPtdXED1vRf3geXT7MRMJn8d6feafvHVEgzbD31odpzTLmOaPrU_mFQXGV
job-00-fork-0005/test/testMagicWriteRecovery/file.txt
CnrbWU3pzgEGvjRuDuaP43Xcv1eBF5aLknqYaZA1vwO3b1QUIu9QJSiZjuLMYKT9GKw1QXwqoKo4iuxTY1a18bARx4XMEiL98kZBv0TPMaAfXE.70Olh8Q2kTyDlUCSh
job-00-fork-0005/test/testMagicWriteRecovery/file.txt
dEVGPBRsuOAzL5pGA02ve9qJhAlNK8lb8khF6laKjo9U0j_aG1xLkHEfPLrmcrcsLxC3R755Yv_uKbzY_Vnoc.nXCprvutM1TZmLLN_7LHrQ0tY0IjYSS6hVzDVlHbvC
job-00-fork-0006/test/restricted/testCommitEmptyFile/empty-commit.txt
NOCjVJqycZhkalrvU26F5oIaJP51q055et2N6b74.2JVjiKL8KwrhOhdrtumOrZ2tZWNqaK4iKZ_iosqgehJOiPbWJwxvrfvA5V.dAUTLNqjtEf5tfWh0UXu.vahDy_S5SSgNLFXK.VB82i5MZtOcw--
job-00/test/tests3ascale/ITestS3AHugeMagicCommits/commit/commit.bin
lsYNpdn_oiWLwEVvvM621hCvIwDVaL4y_bbwVpQouW1OBThA.P9cR8fZtxvBjGdMY41UH0dTjxGHtF3BXEY8WXqmcnO9QHs_Jy.os781pE3MGzqgzFyxmd0yN6LFcTbq
test/restricted/testCommitEmptyFile/empty-commit.txt
T3W9V56Bv_FMhKpgcBgJ1H2wOBkPKk23T0JomesBzZyqiIAu3NiROibAgoZUhWSdoTKSJoOgcn3UWYGOvGBbsHteS_N_c1QoTEp0GE7PNlzDfs1GheJ5SOpUgaEY6MaYdNe0mn0gY48FDXpVB2nqiA--
test/restricted/testCommitEmptyFile/empty-commit.txt
.cr4b3xkfze4N24Bj3PAm_ACIyIVuTU4DueDktU1abNu2LJWXH2HKnUu1oOjfnnQwnUXp4VmXBVbZ5aq8E8gVCxN.Oyb7hmGVtESmRjpqIXSW80JrB_0_dqXe.uAT.JH7kEWywAlb4NIqJ5Xz99tvA--
Total 10 uploads found.
{code}
Most interesting here is `testIfNoneMatchTwoConcurrentMultipartUploads`,
because this initiates then completes an MPU, so as to create a zero byte file.
It doesn't upload any parts.
The attempt to complete failed.
{code}
[ERROR]
org.apache.hadoop.fs.s3a.impl.ITestS3APutIfMatchAndIfNoneMatch.testIfNoneMatchTwoConcurrentMultipartUploads
-- Time elapsed: 2.783 s <<< ERROR!
org.apache.hadoop.fs.s3a.AWSBadRequestException: Completing multipart upload on
job-00-fork-0005/test/testIfNoneMatchTwoConcurrentMultipartUploads:
software.amazon.awssdk.services.s3.model.S3Exception: One or more of the
specified parts could not be found. The part may not have been uploaded, or
the specified entity tag may not match the part's entity tag. (Service: S3,
Status Code: 400, Request ID: 9JCJ6M5QRDGJNYYS, Extended Request ID:
Z7Q7+LA0o/5B4xoIGhgo+tVppawZ0UBj7X4RNb+0m9RbOAOwD/Apv1o+KmnW0aypjwmfFlarxjo=)
(SDK Attempt Count: 1):InvalidPart: One or more of the specified parts could
not be found. The part may not have been uploaded, or the specified entity tag
may not match the part's entity tag. (Service: S3, Status Code: 400, Request
ID: 9JCJ6M5QRDGJNYYS, Extended Request ID:
Z7Q7+LA0o/5B4xoIGhgo+tVppawZ0UBj7X4RNb+0m9RbOAOwD/Apv1o+KmnW0aypjwmfFlarxjo=)
(SDK Attempt Count: 1)
at
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:265)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124)
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372)
at
org.apache.hadoop.fs.s3a.WriteOperationHelper.finalizeMultipartUpload(WriteOperationHelper.java:318)
at
org.apache.hadoop.fs.s3a.WriteOperationHelper.completeMPUwithRetries(WriteOperationHelper.java:370)
at
org.apache.hadoop.fs.s3a.S3ABlockOutputStream$MultiPartUpload.lambda$complete$3(S3ABlockOutputStream.java:1227)
at
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.measureDurationOfInvocation(IOStatisticsBinding.java:493)
at
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDurationOfInvocation(IOStatisticsBinding.java:464)
at
org.apache.hadoop.fs.s3a.S3ABlockOutputStream$MultiPartUpload.complete(S3ABlockOutputStream.java:1225)
at
org.apache.hadoop.fs.s3a.S3ABlockOutputStream$MultiPartUpload.access$1500(S3ABlockOutputStream.java:876)
at
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.close(S3ABlockOutputStream.java:545)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:77)
at
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at
org.apache.hadoop.fs.s3a.impl.ITestS3APutIfMatchAndIfNoneMatch.createFileWithFlags(ITestS3APutIfMatchAndIfNoneMatch.java:190)
at
org.apache.hadoop.fs.s3a.impl.ITestS3APutIfMatchAndIfNoneMatch.testIfNoneMatchTwoConcurrentMultipartUploads(ITestS3APutIfMatchAndIfNoneMatch.java:380)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.util.ArrayList.forEach(ArrayList.java:1259)
at java.util.ArrayList.forEach(ArrayList.java:1259)
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: One or more of
the specified parts could not be found. The part may not have been uploaded,
or the specified entity tag may not match the part's entity tag. (Service: S3,
Status Code: 400, Request ID: 9JCJ6M5QRDGJNYYS, Extended Request ID:
Z7Q7+LA0o/5B4xoIGhgo+tVppawZ0UBj7X4RNb+0m9RbOAOwD/Apv1o+KmnW0aypjwmfFlarxjo=)
(SDK Attempt Count: 1)
at
software.amazon.awssdk.services.s3.model.S3Exception$BuilderImpl.build(S3Exception.java:113)
at
software.amazon.awssdk.services.s3.model.S3Exception$BuilderImpl.build(S3Exception.java:61)
at
software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.retryPolicyDisallowedRetryException(RetryableStageHelper.java:168)
at
software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:73)
at
software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36)
at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at
software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:53)
at
software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:35)
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:82)
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62)
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:43)
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50)
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32)
at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
at
software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:210)
at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103)
at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:173)
at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:80)
at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:182)
at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:74)
at
software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45)
at
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)
at
software.amazon.awssdk.services.s3.DefaultS3Client.completeMultipartUpload(DefaultS3Client.java:801)
at
software.amazon.awssdk.services.s3.DelegatingS3Client.lambda$completeMultipartUpload$1(DelegatingS3Client.java:611)
at
software.amazon.awssdk.services.s3.internal.crossregion.S3CrossRegionSyncClient.invokeOperation(S3CrossRegionSyncClient.java:67)
at
software.amazon.awssdk.services.s3.DelegatingS3Client.completeMultipartUpload(DelegatingS3Client.java:611)
at
org.apache.hadoop.fs.s3a.impl.S3AStoreImpl.completeMultipartUpload(S3AStoreImpl.java:906)
at
org.apache.hadoop.fs.s3a.S3AFileSystem$WriteOperationHelperCallbacksImpl.completeMultipartUpload(S3AFileSystem.java:1953)
at
org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$finalizeMultipartUpload$1(WriteOperationHelper.java:324)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
... 18 more
{code}
Yet the uploads list afterwards finds it
{code}
job-00-fork-0005/test/testIfNoneMatchTwoConcurrentMultipartUploads
AnspJPHUoPJqg61t28OvLfAogi6G9ocyx1Dm6XY2C.a_H_onklM0Nr0LIXaPiYlQjZIiH0fTsQ1e2KhEjS9pGxvSKOXq_4YibiGZmFC6rBolmfACMqIRpoeaqYDgzYW4
{code}
> S3A: retry on MPU completion failure "One or more of the specified parts
> could not be found"
> --------------------------------------------------------------------------------------------
>
> Key: HADOOP-19734
> URL: https://issues.apache.org/jira/browse/HADOOP-19734
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.4.2
> Environment: aws s3 london
> Reporter: Steve Loughran
> Priority: Minor
>
> Experienced transient failure in test run of
> https://github.com/apache/hadoop/pull/7882 : all MPU complete posts failed
> because the request or parts were not found...the tests started succeeding
> 60-90s later *and* a "hadoop s3guards uploads" call listed the outstanding
> uploads of the failing tests.
> Hypothesis: a transient failure meant the server receiving the POST calls to
> complete the uploads was mistakenly reporting no upload IDs.
> Outcome: all active write operations failed, without any retry attempts. This
> can lose data and fail jobs, even though the store may recover.
> Proposed. The multipart uploads, especially block output stream, retry on
> this error; treat it as a connectivity issue.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]