[ 
https://issues.apache.org/jira/browse/HADOOP-18298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17555499#comment-17555499
 ] 

Steve Loughran commented on HADOOP-18298:
-----------------------------------------

I'm sorry but I believe you have completely failed to realise the fundamental 
aspect of the S3A committers

h2. We do not complete multipart uploads in task commit, because that is what 
we do in job commit.


The delayed commit is the core, critical part of the entire algorithm. So your 
statment "uploadFileToPendingCommit" doesn't finish the upload is correct. that 
is why it is called uploadFileToPendingCommit and not uploadFile. Delaying the 
manifestation of the upload is how we ensure that no intermediate the data is 
visible until job commit. And this allows for speculation and for task failure 
during both task execution and task commit. It also insures that if the entire 
job fails at any point prior to job commit, none of the work is visible.
 
For more information please read the algorithm
https://github.com/steveloughran/zero-rename-committer/releases/tag/tag_release_2021-05-17

If you have found errors in the algorithm especially the correctness of the 
protocol, welcome to submit changes, ideally including proofs of correctness.

And if you find that is a problem with things working on Minio, well, Minio has 
quirks.

My suggestion to you is to run the entire Hadoop-aws integration test suites 
against your S3 server. These include running MapReduce jobs against the store 
and verifying that the output is present and correct. Precisely because we do 
this against AWS S3, I am confident it works. That and the little *detail* that 
we have been using this in production for 3+ years.

I am going to close this issue as invalid. I have however changed the title to 
make clear that minio may be a factor.

> Hadoop AWS | Staging committer Multipartupload not completing on minio
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-18298
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18298
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 3.3.1
>         Environment: minio
>            Reporter: Ayush Goyal
>            Priority: Major
>
> In Hadoop aws staging 
> committer(org.apache.hadoop.fs.s3a.commit.staging.StagingCommitter), 
> Committer uploads files from local to s3(method- commitTaskInternal) which 
> calls uploadFileToPendingCommit of CommitOperation to upload file using 
> multipart upload.
>  
> Multipart upload consists of three steps-
> 1)Initialise multipartupload.
> 2) Breaks the file to part and upload Parts.
> 3) Merge all the parts of files and finalize multipart.
>  
> In the implementation of uploadFileToPendingCommit, first 2 steps are 
> implemented. However, 3rd part is missing which leads to uploading the parts 
> file but because it is not merged at the end of job no files are there in 
> destination directory.
>  
> S3 logs before implement 3rd steps-
>  
> {code:java}
> 2022-05-30T13:49:31:000 [200 OK] s3.NewMultipartUpload 
> localhost:9000/minio-feature-testing/spark-job/processed/output-parquet-staging-7/part-00000-ce0a965f-622a-4950-bb4b-550470883134-c000-b552fb34-6156-4aa8-9085-679ad14fab6e.snappy.parquet?uploads
>   240b:c1d1:123:664f:c5d2:2::               8.677ms      ↑ 137 B ↓ 724 B
> 2022-05-30T13:49:31:000 [200 OK] s3.PutObjectPart 
> localhost:9000/minio-feature-testing/spark-job/processed/output-parquet-staging-7/part-00000-ce0a965f-622a-4950-bb4b-550470883134-c000-b552fb34-6156-4aa8-9085-679ad14fab6e.snappy.parquet?uploadId=f3beae8e-3001-48be-9bc4-306b71940e50&partNumber=1
>   240b:c1d1:123:664f:c5d2:2::                443.156ms    ↑ 51 KiB ↓ 325 B
> 2022-05-30T13:49:32:000 [200 OK] s3.ListObjectsV2 
> localhost:9000/minio-feature-testing/?list-type=2&delimiter=%2F&max-keys=2&prefix=spark-job%2Fprocessed%2Foutput-parquet-staging-7%2F_SUCCESS%2F&fetch-owner=false
>   240b:c1d1:123:664f:c5d2:2::                3.414ms      ↑ 137 B ↓ 646 B
> 2022-05-30T13:49:32:000 [200 OK] s3.PutObject 
> localhost:9000/minio-feature-testing/spark-job/processed/output-parquet-staging-7/_SUCCESS
>  240b:c1d1:123:664f:c5d2:2::                52.734ms     ↑ 8.7 KiB ↓ 380 B
> 2022-05-30T13:49:32:000 [200 OK] s3.DeleteMultipleObjects 
> localhost:9000/minio-feature-testing/?delete  240b:c1d1:123:664f:c5d2:2::     
>            73.954ms     ↑ 350 B ↓ 432 B
> 2022-05-30T13:49:32:000 [404 Not Found] s3.HeadObject 
> localhost:9000/minio-feature-testing/spark-job/processed/output-parquet-staging-7/_temporary
>  240b:c1d1:123:664f:c5d2:2::                2.658ms      ↑ 137 B ↓ 291 B
> 2022-05-30T13:49:32:000 [200 OK] s3.ListObjectsV2 
> localhost:9000/minio-feature-testing/?list-type=2&delimiter=%2F&max-keys=2&prefix=spark-job%2Fprocessed%2Foutput-parquet-staging-7%2F_temporary%2F&fetch-owner=false
>   240b:c1d1:123:664f:c5d2:2::                 4.807ms      ↑ 137 B ↓ 648 B
> 2022-05-30T13:49:32:000 [200 OK] s3.ListMultipartUploads 
> localhost:9000/minio-feature-testing/?uploads&prefix=spark-job%2Fprocessed%2Foutput-parquet-staging-7%2F
>   240b:c0e0:102:553e:b4c2:2::               1.081ms      ↑ 137 B ↓ 776 B
> 2022-05-30T13:49:32:000 [404 Not Found] s3.HeadObject 
> localhost:9000/minio-feature-testing/spark-job/processed/output-parquet-staging-7/.spark-staging-ce0a965f-622a-4950-bb4b-550470883134
>  240b:c1d1:123:664f:c5d2:2::                 5.68ms       ↑ 137 B ↓ 291 B
> 2022-05-30T13:49:32:000 [200 OK] s3.ListObjectsV2 
> localhost:9000/minio-feature-testing/?list-type=2&delimiter=%2F&max-keys=2&prefix=spark-job%2Fprocessed%2Foutput-parquet-staging-7%2F.spark-staging-ce0a965f-622a-4950-bb4b-550470883134%2F&fetch-owner=false
>   240b:c1d1:123:664f:c5d2:2::              2.452ms      ↑ 137 B ↓ 689 B
>   {code}
> Here , After s3.PutObjectPart there is no completeMultipartupload call for 
> 3rd step.
>  
> S3 logs after implement 3rd steps-
>  
> {code:java}
> 2022-06-17T10:56:12:000 [200 OK] s3.NewMultipartUpload 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D0/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads
>   240b:c1d1:123:664f:c5d2:2::               9.116ms      ↑ 137 B ↓ 750 B
> 2022-06-17T10:56:12:000 [200 OK] s3.NewMultipartUpload 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D15/quarter%3D45/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads
>   240b:c1d1:123:664f:c5d2:2::               9.416ms      ↑ 137 B ↓ 751 B
> 2022-06-17T10:56:12:000 [200 OK] s3.NewMultipartUpload 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D45/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads
>   240b:c1d1:123:664f:c5d2:2::               8.506ms      ↑ 137 B ↓ 751 B
> 2022-06-17T10:56:12:000 [200 OK] s3.NewMultipartUpload 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D15/quarter%3D0/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads
>   240b:c1d1:123:664f:c5d2:2::               9.815ms      ↑ 137 B ↓ 750 B
> 2022-06-17T10:56:12:000 [200 OK] s3.NewMultipartUpload 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D30/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads
>   240b:c1d1:123:664f:c5d2:2::               10.09ms      ↑ 137 B ↓ 751 B
> 2022-06-17T10:56:12:000 [200 OK] s3.NewMultipartUpload 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D15/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads
>   240b:c1d1:123:664f:c5d2:2::               9.851ms      ↑ 137 B ↓ 751 B
> 2022-06-17T10:56:12:000 [200 OK] s3.NewMultipartUpload 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D17/quarter%3D0/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads
>   240b:c1d1:123:664f:c5d2:2::               9.006ms      ↑ 137 B ↓ 750 B
> 2022-06-17T10:56:12:000 [200 OK] s3.NewMultipartUpload 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D15/quarter%3D15/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads
>   240b:c1d1:123:664f:c5d2:2::               9.217ms      ↑ 137 B ↓ 751 B
> 2022-06-17T10:56:12:000 [200 OK] s3.PutObjectPart 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D15/quarter%3D45/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=7da87f0a-f8ff-4f9c-b877-b2fdd18d3c5f&partNumber=1
>   240b:c1d1:123:664f:c5d2:2::               817.474ms    ↑ 52 KiB ↓ 325 B
> 2022-06-17T10:56:12:000 [200 OK] s3.PutObjectPart 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D15/quarter%3D15/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=782769d0-43f1-43b8-aae0-54ac4c8c6603&partNumber=1
>   240b:c1d1:123:664f:c5d2:2::               818.363ms    ↑ 85 KiB ↓ 325 B
> 2022-06-17T10:56:12:000 [200 OK] s3.PutObjectPart 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D17/quarter%3D0/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=2c509073-e2b6-4d0a-a65a-bb4f154a432c&partNumber=1
>   240b:c1d1:123:664f:c5d2:2::               819.765ms    ↑ 54 KiB ↓ 325 B
> 2022-06-17T10:56:12:000 [200 OK] s3.PutObjectPart 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D0/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=c7e09609-6193-4d41-bc05-4020291725e4&partNumber=1
>   240b:c1d1:123:664f:c5d2:2::               818.782ms    ↑ 55 KiB ↓ 325 B
> 2022-06-17T10:56:12:000 [200 OK] s3.PutObjectPart 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D15/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=3bb4278e-455a-4dc4-af01-ed3227430590&partNumber=1
>   240b:c1d1:123:664f:c5d2:2::               817.97ms     ↑ 51 KiB ↓ 325 B
> 2022-06-17T10:56:12:000 [200 OK] s3.PutObjectPart 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D15/quarter%3D0/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=8fe799e3-c712-43b7-a074-a2359232de07&partNumber=1
>   240b:c1d1:123:664f:c5d2:2::               819.183ms    ↑ 80 KiB ↓ 325 B
> 2022-06-17T10:56:12:000 [200 OK] s3.PutObjectPart 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D45/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=c2e1477b-5457-4cbe-8fdb-4e80eaca63fe&partNumber=1
>   240b:c1d1:123:664f:c5d2:2::               818.126ms    ↑ 53 KiB ↓ 325 B
> 2022-06-17T10:56:12:000 [200 OK] s3.PutObjectPart 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D30/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=992167c8-fbde-4a0d-bd4d-5ce7ddd51a87&partNumber=1
>   240b:c1d1:123:664f:c5d2:2::               818.176ms    ↑ 56 KiB ↓ 325 B
> 2022-06-17T10:56:12:000 [200 OK] s3.CompleteMultipartUpload 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D15/quarter%3D45/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=7da87f0a-f8ff-4f9c-b877-b2fdd18d3c5f
>   240b:c1d1:123:664f:c5d2:2::               632.761ms    ↑ 272 B ↓ 1.1 KiB
> 2022-06-17T10:56:13:000 [200 OK] s3.NewMultipartUpload 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D17/quarter%3D15/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads
>   240b:c1d1:123:664f:c5d2:2::               6.231ms      ↑ 137 B ↓ 751 B
> 2022-06-17T10:56:12:000 [200 OK] s3.CompleteMultipartUpload 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D15/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=3bb4278e-455a-4dc4-af01-ed3227430590
>   240b:c1d1:123:664f:c5d2:2::               697.946ms    ↑ 272 B ↓ 1.1 KiB
> 2022-06-17T10:56:12:000 [200 OK] s3.CompleteMultipartUpload 
> localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D17/quarter%3D0/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=2c509073-e2b6-4d0a-a65a-bb4f154a432c
>   240b:c1d1:123:664f:c5d2:2::               714.377ms    ↑ 272 B ↓ 1.1 KiB
>  {code}
>  
>  
> Needs to be implement -
>  
> After uploadPart call and all upload id's are added to commitData, 
> innerCommit should be called.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to