mehakmeet commented on pull request #2706: URL: https://github.com/apache/hadoop/pull/2706#issuecomment-841880071
Hey @bogthe, > The last part is always an exception with regular multi part uploads too! You can do parallel uploads and even upload the last part first and it would still work (for regular multi-part). Ah, I see, even I was thinking that with CSE it should still be able to find which part was last since during the parts upload step we provide part numbers and complete it in ascending order, But, I ran some tests with CSE enabled and I was facing these issues: AbstractContractMultipartUploaderTest#testMultipartUpload() T1: partSize: 5242880bytes(5MB) + 1byte = 5242881 bytes ``` 2021-05-17 02:43:43,998 [JUnit-testMultipartUpload] INFO contract.AbstractContractMultipartUploaderTest (DurationInfo.java:<init>(77)) - Starting: Put part 1 (size 5242881) s3a://mehakmeet-singh-data/test/testMultipartUpload 2021-05-17 02:43:44,002 [s3a-transfer-shared-pool1-t2] INFO s3a.WriteOperationHelper (WriteOperationHelper.java:operationRetried(146)) - upload part #1 upload ID cFOhefvaRWyUGkB_U6zV2Mhs8RMC3u55_WOASIRCRuv1hVIeGciyQkvs5lA7gvZrdb8W5mCGwSQLsGmg9K9QbsPP1lcBF30vEVaUwbyfq0PjBxehxEeHyMklZE8hhYo_ on test/testMultipartUpload: Retried 0: org.apache.hadoop.fs.s3a.AWSClientIOException: upload part #1 upload ID cFOhefvaRWyUGkB_U6zV2Mhs8RMC3u55_WOASIRCRuv1hVIeGciyQkvs5lA7gvZrdb8W5mCGwSQLsGmg9K9QbsPP1lcBF30vEVaUwbyfq0PjBxehxEeHyMklZE8hhYo_ on test/testMultipartUpload: com.amazonaws.SdkClientException: Invalid part size: part sizes for encrypted multipart uploads must be multiples of the cipher block size (16) with the exception of the last part.: Invalid part size: part sizes for encrypted multipart uploads must be multiples of the cipher block size (16) with the exception of the last part. 2021-05-17 02:43:44,824 [s3a-transfer-shared-pool1-t2] INFO s3a.WriteOperationHelper (WriteOperationHelper.java:operationRetried(146)) - upload part #1 upload ID cFOhefvaRWyUGkB_U6zV2Mhs8RMC3u55_WOASIRCRuv1hVIeGciyQkvs5lA7gvZrdb8W5mCGwSQLsGmg9K9QbsPP1lcBF30vEVaUwbyfq0PjBxehxEeHyMklZE8hhYo_ on test/testMultipartUpload: Retried 1: org.apache.hadoop.fs.s3a.AWSClientIOException: upload part #1 upload ID cFOhefvaRWyUGkB_U6zV2Mhs8RMC3u55_WOASIRCRuv1hVIeGciyQkvs5lA7gvZrdb8W5mCGwSQLsGmg9K9QbsPP1lcBF30vEVaUwbyfq0PjBxehxEeHyMklZE8hhYo_ on test/testMultipartUpload: com.amazonaws.SdkClientException: Invalid part size: part sizes for encrypted multipart uploads must be multiples of the cipher block size (16) with the exception of the last part.: Invalid part size: part sizes for encrypted multipart uploads must be multiples of the cipher block size (16) with the exception of the last part. 2021-05-17 02:43:46,184 [s3a-transfer-shared-pool1-t2] INFO s3a.WriteOperationHelper (WriteOperationHelper.java:operationRetried(146)) - upload part #1 upload ID cFOhefvaRWyUGkB_U6zV2Mhs8RMC3u55_WOASIRCRuv1hVIeGciyQkvs5lA7gvZrdb8W5mCGwSQLsGmg9K9QbsPP1lcBF30vEVaUwbyfq0PjBxehxEeHyMklZE8hhYo_ on test/testMultipartUpload: Retried 2: org.apache.hadoop.fs.s3a.AWSClientIOException: upload part #1 upload ID cFOhefvaRWyUGkB_U6zV2Mhs8RMC3u55_WOASIRCRuv1hVIeGciyQkvs5lA7gvZrdb8W5mCGwSQLsGmg9K9QbsPP1lcBF30vEVaUwbyfq0PjBxehxEeHyMklZE8hhYo_ on test/testMultipartUpload: com.amazonaws.SdkClientException: Invalid part size: part sizes for encrypted multipart uploads must be multiples of the cipher block size (16) with the exception of the last part.: Invalid part size: part sizes for encrypted multipart uploads must be multiples of the cipher block size (16) with the exception of the last part. 2021-05-17 02:43:50,537 [s3a-transfer-shared-pool1-t2] INFO s3a.WriteOperationHelper (WriteOperationHelper.java:operationRetried(146)) - upload part #1 upload ID cFOhefvaRWyUGkB_U6zV2Mhs8RMC3u55_WOASIRCRuv1hVIeGciyQkvs5lA7gvZrdb8W5mCGwSQLsGmg9K9QbsPP1lcBF30vEVaUwbyfq0PjBxehxEeHyMklZE8hhYo_ on test/testMultipartUpload: Retried 3: org.apache.hadoop.fs.s3a.AWSClientIOException: upload part #1 upload ID cFOhefvaRWyUGkB_U6zV2Mhs8RMC3u55_WOASIRCRuv1hVIeGciyQkvs5lA7gvZrdb8W5mCGwSQLsGmg9K9QbsPP1lcBF30vEVaUwbyfq0PjBxehxEeHyMklZE8hhYo_ on test/testMultipartUpload: com.amazonaws.SdkClientException: Invalid part size: part sizes for encrypted multipart uploads must be multiples of the cipher block size (16) with the exception of the last part.: Invalid part size: part sizes for encrypted multipart uploads must be multiples of the cipher block size (16) with the exception of the last part. 2021-05-17 02:44:00,768 [s3a-transfer-shared-pool1-t2] INFO s3a.WriteOperationHelper (WriteOperationHelper.java:operationRetried(146)) - upload part #1 upload ID cFOhefvaRWyUGkB_U6zV2Mhs8RMC3u55_WOASIRCRuv1hVIeGciyQkvs5lA7gvZrdb8W5mCGwSQLsGmg9K9QbsPP1lcBF30vEVaUwbyfq0PjBxehxEeHyMklZE8hhYo_ on test/testMultipartUpload: Retried 4: org.apache.hadoop.fs.s3a.AWSClientIOException: upload part #1 upload ID cFOhefvaRWyUGkB_U6zV2Mhs8RMC3u55_WOASIRCRuv1hVIeGciyQkvs5lA7gvZrdb8W5mCGwSQLsGmg9K9QbsPP1lcBF30vEVaUwbyfq0PjBxehxEeHyMklZE8hhYo_ on test/testMultipartUpload: com.amazonaws.SdkClientException: Invalid part size: part sizes for encrypted multipart uploads must be multiples of the cipher block size (16) with the exception of the last part.: Invalid part size: part sizes for encrypted multipart uploads must be multiples of the cipher block size (16) with the exception of the last part. ``` This retries a couple of times and fails with the exception: ``` org.apache.hadoop.fs.s3a.AWSClientIOException: upload part #1 upload ID cFOhefvaRWyUGkB_U6zV2Mhs8RMC3u55_WOASIRCRuv1hVIeGciyQkvs5lA7gvZrdb8W5mCGwSQLsGmg9K9QbsPP1lcBF30vEVaUwbyfq0PjBxehxEeHyMklZE8hhYo_ on test/testMultipartUpload: com.amazonaws.SdkClientException: Invalid part size: part sizes for encrypted multipart uploads must be multiples of the cipher block size (16) with the exception of the last part.: Invalid part size: part sizes for encrypted multipart uploads must be multiples of the cipher block size (16) with the exception of the last part. ``` T2: partSize: 5242880bytes(5MB) ``` 2021-05-17 02:46:22,270 [JUnit-testMultipartUpload] INFO contract.AbstractContractMultipartUploaderTest (DurationInfo.java:<init>(77)) - Starting: Put part 1 (size 5242880) s3a://mehakmeet-singh-data/test/testMultipartUpload 2021-05-17 02:46:22,907 [JUnit-testMultipartUpload] INFO contract.AbstractContractMultipartUploaderTest (DurationInfo.java:close(98)) - Put part 1 (size 5242880) s3a://mehakmeet-singh-data/test/testMultipartUpload: duration 0:00.637s 2021-05-17 02:46:22,910 [JUnit-testMultipartUpload] INFO contract.ContractTestUtils (ContractTestUtils.java:end(1924)) - Duration of Uploaded part 1: 637,220,364 nS 2021-05-17 02:46:22,911 [JUnit-testMultipartUpload] INFO contract.AbstractContractMultipartUploaderTest (AbstractContractMultipartUploaderTest.java:putPart(352)) - Upload bandwidth 7.846579 MB/s 2021-05-17 02:46:22,934 [JUnit-testMultipartUpload] INFO contract.AbstractContractMultipartUploaderTest (DurationInfo.java:<init>(77)) - Starting: Put part 2 (size 5242880) s3a://mehakmeet-singh-data/test/testMultipartUpload 2021-05-17 02:46:23,254 [JUnit-testMultipartUpload] INFO contract.AbstractContractMultipartUploaderTest (DurationInfo.java:close(98)) - Put part 2 (size 5242880) s3a://mehakmeet-singh-data/test/testMultipartUpload: duration 0:00.320s 2021-05-17 02:46:23,254 [JUnit-testMultipartUpload] INFO contract.ContractTestUtils (ContractTestUtils.java:end(1924)) - Duration of Uploaded part 2: 319,980,951 nS 2021-05-17 02:46:23,255 [JUnit-testMultipartUpload] INFO contract.AbstractContractMultipartUploaderTest (AbstractContractMultipartUploaderTest.java:putPart(352)) - Upload bandwidth 15.625930 MB/s 2021-05-17 02:46:23,275 [JUnit-testMultipartUpload] INFO contract.AbstractContractMultipartUploaderTest (DurationInfo.java:<init>(77)) - Starting: Put part 3 (size 5242880) s3a://mehakmeet-singh-data/test/testMultipartUpload 2021-05-17 02:46:23,990 [JUnit-testMultipartUpload] INFO contract.AbstractContractMultipartUploaderTest (DurationInfo.java:close(98)) - Put part 3 (size 5242880) s3a://mehakmeet-singh-data/test/testMultipartUpload: duration 0:00.715s 2021-05-17 02:46:23,990 [JUnit-testMultipartUpload] INFO contract.ContractTestUtils (ContractTestUtils.java:end(1924)) - Duration of Uploaded part 3: 715,353,661 nS 2021-05-17 02:46:23,990 [JUnit-testMultipartUpload] INFO contract.AbstractContractMultipartUploaderTest (AbstractContractMultipartUploaderTest.java:putPart(352)) - Upload bandwidth 6.989550 MB/s 2021-05-17 02:46:23,991 [JUnit-testMultipartUpload] INFO contract.AbstractContractMultipartUploaderTest (DurationInfo.java:<init>(77)) - Starting: Complete upload to s3a://mehakmeet-singh-data/test/testMultipartUpload 2021-05-17 02:47:49,055 [JUnit-testMultipartUpload] INFO contract.AbstractContractMultipartUploaderTest (DurationInfo.java:close(98)) - Complete upload to s3a://mehakmeet-singh-data/test/testMultipartUpload: duration 1:25.064s 2021-05-17 02:47:49,056 [JUnit-testMultipartUpload] INFO contract.AbstractContractMultipartUploaderTest (DurationInfo.java:<init>(77)) - Starting: Abort upload to s3a://mehakmeet-singh-data/test/testMultipartUpload 2021-05-17 02:47:49,058 [s3a-transfer-shared-pool1-t6] INFO s3a.S3AFileSystem (S3AFileSystem.java:abortMultipartUpload(4703)) - Aborting multipart upload l0UfFfsZXE8ogO8ojviT6D8iJo3oEM052apJu.txB1b5j1KPD4F8LQWWYHmOru4G1mu.uPPtGZhIYoT0P2S3g.k10ROOP7uXOiX7czPpmXzlA.67xB7YoN2_IczirQDL to test/testMultipartUpload ``` Eventually fails with the exception: ``` org.apache.hadoop.fs.s3a.AWSClientIOException: Completing multipart upload on test/testMultipartUpload: com.amazonaws.SdkClientException: Unable to complete an encrypted multipart upload without being told which part was the last. Without knowing which part was the last, the encrypted data in Amazon S3 is incomplete and corrupt.: Unable to complete an encrypted multipart upload without being told which part was the last. Without knowing which part was the last, the encrypted data in Amazon S3 is incomplete and corrupt. ``` Both of these passes without CSE. So, basically, we have a restriction to use only multiple of 16 as partSizes, even though min size of parts is 5MB and anything which is a multiple of MB, would be a multiple of 16, but we can't set any custom bytes(not multiple of 16) as partSize in CSE. And, even after we set it to a multiple of 16, I am seeing the exception regarding last part. So, is the logic of part numbers not applicable in CSE? Maybe I am missing something here? CC: @steveloughran -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
