Hi, we are trying to use org.apache.parquet.avro
<https://www.tabnine.com/code/java/packages/org.apache.parquet.avro>
.AvroParquetWriter

to write parquet file to s3 bucket. The file is successfully written to s3
bucket but

get an exception

com.amazonaws.SdkClientException: Unable to verify integrity of data upload.

The purpose is to resolve these exceptions while The s3 bucket is encrypted
with SSE-KMS not SSE-S3.

 It appears that the exceptions are thrown because of code blocks in the
link below

https://github.com/aws/aws-sdk-java/blob/fd409dee8ae23fb8953e0bb4dbde65536a7e0514/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/AmazonS3Client.java#L1876

>From amazon doc, the etag is not same as MD5 when s3 bucket is encrypted
with SSE-KMS

https://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html

 *The possible way is to pass MD5 in request header or set system.property
to disable validation in
skipMd5CheckStrategy.skipClientSideValidationPerPutResponse as indicated in
link*

https://github.com/aws/aws-sdk-java/blob/99fe75a823d4b02f4e90fa0dda06a1558d5617a1/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/internal/SkipMd5CheckStrategy.java#L42

 The issue is that I do not find a proper way to inject such configurations
into AvroParquetWriter. Is this possible? If yes, can you help to show how
to do it?

 Thanks

Regin

Reply via email to