Hi Regin,

Parquet is a layer to handle the file format. If you are looking for
injecting something in the request header, the S3 client library could be
the place you are looking for.

Xinli

On Fri, May 13, 2022 at 3:21 PM Regin Quinoa <[email protected]> wrote:

> Hi, we are trying to use org.apache.parquet.avro
> <https://www.tabnine.com/code/java/packages/org.apache.parquet.avro>
> .AvroParquetWriter
>
> to write parquet file to s3 bucket. The file is successfully written to s3
> bucket but
>
> get an exception
>
> com.amazonaws.SdkClientException: Unable to verify integrity of data
> upload.
>
> The purpose is to resolve these exceptions while The s3 bucket is encrypted
> with SSE-KMS not SSE-S3.
>
>  It appears that the exceptions are thrown because of code blocks in the
> link below
>
>
> https://github.com/aws/aws-sdk-java/blob/fd409dee8ae23fb8953e0bb4dbde65536a7e0514/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/AmazonS3Client.java#L1876
>
> From amazon doc, the etag is not same as MD5 when s3 bucket is encrypted
> with SSE-KMS
>
>
> https://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html
>
>  *The possible way is to pass MD5 in request header or set system.property
> to disable validation in
> skipMd5CheckStrategy.skipClientSideValidationPerPutResponse as indicated in
> link*
>
>
> https://github.com/aws/aws-sdk-java/blob/99fe75a823d4b02f4e90fa0dda06a1558d5617a1/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/internal/SkipMd5CheckStrategy.java#L42
>
>  The issue is that I do not find a proper way to inject such configurations
> into AvroParquetWriter. Is this possible? If yes, can you help to show how
> to do it?
>
>  Thanks
>
> Regin
>


-- 
Xinli Shang

Reply via email to