Leonardo Contreras created HADOOP-13022:
-------------------------------------------
Summary: S3 MD5 check fails on Server Side Encryption with AWS and
default key is used
Key: HADOOP-13022
URL: https://issues.apache.org/jira/browse/HADOOP-13022
Project: Hadoop Common
Issue Type: Bug
Affects Versions: 2.6.4
Reporter: Leonardo Contreras
When server side encryption with "aws:kms" value and no custom key is used in
S3A Filesystem, the AWSClient fails when verifing Md5:
{noformat}
Exception in thread "main" com.amazonaws.AmazonClientException: Unable to
verify integrity of data upload. Client calculated content hash (contentMD5:
1B2M2Y8AsgTpgAmY7PhCfg== in base 64) didn't match hash (etag:
c29fcc646e17c348bce9cca8f9d205f5 in hex) calculated by Amazon S3. You may need
to delete the data stored in Amazon S3. (metadata.contentMD5: null,
md5DigestStream:
com.amazonaws.services.s3.internal.MD5DigestCalculatingInputStream@65d9e72a,
bucketName: abuse-messages-nonprod, key:
venus/raw_events/checkpoint/825eb6aa-543d-46b1-801f-42de9dbc1610/)
at
com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1492)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.createEmptyObject(S3AFileSystem.java:1295)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.createFakeDirectory(S3AFileSystem.java:1272)
at org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:969)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1888)
at
org.apache.spark.SparkContext$$anonfun$setCheckpointDir$2.apply(SparkContext.scala:2077)
at
org.apache.spark.SparkContext$$anonfun$setCheckpointDir$2.apply(SparkContext.scala:2074)
at scala.Option.map(Option.scala:145)
at
org.apache.spark.SparkContext.setCheckpointDir(SparkContext.scala:2074)
at
org.apache.spark.streaming.StreamingContext.checkpoint(StreamingContext.scala:237)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)