[ https://issues.apache.org/jira/browse/SPARK-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15169013#comment-15169013 ]
Michel Lemay commented on SPARK-13044: -------------------------------------- I get something similar. When trying to read KMS encrypted file from S3, I get an error about AWS Signature version 4. When running locally compiled tip of the master branch (2.0.0-SNAPSHOT) , I get a verbore error message about signature version: org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: Service Error Message. -- ResponseCode: 400, ResponseStatus: Bad Request, XML Error Message: <?xml version="1.0" encoding="UTF-8"?><Error><Code>InvalidArgument</Code><Message>{color:red}Requests specifying Server Side Encryption with AWS KMS managed keys require AWS Signature Version 4.{color}</Message><ArgumentName>Authorization</ArgumentName><ArgumentValue>null</ArgumentValue><RequestId>...</RequestId><HostId>...</HostId></Error> at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:464) at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleException(Jets3tNativeFileSystemStore.java:411) at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:210) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at org.apache.hadoop.fs.s3native.$Proxy17.retrieve(Unknown Source) at org.apache.hadoop.fs.s3native.NativeS3FileSystem.open(NativeS3FileSystem.java:627) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:767) at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:108) at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:248) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:209) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:102) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313) at org.apache.spark.rdd.RDD.iterator(RDD.scala:277) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313) at org.apache.spark.rdd.RDD.iterator(RDD.scala:277) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:69) at org.apache.spark.scheduler.Task.run(Task.scala:81) Under Spark 1.6.0, it translates to a NPE: java.lang.NullPointerException at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.seek(NativeS3FileSystem.java:152) at org.apache.hadoop.fs.BufferedFSInputStream.seek(BufferedFSInputStream.java:89) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:63) at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:126) at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:237) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > saveAsTextFile() doesn't support s3 Signature Version 4 > ------------------------------------------------------- > > Key: SPARK-13044 > URL: https://issues.apache.org/jira/browse/SPARK-13044 > Project: Spark > Issue Type: Bug > Components: Input/Output > Affects Versions: 1.4.0 > Environment: CentOS > Reporter: Xin Ren > Labels: aws-s3 > > I have two clusters deployed: US and EU-Frankfort with the same configs on > AWS. > And the application in EU-Frankfort cannot save data to EU-Frankfort-s3, but > US one can save to US-s3. > And I checked and found that EU-Frankfort supports Signature Version 4 only: > http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region > Code I'm using: > {code:java} > val s3WriteEndpoint = "s3n://access_key:secret_key@bucket_name/data/12345" > rdd.saveAsTextFile(s3WriteEndpoint) > {code} > So from my issue I guess saveAsTextFile() is using Signature Version 2? How > to support Version 4? > I tried to dig into code > https://github.com/apache/spark/blob/f14922cff84b1e0984ba4597d764615184126bdc/core/src/main/scala/org/apache/spark/rdd/RDD.scala -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org