subject:"Can't get Spark to interface with S3A Filesystem with correct credentials"

Re: Can't get Spark to interface with S3A Filesystem with correct credentials

2020-03-05 Thread Devin Boyer

Thanks for the input Steven and Hariharan. I think this ended up being a combination of bad configuration with the credential providers I was using *and* using the wrong set of credentials for the test data I was trying to access. I was able to get this working with both hadoop 2.8 and 3.1 by pull

Re: Can't get Spark to interface with S3A Filesystem with correct credentials

2020-03-04 Thread Hariharan

If you're using hadoop 2.7 or below, you may also need to use the following hadoop settings: fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem fs.s3.impl=org.apache.hadoop.fs.s3a.S3AFileSystem fs.AbstractFileSystem.s3.impl=org.apache.hadoop.fs.s3a.S3A fs.AbstractFileSystem.s3a.impl=org.apache.had

Re: Can't get Spark to interface with S3A Filesystem with correct credentials

2020-03-04 Thread Steven Stetzler

To successfully read from S3 using s3a, I've had to also set ``` spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem ``` in addition to `spark.hadoop.fs.s3a.access.key` and `spark.hadoop.fs.s3a.secret.key`. I've also needed to ensure Spark has access to the AWS SDK jar. I have downloade

Can't get Spark to interface with S3A Filesystem with correct credentials

2020-03-04 Thread Devin Boyer

Hello, I'm attempting to run Spark within a Docker container with the hope of eventually running Spark on Kubernetes. Nearly all the data we currently process with Spark is stored in S3, so I need to be able to interface with it using the S3A filesystem. I feel like I've gotten close to getting t