Re: Hadoop AWS module (Spark) is inventing a secret-ket each time
I have sent the message there as well, I thought I would send it here as well because im actually setting up the hadoopConf On Wed, Mar 8, 2017 at 6:49 PM, Ravi Prakashwrote: > Sorry to hear about your travails. > > I think you might be better off asking the spark community: > http://spark.apache.org/community.html > > On Wed, Mar 8, 2017 at 3:22 AM, Jonhy Stack wrote: > >> Hi, >> >> I'm trying to read a s3 bucket from Spark and up until today Spark always >> complain that the request return 403 >> >> hadoopConf = spark_context._jsc.hadoopConfiguration() >> hadoopConf.set("fs.s3a.access.key", "ACCESSKEY") >> hadoopConf.set("fs.s3a.secret.key", "SECRETKEY") >> hadoopConf.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AF >> ileSystem") >> logs = spark_context.textFile("s3a://mybucket/logs/*) >> >> Spark was saying Invalid Access key [ACCESSKEY] >> >> However with the same ACCESSKEY and SECRETKEY this was working with >> aws-cli >> >> aws s3 ls mybucket/logs/ >> >> and in python boto3 this was working >> >> resource = boto3.resource("s3", region_name="us-east-1") >> resource.Object("mybucket", "logs/text.py") \ >> .put(Body=open("text.py", "rb"),ContentType="text/x-py") >> >> so my credentials ARE invalid and the problem is definitely something >> with Spark.. >> >> Today I decided to turn on the "DEBUG" log for the entire spark and to my >> suprise... Spark is NOT using the [SECRETKEY] I have provided but >> instead... add a random one??? >> >> 17/03/08 10:40:04 DEBUG request: Sending Request: HEAD >> https://mybucket.s3.amazonaws.com / Headers: (Authorization: AWS >> ACCESSKEY:**[RANDON-SECRET-KEY]**, User-Agent: aws-sdk-java/1.7.4 >> Mac_OS_X/10.11.6 Java_HotSpot(TM)_64-Bit_Server_VM/25.65-b01/1.8.0_65, >> Date: Wed, 08 Mar 2017 10:40:04 GMT, Content-Type: >> application/x-www-form-urlencoded; charset=utf-8, ) >> >> This is why it still return 403! Spark is not using the key I provide >> with fs.s3a.secret.key but instead invent a random one EACH time (everytime >> I submit the job the random secret key is different) >> >> For the record I'm running this locally on my machine (OSX) with this >> command >> >> spark-submit --packages com.amazonaws:aws-java-sdk-pom >> :1.11.98,org.apache.hadoop:hadoop-aws:2.7.3 test.py >> >> Could some one enlighten me on this? >> > >
Re: Hadoop AWS module (Spark) is inventing a secret-ket each time
Sorry to hear about your travails. I think you might be better off asking the spark community: http://spark.apache.org/community.html On Wed, Mar 8, 2017 at 3:22 AM, Jonhy Stackwrote: > Hi, > > I'm trying to read a s3 bucket from Spark and up until today Spark always > complain that the request return 403 > > hadoopConf = spark_context._jsc.hadoopConfiguration() > hadoopConf.set("fs.s3a.access.key", "ACCESSKEY") > hadoopConf.set("fs.s3a.secret.key", "SECRETKEY") > hadoopConf.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AF > ileSystem") > logs = spark_context.textFile("s3a://mybucket/logs/*) > > Spark was saying Invalid Access key [ACCESSKEY] > > However with the same ACCESSKEY and SECRETKEY this was working with aws-cli > > aws s3 ls mybucket/logs/ > > and in python boto3 this was working > > resource = boto3.resource("s3", region_name="us-east-1") > resource.Object("mybucket", "logs/text.py") \ > .put(Body=open("text.py", "rb"),ContentType="text/x-py") > > so my credentials ARE invalid and the problem is definitely something with > Spark.. > > Today I decided to turn on the "DEBUG" log for the entire spark and to my > suprise... Spark is NOT using the [SECRETKEY] I have provided but > instead... add a random one??? > > 17/03/08 10:40:04 DEBUG request: Sending Request: HEAD > https://mybucket.s3.amazonaws.com / Headers: (Authorization: AWS > ACCESSKEY:**[RANDON-SECRET-KEY]**, User-Agent: aws-sdk-java/1.7.4 > Mac_OS_X/10.11.6 Java_HotSpot(TM)_64-Bit_Server_VM/25.65-b01/1.8.0_65, > Date: Wed, 08 Mar 2017 10:40:04 GMT, Content-Type: > application/x-www-form-urlencoded; charset=utf-8, ) > > This is why it still return 403! Spark is not using the key I provide with > fs.s3a.secret.key but instead invent a random one EACH time (everytime I > submit the job the random secret key is different) > > For the record I'm running this locally on my machine (OSX) with this > command > > spark-submit --packages com.amazonaws:aws-java-sdk-pom > :1.11.98,org.apache.hadoop:hadoop-aws:2.7.3 test.py > > Could some one enlighten me on this? >
Hadoop AWS module (Spark) is inventing a secret-ket each time
Hi, I'm trying to read a s3 bucket from Spark and up until today Spark always complain that the request return 403 hadoopConf = spark_context._jsc.hadoopConfiguration() hadoopConf.set("fs.s3a.access.key", "ACCESSKEY") hadoopConf.set("fs.s3a.secret.key", "SECRETKEY") hadoopConf.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem") logs = spark_context.textFile("s3a://mybucket/logs/*) Spark was saying Invalid Access key [ACCESSKEY] However with the same ACCESSKEY and SECRETKEY this was working with aws-cli aws s3 ls mybucket/logs/ and in python boto3 this was working resource = boto3.resource("s3", region_name="us-east-1") resource.Object("mybucket", "logs/text.py") \ .put(Body=open("text.py", "rb"),ContentType="text/x-py") so my credentials ARE invalid and the problem is definitely something with Spark.. Today I decided to turn on the "DEBUG" log for the entire spark and to my suprise... Spark is NOT using the [SECRETKEY] I have provided but instead... add a random one??? 17/03/08 10:40:04 DEBUG request: Sending Request: HEAD https://mybucket.s3.amazonaws.com / Headers: (Authorization: AWS ACCESSKEY:**[RANDON-SECRET-KEY]**, User-Agent: aws-sdk-java/1.7.4 Mac_OS_X/10.11.6 Java_HotSpot(TM)_64-Bit_Server_VM/25.65-b01/1.8.0_65, Date: Wed, 08 Mar 2017 10:40:04 GMT, Content-Type: application/x-www-form-urlencoded; charset=utf-8, ) This is why it still return 403! Spark is not using the key I provide with fs.s3a.secret.key but instead invent a random one EACH time (everytime I submit the job the random secret key is different) For the record I'm running this locally on my machine (OSX) with this command spark-submit --packages com.amazonaws:aws-java-sdk- pom:1.11.98,org.apache.hadoop:hadoop-aws:2.7.3 test.py Could some one enlighten me on this?