[jira] [Resolved] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS

Dongjoon Hyun (Jira) Thu, 01 Jul 2021 11:39:08 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-35974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Dongjoon Hyun resolved SPARK-35974.
-----------------------------------
    Resolution: Cannot Reproduce

Could you try to use Apache Spark 3.1.2, please, [~toopt4], because Apache 
Spark 2.4 is EOL. It seems that the log shows `spark-2.3.4-bin-hadoop2.7` and 
the affected version is 2.4.6. Both are too old.

> Spark submit REST cluster/standalone mode - launching an s3a jar with STS
> -------------------------------------------------------------------------
>
>                 Key: SPARK-35974
>                 URL: https://issues.apache.org/jira/browse/SPARK-35974
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.4.6
>            Reporter: t oo
>            Priority: Major
>
> {code:java}
> /var/lib/spark-2.3.4-bin-hadoop2.7/bin/spark-submit --master 
> spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf 
> spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
> spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
> spark.hadoop.fs.s3a.secret.key='redact2' --conf 
> spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
> spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
> spark.hadoop.fs.s3a.session.token='redact3' --conf 
> spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf 
> spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf 
> spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
>  --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
> -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf 
> spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
> -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' 
> --total-executor-cores 4 --executor-cores 2 --executor-memory 2g 
> --driver-memory 1g --name lin1 --deploy-mode cluster --conf 
> spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku 
> s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml
> {code}
> running the above command give below stack trace:
>  
> {code:java}
>  Exception from the cluster:\njava.nio.file.AccessDeniedException: 
> s3a://mybuc/metorikku_2.11.jar: getFileStatus on 
> s3a://mybuc/metorikku_2.11.jar: 
> com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
> S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended 
> Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101)
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542)
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)
> org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463)
> org.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030)
> org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:747)
> org.apache.spark.util.Utils$.doFetchFile(Utils.scala:723)
> org.apache.spark.util.Utils$.fetchFile(Utils.scala:509)
> org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:155)
> org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173)
> org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92){code}
> all the ec2s in the spark cluster only have access to s3 via STS tokens. The 
> jar itself reads csvs from s3 using the tokens, and everything works if 
> either 1. i change the commandline to point to local jars on the ec2 OR 2. 
> use port 7077/client mode instead of cluster mode. But it seems the jar 
> itself can't be launched off s3, as if the tokens are not being picked up 
> properly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS

Reply via email to