[
https://issues.apache.org/jira/browse/SPARK-17307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon resolved SPARK-17307.
----------------------------------
Resolution: Incomplete
> Document what all access is needed on S3 bucket when trying to save a model
> ---------------------------------------------------------------------------
>
> Key: SPARK-17307
> URL: https://issues.apache.org/jira/browse/SPARK-17307
> Project: Spark
> Issue Type: Documentation
> Reporter: Aseem Bansal
> Priority: Minor
> Labels: bulk-closed
>
> I faced this lack of documentation when I was trying to save a model to S3.
> Initially I thought it should be only write. Then I found it also needs
> delete to delete temporary files. Now I requested access for delete and tried
> again and I am get the error
> Exception in thread "main" org.apache.hadoop.fs.s3.S3Exception:
> org.jets3t.service.S3ServiceException: S3 PUT failed for
> '/dev-qa_%24folder%24' XML Error Message
> To reproduce this error the below can be used
> {code}
> SparkSession sparkSession = SparkSession
> .builder()
> .appName("my app")
> .master("local")
> .getOrCreate();
> JavaSparkContext jsc = new
> JavaSparkContext(sparkSession.sparkContext());
> jsc.hadoopConfiguration().set("fs.s3n.awsAccessKeyId", <ACCESS_KEY>);
> jsc.hadoopConfiguration().set("fs.s3n.awsSecretAccessKey", <SECRET
> ACCESS KEY>);
> //Create a Pipelinemode
>
> pipelineModel.write().overwrite().save("s3n://<BUCKET>/dev-qa/modelTest");
> {code}
> This back and forth could be avoided if it was clearly mentioned what all
> access spark needs to write to S3. Also would be great if why all of the
> access is needed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]