[
https://issues.apache.org/jira/browse/SPARK-15965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen resolved SPARK-15965.
-------------------------------
Resolution: Duplicate
> No FileSystem for scheme: s3n or s3a spark-2.0.0 and spark-1.6.1
> -----------------------------------------------------------------
>
> Key: SPARK-15965
> URL: https://issues.apache.org/jira/browse/SPARK-15965
> Project: Spark
> Issue Type: Bug
> Components: Build
> Affects Versions: 1.6.1
> Environment: Debian GNU/Linux 8
> java version "1.7.0_79"
> Reporter: thauvin damien
> Original Estimate: 8h
> Remaining Estimate: 8h
>
> The spark programming-guide explain that Spark can create distributed
> datasets on Amazon S3 .
> But since the pre-buid "Hadoop 2.6" the S3 access doesn't work with s3n or
> s3a.
> sc.hadoopConfiguration.set("fs.s3a.awsAccessKeyId", "XXXZZZHHH")
> sc.hadoopConfiguration.set("fs.s3a.awsSecretAccessKey",
> "xxxxxxxxxxxxxxxxxxxxxxxxxxx")
> val
> lines=sc.textFile("s3a://poc-XXX/access/2016/02/20160201202001_xxx.log.gz")
> java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
> org.apache.hadoop.fs.s3a.S3AFileSystem not found
> Any version of spark : spark-1.3.1 ; spark-1.6.1 even spark-2.0.0 with
> hadoop.7.2 .
> I understand this is an Hadoop Issue (SPARK-7442) but can you make some
> documentation to explain what jar we need to add and where ? ( for standalone
> installation) .
> "hadoop-aws-x.x.x.jar and aws-java-sdk-x.x.x.jar is enough ?
> What env variable we need to set and what file we need to modifiy .
> Is it "$CLASSPATH "or a variable in "spark-defaults.conf" with variable
> "spark.driver.extraClassPath" and "spark.executor.extraClassPath"
> But Still Works with spark-1.6.1 pre build with hadoop2.4
> Thanks
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]