[
https://issues.apache.org/jira/browse/SPARK-15965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15432432#comment-15432432
]
Steve Loughran commented on SPARK-15965:
----------------------------------------
This is being fixed with tests in my work in SPARK-7481; the manual workaround
is
Spark 2:
# Get the same hadoop version that your spark version is built against
# add hadoop-aws, everything with amazon-*.jar into the JARs subdir
Spark 1.6+
This needs my patch a rebuild of spark assembly. However, once that patch is
in, trying to use the assembly without the AWS JARs will stop spark from
starting —unless you move up to Hadoop 2.7.3
> No FileSystem for scheme: s3n or s3a spark-2.0.0 and spark-1.6.1
> -----------------------------------------------------------------
>
> Key: SPARK-15965
> URL: https://issues.apache.org/jira/browse/SPARK-15965
> Project: Spark
> Issue Type: Bug
> Components: Build
> Affects Versions: 1.6.1
> Environment: Debian GNU/Linux 8
> java version "1.7.0_79"
> Reporter: thauvin damien
> Original Estimate: 8h
> Remaining Estimate: 8h
>
> The spark programming-guide explain that Spark can create distributed
> datasets on Amazon S3 .
> But since the pre-buid "Hadoop 2.6" the S3 access doesn't work with s3n or
> s3a.
> sc.hadoopConfiguration.set("fs.s3a.awsAccessKeyId", "XXXZZZHHH")
> sc.hadoopConfiguration.set("fs.s3a.awsSecretAccessKey",
> "xxxxxxxxxxxxxxxxxxxxxxxxxxx")
> val
> lines=sc.textFile("s3a://poc-XXX/access/2016/02/20160201202001_xxx.log.gz")
> java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
> org.apache.hadoop.fs.s3a.S3AFileSystem not found
> Any version of spark : spark-1.3.1 ; spark-1.6.1 even spark-2.0.0 with
> hadoop.7.2 .
> I understand this is an Hadoop Issue (SPARK-7442) but can you make some
> documentation to explain what jar we need to add and where ? ( for standalone
> installation) .
> "hadoop-aws-x.x.x.jar and aws-java-sdk-x.x.x.jar is enough ?
> What env variable we need to set and what file we need to modifiy .
> Is it "$CLASSPATH "or a variable in "spark-defaults.conf" with variable
> "spark.driver.extraClassPath" and "spark.executor.extraClassPath"
> But Still Works with spark-1.6.1 pre build with hadoop2.4
> Thanks
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]