SparkStreaming not read hadoop configuration from its sparkContext on Stand Alone mode?

robin_up Mon, 27 Jan 2014 20:01:47 -0800

Hi 

I try to run a small piece of code on Spark Steaming. It sets the s3 keys in
sparkContext object and passed into a sparkStreaming object. However, I got
the below error -- it seems StreamingContext did not use the hadoop config
on work threads. It works ok if I run it in spark core (batch mode) without
streaming.


java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key
must be specified as the username or password (respectively) of a s3n URL,
or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey
properties (respectively). 


//my code: 

System.setProperty("spark.cleaner.ttl", "3600") 
        val spark_master = "spark://" + System.getenv("SPARK_MASTER_IP") +
":" + System.getenv("SPARK_MASTER_PORT") 
        val external_jars =
Seq("target/scala-2.9.3/test_2.9.3-1.0.jar","/opt/json4s-core_2.9.3-3.2.2.jar","/opt/json4s-native_2.9.3-3.2.2.jar","/opt/json4s-ast_2.9.3-3.2.2.jar")
 

        val sc = new SparkContext(spark_master, "test",
System.getenv("SPARK_HOME"), external_jars) 
        sc.hadoopConfiguration.set("fs.s3n.awsAccessKeyId",
System.getenv("ds_awsAccessKeyId")) 
        sc.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey",
System.getenv("ds_awsSecretAccessKey")) 
        val ssc = new StreamingContext(sc, Seconds(5)) 

        val file =
ssc.textFileStream("s3n://my-bucket/syslog-ng/2014-01-24/") 



-----
-- Robin Li
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/SparkStreaming-not-read-hadoop-configuration-from-its-sparkContext-on-Stand-Alone-mode-tp972.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

SparkStreaming not read hadoop configuration from its sparkContext on Stand Alone mode?

Reply via email to