Reading non UTF-8 files via spark streaming

2015-11-16 Thread tarek_abouzeid
Hi, i am trying to read files which are ISO-8859-6 encoded via spark streaming, but the default encoding for " ssc.textFileStream " is UTF-8 , so i don't get the data properly , so is there a way change the default encoding for textFileStream , or a way to read the file's bytes then i can

Re: Spark Streaming on Yarn Input from Flume

2015-03-15 Thread tarek_abouzeid
have you fixed this issue ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-on-Yarn-Input-from-Flume-tp11755p22055.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Saving Dstream into a single file

2015-03-15 Thread tarek_abouzeid
i am doing word count example on flume stream and trying to save output as text files in HDFS , but in the save directory i got multiple sub directories each having files with small size , i wonder if there is a way to append in a large file instead of saving in multiple files , as i intend to

Re: deploying Spark on standalone cluster

2015-03-15 Thread tarek_abouzeid
i was having a similar issue but it was in spark and flume integration i was getting failed to bind error , but got it fixed by shutting down firewall for both machines (make sure : service iptables status = firewall stopped) -- View this message in context:

Re: Store DStreams into Hive using Hive Streaming

2015-03-02 Thread tarek_abouzeid
please if you have found a solution for this , could you please post it ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Store-DStreams-into-Hive-using-Hive-Streaming-tp18307p21877.html Sent from the Apache Spark User List mailing list archive at

Store Spark data into hive table

2015-03-01 Thread tarek_abouzeid
I am trying to store my word count output into hive data warehouse my pipeline is: Flume streaming = spark do word count = store result in hive table for visualization later my code is : *import org.apache.spark.SparkContext import org.apache.spark.SparkContext._ import