Spark Streaming problem with Yarn

2017-02-28 Thread Amjad ALSHABANI
Hi everyone, I m experiencing a problem with my spark streaming job when running it on yarn. The problem appears only when running this application in a Yarn queue along with other Tez/MR applications This problem is in processing time, which exceeds 1 minute for batches of 1 second. Normally whe

Re: [Spark Streaming][Problem with DataFrame UDFs]

2016-01-21 Thread Jean-Pierre OCALAN
called 10 times, although it gets consistently called >>> 50 >>> times, but the resulting DF is correct and when executing a count() >>> properly >>> return 10, as expected. >>> >>> I have changed my code to work directly with RDDs using mapPartitions

Re: [Spark Streaming][Problem with DataFrame UDFs]

2016-01-21 Thread Jean-Pierre OCALAN
gt;> >> As additional information, I have set spark.speculation to false and no >> tasks failed. >> >> I am working on a smaller example that would isolate this potential issue, >> but in the meantime I

Re: [Spark Streaming][Problem with DataFrame UDFs]

2016-01-21 Thread Cody Koeninger
alse and no > tasks failed. > > I am working on a smaller example that would isolate this potential issue, > but in the meantime I would like to know if somebody encountered this > issue. > > Thank you. > > > > -- > View this message in context: > http://apache-sp

[Spark Streaming][Problem with DataFrame UDFs]

2016-01-20 Thread jpocalan
.1001560.n3.nabble.com/Spark-Streaming-Problem-with-DataFrame-UDFs-tp26024.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands

Re: spark streaming problem saveAsTextFiles() does not write valid JSON to HDFS

2015-11-19 Thread Andy Davidson
Turns out data is in python format. ETL pipeline was over writing original data Andy From: Andrew Davidson Date: Thursday, November 19, 2015 at 6:58 PM To: "user @spark" Subject: spark streaming problem saveAsTextFiles() does not write valid JSON to HDFS > I am working on a

spark streaming problem saveAsTextFiles() does not write valid JSON to HDFS

2015-11-19 Thread Andy Davidson
I am working on a simple POS. I am running into a really strange problem. I wrote a java streaming app that collects tweets using the spark twitter package and stores the to disk in JSON format. I noticed that when I run the code on my mac. The file are written to the local files system as I expect

Re: SPARK STREAMING PROBLEM

2015-05-28 Thread Sourav Chandra
om logs nothing of g gets printed. > > -- Forwarded message -- > From: Animesh Baranawal > Date: Thu, May 28, 2015 at 6:57 PM > Subject: SPARK STREAMING PROBLEM > To: user@spark.apache.org > > > Hi, > > I am trying to extract the filenames from which a Dst

Fwd: SPARK STREAMING PROBLEM

2015-05-28 Thread Animesh Baranawal
I also started the streaming context by running ssc.start() but still apart from logs nothing of g gets printed. -- Forwarded message -- From: Animesh Baranawal Date: Thu, May 28, 2015 at 6:57 PM Subject: SPARK STREAMING PROBLEM To: user@spark.apache.org Hi, I am trying to

Re: SPARK STREAMING PROBLEM

2015-05-28 Thread Sourav Chandra
You must start the StreamingContext by calling ssc.start() On Thu, May 28, 2015 at 6:57 PM, Animesh Baranawal < animeshbarana...@gmail.com> wrote: > Hi, > > I am trying to extract the filenames from which a Dstream is generated by > parsing the toDebugString method on RDD > I am implementing the

SPARK STREAMING PROBLEM

2015-05-28 Thread Animesh Baranawal
Hi, I am trying to extract the filenames from which a Dstream is generated by parsing the toDebugString method on RDD I am implementing the following code in spark-shell: import org.apache.spark.streaming.{StreamingContext, Seconds} val ssc = new StreamingContext(sc,Seconds(10)) val lines = ssc.t