Re: Spark streaming app that processes Kafka DStreams produces no output and no error

2017-01-14 Thread shyla deshpande
Hello, I want to add that, I don't even see the streaming tab in the application UI on port 4040 when I run it on the cluster. The cluster on EC2 has 1 master node and 1 worker node. The cores used on the worker node is 2 of 2 and memory used is 6GB of 6.3GB. Can I run a spark streaming job

Spark streaming app that processes Kafka DStreams produces no output and no error

2017-01-13 Thread shyla deshpande
Hello, My spark streaming app that reads kafka topics and prints the DStream works fine on my laptop, but on AWS cluster it produces no output and no errors. Please help me debug. I am using Spark 2.0.2 and kafka-0-10 Thanks The following is the output of the spark streaming app... 17/01/14

Re: [Spark Streaming] NoClassDefFoundError : StateSpec

2017-01-12 Thread Shixiong(Ryan) Zhu
llowing > error: > > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/spark/streaming/StateSpec$ > Caused by: java.lang.ClassNotFoundException: > org.apache.spark.streaming.StateSpec$ > > Build.sbt > > scalaVersion := "2.

[Spark Streaming] NoClassDefFoundError : StateSpec

2017-01-12 Thread Ramkumar Venkataraman
Spark: 1.6.1 I am trying to use the new mapWithState API and I am getting the following error: Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/streaming/StateSpec$ Caused by: java.lang.ClassNotFoundException: org.apache.spark.streaming.StateSpec$

Re: Docker image for Spark streaming app

2017-01-08 Thread shyla deshpande
started. Thanks On Sun, Jan 8, 2017 at 1:52 PM, shyla deshpande <deshpandesh...@gmail.com> wrote: > Thanks really appreciate. > > On Sun, Jan 8, 2017 at 1:02 PM, vvshvv <vvs...@gmail.com> wrote: > >> Hi, >> >> I am running spark streaming job using s

Re:Docker image for Spark streaming app

2017-01-08 Thread vvshvv
Hi, I am running spark streaming job using spark jobserver via this image: https://hub.docker.com/r/depend/spark-jobserver/. It works well in standalone (using mesos job does not make progress). Spark jobserver that supports Spark 2.0 has new API that is only suitable for non-streaming jobs

Docker image for Spark streaming app

2017-01-08 Thread shyla deshpande
I looking for a docker image that I can use from docker hub for running a spark streaming app with scala and spark 2.0 +. I am new to docker and unable to find one image from docker hub that suits my needs. Please let me know if anyone is using a docker for spark streaming app and share your

How do I read data in dockerized kafka from a spark streaming application

2017-01-06 Thread shyla deshpande
My kafka is in a docker container. How do I read this Kafka data in my Spark streaming app. Also, I need to write data from Spark Streaming to Cassandra database which is in docker container. I appreciate any help. Thanks.

Re: Re: Re: Spark Streaming prediction

2017-01-03 Thread Marco Mistroni
hours (one value > per minute) should be predicted. > > Thank you in advance. > > Regards, > Daniela > > *Gesendet:* Montag, 02. Januar 2017 um 22:30 Uhr > *Von:* "Marco Mistroni" <mmistr...@gmail.com> > *An:* "Daniela S" <daniela_4...@gmx

Aw: Re: Re: Spark Streaming prediction

2017-01-02 Thread Daniela S
t; <mmistr...@gmail.com> An: "Daniela S" <daniela_4...@gmx.at> Cc: User <user@spark.apache.org> Betreff: Re: Re: Spark Streaming prediction Apologies, perhaps i misunderstood your usecase. My assumption was that you have 2-3 hours worth fo data and you want to

Re: Re: Spark Streaming prediction

2017-01-02 Thread Marco Mistroni
want to accumulate data worth of 24 hrs and display it in the dashboard? or is it something else? for dashboard update, i guess you either - poll 'a database' (where you store the compuation of your spark logic ) periodically - propagate events from your spark streaming application to your

Aw: Re: Spark Streaming prediction

2017-01-02 Thread Daniela S
, 02. Januar 2017 um 21:07 Uhr Von: "Marco Mistroni" <mmistr...@gmail.com> An: "Daniela S" <daniela_4...@gmx.at> Cc: User <user@spark.apache.org> Betreff: Re: Spark Streaming prediction Hi  you  might want to have a look at the Regression ML  algorithm a

Re: Spark Streaming prediction

2017-01-02 Thread Marco Mistroni
somewhere and have your dashboard poll periodically your data store to read the predictions I have seen ppl on the list doing ML over a Spark streaming app, i m sure someone can reply back Hpefully i gave u a starting point hth marco On 2 Jan 2017 4:03 pm, "Daniela S" <daniel

Spark Streaming prediction

2017-01-02 Thread Daniela S
Hi   I am trying to solve the following problem with Spark Streaming. I receive timestamped events from Kafka. Each event refers to a device and contains values for every minute of the next 2 to 3 hours. What I would like to do is to predict the minute values for the next 24 hours. So I would

24/7 Spark Streaming on YARN in Production

2017-01-01 Thread Bernhard Schäfer
Two weeks ago I have published a blogpost about our experiences running 24/7 Spark Streaming applications on YARN in production: https://www.inovex.de/blog/247-spark-streaming-on-yarn-in-production/ <https://www.inovex.de/blog/247-spark-streaming-on-yarn-in-production/> Amongst

[Spark streaming 1.6.0] Spark streaming with Yarn: executors not fully utilized

2016-12-29 Thread Nishant Kumar
I am running spark streaming with Yarn - *spark-submit --master yarn --deploy-mode cluster --num-executors 2 > --executor-memory 8g --driver-memory 2g --executor-cores 8 ..* > I am consuming Kafka through DireactStream approach (No receiver). I have 2 topics (each with 3 partition

Re: Spark streaming with Yarn: executors not fully utilized

2016-12-28 Thread Nishant Kumar
Any update on this guys ? On Wed, Dec 28, 2016 at 10:19 AM, Nishant Kumar <nishant.ku...@applift.com> wrote: > I have updated my question: > > http://stackoverflow.com/questions/41345552/spark- > streaming-with-yarn-executors-not-fully-utilized > > On Wed, Dec 28, 2016 a

Re: Spark streaming with Yarn: executors not fully utilized

2016-12-27 Thread Nishant Kumar
I have updated my question: http://stackoverflow.com/questions/41345552/spark-streaming-with-yarn-executors-not-fully-utilized On Wed, Dec 28, 2016 at 9:49 AM, Nishant Kumar <nishant.ku...@applift.com> wrote: > Hi, > > I am running spark streaming with Yarn with - > > *

Spark streaming with Yarn: executors not fully utilized

2016-12-27 Thread Nishant Kumar
Hi, I am running spark streaming with Yarn with - *spark-submit --master yarn --deploy-mode cluster --num-executors 2 --executor-memory 8g --driver-memory 2g --executor-cores 8 ..* I am consuming Kafka through DireactStream approach (No receiver). I have 2 topics (each with 3 partitions). I

Re: How many Spark streaming applications can be run at a time on a Spark cluster?

2016-12-24 Thread Dirceu Semighini Filho
atabricks.com/docs/latest/databricks_ > guide/index.html#07%20Spark%20Streaming/15%20Streaming%20FAQs.html > > There can be only one streaming context in a cluster which implies only > one streaming job. > > So, I am still confused. Anyone having more than 1 spark streaming app in

Re: How many Spark streaming applications can be run at a time on a Spark cluster?

2016-12-24 Thread shyla deshpande
having more than 1 spark streaming app in a cluster running at the same time, please share your experience. Thanks On Wed, Dec 14, 2016 at 6:54 PM, Akhilesh Pathodia < pathodia.akhil...@gmail.com> wrote: > If you have enough cores/resources, run them separately depending on your &

Re: Can't access the data in Kafka Spark Streaming globally

2016-12-23 Thread Cody Koeninger
This doesn't sound like a question regarding Kafka streaming, it sounds like confusion about the scope of variables in spark generally. Is that right? If so, I'd suggest reading the documentation, starting with a simple rdd (e.g. using sparkContext.parallelize), and experimenting to confirm your

Can't access the data in Kafka Spark Streaming globally

2016-12-22 Thread Sree Eedupuganti
I am trying to stream the data from Kafka to Spark. JavaPairInputDStream directKafkaStream = KafkaUtils.createDirectStream(ssc, String.class, String.class, StringDecoder.class, StringDecoder.class,

Re: What is the deployment model for Spark Streaming? A specific example.

2016-12-19 Thread Eike von Seggern
't be how apps are deployed in the wild because it >> will never be very reliable, right? But I don't see anything about this in >> the docs, so I am confused. >> >> Note that I use this to run the app, maybe that is the problem? >> >> ssc.start() >> ss

Re: What is the deployment model for Spark Streaming? A specific example.

2016-12-17 Thread Divya Gehlot
be how apps are deployed in the wild because it > will never be very reliable, right? But I don't see anything about this in > the docs, so I am confused. > > Note that I use this to run the app, maybe that is the problem? > > ssc.start() > ssc.awaitTermination() > > >

Re: What is the deployment model for Spark Streaming? A specific example.

2016-12-17 Thread Russell Jurney
n > PID going down. This can't be how apps are deployed in the wild because it > will never be very reliable, right? But I don't see anything about this in > the docs, so I am confused. > > Note that I use this to run the app, maybe that is the problem? > > ssc.start() > ssc.awai

What is the deployment model for Spark Streaming? A specific example.

2016-12-16 Thread Russell Jurney
, maybe that is the problem? ssc.start() ssc.awaitTermination() What is the actual deployment model for Spark Streaming? All I know to do right now is to restart the PID. I'm new to Spark, and the docs don't really explain this (that I can see). Thanks! -- Russell Jurney twitter.com/rjurney

"remember" vs "window" in Spark Streaming

2016-12-15 Thread Mattz
Hello, Can someone please help me understand the different scenarios when I could use "remember" vs "window" in Spark streaming? Thanks!

Re: How many Spark streaming applications can be run at a time on a Spark cluster?

2016-12-14 Thread Akhilesh Pathodia
acoomodate ,can run as many spark/spark > streaming application. > > > Thanks, > Divya > > On 15 December 2016 at 08:42, shyla deshpande <deshpandesh...@gmail.com > <javascript:_e(%7B%7D,'cvml','deshpandesh...@gmail.com');>> wrote: > >> How many

Re: How many Spark streaming applications can be run at a time on a Spark cluster?

2016-12-14 Thread Divya Gehlot
It depends on the use case ... Spark always depends on the resource availability . As long as you have resource to acoomodate ,can run as many spark/spark streaming application. Thanks, Divya On 15 December 2016 at 08:42, shyla deshpande <deshpandesh...@gmail.com> wrote: > How m

How many Spark streaming applications can be run at a time on a Spark cluster?

2016-12-14 Thread shyla deshpande
How many Spark streaming applications can be run at a time on a Spark cluster? Is it better to have 1 spark streaming application to consume all the Kafka topics or have multiple streaming applications when possible to keep it simple? Thanks

Re: sbt "org.apache.spark#spark-streaming-kafka_2.11;2.0.0: not found"

2016-12-12 Thread Mattz
I use this in my SBT and it works on 2.0.1: "org.apache.spark" %% "spark-streaming-kafka-0-8" % "2.0.1" On Tue, Dec 13, 2016 at 1:00 PM, Luke Adolph <kenan3...@gmail.com> wrote: > Hi all, > My project uses spark-streaming-kafka module.When I migra

sbt "org.apache.spark#spark-streaming-kafka_2.11;2.0.0: not found"

2016-12-12 Thread Luke Adolph
Hi all, My project uses spark-streaming-kafka module.When I migrate spark from 1.6.0 to 2.0.0 and rebuild project, I run into below error: [warn] module not found: org.apache.spark#spark-streaming-kafka_2.11;2.0.0 [warn] local: tried [warn] /home/linker/.ivy2/local

Re: Spark Streaming with Kafka

2016-12-12 Thread Anton Okolnychyi
7:11 GMT+01:00 Timur Shenkao <t...@timshenkao.su>: > >>> > >>> Hi, > >>> Usual general questions are: > >>> -- what is your Spark version? > >>> -- what is your Kafka version? > >>> -- do you use "standard" Kafka con

Re: Spark Streaming with Kafka

2016-12-12 Thread Cody Koeninger
sion? >>> -- do you use "standard" Kafka consumer or try to implement something >>> custom (your own multi-threaded consumer)? >>> >>> The freshest docs >>> https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html >>> &g

RE: [Spark Streaming] How to do join two messages in spark streaming(Probabaly messasges are in differnet RDD) ?

2016-12-11 Thread Sanchuan Cheng (sancheng)
smime.p7m Description: S/MIME encrypted message

Re: Spark Streaming with Kafka

2016-12-11 Thread Oleksii Dukhno
>> >> The freshest docs https://spark.apache.org/docs/ >> latest/streaming-kafka-0-10-integration.html >> >> AFAIK, yes, you should use unique group id for each stream (KAFKA 0.10 >> !!!) >> >>> kafkaParams.put("group.id", "use_a_separa

Re: Spark Streaming with Kafka

2016-12-11 Thread Anton Okolnychyi
1, 2016 at 5:51 PM, Anton Okolnychyi < > anton.okolnyc...@gmail.com> wrote: > >> Hi, >> >> I am experimenting with Spark Streaming and Kafka. I will appreciate if >> someone can say whether the following assumption is correct. >> >&

Re: Spark Streaming with Kafka

2016-12-11 Thread Timur Shenkao
kafka-0-10-integration.html AFAIK, yes, you should use unique group id for each stream (KAFKA 0.10 !!!) > kafkaParams.put("group.id", "use_a_separate_group_id_for_each_stream"); > > On Sun, Dec 11, 2016 at 5:51 PM, Anton Okolnychyi < anton.okolnyc...@gmail.com> wrote: >

Spark Streaming with Kafka

2016-12-11 Thread Anton Okolnychyi
Hi, I am experimenting with Spark Streaming and Kafka. I will appreciate if someone can say whether the following assumption is correct. If I have multiple computations (each with its own output) on one stream (created as KafkaUtils.createDirectStream), then there is a chance to have

Re: Wrting data from Spark streaming to AWS Redshift?

2016-12-11 Thread kant kodali
ns as stated in the bloga shot, but changing mode to append. > > On Sat, Dec 10, 2016 at 8:25 AM, shyla deshpande <deshpandesh...@gmail.com > > wrote: > >> Hello all, >> >> Is it possible to Write data from Spark streaming to AWS Redshift? >> >> I came acros

Re: Wrting data from Spark streaming to AWS Redshift?

2016-12-09 Thread ayan guha
Ideally, saving data to external sources should not be any different. give the write options as stated in the bloga shot, but changing mode to append. On Sat, Dec 10, 2016 at 8:25 AM, shyla deshpande <deshpandesh...@gmail.com> wrote: > Hello all, > > Is it possible to Write

Wrting data from Spark streaming to AWS Redshift?

2016-12-09 Thread shyla deshpande
Hello all, Is it possible to Write data from Spark streaming to AWS Redshift? I came across the following article, so looks like it works from a Spark batch program. https://databricks.com/blog/2015/10/19/introducing-redshift-data-source-for-spark.html I want to write to AWS Redshift from

Re: Not per-key state in spark streaming

2016-12-08 Thread Anty Rao
te: > >> >> >> On Wed, Dec 7, 2016 at 7:42 PM, Anty Rao <ant@gmail.com> wrote: >> >>> Hi >>> I'm new to Spark. I'm doing some research to see if spark streaming can >>> solve my problem. I don't want to keep per-key state,b/c my data

Re: Not per-key state in spark streaming

2016-12-08 Thread Daniel Haviv
There's no need to extend Spark's API, look at mapWithState for examples. On Thu, Dec 8, 2016 at 4:49 AM, Anty Rao <ant@gmail.com> wrote: > > > On Wed, Dec 7, 2016 at 7:42 PM, Anty Rao <ant@gmail.com> wrote: > >> Hi >> I'm new to Spark. I'm doing som

Re: Not per-key state in spark streaming

2016-12-07 Thread Anty Rao
On Wed, Dec 7, 2016 at 7:42 PM, Anty Rao <ant@gmail.com> wrote: > Hi > I'm new to Spark. I'm doing some research to see if spark streaming can > solve my problem. I don't want to keep per-key state,b/c my data set is > very huge and keep a little longer time, it not viable t

Re: Not per-key state in spark streaming

2016-12-07 Thread Anty Rao
e HDFS or HBASE. > > Daniel > > On Wed, Dec 7, 2016 at 1:42 PM, Anty Rao <ant@gmail.com> wrote: > >> Hi >> I'm new to Spark. I'm doing some research to see if spark streaming can >> solve my problem. I don't want to keep per-key state,b/c my data set is >

Re: Spark streaming completed batches statistics

2016-12-07 Thread map reduced
urce. Thanks. > > > https://richardstartin.com/ > > > -- > *From:* map reduced <k3t.gi...@gmail.com> > *Sent:* 07 December 2016 19:49 > *To:* Richard Startin > *Cc:* user@spark.apache.org > *Subject:* Re: Spark streaming com

Re: Spark streaming completed batches statistics

2016-12-07 Thread Richard Startin
; Sent: 05 December 2016 15:55 To: user@spark.apache.org<mailto:user@spark.apache.org> Subject: Spark streaming completed batches statistics Is there any way to get a more computer friendly version of the completes batches section of the streaming page of the application master? I am very

Re: Spark streaming completed batches statistics

2016-12-07 Thread map reduced
mpletedBatches.png > > > https://richardstartin.com/ > > > -- > *From:* Richard Startin <richardstar...@outlook.com> > *Sent:* 05 December 2016 15:55 > *To:* user@spark.apache.org > *Subject:* Spark streaming completed batches statistics > &

Re: Spark streaming completed batches statistics

2016-12-07 Thread Richard Startin
nt: 05 December 2016 15:55 To: user@spark.apache.org Subject: Spark streaming completed batches statistics Is there any way to get a more computer friendly version of the completes batches section of the streaming page of the application master? I am very interested in the statistics and am cur

Re: Not per-key state in spark streaming

2016-12-07 Thread Daniel Haviv
see if spark streaming can > solve my problem. I don't want to keep per-key state,b/c my data set is > very huge and keep a little longer time, it not viable to keep all per key > state in memory.Instead, i want to have a bloom filter based state. Does it > possible to achieve this in Spark

Not per-key state in spark streaming

2016-12-07 Thread Anty Rao
Hi I'm new to Spark. I'm doing some research to see if spark streaming can solve my problem. I don't want to keep per-key state,b/c my data set is very huge and keep a little longer time, it not viable to keep all per key state in memory.Instead, i want to have a bloom filter based state. Does

Re: [Spark Streaming] How to do join two messages in spark streaming(Probabaly messasges are in differnet RDD) ?

2016-12-06 Thread Tathagata Das
with timestamp, etc. Then when the second message of the same key arrives, Spark Streaming will ensure that it calls your state update function with old state (i.e. first message filled up) and you can take the time difference. Check out my blog - https://databricks.com/blog/2016/02/01/faster-stateful

Re: [Spark Streaming] How to do join two messages in spark streaming(Probabaly messasges are in differnet RDD) ?

2016-12-06 Thread sancheng
any valuable feedback is appreciated! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-How-to-do-join-two-messages-in-spark-streaming-Probabaly-messasges-are-in-differnet--tp28161p28163.html Sent from the Apache Spark User List mailing list

Re: Spark Streaming - join streaming and static data

2016-12-06 Thread Cody Koeninger
rak > > On Tue, Dec 6, 2016 at 2:16 AM, Daniela S <daniela_4...@gmx.at> wrote: >> >> Hi >> >> I have some questions regarding Spark Streaming. >> >> I receive a stream of JSON messages from Kafka. >> The messages consist of a ti

Re: Spark Streaming - join streaming and static data

2016-12-06 Thread Burak Yavuz
called "from_json", which should also help you easily parse your messages incoming from Kafka. Best, Burak On Tue, Dec 6, 2016 at 2:16 AM, Daniela S <daniela_4...@gmx.at> wrote: > Hi > > I have some questions regarding Spark Streaming. > > I receive a str

Spark Streaming - join streaming and static data

2016-12-06 Thread Daniela S
Hi   I have some questions regarding Spark Streaming.   I receive a stream of JSON messages from Kafka. The messages consist of a timestamp and an ID.   timestamp                 ID 2016-12-06 13:00    1 2016-12-06 13:40    5 ...   In a database I have values for each ID:   ID

RE: [Spark Streaming] How to do join two messages in spark streaming(Probabaly messasges are in differnet RDD) ?

2016-12-05 Thread Sanchuan Cheng (sancheng)
smime.p7m Description: S/MIME encrypted message

[Spark Streaming] How to do join two messages in spark streaming(Probabaly messasges are in differnet RDD) ?

2016-12-05 Thread sancheng
Hello, we are trying to use Spark streaming to do some billing related application. so our case is that we need to correlate two different messages, and calculate the time invterval between two messages, the two message should be in same partition but probabaly not in the same RDD, it seems

Spark streaming completed batches statistics

2016-12-05 Thread Richard Startin
Is there any way to get a more computer friendly version of the completes batches section of the streaming page of the application master? I am very interested in the statistics and am currently screen-scraping... https://richardstartin.com

Re: Kafka 0.10 & Spark Streaming 2.0.2

2016-12-02 Thread Jacek Laskowski
> > From: Mich Talebzadeh <mich.talebza...@gmail.com> > Date: Friday, December 2, 2016 at 12:26 PM > To: Gabriel Perez <gabr...@adtheorent.com> > Cc: Jacek Laskowski <ja...@japila.pl>, user <user@spark.apache.org> > > > Subject: Re: Kafka 0.10 & Spark

Re: Kafka 0.10 & Spark Streaming 2.0.2

2016-12-02 Thread Gabriel Perez
:26 PM To: Gabriel Perez <gabr...@adtheorent.com> Cc: Jacek Laskowski <ja...@japila.pl>, user <user@spark.apache.org> Subject: Re: Kafka 0.10 & Spark Streaming 2.0.2 in this POC of yours are you running this app with spark in Local mode by any chance? Dr Mich Ta

Re: Kafka 0.10 & Spark Streaming 2.0.2

2016-12-02 Thread Mich Talebzadeh
rez <gabr...@adtheorent.com> > *Cc: *user <user@spark.apache.org> > *Subject: *Re: Kafka 0.10 & Spark Streaming 2.0.2 > > > > Hi, > > > > How many partitions does the topic have? How do you check how many > executors read from the topic? > > &

Re: Kafka 0.10 & Spark Streaming 2.0.2

2016-12-02 Thread Gabriel Perez
Laskowski <ja...@japila.pl> Date: Friday, December 2, 2016 at 12:21 PM To: Gabriel Perez <gabr...@adtheorent.com> Cc: user <user@spark.apache.org> Subject: Re: Kafka 0.10 & Spark Streaming 2.0.2 Hi, Can you post the screenshot of the Executors and Streaming tabs? Jacek

Re: Kafka 0.10 & Spark Streaming 2.0.2

2016-12-02 Thread Jacek Laskowski
> *To: *Gabriel Perez <gabr...@adtheorent.com> > *Cc: *user <user@spark.apache.org> > *Subject: *Re: Kafka 0.10 & Spark Streaming 2.0.2 > > > > Hi, > > > > How many partitions does the topic have? How do you check how many > executors read from

Re: Kafka 0.10 & Spark Streaming 2.0.2

2016-12-02 Thread Gabriel Perez
Friday, December 2, 2016 at 11:47 AM To: Gabriel Perez <gabr...@adtheorent.com> Cc: user <user@spark.apache.org> Subject: Re: Kafka 0.10 & Spark Streaming 2.0.2 Hi, How many partitions does the topic have? How do you check how many executors read from the topic? Jacek

Re: Kafka 0.10 & Spark Streaming 2.0.2

2016-12-02 Thread Jacek Laskowski
@Override public void call( JavaRDD<ConsumerRecordString, String>> rdd ) { OffsetRange[] offsetRanges = ( (HasOffsetRanges) rdd.rdd() ).offsetRanges(); // some time later, after outputs have compl

Kafka 0.10 & Spark Streaming 2.0.2

2016-12-02 Thread gabrielperez2484
OffsetRanges) rdd.rdd() ).offsetRanges(); // some time later, after outputs have completed ( (CanCommitOffsets) stream.inputDStream() ).commitAsync( offsetRanges ); } } ); -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Kafka-0-10-Spark-Streaming-2-0-2-tp28153.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark 2.0.2 , using DStreams in Spark Streaming . How do I create SQLContext? Please help

2016-12-01 Thread shyla deshpande
Used SparkSession, Works now. Thanks. On Wed, Nov 30, 2016 at 11:02 PM, Deepak Sharma wrote: > In Spark > 2.0 , spark session was introduced that you can use to query > hive as well. > Just make sure you create spark session with enableHiveSupport() option. > > Thanks >

Re: Spark 2.0.2 , using DStreams in Spark Streaming . How do I create SQLContext? Please help

2016-11-30 Thread Deepak Sharma
In Spark > 2.0 , spark session was introduced that you can use to query hive as well. Just make sure you create spark session with enableHiveSupport() option. Thanks Deepak On Thu, Dec 1, 2016 at 12:27 PM, shyla deshpande wrote: > I am Spark 2.0.2 , using DStreams

Spark 2.0.2 , using DStreams in Spark Streaming . How do I create SQLContext? Please help

2016-11-30 Thread shyla deshpande
I am Spark 2.0.2 , using DStreams because I need Cassandra Sink. How do I create SQLContext? I get the error SQLContext deprecated. *[image: Inline image 1]* *Thanks*

Re: Do I have to wrap akka around spark streaming app?

2016-11-29 Thread shyla deshpande
tions > http://allegro.tech/2015/08/spark-kafka-integration.html > > > 2016-11-29 2:18 GMT+01:00 shyla deshpande <deshpandesh...@gmail.com>: > >> Hello All, >> >> I just want to make sure this is a right use case for Kafka --> Spark >> Streaming >

Re: Do I have to wrap akka around spark streaming app?

2016-11-29 Thread vincent gromakowski
nt to make sure this is a right use case for Kafka --> Spark > Streaming > > Few words about my use case : > > When the user watches a video, I get the position events from the user > that indicates how much they have completed viewing and at a certain point, > I mark that Video as co

Re: Spark Streaming + Kinesis : Receiver MaxRate is violated

2016-11-29 Thread dav009
Possibly a bug, please check: https://issues.apache.org/jira/browse/SPARK-18620 -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Kinesis-Receiver-MaxRate-is-violated-tp28141p28144.html Sent from the Apache Spark User List mailing list archive

Spark Streaming + Kinesis : Receiver MaxRate is violated

2016-11-28 Thread dav009
this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Kinesis-Receiver-MaxRate-is-violated-tp28141.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e

Re: Do I have to wrap akka around spark streaming app?

2016-11-28 Thread shyla deshpande
Hello All, I just want to make sure this is a right use case for Kafka --> Spark Streaming Few words about my use case : When the user watches a video, I get the position events from the user that indicates how much they have completed viewing and at a certain point, I mark that Vi

Re: Do I have to wrap akka around spark streaming app?

2016-11-28 Thread shyla deshpande
u suggesting I write the completed events to kafka(different topic) >> and the akka consumer could read from this? There could be many completed >> events from different users in this topic. So the akka consumer should >> pretty much do what a spark streaming does to pr

Re: Do I have to wrap akka around spark streaming app?

2016-11-28 Thread vincent gromakowski
ifferent topic) > and the akka consumer could read from this? There could be many completed > events from different users in this topic. So the akka consumer should > pretty much do what a spark streaming does to process this without the > knowledge of the kafka offset. > >

Re: Do I have to wrap akka around spark streaming app?

2016-11-28 Thread shyla deshpande
should pretty much do what a spark streaming does to process this without the knowledge of the kafka offset. So not sure what you mean by kafka offsets will do the job, how will the akka consumer know the kafka offset? On Mon, Nov 28, 2016 at 12:52 PM, vincent gromakowski < vincent.groma

Re: Do I have to wrap akka around spark streaming app?

2016-11-28 Thread vincent gromakowski
hpande <deshpandesh...@gmail.com>: > Thanks Daniel for the response. > > I am planning to use Spark streaming to do Event Processing. I will have > akka actors sending messages to kafka. I process them using Spark streaming > and as a result a new events will be generated. How d

Re: Do I have to wrap akka around spark streaming app?

2016-11-28 Thread shyla deshpande
Thanks Daniel for the response. I am planning to use Spark streaming to do Event Processing. I will have akka actors sending messages to kafka. I process them using Spark streaming and as a result a new events will be generated. How do I notify the akka actor(Message producer) that a new event

Re: Do I have to wrap akka around spark streaming app?

2016-11-28 Thread Daniel van der Ende
AM, shyla deshpande <deshpandesh...@gmail.com> wrote: > My data pipeline is Kafka --> Spark Streaming --> Cassandra. > > Can someone please explain me when would I need to wrap akka around the > spark streaming app. My knowledge of akka and the actor system is poor. &

Re: Do I have to wrap akka around spark streaming app?

2016-11-28 Thread shyla deshpande
Anyone with experience of spark streaming in production, appreciate your input. Thanks -shyla On Mon, Nov 28, 2016 at 12:11 AM, shyla deshpande <deshpandesh...@gmail.com> wrote: > My data pipeline is Kafka --> Spark Streaming --> Cassandra. > > Can someone please explain

Do I have to wrap akka around spark streaming app?

2016-11-28 Thread shyla deshpande
My data pipeline is Kafka --> Spark Streaming --> Cassandra. Can someone please explain me when would I need to wrap akka around the spark streaming app. My knowledge of akka and the actor system is poor. Please help! Thanks

Re: getting error on spark streaming : java.lang.OutOfMemoryError: unable to create new native thread

2016-11-22 Thread Shixiong(Ryan) Zhu
Possibly https://issues.apache.org/jira/browse/SPARK-17396 On Tue, Nov 22, 2016 at 1:42 PM, Mohit Durgapal <durgapalmo...@gmail.com> wrote: > Hi Everyone, > > > I am getting the following error while running a spark streaming example > on my local machine, the being i

getting error on spark streaming : java.lang.OutOfMemoryError: unable to create new native thread

2016-11-22 Thread Mohit Durgapal
Hi Everyone, I am getting the following error while running a spark streaming example on my local machine, the being ingested is only 506kb. *16/11/23 03:05:54 INFO MappedDStream: Slicing from 1479850537180 ms to 1479850537235 ms (aligned to 1479850537180 ms and 1479850537235 ms)* *Exception

[Spark Streaming] map and window operation on DStream only process one batch

2016-11-22 Thread Hao Ren
Spark Streaming v 1.6.2 Kafka v0.10.1 I am reading msgs from Kafka. What surprised me is the following DStream only process the first batch. KafkaUtils.createDirectStream[ String, String, StringDecoder, StringDecoder](streamingContext, kafkaParams, Set(topic)) .map(_._2) .window

Re: spark streaming with kinesis

2016-11-20 Thread Takeshi Yamamuro
16 at 1:59 PM, Shushant Arora <shushantaror...@gmail.com> wrote: > Hi > > Thanks. > Have a doubt on spark streaming kinesis consumer. Say I have a batch time > of 500 ms and kiensis stream is partitioned on userid(uniformly > distributed).But since IdleTimeBetweenReadsIn

Re: spark streaming with kinesis

2016-11-20 Thread Shushant Arora
Hi Thanks. Have a doubt on spark streaming kinesis consumer. Say I have a batch time of 500 ms and kiensis stream is partitioned on userid(uniformly distributed).But since IdleTimeBetweenReadsInMillis is set to 1000ms so Spark receiver nodes will fetch the data at interval of 1 second and store

Re: How do I access the nested field in a dataframe, spark Streaming app... Please help.

2016-11-20 Thread shyla deshpande
gt;> root >>> |-- name: string (nullable = true) >>> |-- addresses: array (nullable = true) >>> ||-- element: struct (containsNull = true) >>> |||-- street: string (nullable = true) >>> |||-- city: string (nullable = t

Re: How do I access the nested field in a dataframe, spark Streaming app... Please help.

2016-11-20 Thread Jon Gregg
;> |||-- street: string (nullable = true) >> |||-- city: string (nullable = true) >> >> I want to output name and city. The following is my spark streaming app >> which outputs name and addresses, but I want name and cities in the out

Re: How do I access the nested field in a dataframe, spark Streaming app... Please help.

2016-11-20 Thread pandees waran
ng (nullable = true) > |-- addresses: array (nullable = true) > ||-- element: struct (containsNull = true) > |||-- street: string (nullable = true) > |||-- city: string (nullable = true) > > I want to output name and city. The following is my

How do I access the nested field in a dataframe, spark Streaming app... Please help.

2016-11-20 Thread shyla deshpande
to output name and city. The following is my spark streaming app which outputs name and addresses, but I want name and cities in the output. object PersonConsumer { import org.apache.spark.sql.{SQLContext, SparkSession} import com.example.protos.demo._ def main(args : Array[String

Successful streaming with ibm/ mq to flume then to kafka and finally spark streaming

2016-11-18 Thread Mich Talebzadeh
hi, can someone share their experience of feeding data from ibm/mq messages into flume, then from flume to kafka and using spark streaming on it? any issues and things to be aware of? thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

analysing ibm mq messages using spark streaming

2016-11-17 Thread Mich Talebzadeh
hi, I guess the only way to do this is to read ibm mq messages into flume, ingest it into hdfs and read it from there. alternatively use flume to ingest data into hbase and then use spark on hbase. I don't think there is an api like spark streaming with kafka for ibm mq? thanks Dr Mich

Re: Spark Streaming Data loss on failure to write BlockAdditionEvent failure to WAL

2016-11-17 Thread Dirceu Semighini Filho
il.com> > *Sent:* Thursday, November 17, 2016 6:50:28 AM > *To:* Arijit > *Cc:* Tathagata Das; user@spark.apache.org > > *Subject:* Re: Spark Streaming Data loss on failure to write > BlockAdditionEvent failure to WAL > > Hi Arijit, > Have you find a solution for t

Re: Spark Streaming Data loss on failure to write BlockAdditionEvent failure to WAL

2016-11-17 Thread Arijit
_ From: Dirceu Semighini Filho <dirceu.semigh...@gmail.com> Sent: Thursday, November 17, 2016 6:50:28 AM To: Arijit Cc: Tathagata Das; user@spark.apache.org Subject: Re: Spark Streaming Data loss on failure to write BlockAdditionEvent failure to WAL Hi Arijit, Ha

Re: Spark Streaming Data loss on failure to write BlockAdditionEvent failure to WAL

2016-11-17 Thread Dirceu Semighini Filho
> > Thanks again, Arijit > -- > *From:* Tathagata Das <tathagata.das1...@gmail.com> > *Sent:* Monday, November 7, 2016 7:59:06 PM > *To:* Arijit > *Cc:* user@spark.apache.org > *Subject:* Re: Spark Streaming Data loss on failure to write > BlockAdd

Re: Need guidelines in Spark Streaming and Kafka integration

2016-11-16 Thread Karim, Md. Rezaul
Hi Tariq and Jon, At first thanks for quick response. I really appreciate that. Well, I would like to start from the very begging of using Kafka with Spark. For example, in the Spark distribution, I found an example using Kafka with Spark streaming that demonstrates a Direct Kafka Word Count

<    4   5   6   7   8   9   10   11   12   13   >