Hello,
I want to add that,
I don't even see the streaming tab in the application UI on port 4040 when
I run it on the cluster.
The cluster on EC2 has 1 master node and 1 worker node.
The cores used on the worker node is 2 of 2 and memory used is 6GB of 6.3GB.
Can I run a spark streaming job
Hello,
My spark streaming app that reads kafka topics and prints the DStream works
fine on my laptop, but on AWS cluster it produces no output and no errors.
Please help me debug.
I am using Spark 2.0.2 and kafka-0-10
Thanks
The following is the output of the spark streaming app...
17/01/14
llowing
> error:
>
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/spark/streaming/StateSpec$
> Caused by: java.lang.ClassNotFoundException:
> org.apache.spark.streaming.StateSpec$
>
> Build.sbt
>
> scalaVersion := "2.
Spark: 1.6.1
I am trying to use the new mapWithState API and I am getting the following
error:
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/spark/streaming/StateSpec$
Caused by: java.lang.ClassNotFoundException:
org.apache.spark.streaming.StateSpec$
started.
Thanks
On Sun, Jan 8, 2017 at 1:52 PM, shyla deshpande <deshpandesh...@gmail.com>
wrote:
> Thanks really appreciate.
>
> On Sun, Jan 8, 2017 at 1:02 PM, vvshvv <vvs...@gmail.com> wrote:
>
>> Hi,
>>
>> I am running spark streaming job using s
Hi,
I am running spark streaming job using spark jobserver via this image:
https://hub.docker.com/r/depend/spark-jobserver/.
It works well in standalone (using mesos job does not make progress). Spark jobserver that supports Spark 2.0 has new API that is only suitable for non-streaming jobs
I looking for a docker image that I can use from docker hub for running a
spark streaming app with scala and spark 2.0 +.
I am new to docker and unable to find one image from docker hub that suits
my needs. Please let me know if anyone is using a docker for spark
streaming app and share your
My kafka is in a docker container.
How do I read this Kafka data in my Spark streaming app.
Also, I need to write data from Spark Streaming to Cassandra database which
is in docker container.
I appreciate any help.
Thanks.
hours (one value
> per minute) should be predicted.
>
> Thank you in advance.
>
> Regards,
> Daniela
>
> *Gesendet:* Montag, 02. Januar 2017 um 22:30 Uhr
> *Von:* "Marco Mistroni" <mmistr...@gmail.com>
> *An:* "Daniela S" <daniela_4...@gmx
t; <mmistr...@gmail.com>
An: "Daniela S" <daniela_4...@gmx.at>
Cc: User <user@spark.apache.org>
Betreff: Re: Re: Spark Streaming prediction
Apologies, perhaps i misunderstood your usecase.
My assumption was that you have 2-3 hours worth fo data and you want to
want to accumulate data worth of 24 hrs and display it in the
dashboard?
or is it something else?
for dashboard update, i guess you either
- poll 'a database' (where you store the compuation of your spark logic )
periodically
- propagate events from your spark streaming application to your
, 02. Januar 2017 um 21:07 Uhr
Von: "Marco Mistroni" <mmistr...@gmail.com>
An: "Daniela S" <daniela_4...@gmx.at>
Cc: User <user@spark.apache.org>
Betreff: Re: Spark Streaming prediction
Hi
you might want to have a look at the Regression ML algorithm a
somewhere and have your dashboard poll periodically your
data store to read the predictions
I have seen ppl on the list doing ML over a Spark streaming app, i m sure
someone can reply back
Hpefully i gave u a starting point
hth
marco
On 2 Jan 2017 4:03 pm, "Daniela S" <daniel
Hi
I am trying to solve the following problem with Spark Streaming.
I receive timestamped events from Kafka. Each event refers to a device and contains values for every minute of the next 2 to 3 hours. What I would like to do is to predict the minute values for the next 24 hours. So I would
Two weeks ago I have published a blogpost about our experiences running 24/7
Spark Streaming applications on YARN in production:
https://www.inovex.de/blog/247-spark-streaming-on-yarn-in-production/
<https://www.inovex.de/blog/247-spark-streaming-on-yarn-in-production/>
Amongst
I am running spark streaming with Yarn -
*spark-submit --master yarn --deploy-mode cluster --num-executors 2
> --executor-memory 8g --driver-memory 2g --executor-cores 8 ..*
>
I am consuming Kafka through DireactStream approach (No receiver). I have 2
topics (each with 3 partition
Any update on this guys ?
On Wed, Dec 28, 2016 at 10:19 AM, Nishant Kumar <nishant.ku...@applift.com>
wrote:
> I have updated my question:
>
> http://stackoverflow.com/questions/41345552/spark-
> streaming-with-yarn-executors-not-fully-utilized
>
> On Wed, Dec 28, 2016 a
I have updated my question:
http://stackoverflow.com/questions/41345552/spark-streaming-with-yarn-executors-not-fully-utilized
On Wed, Dec 28, 2016 at 9:49 AM, Nishant Kumar <nishant.ku...@applift.com>
wrote:
> Hi,
>
> I am running spark streaming with Yarn with -
>
> *
Hi,
I am running spark streaming with Yarn with -
*spark-submit --master yarn --deploy-mode cluster --num-executors 2
--executor-memory 8g --driver-memory 2g --executor-cores 8 ..*
I am consuming Kafka through DireactStream approach (No receiver). I have 2
topics (each with 3 partitions).
I
atabricks.com/docs/latest/databricks_
> guide/index.html#07%20Spark%20Streaming/15%20Streaming%20FAQs.html
>
> There can be only one streaming context in a cluster which implies only
> one streaming job.
>
> So, I am still confused. Anyone having more than 1 spark streaming app in
having more than 1 spark streaming app in a
cluster running at the same time, please share your experience.
Thanks
On Wed, Dec 14, 2016 at 6:54 PM, Akhilesh Pathodia <
pathodia.akhil...@gmail.com> wrote:
> If you have enough cores/resources, run them separately depending on your
&
This doesn't sound like a question regarding Kafka streaming, it
sounds like confusion about the scope of variables in spark generally.
Is that right? If so, I'd suggest reading the documentation, starting
with a simple rdd (e.g. using sparkContext.parallelize), and
experimenting to confirm your
I am trying to stream the data from Kafka to Spark.
JavaPairInputDStream directKafkaStream =
KafkaUtils.createDirectStream(ssc,
String.class,
String.class,
StringDecoder.class,
StringDecoder.class,
't be how apps are deployed in the wild because it
>> will never be very reliable, right? But I don't see anything about this in
>> the docs, so I am confused.
>>
>> Note that I use this to run the app, maybe that is the problem?
>>
>> ssc.start()
>> ss
be how apps are deployed in the wild because it
> will never be very reliable, right? But I don't see anything about this in
> the docs, so I am confused.
>
> Note that I use this to run the app, maybe that is the problem?
>
> ssc.start()
> ssc.awaitTermination()
>
>
>
n
> PID going down. This can't be how apps are deployed in the wild because it
> will never be very reliable, right? But I don't see anything about this in
> the docs, so I am confused.
>
> Note that I use this to run the app, maybe that is the problem?
>
> ssc.start()
> ssc.awai
, maybe that is the problem?
ssc.start()
ssc.awaitTermination()
What is the actual deployment model for Spark Streaming? All I know to do
right now is to restart the PID. I'm new to Spark, and the docs don't
really explain this (that I can see).
Thanks!
--
Russell Jurney twitter.com/rjurney
Hello,
Can someone please help me understand the different scenarios when I could
use "remember" vs "window" in Spark streaming?
Thanks!
acoomodate ,can run as many spark/spark
> streaming application.
>
>
> Thanks,
> Divya
>
> On 15 December 2016 at 08:42, shyla deshpande <deshpandesh...@gmail.com
> <javascript:_e(%7B%7D,'cvml','deshpandesh...@gmail.com');>> wrote:
>
>> How many
It depends on the use case ...
Spark always depends on the resource availability .
As long as you have resource to acoomodate ,can run as many spark/spark
streaming application.
Thanks,
Divya
On 15 December 2016 at 08:42, shyla deshpande <deshpandesh...@gmail.com>
wrote:
> How m
How many Spark streaming applications can be run at a time on a Spark
cluster?
Is it better to have 1 spark streaming application to consume all the Kafka
topics or have multiple streaming applications when possible to keep it
simple?
Thanks
I use this in my SBT and it works on 2.0.1:
"org.apache.spark" %% "spark-streaming-kafka-0-8" % "2.0.1"
On Tue, Dec 13, 2016 at 1:00 PM, Luke Adolph <kenan3...@gmail.com> wrote:
> Hi all,
> My project uses spark-streaming-kafka module.When I migra
Hi all,
My project uses spark-streaming-kafka module.When I migrate spark from
1.6.0 to 2.0.0 and rebuild project, I run into below error:
[warn] module not found: org.apache.spark#spark-streaming-kafka_2.11;2.0.0
[warn] local: tried
[warn]
/home/linker/.ivy2/local
7:11 GMT+01:00 Timur Shenkao <t...@timshenkao.su>:
> >>>
> >>> Hi,
> >>> Usual general questions are:
> >>> -- what is your Spark version?
> >>> -- what is your Kafka version?
> >>> -- do you use "standard" Kafka con
sion?
>>> -- do you use "standard" Kafka consumer or try to implement something
>>> custom (your own multi-threaded consumer)?
>>>
>>> The freshest docs
>>> https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html
>>>
&g
smime.p7m
Description: S/MIME encrypted message
>>
>> The freshest docs https://spark.apache.org/docs/
>> latest/streaming-kafka-0-10-integration.html
>>
>> AFAIK, yes, you should use unique group id for each stream (KAFKA 0.10
>> !!!)
>>
>>> kafkaParams.put("group.id", "use_a_separa
1, 2016 at 5:51 PM, Anton Okolnychyi <
> anton.okolnyc...@gmail.com> wrote:
>
>> Hi,
>>
>> I am experimenting with Spark Streaming and Kafka. I will appreciate if
>> someone can say whether the following assumption is correct.
>>
>&
kafka-0-10-integration.html
AFAIK, yes, you should use unique group id for each stream (KAFKA 0.10 !!!)
> kafkaParams.put("group.id", "use_a_separate_group_id_for_each_stream");
>
>
On Sun, Dec 11, 2016 at 5:51 PM, Anton Okolnychyi <
anton.okolnyc...@gmail.com> wrote:
>
Hi,
I am experimenting with Spark Streaming and Kafka. I will appreciate if
someone can say whether the following assumption is correct.
If I have multiple computations (each with its own output) on one stream
(created as KafkaUtils.createDirectStream), then there is a chance to have
ns as stated in the bloga shot, but changing mode to append.
>
> On Sat, Dec 10, 2016 at 8:25 AM, shyla deshpande <deshpandesh...@gmail.com
> > wrote:
>
>> Hello all,
>>
>> Is it possible to Write data from Spark streaming to AWS Redshift?
>>
>> I came acros
Ideally, saving data to external sources should not be any different. give
the write options as stated in the bloga shot, but changing mode to append.
On Sat, Dec 10, 2016 at 8:25 AM, shyla deshpande <deshpandesh...@gmail.com>
wrote:
> Hello all,
>
> Is it possible to Write
Hello all,
Is it possible to Write data from Spark streaming to AWS Redshift?
I came across the following article, so looks like it works from a Spark
batch program.
https://databricks.com/blog/2015/10/19/introducing-redshift-data-source-for-spark.html
I want to write to AWS Redshift from
te:
>
>>
>>
>> On Wed, Dec 7, 2016 at 7:42 PM, Anty Rao <ant@gmail.com> wrote:
>>
>>> Hi
>>> I'm new to Spark. I'm doing some research to see if spark streaming can
>>> solve my problem. I don't want to keep per-key state,b/c my data
There's no need to extend Spark's API, look at mapWithState for examples.
On Thu, Dec 8, 2016 at 4:49 AM, Anty Rao <ant@gmail.com> wrote:
>
>
> On Wed, Dec 7, 2016 at 7:42 PM, Anty Rao <ant@gmail.com> wrote:
>
>> Hi
>> I'm new to Spark. I'm doing som
On Wed, Dec 7, 2016 at 7:42 PM, Anty Rao <ant@gmail.com> wrote:
> Hi
> I'm new to Spark. I'm doing some research to see if spark streaming can
> solve my problem. I don't want to keep per-key state,b/c my data set is
> very huge and keep a little longer time, it not viable t
e HDFS or HBASE.
>
> Daniel
>
> On Wed, Dec 7, 2016 at 1:42 PM, Anty Rao <ant@gmail.com> wrote:
>
>> Hi
>> I'm new to Spark. I'm doing some research to see if spark streaming can
>> solve my problem. I don't want to keep per-key state,b/c my data set is
>
urce. Thanks.
>
>
> https://richardstartin.com/
>
>
> --
> *From:* map reduced <k3t.gi...@gmail.com>
> *Sent:* 07 December 2016 19:49
> *To:* Richard Startin
> *Cc:* user@spark.apache.org
> *Subject:* Re: Spark streaming com
;
Sent: 05 December 2016 15:55
To: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: Spark streaming completed batches statistics
Is there any way to get a more computer friendly version of the completes
batches section of the streaming page of the application master? I am very
mpletedBatches.png
>
>
> https://richardstartin.com/
>
>
> --
> *From:* Richard Startin <richardstar...@outlook.com>
> *Sent:* 05 December 2016 15:55
> *To:* user@spark.apache.org
> *Subject:* Spark streaming completed batches statistics
>
&
nt: 05 December 2016 15:55
To: user@spark.apache.org
Subject: Spark streaming completed batches statistics
Is there any way to get a more computer friendly version of the completes
batches section of the streaming page of the application master? I am very
interested in the statistics and am cur
see if spark streaming can
> solve my problem. I don't want to keep per-key state,b/c my data set is
> very huge and keep a little longer time, it not viable to keep all per key
> state in memory.Instead, i want to have a bloom filter based state. Does it
> possible to achieve this in Spark
Hi
I'm new to Spark. I'm doing some research to see if spark streaming can
solve my problem. I don't want to keep per-key state,b/c my data set is
very huge and keep a little longer time, it not viable to keep all per key
state in memory.Instead, i want to have a bloom filter based state. Does
with
timestamp, etc. Then when the second message of the same key arrives, Spark
Streaming will ensure that it calls your state update function with old
state (i.e. first message filled up) and you can take the time difference.
Check out my blog -
https://databricks.com/blog/2016/02/01/faster-stateful
any valuable feedback is appreciated!
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-How-to-do-join-two-messages-in-spark-streaming-Probabaly-messasges-are-in-differnet--tp28161p28163.html
Sent from the Apache Spark User List mailing list
rak
>
> On Tue, Dec 6, 2016 at 2:16 AM, Daniela S <daniela_4...@gmx.at> wrote:
>>
>> Hi
>>
>> I have some questions regarding Spark Streaming.
>>
>> I receive a stream of JSON messages from Kafka.
>> The messages consist of a ti
called
"from_json", which should also help you easily parse your messages incoming
from Kafka.
Best,
Burak
On Tue, Dec 6, 2016 at 2:16 AM, Daniela S <daniela_4...@gmx.at> wrote:
> Hi
>
> I have some questions regarding Spark Streaming.
>
> I receive a str
Hi
I have some questions regarding Spark Streaming.
I receive a stream of JSON messages from Kafka.
The messages consist of a timestamp and an ID.
timestamp ID
2016-12-06 13:00 1
2016-12-06 13:40 5
...
In a database I have values for each ID:
ID
smime.p7m
Description: S/MIME encrypted message
Hello,
we are trying to use Spark streaming to do some billing related application.
so our case is
that we need to correlate two different messages, and calculate the time
invterval between two
messages, the two message should be in same partition but probabaly not in
the same RDD,
it seems
Is there any way to get a more computer friendly version of the completes
batches section of the streaming page of the application master? I am very
interested in the statistics and am currently screen-scraping...
https://richardstartin.com
>
> From: Mich Talebzadeh <mich.talebza...@gmail.com>
> Date: Friday, December 2, 2016 at 12:26 PM
> To: Gabriel Perez <gabr...@adtheorent.com>
> Cc: Jacek Laskowski <ja...@japila.pl>, user <user@spark.apache.org>
>
>
> Subject: Re: Kafka 0.10 & Spark
:26 PM
To: Gabriel Perez <gabr...@adtheorent.com>
Cc: Jacek Laskowski <ja...@japila.pl>, user <user@spark.apache.org>
Subject: Re: Kafka 0.10 & Spark Streaming 2.0.2
in this POC of yours are you running this app with spark in Local mode by any
chance?
Dr Mich Ta
rez <gabr...@adtheorent.com>
> *Cc: *user <user@spark.apache.org>
> *Subject: *Re: Kafka 0.10 & Spark Streaming 2.0.2
>
>
>
> Hi,
>
>
>
> How many partitions does the topic have? How do you check how many
> executors read from the topic?
>
>
&
Laskowski <ja...@japila.pl>
Date: Friday, December 2, 2016 at 12:21 PM
To: Gabriel Perez <gabr...@adtheorent.com>
Cc: user <user@spark.apache.org>
Subject: Re: Kafka 0.10 & Spark Streaming 2.0.2
Hi,
Can you post the screenshot of the Executors and Streaming tabs?
Jacek
> *To: *Gabriel Perez <gabr...@adtheorent.com>
> *Cc: *user <user@spark.apache.org>
> *Subject: *Re: Kafka 0.10 & Spark Streaming 2.0.2
>
>
>
> Hi,
>
>
>
> How many partitions does the topic have? How do you check how many
> executors read from
Friday, December 2, 2016 at 11:47 AM
To: Gabriel Perez <gabr...@adtheorent.com>
Cc: user <user@spark.apache.org>
Subject: Re: Kafka 0.10 & Spark Streaming 2.0.2
Hi,
How many partitions does the topic have? How do you check how many executors
read from the topic?
Jacek
@Override
public void call( JavaRDD<ConsumerRecordString,
String>> rdd ) {
OffsetRange[] offsetRanges = (
(HasOffsetRanges) rdd.rdd()
).offsetRanges();
// some time later, after outputs have
compl
OffsetRanges) rdd.rdd()
).offsetRanges();
// some time later, after outputs have completed
( (CanCommitOffsets) stream.inputDStream()
).commitAsync( offsetRanges
);
}
} );
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Kafka-0-10-Spark-Streaming-2-0-2-tp28153.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Used SparkSession, Works now. Thanks.
On Wed, Nov 30, 2016 at 11:02 PM, Deepak Sharma
wrote:
> In Spark > 2.0 , spark session was introduced that you can use to query
> hive as well.
> Just make sure you create spark session with enableHiveSupport() option.
>
> Thanks
>
In Spark > 2.0 , spark session was introduced that you can use to query
hive as well.
Just make sure you create spark session with enableHiveSupport() option.
Thanks
Deepak
On Thu, Dec 1, 2016 at 12:27 PM, shyla deshpande
wrote:
> I am Spark 2.0.2 , using DStreams
I am Spark 2.0.2 , using DStreams because I need Cassandra Sink.
How do I create SQLContext? I get the error SQLContext deprecated.
*[image: Inline image 1]*
*Thanks*
tions
> http://allegro.tech/2015/08/spark-kafka-integration.html
>
>
> 2016-11-29 2:18 GMT+01:00 shyla deshpande <deshpandesh...@gmail.com>:
>
>> Hello All,
>>
>> I just want to make sure this is a right use case for Kafka --> Spark
>> Streaming
>
nt to make sure this is a right use case for Kafka --> Spark
> Streaming
>
> Few words about my use case :
>
> When the user watches a video, I get the position events from the user
> that indicates how much they have completed viewing and at a certain point,
> I mark that Video as co
Possibly a bug, please check:
https://issues.apache.org/jira/browse/SPARK-18620
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Kinesis-Receiver-MaxRate-is-violated-tp28141p28144.html
Sent from the Apache Spark User List mailing list archive
this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Kinesis-Receiver-MaxRate-is-violated-tp28141.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe e
Hello All,
I just want to make sure this is a right use case for Kafka --> Spark
Streaming
Few words about my use case :
When the user watches a video, I get the position events from the user that
indicates how much they have completed viewing and at a certain point, I
mark that Vi
u suggesting I write the completed events to kafka(different topic)
>> and the akka consumer could read from this? There could be many completed
>> events from different users in this topic. So the akka consumer should
>> pretty much do what a spark streaming does to pr
ifferent topic)
> and the akka consumer could read from this? There could be many completed
> events from different users in this topic. So the akka consumer should
> pretty much do what a spark streaming does to process this without the
> knowledge of the kafka offset.
>
>
should
pretty much do what a spark streaming does to process this without the
knowledge of the kafka offset.
So not sure what you mean by kafka offsets will do the job, how will the
akka consumer know the kafka offset?
On Mon, Nov 28, 2016 at 12:52 PM, vincent gromakowski <
vincent.groma
hpande <deshpandesh...@gmail.com>:
> Thanks Daniel for the response.
>
> I am planning to use Spark streaming to do Event Processing. I will have
> akka actors sending messages to kafka. I process them using Spark streaming
> and as a result a new events will be generated. How d
Thanks Daniel for the response.
I am planning to use Spark streaming to do Event Processing. I will have
akka actors sending messages to kafka. I process them using Spark streaming
and as a result a new events will be generated. How do I notify the akka
actor(Message producer) that a new event
AM, shyla deshpande <deshpandesh...@gmail.com>
wrote:
> My data pipeline is Kafka --> Spark Streaming --> Cassandra.
>
> Can someone please explain me when would I need to wrap akka around the
> spark streaming app. My knowledge of akka and the actor system is poor.
&
Anyone with experience of spark streaming in production, appreciate your
input.
Thanks
-shyla
On Mon, Nov 28, 2016 at 12:11 AM, shyla deshpande <deshpandesh...@gmail.com>
wrote:
> My data pipeline is Kafka --> Spark Streaming --> Cassandra.
>
> Can someone please explain
My data pipeline is Kafka --> Spark Streaming --> Cassandra.
Can someone please explain me when would I need to wrap akka around the
spark streaming app. My knowledge of akka and the actor system is poor.
Please help!
Thanks
Possibly https://issues.apache.org/jira/browse/SPARK-17396
On Tue, Nov 22, 2016 at 1:42 PM, Mohit Durgapal <durgapalmo...@gmail.com>
wrote:
> Hi Everyone,
>
>
> I am getting the following error while running a spark streaming example
> on my local machine, the being i
Hi Everyone,
I am getting the following error while running a spark streaming example on
my local machine, the being ingested is only 506kb.
*16/11/23 03:05:54 INFO MappedDStream: Slicing from 1479850537180 ms to
1479850537235 ms (aligned to 1479850537180 ms and 1479850537235 ms)*
*Exception
Spark Streaming v 1.6.2
Kafka v0.10.1
I am reading msgs from Kafka.
What surprised me is the following DStream only process the first batch.
KafkaUtils.createDirectStream[
String,
String,
StringDecoder,
StringDecoder](streamingContext, kafkaParams, Set(topic))
.map(_._2)
.window
16 at 1:59 PM, Shushant Arora <shushantaror...@gmail.com>
wrote:
> Hi
>
> Thanks.
> Have a doubt on spark streaming kinesis consumer. Say I have a batch time
> of 500 ms and kiensis stream is partitioned on userid(uniformly
> distributed).But since IdleTimeBetweenReadsIn
Hi
Thanks.
Have a doubt on spark streaming kinesis consumer. Say I have a batch time
of 500 ms and kiensis stream is partitioned on userid(uniformly
distributed).But since IdleTimeBetweenReadsInMillis is set to 1000ms so
Spark receiver nodes will fetch the data at interval of 1 second and store
gt;> root
>>> |-- name: string (nullable = true)
>>> |-- addresses: array (nullable = true)
>>> ||-- element: struct (containsNull = true)
>>> |||-- street: string (nullable = true)
>>> |||-- city: string (nullable = t
;> |||-- street: string (nullable = true)
>> |||-- city: string (nullable = true)
>>
>> I want to output name and city. The following is my spark streaming app
>> which outputs name and addresses, but I want name and cities in the out
ng (nullable = true)
> |-- addresses: array (nullable = true)
> ||-- element: struct (containsNull = true)
> |||-- street: string (nullable = true)
> |||-- city: string (nullable = true)
>
> I want to output name and city. The following is my
to output name and city. The following is my spark streaming app
which outputs name and addresses, but I want name and cities in the output.
object PersonConsumer {
import org.apache.spark.sql.{SQLContext, SparkSession}
import com.example.protos.demo._
def main(args : Array[String
hi,
can someone share their experience of feeding data from ibm/mq messages
into flume, then from flume to kafka and using spark streaming on it?
any issues and things to be aware of?
thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id
hi,
I guess the only way to do this is to read ibm mq messages into flume,
ingest it into hdfs and read it from there. alternatively use flume to
ingest data into hbase and then use spark on hbase.
I don't think there is an api like spark streaming with kafka for ibm mq?
thanks
Dr Mich
il.com>
> *Sent:* Thursday, November 17, 2016 6:50:28 AM
> *To:* Arijit
> *Cc:* Tathagata Das; user@spark.apache.org
>
> *Subject:* Re: Spark Streaming Data loss on failure to write
> BlockAdditionEvent failure to WAL
>
> Hi Arijit,
> Have you find a solution for t
_
From: Dirceu Semighini Filho <dirceu.semigh...@gmail.com>
Sent: Thursday, November 17, 2016 6:50:28 AM
To: Arijit
Cc: Tathagata Das; user@spark.apache.org
Subject: Re: Spark Streaming Data loss on failure to write BlockAdditionEvent
failure to WAL
Hi Arijit,
Ha
>
> Thanks again, Arijit
> --
> *From:* Tathagata Das <tathagata.das1...@gmail.com>
> *Sent:* Monday, November 7, 2016 7:59:06 PM
> *To:* Arijit
> *Cc:* user@spark.apache.org
> *Subject:* Re: Spark Streaming Data loss on failure to write
> BlockAdd
Hi Tariq and Jon,
At first thanks for quick response. I really appreciate that.
Well, I would like to start from the very begging of using Kafka with
Spark. For example, in the Spark distribution, I found an example using
Kafka with Spark streaming that demonstrates a Direct Kafka Word Count
801 - 900 of 4567 matches
Mail list logo