Re: Low Level Kafka Consumer for Spark

2015-01-16 Thread Dibyendu Bhattacharya
tream(new >>>>> KafkaReceiver(_props, i)). >>>>> >>>>> I have found, in your codes, all the messages are retrieved correctly, >>>>> but >>>>> _receiver.store(_dataBuffer.iterator()) which is spark streaming >

Re: Low Level Kafka Consumer for Spark

2015-01-16 Thread Debasish Das
eved correctly, >>>> but >>>> _receiver.store(_dataBuffer.iterator()) which is spark streaming >>>> abstract >>>> class's method does not seem to work correctly. >>>> >>>> Have you tried running your spark str

Re: Low Level Kafka Consumer for Spark

2015-01-15 Thread Akhil Das
i)). >>> >>> I have found, in your codes, all the messages are retrieved correctly, >>> but >>> _receiver.store(_dataBuffer.iterator()) which is spark streaming >>> abstract >>> class's method does not seem to work correctly. >&g

Re: Low Level Kafka Consumer for Spark

2015-01-15 Thread Dibyendu Bhattacharya
class's method does not seem to work correctly. >> >> Have you tried running your spark streaming kafka consumer with kafka >> 0.8.1.1 and spark 1.2.0 ? >> >> - Kidong. >> >> >> >> >> >> >> -- >> View this mes

Re: Low Level Kafka Consumer for Spark

2015-01-15 Thread Dibyendu Bhattacharya
> -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p21180.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > >

Re: Low Level Kafka Consumer for Spark

2015-01-15 Thread mykidong
01560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p21180.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional comman

Re: Low Level Kafka Consumer for Spark

2014-12-03 Thread Luis Ángel Vicente Sánchez
a Consumer for Spark >> >> Dibyendu, >> >> Just to make sure I will not be misunderstood - My concerns are referring >> to the Spark upcoming solution and not yours. I would to gather the >> perspective of someone which implemented recovery with Kafka a different >> way. &

Re: Low Level Kafka Consumer for Spark

2014-12-03 Thread Dibyendu Bhattacharya
re, to improve the reliable Kafka receiver like what you mentioned is > on our scheduler. > > Thanks > Jerry > > > -Original Message- > From: RodrigoB [mailto:rodrigo.boav...@aspect.com] > Sent: Wednesday, December 3, 2014 5:44 AM > To: u...@spark.incubator.apache.org

RE: Low Level Kafka Consumer for Spark

2014-12-02 Thread Shao, Saisai
eiver like what you mentioned is on our scheduler. Thanks Jerry -Original Message- From: RodrigoB [mailto:rodrigo.boav...@aspect.com] Sent: Wednesday, December 3, 2014 5:44 AM To: u...@spark.incubator.apache.org Subject: Re: Low Level Kafka Consumer for Spark Dibyendu, Just to make sure I

Re: Low Level Kafka Consumer for Spark

2014-12-02 Thread RodrigoB
-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p20196.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For

Re: Low Level Kafka Consumer for Spark

2014-12-02 Thread RodrigoB
ments will be greatly appreciated.Tnks,Rod -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p20181.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Low Level Kafka Consumer for Spark

2014-09-15 Thread Dibyendu Bhattacharya
artitioning the messages across > >> different RDDs. > >> > >> Does your Receiver guarantee this behavior, until the problem is fixed > in > >> Spark 1.2? > >> > >> Regards, > >> Alon > >> > >> > >> >

Re: Low Level Kafka Consumer for Spark

2014-09-15 Thread Tim Smith
ages are assigned to a >> single RDD, instead of arbitrarily repartitioning the messages across >> different RDDs. >> >> Does your Receiver guarantee this behavior, until the problem is fixed in >> Spark 1.2? >> >> Regards, >> Alon >> >>

Re: Low Level Kafka Consumer for Spark

2014-09-15 Thread Dibyendu Bhattacharya
d in > Spark 1.2? > > Regards, > Alon > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p14233.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >

Re: Low Level Kafka Consumer for Spark

2014-09-15 Thread Alon Pe'er
em is fixed in Spark 1.2? Regards, Alon -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p14233.html Sent from the Apache Spark User List mailing list archive at Nabbl

Re: Low Level Kafka Consumer for Spark

2014-09-10 Thread Dibyendu Bhattacharya
-checkpoint-recovery-causes-IO-re-execution-td12568.html#a13205 >>>> >>>> Re-computations do occur, but the only RDD's that are recovered are the >>>> ones >>>> from the data checkpoint. This is what we've seen. Is not enough by >>

Re: Low Level Kafka Consumer for Spark

2014-09-08 Thread Tim Smith
chunks of data being consumed to >>> Receiver node on let's say a second bases then having it persisted to >>> HDFS >>> every second could be a big challenge for keeping JVM performance - maybe >>> that could be reason why it

Re: Low Level Kafka Consumer for Spark

2014-09-07 Thread Dibyendu Bhattacharya
tion >>> lineage >>> is checkpointed, but if we have big chunks of data being consumed to >>> Receiver node on let's say a second bases then having it persisted to >>> HDFS >>> every second could be a big challenge f

Re: Low Level Kafka Consumer for Spark

2014-09-05 Thread Tathagata Das
gt;> state consistent recovery feels to me like another big issue to address. >> >> I plan on having a dive into the Streaming code and try to at least >> contribute with some ideas. Some more insight from anyone on the dev team >> will be very appreciated. >> &g

Re: Low Level Kafka Consumer for Spark

2014-09-03 Thread Dibyendu Bhattacharya
sistent recovery feels to me like another big issue to address. > > I plan on having a dive into the Streaming code and try to at least > contribute with some ideas. Some more insight from anyone on the dev team > will be very appreciated. > > tnks, > Rod > > > > > -- &g

Re: Low Level Kafka Consumer for Spark

2014-08-31 Thread RodrigoB
like another big issue to address. I plan on having a dive into the Streaming code and try to at least contribute with some ideas. Some more insight from anyone on the dev team will be very appreciated. tnks, Rod -- View this message in context: http://apache-spark-user-list.1001560.n3.nabb

Re: Low Level Kafka Consumer for Spark

2014-08-30 Thread Tim Smith
t;> Dib >>>> On Aug 28, 2014 6:45 AM, "RodrigoB" wrote: >>>> >>>>> Dibyendu, >>>>> >>>>> Tnks for getting back. >>>>> >>>>> I believe you are absolutely right. We were under the assumpt

Re: Low Level Kafka Consumer for Spark

2014-08-30 Thread Roger Hoover
t;> tests. This applies to Kafka as well. >>>> >>>> The issue is of major priority fortunately. >>>> >>>> Regarding your suggestion, I would maybe prefer to have the problem >>>> resolved >>>> within Spark's inter

Re: Low Level Kafka Consumer for Spark

2014-08-30 Thread Sean Owen
I'm no expert. But as I understand, yes you create multiple streams to consume multiple partitions in parallel. If they're all in the same Kafka consumer group, you'll get exactly one copy of the message so yes if you have 10 consumers and 3 Kafka partitions I believe only 3 will be getting message

Re: Low Level Kafka Consumer for Spark

2014-08-29 Thread Tim Smith
partitionSize=30 >> >> >> >> JavaDStream lines = newMessages.map(new >> >> Function, String>() { >> >> ... >> >> >> >> public String call(Tuple2 tuple2) { >> >> ret

Re: Low Level Kafka Consumer for Spark

2014-08-29 Thread Tim Smith
t; >> > >> JavaDStream words = lines.flatMap(new > >> MetricsComputeFunction() > >> ); > >> > >> JavaPairDStream wordCounts = words.mapToPair( > >> new PairFunction() { > >>

Re: Low Level Kafka Consumer for Spark

2014-08-29 Thread Tim Smith
Function, > Void>() {...}); > > Thanks, > Bharat > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p13131.html > Sent from the Apache Spark User List mailing list archive at N

Re: Low Level Kafka Consumer for Spark

2014-08-29 Thread Jonathan Hodges
ffected by this issue. If for example >>> there >>> is a big amount of batches to be recomputed I would rather have them done >>> distributed than overloading the batch interval with huge amount of Kafka >>> messages. >>> >>> I do not have yet

Re: Low Level Kafka Consumer for Spark

2014-08-29 Thread bharatvenkat
ew this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p13131.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscr

Re: Low Level Kafka Consumer for Spark

2014-08-28 Thread Chris Fregly
e. If for example >> there >> is a big amount of batches to be recomputed I would rather have them done >> distributed than overloading the batch interval with huge amount of Kafka >> messages. >> >> I do not have yet enough know how on where is the issue and ab

Re: Low Level Kafka Consumer for Spark

2014-08-27 Thread Dibyendu Bhattacharya
d than overloading the batch interval with huge amount of Kafka > messages. > > I do not have yet enough know how on where is the issue and about the > internal Spark code so I can't really how much difficult will be the > implementation. > > tnks, > Rod > > > >

Re: Low Level Kafka Consumer for Spark

2014-08-27 Thread RodrigoB
ion. tnks, Rod -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p12966.html Sent from the Apache Spark User List mailing list archive at Nabble.com. -

Re: Low Level Kafka Consumer for Spark

2014-08-27 Thread Bharat Venkat
t;>> way to specify multiple worker processes (on different machines) to read >>>> from Kafka? Maybe one worker process for each partition? >>>> >>>> If there is no such option, what happens when the single machine >>>> hosting the >>&g

Re: Low Level Kafka Consumer for Spark

2014-08-26 Thread Dibyendu Bhattacharya
t;>> >>> If there is no such option, what happens when the single machine hosting >>> the >>> "Kafka Reader" worker process dies and is replaced by a different machine >>> (like in cloud)? >>> >>> Thanks, >>> Bharat >

Re: Low Level Kafka Consumer for Spark

2014-08-26 Thread Chris Fregly
by a different machine >> (like in cloud)? >> >> Thanks, >> Bharat >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p12788.html >> Sent f

Re: Low Level Kafka Consumer for Spark

2014-08-26 Thread Dibyendu Bhattacharya
and is replaced by a different machine > (like in cloud)? > > Thanks, > Bharat > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p12788.h

Re: Low Level Kafka Consumer for Spark

2014-08-26 Thread Dibyendu Bhattacharya
checkpoints are done every > batch interval. > > Was it on purpose to solely depend on the Kafka commit to recover data and > recomputations between data checkpoints? If so, how to make this work? > > tnks > Rod > > > > -- > View this message in context: > ht

Re: Low Level Kafka Consumer for Spark

2014-08-25 Thread bharatvenkat
ache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p12788.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For

Re: Low Level Kafka Consumer for Spark

2014-08-25 Thread RodrigoB
a and recomputations between data checkpoints? If so, how to make this work? tnks Rod -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p12757.html Sent from the Apache Spark User List mailing list archive at Nabbl

Re: Low Level Kafka Consumer for Spark

2014-08-05 Thread Dibyendu Bhattacharya
gt;>> streams/receivers, adding a Java API for receivers was something we did >>> specifically to allow this :) >>> >>> - Patrick >>> >>> >>> On Sat, Aug 2, 2014 at 10:09 AM, Dibyendu Bhattacharya < >>> dibyendu.bhattach...@gma

Re: Low Level Kafka Consumer for Spark

2014-08-04 Thread Jonathan Hodges
g 2, 2014 at 10:09 AM, Dibyendu Bhattacharya < >> dibyendu.bhattach...@gmail.com> wrote: >> >>> Hi, >>> >>> I have implemented a Low Level Kafka Consumer for Spark Streaming using >>> Kafka Simple Consumer API. This API will give better contr

Re: Low Level Kafka Consumer for Spark

2014-08-04 Thread Yan Fang
byendu Bhattacharya < > dibyendu.bhattach...@gmail.com> wrote: > >> Hi, >> >> I have implemented a Low Level Kafka Consumer for Spark Streaming using >> Kafka Simple Consumer API. This API will give better control over the Kafka >> offset management and recovery

Re: Low Level Kafka Consumer for Spark

2014-08-03 Thread Patrick Wendell
09 AM, Dibyendu Bhattacharya < dibyendu.bhattach...@gmail.com> wrote: > Hi, > > I have implemented a Low Level Kafka Consumer for Spark Streaming using > Kafka Simple Consumer API. This API will give better control over the Kafka > offset management and recovery from failures. As

Re: Low Level Kafka Consumer for Spark

2014-08-03 Thread hodgesz
simple consumer and managing the offsets explicitly. I think this will be a great addition to Spark Streaming, but curious what others think. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p11281.html Sent from the

Low Level Kafka Consumer for Spark

2014-08-02 Thread Dibyendu Bhattacharya
Hi, I have implemented a Low Level Kafka Consumer for Spark Streaming using Kafka Simple Consumer API. This API will give better control over the Kafka offset management and recovery from failures. As the present Spark KafkaUtils uses HighLevel Kafka Consumer API, I wanted to have a better