Spark Streaming: process only last events

2016-01-06 Thread Julien Naour
and process the same DStream at different speed (low processing vs high)? Is it easily possible to share values (map for example) between pipelines without using an external database? I think accumulator/broadcast could work but between two pipelines I'm not sure. Regards, Julien Naour

Re: Spark Streaming: process only last events

2016-01-06 Thread Julien Naour
keys corresponding to some kind of user id. I want to process last events by each user id once ie skip intermediate events by user id. I have only one Kafka topic with all theses events. Regards, Julien Naour Le mer. 6 janv. 2016 à 16:13, Cody Koeninger <c...@koeninger.org> a écrit : >

Re: Spark Streaming: process only last events

2016-01-06 Thread Julien Naour
s, you can't just magically ignore some time > range of rdds, because they may contain events you care about. > > On Wed, Jan 6, 2016 at 10:55 AM, Julien Naour <julna...@gmail.com> wrote: > >> The following lines are my understanding of Spark Streaming AFAIK, I >>

Re: Spark Streaming: process only last events

2016-01-06 Thread Julien Naour
t; Then you can do foreachPartition with a local map to store just a single > event per user, e.g. > > foreachPartition { p => > val m = new HashMap > p.foreach ( event => > m.put(event,user, event) > } > m.foreach { >... do your computation > } &g

Re: Broadcast vs simple variable

2014-08-21 Thread Julien Naour
/1427 And current k-means implementation of MLlib, it's benefited from sparse vector computing. http://spark-summit.org/2014/talk/sparse-data-support-in-mllib-2 2014-08-21 15:40 GMT+08:00 Julien Naour julna...@gmail.com: My Arrays are in fact Array[Array[Long]] and like 17x15 (17

Broadcast vs simple variable

2014-08-20 Thread Julien Naour
instead of simple variable? Cheers, Julien Naour

Re: about spark and using machine learning model

2014-08-05 Thread Julien Naour
You can find in the following presentation a simple example of a clustering model use to classify new incoming tweet : https://www.youtube.com/watch?v=sPhyePwo7FA Regards, Julien 2014-08-05 7:08 GMT+02:00 Xiangrui Meng men...@gmail.com: Some extra work is needed to close the loop. One related