Re: using StreamingKMeans

2016-11-21 Thread Julian Keppel
ining? >> 2. Can you run multiple streams at the same time with different values >> for k and compare their performance? >> 3. foreachRDD is fine in general, can't speak to the specifics. >> 4. If you haven't done any transformations yet on a direct stream, >> foreachRDD will

Re: using StreamingKMeans

2016-11-19 Thread Debasish Ghosh
hould be able to skip empty > batches. > > > > On Sat, Nov 19, 2016 at 10:46 AM, debasishg <ghosh.debas...@gmail.com> > wrote: > > Hello - > > > > I am trying to implement an outlier detection application on streaming > data. > > I am a newbie t

Re: using StreamingKMeans

2016-11-19 Thread ayan guha
u a KafkaRDD. Checking if a KafkaRDD is empty >> is very cheap, it's done on the driver only because the beginning and >> ending offsets are known. So you should be able to skip empty >> batches. >> >> >> >> On Sat, Nov 19, 2016 at 10:46 AM, debasishg <ghosh.deba

Re: using StreamingKMeans

2016-11-19 Thread Debasish Ghosh
known. So you should be able to skip empty > batches. > > > > On Sat, Nov 19, 2016 at 10:46 AM, debasishg <ghosh.debas...@gmail.com> > wrote: > > Hello - > > > > I am trying to implement an outlier detection application on streaming > data. > > I am

Re: using StreamingKMeans

2016-11-19 Thread ayan guha
ov 19, 2016 at 10:46 AM, debasishg <ghosh.debas...@gmail.com> >> wrote: >> > Hello - >> > >> > I am trying to implement an outlier detection application on streaming >> data. >> > I am a newbie to Spark and hence would like some advice on

Re: using StreamingKMeans

2016-11-19 Thread Debasish Ghosh
:46 AM, debasishg <ghosh.debas...@gmail.com> > wrote: > > Hello - > > > > I am trying to implement an outlier detection application on streaming > data. > > I am a newbie to Spark and hence would like some advice on the confusions > > that I have .. > >

Re: using StreamingKMeans

2016-11-19 Thread Cody Koeninger
d like some advice on the confusions > that I have .. > > I am thinking of using StreamingKMeans - is this a good choice ? I have one > stream of data and I need an online algorithm. But here are some questions > that immediately come to my mind .. > > 1. I cannot do separate trainin

using StreamingKMeans

2016-11-19 Thread debasishg
Hello - I am trying to implement an outlier detection application on streaming data. I am a newbie to Spark and hence would like some advice on the confusions that I have .. I am thinking of using StreamingKMeans - is this a good choice ? I have one stream of data and I need an online algorithm

outlier detection using StreamingKMeans

2016-11-17 Thread Debasish Ghosh
Hello - I am trying to implement an outlier detection application on streaming data. I am a newbie to Spark and hence would like some advice on the confusions that I have .. I am thinking of using StreamingKMeans - is this a good choice ? I have one stream of data and I need an online algorithm