Thanks!
On Wed, Jul 16, 2014 at 6:34 PM, Tathagata Das <tathagata.das1...@gmail.com> wrote: > Have you taken a look at DStream.transformWith( ... ) . That allows you > apply arbitrary transformation between RDDs (of the same timestamp) of two > different streams. > > So you can do something like this. > > 2s-window-stream.transformWith(1s-window-stream, (rdd1: RDD[...], rdd2: > RDD[...]) => { > ... > // return a new RDD > }) > > > And streamingContext.transform() extends it to N DStreams. :) > > Hope this helps! > > TD > > > > > On Wed, Jul 16, 2014 at 10:42 AM, Walrus theCat <walrusthe...@gmail.com> > wrote: > >> hey at least it's something (thanks!) ... not sure what i'm going to do >> if i can't find a solution (other than not use spark) as i really need >> these capabilities. anyone got anything else? >> >> >> On Wed, Jul 16, 2014 at 10:34 AM, Luis Ángel Vicente Sánchez < >> langel.gro...@gmail.com> wrote: >> >>> hum... maybe consuming all streams at the same time with an actor that >>> would act as a new DStream source... but this is just a random idea... I >>> don't really know if that would be a good idea or even possible. >>> >>> >>> 2014-07-16 18:30 GMT+01:00 Walrus theCat <walrusthe...@gmail.com>: >>> >>> Yeah -- I tried the .union operation and it didn't work for that >>>> reason. Surely there has to be a way to do this, as I imagine this is a >>>> commonly desired goal in streaming applications? >>>> >>>> >>>> On Wed, Jul 16, 2014 at 10:10 AM, Luis Ángel Vicente Sánchez < >>>> langel.gro...@gmail.com> wrote: >>>> >>>>> I'm joining several kafka dstreams using the join operation but you >>>>> have the limitation that the duration of the batch has to be same,i.e. 1 >>>>> second window for all dstreams... so it would not work for you. >>>>> >>>>> >>>>> 2014-07-16 18:08 GMT+01:00 Walrus theCat <walrusthe...@gmail.com>: >>>>> >>>>> Hi, >>>>>> >>>>>> My application has multiple dstreams on the same inputstream: >>>>>> >>>>>> dstream1 // 1 second window >>>>>> dstream2 // 2 second window >>>>>> dstream3 // 5 minute window >>>>>> >>>>>> >>>>>> I want to write logic that deals with all three windows (e.g. when >>>>>> the 1 second window differs from the 2 second window by some delta ...) >>>>>> >>>>>> I've found some examples online (there's not much out there!), and I >>>>>> can only see people transforming a single dstream. In conventional >>>>>> spark, >>>>>> we'd do this sort of thing with a cartesian on RDDs. >>>>>> >>>>>> How can I deal with multiple Dstreams at once? >>>>>> >>>>>> Thanks >>>>>> >>>>> >>>>> >>>> >>> >> >