Hi Sean Many many thanks. This really clears a lot up for me
Andy From: Sean Owen <so...@cloudera.com> Date: Wednesday, October 1, 2014 at 11:27 AM To: Andrew Davidson <a...@santacruzintegration.com> Cc: "user@spark.apache.org" <user@spark.apache.org> Subject: Re: can I think of JavaDStream<> foreachRDD() as being 'for each mini batch' ? > Yes, foreachRDD will do your something for each RDD, which is what you > get for each mini-batch of input. > > The operations you express on a DStream (or JavaDStream) are all, > really, "for each RDD", including print(). Logging is a little harder > to reason about since the logging will happen on a potentially remote > receiver. I am not sure if this explains your observed behavior; it > depends on what you were logging. > > On Wed, Oct 1, 2014 at 6:51 PM, Andy Davidson > <a...@santacruzintegration.com> wrote: >> Hi >> >> I am new to Spark Streaming. Can I think of JavaDStream<> foreachRDD() as >> being 'for each mini batch¹? The java doc does not say much about this >> function. >> >> Here is the background. I am writing a little test program to figure out how >> to use streams. At some point I wanted to calculate an aggregate. In my >> first attempt my driver did a couple of transformations and tries to get the >> count(). I wrote this code like I would a spark core app and added some >> logging. My log statements where only executed once, how ever >> JavaDStream<>print() appears to run on each mini batch. >> >> When I put my logging and aggregation code inside foreachRDD() things work >> as expected my aggegrate and logging appear to be executed on each >> minibactch >> >> I am running on a 4 machine cluster. I create a message with the value of >> each mini batch aggregate and use logging and System.out. Given the message >> shows up on my console is it safe to assume that this output code is >> executing in my driver? The ip shows it is running on my Master. >> >> I thought maybe the message is showing up here because I do not have enough >> data in the steam to force load onto the workers? >> >> >> Any in sites would be greatly appreciated. >> >> Andy >