Hi, On Sun, Oct 26, 2014 at 8:03 PM, Ji ZHANG <[email protected]> wrote: > > Suppose I have a stream of logs and I want to count them by minute. > The result is like: > > 2014-10-26 18:38:00 100 > 2014-10-26 18:39:00 150 > 2014-10-26 18:40:00 200 > > One way to do this is to set the batch interval to 1 min, but each > batch would be quite large. >
I would say the straightforward way to do that is myRdd.countByWindow(Seconds(1), Seconds(1)) Tobias
