> Does that make sense? Yes and no.
In the example on your blog the RollingCountBolt is configured for 9 and 3 which I understand to mean: Emit the last 9 second rolling window every 3 seconds. I just don't understand the 2 second emit frequencies of the other bolts. On Tue, Apr 1, 2014 at 11:20 AM, Michael G. Noll <[email protected]> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > "Software Dev", > > in RollingCountBolt there are two *time* related settings: > > 1. The size (duration) of the sliding window itself. In seconds. > 2. The time interval at which the latest sliding window count is sent > to downstream bolts. In seconds. > > See details here: > https://github.com/apache/incubator-storm/blob/master/examples/storm-starter/src/jvm/storm/starter/bolt/RollingCountBolt.java > > I'm quoting from the code above: > > "The bolt is configured by two parameters, the length of the sliding > window in seconds (which influences the output data of the bolt, i.e. > how it will count objects) and the emit frequency in seconds (which > influences how often the bolt will output the latest window counts). > For instance, if the window length is set to an equivalent of five > minutes and the emit frequency to one minute, then the bolt will > output the latest five-minute sliding window every minute." > > >> Does this mean that the rolling counts for the last 9 events are >> ranked and emitted every 2 seconds? 7 seconds > > The RollingCountBolt "thinks" in seconds. However, behind the scenes > RollingCountBolt uses SlidingWindowCounter [1], which in turn is built > upon SlotBasedCounter [2]. Both the SlidingWindowCounter and the > SlotBasedCounter don't know anything about time or durations (no > seconds, minutes, and such). This is by design, as it decouples the > responsibility of counting (SlidingWindowCounter/SlotBasedCounter) > from the responsibility of tracking the time (RollingCountBolt). > > The Apache Spark project has exactly the same notion of > emitFrequencyInSeconds and windowLengthInSeconds, which they call > slideInterval and windowLength. See > https://spark.apache.org/docs/0.9.0/streaming-programming-guide.html. > They also have a similar diagram to what I showed in [3] that > explains the idea behind sliding windows, see section "Window > Operations" in the Spark link above. > > > Does that make sense? > Michael > > > > [1] > https://github.com/apache/incubator-storm/blob/master/examples/storm-starter/src/jvm/storm/starter/tools/SlidingWindowCounter.java > [2] > https://github.com/apache/incubator-storm/blob/master/examples/storm-starter/src/jvm/storm/starter/tools/SlotBasedCounter.java > [3] > http://www.michael-noll.com/blog/2013/01/18/implementing-real-time-trending-topics-in-storm/ > > > On 01.04.2014 18:45, Software Dev wrote: >> In the article >> (http://www.michael-noll.com/blog/2013/01/18/implementing-real-time-trending-topics-in-storm/) >> >> > and I was wondering what the rationale was for the emit frequencies >> and how they all relate to each other. >> >> In the example the RollingCountBolt emits every 3 seconds, >> IntermediateRankingBolt every 2 seconds and TotalRankingBolt every >> 2 seconds. Does this mean that the rolling counts for the last 9 >> events are ranked and emitted every 2 seconds? 7 seconds? A little >> confused. >> >> Thanks >> > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.22 (MingW32) > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iEYEARECAAYFAlM7A2kACgkQeW5XuG18ujR93wCdHE6Ldu01fRgnMqjIi7chVMbu > uEMAnjUyrZQq0xkg2REUzbgvk31A85Dm > =YI7Y > -----END PGP SIGNATURE-----
