Is there a way to tell how many batches my topology processes per second ? Or for that matter how many tuples are processed per second ? Aside from creating a new bolt purely for that aggregation ?
On Mon, Jul 14, 2014 at 2:08 PM, Carlos Rodriguez <[email protected]> wrote: > Max spout pending config specifies how many *batches* can be processed > simultaneously by your topology. > Thats why 48,000 seems absurdly high to you. Divide it between the batch > size and you'll get the max spout pending config that you were expecting. > > > 2014-07-14 19:00 GMT+02:00 Raphael Hsieh <[email protected]>: > > What is the optimal max spout pending to use in a topology ? >> I found this thread here: >> http://mail-archives.apache.org/mod_mbox/storm-user/201402.mbox/%3cca+avhzatfg_s88lkombvommkh-rafwr6szy0i8b8tm3rfab...@mail.gmail.com%3E >> that didn't seem to have a follow up. >> >> Part of it says to >> >> "Start with a max spout pending that is for sure too small -- one for >> trident, or the number of executors for storm -- and increase it until you >> stop seeing changes in the flow. You'll probably end up with something >> near 2*(throughput >> in recs/sec)*(end-to-end latency) (2x the Little's law capacity)." >> >> Does this make sense for a Max Spout Pending value ? >> I expect my topology to have a throughput of around 80,000/s and I've >> been seeing a complete latency of around 300ms, so given this formula, I'd >> want 2*80000*.3 = 48,000 Max Spout Pending. >> >> This seems absurdly high to me.. >> >> -- >> Raphael Hsieh >> >> >> >> > > > > -- > Carlos Rodríguez > Developer at ENEO Tecnología > http://redborder.net/ > http://lnkd.in/bgfCVF9 > -- Raphael Hsieh
