max spout pending is the number of tuples each spout task will emit before waiting for an ack on those tuples. For example: if you have 4 spout tasks with 5 as your max spout pending, Each of you 4 spout tasks will emit 5 tuples and wait for ack's on those tuples before emitting more tuples. So with this config you will at anytime have 20 spout tuples in your topology.
On Tue, Jul 15, 2014 at 12:26 AM, 唐思成 <[email protected]> wrote: > https://gist.github.com/mrflip/5958028#provisionings > > Max-pending (TOPOLOGY_MAX_SPOUT_PENDING) sets the number of tuple trees > live in the system at any one time. > maybe this is useful for you > > > 2014-07-15 > ------------------------------ > 唐思成 > ------------------------------ > *发件人:* Raphael Hsieh > *发送时间:* 2014-07-15 05:54:28 > *收件人:* user > *抄送:* > *主题:* Re: Max Spout Pending > Is there a way to tell how many batches my topology processes per > second ? > Or for that matter how many tuples are processed per second ? > Aside from creating a new bolt purely for that aggregation ? > > > On Mon, Jul 14, 2014 at 2:08 PM, Carlos Rodriguez < > [email protected]> wrote: > >> Max spout pending config specifies how many *batches* can be processed >> simultaneously by your topology. >> Thats why 48,000 seems absurdly high to you. Divide it between the batch >> size and you'll get the max spout pending config that you were expecting. >> >> >> 2014-07-14 19:00 GMT+02:00 Raphael Hsieh <[email protected]>: >> >> What is the optimal max spout pending to use in a topology ? >>> I found this thread here: >>> http://mail-archives.apache.org/mod_mbox/storm-user/201402.mbox/%3cca+avhzatfg_s88lkombvommkh-rafwr6szy0i8b8tm3rfab...@mail.gmail.com%3E >>> that didn't seem to have a follow up. >>> >>> Part of it says to >>> >>> "Start with a max spout pending that is for sure too small -- one for >>> trident, or the number of executors for storm -- and increase it until you >>> stop seeing changes in the flow. You'll probably end up with something >>> near 2*(throughput >>> in recs/sec)*(end-to-end latency) (2x the Little's law capacity)." >>> >>> Does this make sense for a Max Spout Pending value ? >>> I expect my topology to have a throughput of around 80,000/s and I've >>> been seeing a complete latency of around 300ms, so given this formula, I'd >>> want 2*80000*.3 = 48,000 Max Spout Pending. >>> >>> This seems absurdly high to me.. >>> >>> -- >>> Raphael Hsieh >>> >>> >>> >>> >> >> >> >> -- >> Carlos Rodríguez >> Developer at ENEO Tecnología >> http://redborder.net/ >> http://lnkd.in/bgfCVF9 >> > > > > -- > Raphael Hsieh > > > >
