Had a similar experience - too many emits would jam the spout and it would never get around to processing the acks received from the bolts. We "fixed" it by introducing artificial 1ms sleep in the spout processing so that there was enough idle capacity to run the acks. I doubt that's the better solution for the issue, but we were only doing a PoC and weren't particularly concerned with elegant at the time.
Regards, JG On Tue, Apr 7, 2015 at 2:48 PM, Dan DeCapria, CivicScience < [email protected]> wrote: > I have a multilang spout in php running against a storm 0.9.3 cluster. The > spout does it's initialization just fine, and nextTuple() is called just > fine on the standard 1ms cadence. Everything has been working well, until a > new use case became apparent. > > Within nextTuple(), I have a for-loop which emits M 3-tuples of three > integer values, to a well-defined stream in the topology. If M is small or > moderately large (say M ~ 5000), all is fine, nextTuple() completes and is > recalled in turn. > > However, if M >~ 7000, the spout emits ~6800 tuples; all of those tuples > are acked by the downstream bolts (single bolt, high parallelism, > shuffled), *but* the spout never acknowledges the ack. It's as if > everything is hung, and nextTuple() is never able to be called again. > Furthermore, the remaining ~200 tuples in the loop are never emitted > downstream. The logger at the very end of the nextTuple() method is never > invoked. This behavior appears to be independent of max spout pending set > size (from 10 to 10000 attempted, for varying sizes of M, still M > 7000 > fails). > > Modified representation of Use Case: > > // override > protected function nextTuple() { > $this->logger->info("nextTuple(): called"); > $M; $s; $c; $d; // defined elsewhere, but well-formed > foreach ($M AS $a => $b) { > $messageId = (string)UUID::generate(); // type4 random uuid > $tuple = array((int)$c, (int)$d, (int)$b); > $this->emit($tuple, $messageId, $s); > } > $this->logger->info("nextTuple(): finished"); > } > > If |$M| == 5000, 'finished' is outputted. If |$M| >= 7000, 'finished' is > never outputted. Could this be a Tx/Rx buffer setting / some other > configuration parameter as it relates to emit()? > > Any insight is appreciated. > > Thanks, -Dan > > > -- Javier González Nicolini
