A fix STORM-120 would be greatly appreciated. It's making it impossible to increase tasks/executors > 1 when there is a downstream shuffle grouping.
I'm not sure why there haven't been more reports of problems with it. Two possibilities I can think of are that we are using exclusively shell components--perhaps there's a root-cause bug in those component classes--and that we are dealing with a high volume stream of large tuples. (thousands / sec, KB in size) On Thu, Mar 20, 2014 at 2:14 PM, P. Taylor Goetz <[email protected]> wrote: > Never mind... just found it. > > On Mar 20, 2014, at 5:09 PM, P. Taylor Goetz <[email protected]> wrote: > > > Derek do you have an idea for a fix? > > > > On Mar 20, 2014, at 3:43 PM, Derek Dagit <[email protected]> wrote: > > > >>> As I said above, this fix is the most important in my opinion. > >>> STORM-259 (Random#nextInt) is new to me -- can't say whether it's as > >>> important as STORM-187 or not. > >> > >> Yeah, we found it recently, and I created it this morning after reading > Taylor's mail. > >> > >> STORM-187 can be a problem with fewer than 30 retries (likelihood > depends on configuration), but we will hit STORM-259 when retries exceeds > 30. > >> > >> -- > >> Derek > >> > >> On 3/20/14, 14:18, Michael G. Noll wrote: > >>> On my side the most important change is, as you point out, STORM-187. > >>> The primary reason is like Adam Lewis is pointing out because it's a > >>> stability problem. The secondary aspect is that this issue taints the > >>> new Netty backend, and at least IMHO the faster Storm could confidently > >>> bury ZeroMQ the better. :-) > >>> > >>> As I said above, this fix is the most important in my opinion. > >>> STORM-259 (Random#nextInt) is new to me -- can't say whether it's as > >>> important as STORM-187 or not. > >>> > >>> Switching to my non-essential wishlist I'd also +1 STORM-252 (Upgrade > >>> Curator and thus ZooKeeper to 3.4.5). We have been running ZK 3.4.5 > >>> anyway for a couple of reasons, and it would be nice to have official > >>> Storm support for the latest ZK version (ok, the recently released ZK > >>> 3.4.6 is actually the latest but hey). Although I don't know how > >>> confident we are that the code in STORM-252 actually works, i.e. > whether > >>> integrating STORM-252 into 0.9.2 on such short notice would be jumping > >>> the gun or a safe move. > >>> > >>> Btw, in terms of Storm/Kafka integration Kafka is in the same boat: > >>> it's built against ZK 3.3.x, and LinkedIn recommends the use of ZK > 3.3.4 > >>> in the docs. There's an open ticket KAFKA-854 [1] that's basically the > >>> equivalent of STORM-252, but I'm not sure how actively the Kafka team > is > >>> working on that. > >>> > >>> Best, > >>> Michael > >>> > >>> [1] https://issues.apache.org/jira/browse/KAFKA-854 > >>> > >>> > >>> > >>> On 03/20/2014 02:33 AM, P. Taylor Goetz wrote: > >>>> I'd like to get this discussion started, largely because the > "negative timeout" bug (STORM-187) really bothers me. I've not seen it in > the wild, but I've heard of a few cases where it was enough to hinder > upgrading. > >>>> > >>>> HEAD looks good to me at the moment, with the major difference being > the zookeeper update and the patch mentioned above. > >>>> > >>>> Any thoughts on other PRs or patches to include? > >>>> > >>>> -Taylor > >>> > > > > -- Patrick Lucas
