[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-25 Thread ptgoetz
Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-76072022 Patch ported and merged to 0.9.x branch. Benchmarked several topologies (both core and Trident) before (0.9.3) and after (0.9.4-SNAPSHOT) this patch and found no

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-25 Thread 3in
Github user 3in commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-76130337 @ptgoetz If you have the resources, testing an RC and reporting back is the best way to accelerate a release. - where do i get the rc, i will have a try,

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-24 Thread miguno
Github user miguno commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-75713928 Oh, Taylor. Could you also update STORM-404 and STORM-510 as appropriate? --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-24 Thread ptgoetz
Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-75891084 @danielschonfeld Soon. ;) We are probably a handful of weeks out, hopefully less. If you have the resources, testing an RC and reporting back is the best

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-24 Thread danielschonfeld
Github user danielschonfeld commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-75878755 @ptgoetz sorry for being the newbie here, but does this mean a new jar will be built of 0.9.3 that will include this? or will we have to wait for 0.10.0?

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-24 Thread ptgoetz
Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-75867311 I finally got this successfully back ported to the 0.9.x branch, with all tests passing. I will merge that soon after more testing and update all associated JIRAs.

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-24 Thread ptgoetz
Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-75879324 @miguno yep, that's what I meant. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-24 Thread danielschonfeld
Github user danielschonfeld commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-75881077 @ptgoetz what is the time frame for 0.9.4? (roughly) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-24 Thread ptgoetz
Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-75879629 @danielschonfeld No worries. There will be a 0.9.4 release that includes this fix. No need to wait for 0.10.0. --- If your project is set up for it, you can reply to

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-23 Thread miguno
Github user miguno commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-75634123 Phew. :-) Thanks for merging, Taylor! On 23.02.2015, at 22:06, P. Taylor Goetz notificati...@github.com wrote: Disregard last

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-23 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/storm/pull/429 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-23 Thread ptgoetz
Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-75631663 Disregard last message. It was a merge mistake (picked right when I should have picked left). All tests are passing now. --- If your project is set up for it,

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-18 Thread nathanmarz
Github user nathanmarz commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74914806 Nimbus only knows a worker is having trouble when it stops sending heartbeats. If a worker gets into a bad state, the worst thing to do is have it continue trying to

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-18 Thread miguno
Github user miguno commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74927415 @nathanmarz Thanks for the detailed feedback on the max-retries issue. As Bobby suggested, would you mind if we decouple the work on max-retries (tracked at STORM-677)

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-18 Thread miguno
Github user miguno commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74894542 FYI: I created [STORM-677: Maximum retries strategy may cause data loss](https://issues.apache.org/jira/browse/STORM-677) to address the issue that Bobby brought up in

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-18 Thread miguno
Github user miguno commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74895041 PS: We may also want to update the original [STORM-329](https://issues.apache.org/jira/browse/STORM-329) ticket description to reflect the changes in this PR. --- If

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-18 Thread ptgoetz
Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74885134 +1 I was able to verify the fix, and am in favor of merging. I'd also like to apply it to the 0.9.x branch as I feel it's an important fix. --- If your project is set up

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-18 Thread miguno
Github user miguno commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74898690 Thanks, Taylor! Let me know if I can help with sorting out the test failures. Also regarding JIRA: I forgot to mention that it looks like we need to update

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-17 Thread miguno
Github user miguno commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74702215 And FWIW, with the code in this PR the total test suite takes about 5mins to complete. ``` $ mvn clean install ... [INFO]

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-17 Thread revans2
Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/429#discussion_r24824540 --- Diff: storm-core/src/jvm/backtype/storm/messaging/netty/Client.java --- @@ -42,344 +42,577 @@ import org.slf4j.LoggerFactory; import

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-17 Thread miguno
Github user miguno commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74701138 I am seeing a lot of tests timing out with this change. Has anyone else seen this? Hmm. All the tests are passing for me (and they have been since a while).

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-17 Thread revans2
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74702976 I don't it could just be my mac acting funny. I'll dig into it, but don't block the pull request on me. I saw similar things on a different pull request that corrected

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-17 Thread revans2
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74696132 I am +1 for the code as is. Perhaps a separate JIRA for the reconnection attempts would be best. The issue has been in for quite a while now, and this code was not

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-17 Thread revans2
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74707726 OK It is something odd with my mac. It looks like the difference is wired vs wireless networking :), or possibly even switching between the two. --- If your project is

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-16 Thread miguno
Github user miguno commented on a diff in the pull request: https://github.com/apache/storm/pull/429#discussion_r24738502 --- Diff: storm-core/src/jvm/backtype/storm/messaging/netty/Client.java --- @@ -42,344 +42,577 @@ import org.slf4j.LoggerFactory; import

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-13 Thread revans2
Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/429#discussion_r24680701 --- Diff: storm-core/src/jvm/backtype/storm/messaging/netty/Client.java --- @@ -42,344 +42,577 @@ import org.slf4j.LoggerFactory; import

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-12 Thread danielschonfeld
Github user danielschonfeld commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74130344 Doesn't dropping the messages coming from a non ack/fail caring spout negate the 'at least once' attempt of storm? I mean doesn't that kinda force you to make all

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-12 Thread miguno
Github user miguno commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74147926 If you need at-least-once processing you must use an acking topology, which will allow Storm to replay lost messages. If instead you go with an unacking topology (= no

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-12 Thread nathanmarz
Github user nathanmarz commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74036235 I retract my earlier -1. It was mentioned that this enables backpressure for unacked topologies. Is this the case? If so, this is a great new feature of Storm and

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-12 Thread clockfly
Github user clockfly commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74084548 +1 On Thu, Feb 12, 2015 at 6:44 PM, Michael G. Noll notificati...@github.com wrote: Thanks for your feedback, Nathan. As far as I

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-12 Thread miguno
Github user miguno commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74032874 This patch allows a worker to properly detect that the connection to a peer becomes unavailable -- for whatever reason (the remote worker is dead or restarting, there was

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-11 Thread clockfly
Github user clockfly commented on the pull request: https://github.com/apache/storm/pull/428#issuecomment-73925561 Something like git rebase -i upstream/master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-11 Thread miguno
GitHub user miguno opened a pull request: https://github.com/apache/storm/pull/428 STORM-329: fix cascading Storm failure by improving reconnection strategy and buffering messages This is an improved version of the original pull request discussed at

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-11 Thread clockfly
Github user clockfly commented on the pull request: https://github.com/apache/storm/pull/428#issuecomment-73932513 I see, there are multiple remote-merging, which then make rebase impossible. How about create a patch file against master and then apply the patch with new

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-11 Thread miguno
Github user miguno commented on the pull request: https://github.com/apache/storm/pull/428#issuecomment-73946399 I did exactly this in #429. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-11 Thread miguno
Github user miguno closed the pull request at: https://github.com/apache/storm/pull/428 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-11 Thread miguno
Github user miguno commented on the pull request: https://github.com/apache/storm/pull/428#issuecomment-73928303 I tried rebasing (also to fix the incorrect commit message that starts with STORM-32*7*) but gave up after several failed attempts. Feel free to give it a try though --

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-11 Thread miguno
GitHub user miguno opened a pull request: https://github.com/apache/storm/pull/429 STORM-329: fix cascading Storm failure by improving reconnection strategy and buffering messages **This PR contains the same code as https://github.com/apache/storm/pull/428 but as a single commit

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-11 Thread danielschonfeld
Github user danielschonfeld commented on the pull request: https://github.com/apache/storm/pull/429#issuecomment-74026208 I admit to not understanding all the intricacies of the code as i'm still coming to terms with the different parts of storm. However, does this PR handle the