GitHub user vjagadish1989 opened a pull request:

    https://github.com/apache/samza/pull/243

    SAMZA-1359: Handle phantom container notifications cleanly during an RM 
fail-over

    1. Improved our container handling logic to be resilient to phantom 
notifications.
    2. Added a new metric to Samza's ContainerProcessManager module that tracks 
the number of such invalid notifications.
    3. Add a couple of tests that simulate this exact scenario above that we 
encountered during the cluster upgrade. (container starts -> container fails -> 
legitimate notification for the failure - container re-start -> RM fail-over -> 
phantom notification with a different exit code)
    4. As an aside, there are a whole bunch of tests in ContainerProcessManager 
that rely on Thread.sleep to ensure that threads get to run in a certain order. 
Removed this non-determinism and made them predictable.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vjagadish1989/samza am-bug

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/samza/pull/243.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #243
    
----
commit 8ecf9dd847cea627aa788def871291d663ad9062
Author: Jagadish Venkatraman <jvenk...@jvenkatr-mn2.linkedin.biz>
Date:   2017-07-13T18:32:27Z

    changes for phantom notification

commit 262f420104aea69099e03b8d0ae33883f463088f
Author: Jagadish Venkatraman <jvenk...@jvenkatr-mn2.linkedin.biz>
Date:   2017-07-13T18:57:47Z

    fix bug

commit 8646d729691468a714b819d2c15dc62b9f3f8b16
Author: Jagadish Venkatraman <jvenk...@jvenkatr-mn2.linkedin.biz>
Date:   2017-07-13T18:59:33Z

    bug fixes

commit 8ed7a3662376bb466679d10f121857bcc3ee1a30
Author: Jagadish Venkatraman <jvenk...@jvenkatr-mn2.linkedin.biz>
Date:   2017-07-13T19:08:02Z

    fix typos

commit beb8b0650b7c68748d2369a41a0e149c8a31667b
Author: Jagadish Venkatraman <jvenk...@jvenkatr-mn2.linkedin.biz>
Date:   2017-07-14T17:09:06Z

    typos, and java doc fixes

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to