Zombie writers protection

2016-02-10 Thread John Dennison
Greetings, I have general design question i did not see addressed in the docs. Basically how does samza guarantee a single writer for each changelog partition. Because of strong ordering assumption of these changelog, how do you protect against zombie processes writing to the changelog with out

Re: Zombie writers protection

2016-02-10 Thread Rick Mangi
Security wouldn’t stop zombie processes from writing to kafka. I had this problem with yarn before where the container thought it was killing jobs but they never actually died, and in fact continued to write to kafka. > On Feb 10, 2016, at 4:23 PM, Jagadish Venkatraman

Re: Zombie writers protection

2016-02-10 Thread Jagadish Venkatraman
Hi John Currently there is no authorization on who writes to Kafka. There is a Kafka security proposal that the kafka community is working on. https://cwiki.apache.org/confluence/display/KAFKA/Security Building this into Samza may entail expensive coordination (to prevent other jobs). Since,

Re: Zombie writers protection

2016-02-10 Thread Jacob Maes
Hey Rick, If I understand your question, the goal is really to make sure there are no orphaned containers that continue to run "off the books". The newly added SAMZA-871 describes a heart beat mechanism to make sure orphaned containers actually get killed. Also, the YARN Node Manager Restart

Re: Zombie writers protection

2016-02-10 Thread John Dennison
To second Rick's point. Its less about malicious actors, but rather containers thought to be lost due to a network partition popping up later and starting to write to the change log. I assume from Rick's response that yarn is responsible for ensure only one version of each container is running and

Re: Zombie writers protection

2016-02-10 Thread Rick Mangi
Jake, Not my question, I was just adding my 2 cents :) John, it’s not that yarn is responsible for maintaining 1 instance of each container, samza has an abstract management layer that defers this to yarn, but some people bypass yarn all together and manage their containers themselves or run

Re: Zombie writers protection

2016-02-10 Thread Yi Pan
Hi, Rick and John, Thanks for the great discussion! As Jacob said, we realized the possible drawbacks relying solely on YARN for process liveness detection as well and that's why SAMZA-871 was opened. Please help to comment on the JIRA so that we can track the discussion and move the design