Hey Rick, If I understand your question, the goal is really to make sure there are no orphaned containers that continue to run "off the books".
The newly added SAMZA-871 describes a heart beat mechanism to make sure orphaned containers actually get killed. Also, the YARN Node Manager Restart capability might help. We're in the process of testing this at LinkedIn: https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeManagerRestart.html -Jake On Wed, Feb 10, 2016 at 1:42 PM, John Dennison <dennison.j...@gmail.com> wrote: > To second Rick's point. Its less about malicious actors, but rather > containers thought to be lost due to a network partition popping up later > and starting to write to the change log. I assume from Rick's response that > yarn is responsible for ensure only one version of each container is > running and samza has nothing internal to deal with this. > > I guess you could hijack kafka's auth framework to block old zombie > containers from writing. Use some global lock's incrementing token as the > password. A zombie process would auth with an old token and be denied. I > haven't looked but i imagine that 0.9.0 auth framework isn't done on a > partition level. > > On Wed, Feb 10, 2016 at 2:27 PM, Rick Mangi <r...@chartbeat.com> wrote: > > > Security wouldn’t stop zombie processes from writing to kafka. I had this > > problem with yarn before where the container thought it was killing jobs > > but they never actually died, and in fact continued to write to kafka. > > > > > > > On Feb 10, 2016, at 4:23 PM, Jagadish Venkatraman < > > jagadish1...@gmail.com> wrote: > > > > > > Hi John > > > > > > Currently there is no authorization on who writes to Kafka. There is a > > > Kafka security proposal that the kafka community is working on. > > > https://cwiki.apache.org/confluence/display/KAFKA/Security > > > > > > Building this into Samza may entail expensive coordination (to prevent > > > other jobs). Since, jobs are usually run in a trusted environment, I've > > not > > > seen people requesting this use-case. Even if we did build this into > > Samza, > > > nothing stops people from writing to that Kafka topic by bypassing > Samza > > > completely. (thro' the kafka producer or external library) > > > > > > I'd think Kafka would build support for authorization, principals, > roles > > > etc. in the future and Samza can leverage it once it's done. > > > > > > Thoughts? > > > > > > On Wednesday, February 10, 2016, John Dennison < > dennison.j...@gmail.com> > > > wrote: > > > > > >> Greetings, > > >> > > >> I have general design question i did not see addressed in the docs. > > >> Basically how does samza guarantee a single writer for each changelog > > >> partition. Because of strong ordering assumption of these changelog, > > how do > > >> you protect against zombie processes writing to the changelog with out > > of > > >> date values. > > >> > > >> Thanks, > > >> > > >> John > > >> > > > > >