Hey Rick,

If I understand your question, the goal is really to make sure there are no
orphaned containers that continue to run "off the books".

The newly added SAMZA-871 describes a heart beat mechanism to make sure
orphaned containers actually get killed.

Also, the YARN Node Manager Restart capability might help. We're in the
process of testing this at LinkedIn:
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeManagerRestart.html

-Jake

On Wed, Feb 10, 2016 at 1:42 PM, John Dennison <dennison.j...@gmail.com>
wrote:

> To second Rick's point. Its less about malicious actors, but rather
> containers thought to be lost due to a network partition popping up later
> and starting to write to the change log. I assume from Rick's response that
> yarn is responsible for ensure only one version of each container is
> running and samza has nothing internal to deal with this.
>
> I guess you could hijack kafka's auth framework to block old zombie
> containers from writing. Use some global lock's incrementing token as the
> password. A zombie process would auth with an old token and be denied. I
> haven't looked but i imagine that 0.9.0 auth framework isn't done on a
> partition level.
>
> On Wed, Feb 10, 2016 at 2:27 PM, Rick Mangi <r...@chartbeat.com> wrote:
>
> > Security wouldn’t stop zombie processes from writing to kafka. I had this
> > problem with yarn before where the container thought it was killing jobs
> > but they never actually died, and in fact continued to write to kafka.
> >
> >
> > > On Feb 10, 2016, at 4:23 PM, Jagadish Venkatraman <
> > jagadish1...@gmail.com> wrote:
> > >
> > > Hi John
> > >
> > > Currently there is no authorization on who writes to Kafka. There is a
> > > Kafka security proposal that the kafka community is working on.
> > > https://cwiki.apache.org/confluence/display/KAFKA/Security
> > >
> > > Building this into Samza may entail expensive coordination (to prevent
> > > other jobs). Since, jobs are usually run in a trusted environment, I've
> > not
> > > seen people requesting this use-case. Even if we did build this into
> > Samza,
> > > nothing stops people from writing to that Kafka topic by bypassing
> Samza
> > > completely. (thro' the kafka producer or external library)
> > >
> > > I'd think Kafka would build support for authorization, principals,
> roles
> > > etc. in the future and Samza can leverage it once it's done.
> > >
> > > Thoughts?
> > >
> > > On Wednesday, February 10, 2016, John Dennison <
> dennison.j...@gmail.com>
> > > wrote:
> > >
> > >> Greetings,
> > >>
> > >> I have general design question i did not see addressed in the docs.
> > >> Basically how does samza guarantee a single writer for each changelog
> > >> partition. Because of strong ordering assumption of these changelog,
> > how do
> > >> you protect against zombie processes writing to the changelog with out
> > of
> > >> date values.
> > >>
> > >> Thanks,
> > >>
> > >> John
> > >>
> >
> >
>

Reply via email to