Re: [DISCUSS] KIP-590: Redirect Zookeeper Mutation Protocols to The Controller

Sönke Liebau Tue, 21 Apr 2020 14:37:06 -0700

Hi Boyang,

I think what Tom is referring to is that it is very hard to forward enough
information to the controller to put it into a position to properly
authenticate any request.


While the Default KafkaPrincipal can easily be serialized and sent to the
controller, as previously seen, those are just strings. For the Controller
to properly authenticate a request we'd need to forward the
AuthenticationContext (from which the Principal is built [1]) containing
the SSL/SASL details to the controller, in order for the controller to then
check certificates for validity etc.
And those checks will be very difficult, because what we are effectively
doing here is a man-in-the-middle attack (broadly speaking), as we are
forwarding a request "in the name of" someone else. And most authentication
methods have been built to prevent exactly that.
As soon as we have only the Principal we are trusting someone else to have
properly authenticated that principal, because we do not have all the
information to do that verification ourselves. And if we do that, then I
don't see why we should a

Regarding custom Principals, I don't think too many people do this in
practice, but in theory you can provide you own PrincipalBuilder and use
your own Principal objects that contain as much additional information as
you wish. And since these can basically be any Java object that makes them
very tough to serialize.
Currently these Principals don't need to be serialized, because they are
created and used within the same JVM, there is no need to forward them to a
different broker.
I wrote a blog post [2] about a scenario that uses a custom Principal a
little while ago, that shows a possible scenario, maybe that helps a little.

Feel free to correct me if I misinterpreted your meaning Tom :)

Best regards,
Sönke

[1] https://imgur.com/a/Gi0cFNH
[2]
https://www.opencore.com/de/blog/2018/3/group-based-authorization-in-kafka/

On Tue, 21 Apr 2020 at 20:33, Boyang Chen <[email protected]>
wrote:

> Hey Tom,
>
> I agree with the claim here. All the brokers should have the same
> authentication power, which means getting the forwarding broker verify the
> client request first is more favorable. This approach avoids sending one
> unnecessary forwarding request if it couldn't pass the authorization in the
> first place.
>
> In the meantime, could you give more context on the custom Kafka principal
> you are referring to? How does that get encoded today, and how server and
> client could both agree on the serialization? As the plain principal is
> only a String, I would like to know more about the security strategy people
> are using, thanks!
>
> Boyang
>
> On Tue, Apr 21, 2020 at 2:24 AM Tom Bentley <[email protected]> wrote:
>
> > Hi Boyang,
> >
> > The answer to my original question about the request principal was that
> the
> > forwarding broker would authorize the request and the controller would
> > trust the request since it was from another broker. AFAIU you added the
> > principal purely for logging purposes. In the "EnvelopeRequest Handling"
> > section the KIP now says "Once that part is done, we shall replace the
> > request context with Principal information embedded inside the
> > EnvelopeRequest to complete the inner request permission check.", which
> > sounds to me like the controller is now authorizing the request (maybe in
> > addition to the forwarding broker) using a principal it's deserialized
> from
> > the EnvelopeRequest. I don't think that works if a custom principal
> builder
> > is returning a subclass of KafkaPrincipal (the Javadoc for KafkaPrincipal
> > describes the contract I'm talking about). Basically the controller would
> > not be able to instantiate the subclass (even if that was included in the
> > envelope) because it wouldn't necessarily know the signature of the
> > constructor. Nor can it use the principal builder itself because it
> doesn't
> > have access to the original AuthenticationContext. Maybe you figure out
> > some way to make it work, otherwise I think the best you can do is to
> > revert to the model you had before where the controller trusts the
> embedded
> > request because it's been received from a broker.
> >
> > Cheers,
> >
> > Tom
> >
> > On Sat, Apr 18, 2020 at 8:56 PM Colin McCabe <[email protected]> wrote:
> >
> > > On Fri, Apr 17, 2020, at 13:11, Ismael Juma wrote:
> > > > Hi Colin,
> > > >
> > > > The read/modify/write is protected by the zk version, right?
> > > >
> > > > Ismael
> > >
> > > No, we don't use the ZK version when doing the write to the config
> > > znodes.  We do for ACLs, I think.
> > >
> > > This is something that we could fix just by using the ZK version, but
> > > there are other race conditions like what if we're deleting a topic
> while
> > > setting this config, etc.  A single writer is a lot easier to reason
> > about.
> > >
> > > best,
> > > Colin
> > >
> > >
> > > >
> > > > On Fri, Apr 17, 2020 at 12:53 PM Colin McCabe <[email protected]>
> > > wrote:
> > > >
> > > > > On Thu, Apr 16, 2020, at 08:51, Ismael Juma wrote:
> > > > > > I don't think these requests are necessarily infrequent under
> multi
> > > > > tenant
> > > > > > environments though. I've seen Controller availability being an
> > > issue for
> > > > > > describe topics for example (before it was changed to go to any
> > > broker).
> > > > >
> > > > > Hi Ismael,
> > > > >
> > > > > I don't think DescribeTopics is a good comparison.  That RPC is
> > > available
> > > > > to regular users and is used many orders of magnitude more
> frequently
> > > than
> > > > > administrative operations like changing ACLs or setting quotas.
> > > > >
> > > > > The operations we're talking about redirecting here all require the
> > > > > highest possible permissions and will not be frequent in any
> > real-world
> > > > > cluster... unless someone is running a stress-test or a benchmark.
> > We
> > > > > didn't even notice some of the serious bugs in setting dynamic
> > configs
> > > > > until recently because the alterConfigs / incrementalAlterConfigs
> > RPCs
> > > are
> > > > > so infrequently called.
> > > > >
> > > > > Additionally, this KIP fixes some existing bugs.  The current
> > approach
> > > of
> > > > > having random writers do a read-write-modify cycle on a
> configuration
> > > znode
> > > > > is buggy since it could be interleaved with another node's
> > read-modify
> > > > > write cycle.  It has a "lost updates" problem.
> > > > >
> > > > > For example, node 1 reads a config znode.  Node 2 reads the same
> > config
> > > > > znode.  Node 1 writes back a modified version of the znode.  Node 2
> > > writes
> > > > > back its (differently) modified version, overwriting the changes
> from
> > > node
> > > > > 1.
> > > > >
> > > > > I don't think anyone ever noticed this problem since, again, these
> > > > > operations are very infrequent, making the chance of such a
> collision
> > > low.
> > > > > But it is a serious bug that is fixed by having a single writer.
> (We
> > > > > should add this to the KIP...)
> > > > >
> > > > > >
> > > > > > Would it be better to redirect once the controller quorum is
> there?
> > > > >
> > > > > This KIP is needed for the bridge release.  The bridge release
> > upgrade
> > > > > process relies on the old nodes sending their administrative
> > > operations to
> > > > > the controller quorum, not directly to zookeeper.
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > > >
> > > > > >
> > > > > > Note that this is different from things like AlterIsr since these
> > > calls
> > > > > are
> > > > > > coming from clients versus other brokers.
> > > > > >
> > > > > > Ismael
> > > > > >
> > > > > > On Wed, Apr 15, 2020, 5:10 PM Colin McCabe <[email protected]>
> > > wrote:
> > > > > >
> > > > > > > Hi Ismael,
> > > > > > >
> > > > > > > I agree that sending these requests through the controller will
> > not
> > > > > work
> > > > > > > during the periods when there is no controller.  However, those
> > > periods
> > > > > > > should be short-- otherwise we have bigger problems in the
> > cluster.
> > > > > > >
> > > > > > > These requests are very infrequent because they are
> > administrative
> > > > > > > operations.  Basically the affected operations are changing
> ACLs,
> > > > > changing
> > > > > > > dynamic configurations, and changing quotas.
> > > > > > >
> > > > > > > best,
> > > > > > > Colin
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Apr 15, 2020, at 15:25, Ismael Juma wrote:
> > > > > > > > Hi Boyang,
> > > > > > > >
> > > > > > > > Thanks for the KIP. Have we considered that this reduces
> > > > > availability for
> > > > > > > > these operations since we have a single Controller instead of
> > > the ZK
> > > > > > > quorum?
> > > > > > > >
> > > > > > > > Ismael
> > > > > > > >
> > > > > > > > On Fri, Apr 3, 2020 at 4:45 PM Boyang Chen <
> > > > > [email protected]>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hey all,
> > > > > > > > >
> > > > > > > > > I would like to start off the discussion for KIP-590, a
> > > follow-up
> > > > > > > > > initiative after KIP-500:
> > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-590%3A+Redirect+Zookeeper+Mutation+Protocols+to+The+Controller
> > > > > > > > >
> > > > > > > > > This KIP proposes to migrate existing Zookeeper mutation
> > paths,
> > > > > > > including
> > > > > > > > > configuration, security and quota changes, to
> controller-only
> > > by
> > > > > always
> > > > > > > > > routing these alterations to the controller.
> > > > > > > > >
> > > > > > > > > Let me know your thoughts!
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Boyang
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> >
>

Re: [DISCUSS] KIP-590: Redirect Zookeeper Mutation Protocols to The Controller

Reply via email to