Re: BatchWriter Auth/Audit Question

Mike Drob Fri, 19 Aug 2016 13:00:11 -0700

I tried out what Marc suggested, and after I removed the authenticateUser
line from my tservers and my write throughput DOUBLED. This does not seem
like a minor detail here.


In the tserver logs, I still see a bunch of "2016-08-19 12:28:11,621
[Audit   ] INFO : operation: permitted; user: mdrob; client: [host:port];
action: authenticate;" which I think corresponds to the authenticate checks
from canWrite, so it's not like any user is allowed to do just write random
data.
A well-behaved client would fail while setting up the Connector object. A
malicious client that's trying to go directly via RPC would fail at the
canWrite check.

I missed the part about canPerformSystemActions throwing an exception via
authenticate when it fails, so we'd never get to the self-equality check.
That's a good point, but I don't think that code is ever executed during
the write path.


Mike



On Fri, Aug 19, 2016 at 2:42 PM, Christopher <[email protected]> wrote:

> If you note the comment in the parent class, the implementation of
> canAskAboutUser is relying on the authentication being done in the call to
> canPerformSystemActions. If you reverse the order here, you lose that
> authentication check, and the action will be allowed simply if the user is
> equal to itself. That's no good. We need an authentication check, and we
> can't rely on the client code to check it for us.
>
> We could reverse the order, but with an explicit authentication check when
> the user is asking about itself, but then there's likely going to be a
> duplicate authentication check if that fails, because
> canPerformSystemActions will also need to do an authentication check
> (because it can also be called directly).
>
> We can probably refactor this a bit, so we have an unauthenticated
> "canPerformSystemActions" check which is shared by the authenticated
> version and the reverse-ordered canAskAboutUser implementation, but we need
> to be careful about reversing the order here and forgetting to do an
> authentication check, or unnecessarily doubling-up on the authentication
> check (though, maybe the double-check isn't that bad in the less common
> case... depends on impl performance).
>
> On Fri, Aug 19, 2016 at 2:33 PM Marc P. <[email protected]> wrote:
>
> > Sorry....by out of band, It could be done once per writer or even in the
> > connector once...but I think we're in agreement Drob, Mike.
> >
> > On Fri, Aug 19, 2016 at 2:30 PM, Marc P. <[email protected]> wrote:
> >
> > > Funny, exactly the same thing I mentioned to Josh last night. Are you
> > > watching us?
> > >
> > >  In something I"m doing now I'm making the thrift calls and modeled it
> > > after what the client code does; however,
> > > once I removed the authenticateUser my throughput increased by 25 per
> > > cent. In my trace table it was by and large the greatest audited call.
> > >
> > > I think we could change how that works and potentially keep that out of
> > > band. The fail fast is great if the user doesn't have permissions to
> that
> > > table and may save the server some work...but as it is now, I'm always
> > > authenticated and I'm causing more work for the server. Perhaps we let
> > the
> > > natural exception that's thrown stop the client ( if it's being a good
> > > client ). I think the 25 per cent improvement was well worth the
> > > alternative risk.
> > >
> > > On Fri, Aug 19, 2016 at 2:16 PM, Mike Drob <[email protected]> wrote:
> > >
> > >> Devs,
> > >>
> > >> I was recently running a 1.7.2 cluster under a heavy write workload
> and
> > >> noticed a _lot_ of audit messages showing up in the logs. After some
> > >> digging, I found that one of the causes is the following call chain:
> > >>
> > >> TabletServerBatchWriter::sendMutationsToTabletServer
> > >> calls TabletServer.ThriftClientHandler::startUpdate
> > >> calls SecurityOperation::authenticateUser <-- always return true
> > >> calls SecurityOperation::canAskAboutUser <-- always return true
> > >> calls AuditedSecurityOperation::canPerformSystemActions <-- fails;
> logs
> > >> "action: performSystemAction"
> > >> calls AuditedSecurityAction::authenticate <-- succeeds; logs "action:
> > >> authenticate;"
> > >>
> > >> There are two separate but related problems here, I think, that I want
> > to
> > >> confirm.
> > >>
> > >> When startUpdate calls security.authenticateUser(credentials,
> > >> credentials);
> > >> there is the comment that this is done to "// Make sure user is real"
> > >> presumably to fail fast. This makes sense, but does it actually work?
> In
> > >> authenticateUser and canAskAboutUser we end up comparing the two
> > >> credentials to each other and this will always succeed because they're
> > the
> > >> same object. I expect the write fails later in the TabletServer code,
> > >> possibly in the UpdateSession, but I haven't been able to untangle
> this
> > >> part yet.
> > >>
> > >> Second, I think it makes sense to switch the order of the OR in
> > >> 'canAskAboutUser.' If this is in the write path, then it seems like
> the
> > >> most common case would be users asking about themselves, so we should
> > >> check
> > >> that before checking if a user is an admin. This also has the
> advantage
> > of
> > >> removing two uninformative audit log lines. We won't have an audit
> > message
> > >> that somebody wrote data, but we don't have that currently anyway.
> > >>
> > >> I did some visual spot checking and this looks like an issue in master
> > >> also, but I haven't run that code to find out.
> > >>
> > >> Thoughts?
> > >>
> > >> Mike
> > >>
> > >
> > >
> >
>

Re: BatchWriter Auth/Audit Question

Reply via email to