I tried out what Marc suggested, and after I removed the authenticateUser line from my tservers and my write throughput DOUBLED. This does not seem like a minor detail here.
In the tserver logs, I still see a bunch of "2016-08-19 12:28:11,621 [Audit ] INFO : operation: permitted; user: mdrob; client: [host:port]; action: authenticate;" which I think corresponds to the authenticate checks from canWrite, so it's not like any user is allowed to do just write random data. A well-behaved client would fail while setting up the Connector object. A malicious client that's trying to go directly via RPC would fail at the canWrite check. I missed the part about canPerformSystemActions throwing an exception via authenticate when it fails, so we'd never get to the self-equality check. That's a good point, but I don't think that code is ever executed during the write path. Mike On Fri, Aug 19, 2016 at 2:42 PM, Christopher <[email protected]> wrote: > If you note the comment in the parent class, the implementation of > canAskAboutUser is relying on the authentication being done in the call to > canPerformSystemActions. If you reverse the order here, you lose that > authentication check, and the action will be allowed simply if the user is > equal to itself. That's no good. We need an authentication check, and we > can't rely on the client code to check it for us. > > We could reverse the order, but with an explicit authentication check when > the user is asking about itself, but then there's likely going to be a > duplicate authentication check if that fails, because > canPerformSystemActions will also need to do an authentication check > (because it can also be called directly). > > We can probably refactor this a bit, so we have an unauthenticated > "canPerformSystemActions" check which is shared by the authenticated > version and the reverse-ordered canAskAboutUser implementation, but we need > to be careful about reversing the order here and forgetting to do an > authentication check, or unnecessarily doubling-up on the authentication > check (though, maybe the double-check isn't that bad in the less common > case... depends on impl performance). > > On Fri, Aug 19, 2016 at 2:33 PM Marc P. <[email protected]> wrote: > > > Sorry....by out of band, It could be done once per writer or even in the > > connector once...but I think we're in agreement Drob, Mike. > > > > On Fri, Aug 19, 2016 at 2:30 PM, Marc P. <[email protected]> wrote: > > > > > Funny, exactly the same thing I mentioned to Josh last night. Are you > > > watching us? > > > > > > In something I"m doing now I'm making the thrift calls and modeled it > > > after what the client code does; however, > > > once I removed the authenticateUser my throughput increased by 25 per > > > cent. In my trace table it was by and large the greatest audited call. > > > > > > I think we could change how that works and potentially keep that out of > > > band. The fail fast is great if the user doesn't have permissions to > that > > > table and may save the server some work...but as it is now, I'm always > > > authenticated and I'm causing more work for the server. Perhaps we let > > the > > > natural exception that's thrown stop the client ( if it's being a good > > > client ). I think the 25 per cent improvement was well worth the > > > alternative risk. > > > > > > On Fri, Aug 19, 2016 at 2:16 PM, Mike Drob <[email protected]> wrote: > > > > > >> Devs, > > >> > > >> I was recently running a 1.7.2 cluster under a heavy write workload > and > > >> noticed a _lot_ of audit messages showing up in the logs. After some > > >> digging, I found that one of the causes is the following call chain: > > >> > > >> TabletServerBatchWriter::sendMutationsToTabletServer > > >> calls TabletServer.ThriftClientHandler::startUpdate > > >> calls SecurityOperation::authenticateUser <-- always return true > > >> calls SecurityOperation::canAskAboutUser <-- always return true > > >> calls AuditedSecurityOperation::canPerformSystemActions <-- fails; > logs > > >> "action: performSystemAction" > > >> calls AuditedSecurityAction::authenticate <-- succeeds; logs "action: > > >> authenticate;" > > >> > > >> There are two separate but related problems here, I think, that I want > > to > > >> confirm. > > >> > > >> When startUpdate calls security.authenticateUser(credentials, > > >> credentials); > > >> there is the comment that this is done to "// Make sure user is real" > > >> presumably to fail fast. This makes sense, but does it actually work? > In > > >> authenticateUser and canAskAboutUser we end up comparing the two > > >> credentials to each other and this will always succeed because they're > > the > > >> same object. I expect the write fails later in the TabletServer code, > > >> possibly in the UpdateSession, but I haven't been able to untangle > this > > >> part yet. > > >> > > >> Second, I think it makes sense to switch the order of the OR in > > >> 'canAskAboutUser.' If this is in the write path, then it seems like > the > > >> most common case would be users asking about themselves, so we should > > >> check > > >> that before checking if a user is an admin. This also has the > advantage > > of > > >> removing two uninformative audit log lines. We won't have an audit > > message > > >> that somebody wrote data, but we don't have that currently anyway. > > >> > > >> I did some visual spot checking and this looks like an issue in master > > >> also, but I haven't run that code to find out. > > >> > > >> Thoughts? > > >> > > >> Mike > > >> > > > > > > > > >
