Devs, I was recently running a 1.7.2 cluster under a heavy write workload and noticed a _lot_ of audit messages showing up in the logs. After some digging, I found that one of the causes is the following call chain:
TabletServerBatchWriter::sendMutationsToTabletServer calls TabletServer.ThriftClientHandler::startUpdate calls SecurityOperation::authenticateUser <-- always return true calls SecurityOperation::canAskAboutUser <-- always return true calls AuditedSecurityOperation::canPerformSystemActions <-- fails; logs "action: performSystemAction" calls AuditedSecurityAction::authenticate <-- succeeds; logs "action: authenticate;" There are two separate but related problems here, I think, that I want to confirm. When startUpdate calls security.authenticateUser(credentials, credentials); there is the comment that this is done to "// Make sure user is real" presumably to fail fast. This makes sense, but does it actually work? In authenticateUser and canAskAboutUser we end up comparing the two credentials to each other and this will always succeed because they're the same object. I expect the write fails later in the TabletServer code, possibly in the UpdateSession, but I haven't been able to untangle this part yet. Second, I think it makes sense to switch the order of the OR in 'canAskAboutUser.' If this is in the write path, then it seems like the most common case would be users asking about themselves, so we should check that before checking if a user is an admin. This also has the advantage of removing two uninformative audit log lines. We won't have an audit message that somebody wrote data, but we don't have that currently anyway. I did some visual spot checking and this looks like an issue in master also, but I haven't run that code to find out. Thoughts? Mike
