[
https://issues.apache.org/jira/browse/SENTRY-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tushar I updated SENTRY-1703:
-----------------------------
Attachment: stacktrace3.log.txt
stacktrace2.log.txt
stacktrace1.log.txt
kdc.log.txt
> Solr-Sentry in kerberos mode makes too many KDC requests and returns
> unauthorized on KDC timeout
> ------------------------------------------------------------------------------------------------
>
> Key: SENTRY-1703
> URL: https://issues.apache.org/jira/browse/SENTRY-1703
> Project: Sentry
> Issue Type: Bug
> Components: Solr Plugin
> Affects Versions: 1.5.1
> Reporter: Tushar I
> Attachments: kdc.log.txt, stacktrace1.log.txt, stacktrace2.log.txt,
> stacktrace3.log.txt
>
>
> Sentry Version: 1.5.1-cdh5.8.0
> We are seeing intermittent authorization failures with Sentry Solr plugin in
> a Kerberos environment.
> 1. We are writing to Solr using the SolrJ client from within Spark jobs in a
> multi-node Spark/Hadoop cluster and frequently get authorization errors from
> Solr in individual spark tasks saying "User XX does not have privileges for
> YYcollection" which are generated by the Solr-Sentry plugin. (The user very
> well has access to the collection and it works fine rest of the times).
> 2. The root cause seems to be that on every Solr call from the client, Sentry
> reaches out to KDC on behalf of solr/hostname to check if user XX has
> permission on the YYcollection, thereby drowning the KDC in tons of requests
> per second, and at some point fails on a KDC timeout, throwing the exception:
> {{org.apache.sentry.binding.solr.authz.SentrySolrAuthorizationException: User
> XX does not have privileges for YYcollection}} to the calling client.
> I didn't get enough time to investigate why Sentry is making so many KDC
> calls, maybe it's doing it for each document in a batched Solr operation, or
> it logs in using keytab each time and doesn't cache the ticket, etc.
> Caching the result of {{authProvider.hasAccess()}} in SolrAuthzBinding.java
> for a reasonably short time might not be a bad idea.
> My question in the meantime is: Are there any tuning knobs to somehow reduce
> the load on KDC, or increase the KDC request timeout value, or anything along
> these lines?
> Relevant stacktraces captured from Solr Admin are attached:
> 1. stacktrace1.log : The timeout from KDC for sentry call
> 2. stacktrace2.log: When Sentry cannot authenticate with KDC due to # 1 above
> 3. stacktrace3.log: SolrException when {{authProvider.hasAccess()}} returns
> false due to # 2 above.
> Also attached is a _snippet_ from the KDC log - the full log bloats to 17 MB
> within a minute, full of messages like:
> {code}
> Apr 10 17:06:37 a0 krb5kdc[20427](info): TGS_REQ (1 etypes {23}) 10.0.0.1:
> ISSUE: authtime 1491818430, etypes {rep=23 tkt=23 ses=23},
> solr/[email protected] for sentry/[email protected]
> {code}
> This is reproducible in two separate clusters with different environments:
> CDH 5.10.1 and
> CDH 5.8.0
> Please let me know if I've left out any key information.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)