[ https://issues.apache.org/jira/browse/SENTRY-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tushar I updated SENTRY-1703: ----------------------------- Attachment: solr-sentry-test-master.zip > Solr-Sentry in kerberos mode makes too many KDC requests and returns > unauthorized on KDC timeout > ------------------------------------------------------------------------------------------------ > > Key: SENTRY-1703 > URL: https://issues.apache.org/jira/browse/SENTRY-1703 > Project: Sentry > Issue Type: Bug > Components: Solr Plugin > Affects Versions: 1.5.1 > Reporter: Tushar I > Priority: Blocker > Attachments: kdc.log.txt, solr-sentry-test-master.zip, > stacktrace1.log.txt, stacktrace2.log.txt, stacktrace3.log.txt > > > Sentry Version: 1.5.1-cdh5.8.0 > We are seeing intermittent authorization failures with Sentry Solr plugin in > a Kerberos environment. > 1. We are writing to Solr using the SolrJ client from within Spark jobs in a > multi-node Spark/Hadoop cluster and frequently get authorization errors from > Solr in individual spark tasks saying "User XX does not have privileges for > YYcollection" which are generated by the Solr-Sentry plugin. (The user very > well has access to the collection and it works fine rest of the times). > 2. The root cause seems to be that on every Solr call from the client, Sentry > reaches out to KDC on behalf of solr/hostname, thereby drowning the KDC in > tons of requests per second, and at some point fails on a KDC timeout, > throwing the exception: > {{org.apache.sentry.binding.solr.authz.SentrySolrAuthorizationException: User > XX does not have privileges for YYcollection}} to the calling client. > I didn't get enough time to investigate why Sentry is making so many KDC > calls, maybe it's doing it for each document in a batched Solr operation, or > it logs in using keytab each time and doesn't cache the ticket, etc. > Caching the result of {{authProvider.hasAccess()}} in SolrAuthzBinding.java > for a reasonably short time might not be a bad idea. > My question in the meantime is: Are there any tuning knobs to somehow reduce > the load on KDC, or increase the KDC request timeout value, or anything along > these lines? > Relevant stacktraces captured from Solr Admin are attached: > 1. stacktrace1.log : The timeout from KDC for sentry call > 2. stacktrace2.log: When Sentry cannot authenticate with KDC due to # 1 above > 3. stacktrace3.log: SolrException when {{authProvider.hasAccess()}} returns > false due to # 2 above. > Also attached is a _snippet_ from the KDC log - the full log bloats to 17 MB > within a minute, full of messages like: > {code} > Apr 10 17:06:37 a0 krb5kdc[20427](info): TGS_REQ (1 etypes {23}) 10.0.0.1: > ISSUE: authtime 1491818430, etypes {rep=23 tkt=23 ses=23}, > solr/a...@realm.com for sentry/a...@realm.com > {code} > This is reproducible in two separate clusters with different environments: > CDH 5.10.1 and > CDH 5.8.0 > Please let me know if I've left out any key information. -- This message was sent by Atlassian JIRA (v6.3.15#6346)