[ 
https://issues.apache.org/jira/browse/SOLR-12514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16607933#comment-16607933
 ] 

Noble Paul commented on SOLR-12514:
-----------------------------------

{quote}Can we just read the solrconfig.xml for all the collections and prepare 
this mapping at the startup? This can be a lightweight code which does not 
involve spinning up a core.
{quote}
 

Well, NO.

 

A node only keep the config details of collections only if it has replicas of 
that collection. If we load all configs on startup, the node is not notified of 
config updates. So, it may have stale data .It will also be a problem if the 
handlers are loaded from a runtime lib

 

The solution is 

 
 * Add a custom handler that serves the permissions at 
{{<collection-name>/check-perm?path=<path>&method=<HTTP-METHOD>}}
 * If a request comes in check if it is served by the node
 * if not, identify a node that serves that collection and hit the permissions 
path
 * respond appropriately
 * cache the details with a TTL of say 5secs to avoid too many requests

> Rule-base Authorization plugin skips authorization if querying node does not 
> have collection replica
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-12514
>                 URL: https://issues.apache.org/jira/browse/SOLR-12514
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: security
>    Affects Versions: 7.3.1
>            Reporter: Mahesh Kumar Vasanthu Somashekar
>            Priority: Major
>         Attachments: SOLR-12514.patch, Screen Shot 2018-06-24 at 9.36.45 
> PM.png, security.json
>
>
> Solr serves client requests going throught 3 steps - init(), authorize() and 
> handle-request ([link 
> git-link|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.3.1/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L471]).
>  init() initializes all required information to be used by authorize(). 
> init() skips initializing if request is to be served remotely, which leads to 
> skipping authorization step ([link 
> git-link|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.3.1/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L291]).
>  init() relies on 'cores' object which only has information of local node 
> (which is perfect as per design). It should actually be getting security 
> information (security.json) from zookeeper, which has global view of the 
> cluster.
>  
> Example:
> SolrCloud setup consists of 2 nodes (solr-7.3.1):
> {code:javascript}
> live_nodes: [
>  "localhost:8983_solr",
>  "localhost:8984_solr",
> ]
> {code}
> Two collections are created - 'collection-rf-1' with RF=1 and 
> 'collection-rf-2' with RF=2.
> Two users are created - 'collection-rf-1-user' and 'collection-rf-2-user'.
> Security configuration is as below (security.json attached):
> {code:javascript}
> "authorization":{
>   "class":"solr.RuleBasedAuthorizationPlugin",
>   "permissions":[
>     { "name":"read", "collection":"collection-rf-2", 
> "role":"collection-rf-2", "index":1},
>     { "name":"read", "collection":"collection-rf-1", 
> "role":"collection-rf-1", "index":2},
>     { "name":"read", "role":"*", "index":3},
>     ...
>   "user-role":
>     { "collection-rf-1-user":[ "collection-rf-1"], "collection-rf-2-user":[ 
> "collection-rf-2"]},
>     ...
> {code}
>  
> Basically, its setup to that 'collection-rf-1-user' user can only access 
> 'collection-rf-1' collection and 'collection-rf-2-user' user can only access 
> 'collection-rf-2' collection.
> Also note that 'collection-rf-1' collection replica is only on 
> 'localhost:8983_solr' node, whereas ''collection-rf-2' collection replica is 
> on both live nodes.
>  
> Authorization does not work as expected for 'collection-rf-1' collection:
> $ curl -u collection-rf-2-user:password 
> 'http://*localhost:8983*/solr/collection-rf-1/select?q=*:*'
> {code:html}
>  <html>
>  <head>
>  <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
>  <title>Error 403 Unauthorized request, Response code: 403</title>
>  </head>
>  <body><h2>HTTP ERROR 403</h2>
>  <p>Problem accessing /solr/collection-rf-1/select. Reason:
>  <pre> Unauthorized request, Response code: 403</pre></p>
>  </body>
>  </html>
> {code}
> $ curl -u collection-rf-2-user:password 
> 'http://*localhost:8984*/solr/collection-rf-1/select?q=*:*'
> {code:javascript}
>  {
>    "responseHeader":{
>      "zkConnected":true,
>      "status":0,
>      "QTime":0,
>      "params":{
>        "q":"*:*"}},
>    "response":{"numFound":0,"start":0,"docs":[]
>  }}
> {code}
>  
> Whereas authorization works perfectly for 'collection-rf-2' collection (as 
> both nodes have replica):
> $ curl -u collection-rf-1-user:password 
> 'http://*localhost:8984*/solr/collection-rf-2/select?q=*:*'
> {code:html}
>  <html>
>  <head>
>  <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
>  <title>Error 403 Unauthorized request, Response code: 403</title>
>  </head>
>  <body><h2>HTTP ERROR 403</h2>
>  <p>Problem accessing /solr/collection-rf-2/select. Reason:
>  <pre> Unauthorized request, Response code: 403</pre></p>
>  </body>
>  </html>
> {code}
> $ curl -u collection-rf-1-user:password 
> 'http://*localhost:8983*/solr/collection-rf-2/select?q=*:*'
> {code:html}
>  <html>
>  <head>
>  <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
>  <title>Error 403 Unauthorized request, Response code: 403</title>
>  </head>
>  <body><h2>HTTP ERROR 403</h2>
>  <p>Problem accessing /solr/collection-rf-2/select. Reason:
>  <pre> Unauthorized request, Response code: 403</pre></p>
>  </body>
>  </html>
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to