[ 
https://issues.apache.org/jira/browse/COUCHDB-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128581#comment-14128581
 ] 

ASF subversion and git services commented on COUCHDB-2325:
----------------------------------------------------------

Commit 88ff6d323f904425f5130a237ce712b68ac70dfb in couchdb-fabric's branch 
refs/heads/master from [~mikewallace]
[ https://git-wip-us.apache.org/repos/asf?p=couchdb-fabric.git;h=88ff6d3 ]

Teach fabric_util:get_db/2 about maintenance mode

If the node servicing a request does not have a shard for the db
involved then fabric_util:get_db/2 can return a shard from a node
which is in maintenance mode. If that node is a replacement node
that has not yet been brought into the cluster then the security
object will be empty.

Because fabric:get_security/2 calls fabric_util:get_db/2 and is in
the code path for authorizing requests at the HTTP layer, this can
result in live nodes returning 403s.

This commit replaces an rpc:call/4 with a rexi:cast/4 and adds
a new rpc endpoint in fabric_rpc for opening single shards. This
uses set_io_priority which will reply with a rexi_EXIT if
maintenance mode is set.

Closes COUCHDB-2325


> fabric:get_security/2 can return security objects from nodes that are in 
> maintenance mode
> -----------------------------------------------------------------------------------------
>
>                 Key: COUCHDB-2325
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-2325
>             Project: CouchDB
>          Issue Type: Bug
>      Security Level: public(Regular issues) 
>          Components: BigCouch
>            Reporter: Mike Wallace
>             Fix For: 2.0.0
>
>
> Currently, fabric:get_security/2 calls fabric_util:get_db/2 and if the node 
> servicing a request does not have a shard for the db then 
> fabric_util:get_db/2 can return a shard from a node which is in maintenance 
> mode.
> If that node is a replacement node that has not yet been brought into the 
> cluster then the security object will be empty. Because fabric:get_security/2 
> is in the code path for authorizing requests at the HTTP layer this can 
> result in live nodes returning 403s. I have verified that this issue exists 
> even though cassim now handles authorization (cassim eventually makes the 
> same call to fabric:get_security/2).
> The crux of the problem is that the algorithm used by fabric_util:get_db/2 
> doesn't account for the possibility of nodes being in maintenance mode.
> See https://gist.github.com/mikewallace1979/8d01bb8661a50762bfc3 for the 
> steps to reproduce locally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to