[
https://issues.apache.org/jira/browse/COUCHDB-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128349#comment-14128349
]
ASF subversion and git services commented on COUCHDB-2325:
----------------------------------------------------------
Commit 88ff6d323f904425f5130a237ce712b68ac70dfb in couchdb-fabric's branch
refs/heads/2325-teach-fabric-get_db-about-maintenance-mode from [~mikewallace]
[ https://git-wip-us.apache.org/repos/asf?p=couchdb-fabric.git;h=88ff6d3 ]
Teach fabric_util:get_db/2 about maintenance mode
If the node servicing a request does not have a shard for the db
involved then fabric_util:get_db/2 can return a shard from a node
which is in maintenance mode. If that node is a replacement node
that has not yet been brought into the cluster then the security
object will be empty.
Because fabric:get_security/2 calls fabric_util:get_db/2 and is in
the code path for authorizing requests at the HTTP layer, this can
result in live nodes returning 403s.
This commit replaces an rpc:call/4 with a rexi:cast/4 and adds
a new rpc endpoint in fabric_rpc for opening single shards. This
uses set_io_priority which will reply with a rexi_EXIT if
maintenance mode is set.
Closes COUCHDB-2325
> fabric:get_security/2 can return security objects from nodes that are in
> maintenance mode
> -----------------------------------------------------------------------------------------
>
> Key: COUCHDB-2325
> URL: https://issues.apache.org/jira/browse/COUCHDB-2325
> Project: CouchDB
> Issue Type: Bug
> Security Level: public(Regular issues)
> Components: BigCouch
> Reporter: Mike Wallace
>
> Currently, fabric:get_security/2 calls fabric_util:get_db/2 and if the node
> servicing a request does not have a shard for the db then
> fabric_util:get_db/2 can return a shard from a node which is in maintenance
> mode.
> If that node is a replacement node that has not yet been brought into the
> cluster then the security object will be empty. Because fabric:get_security/2
> is in the code path for authorizing requests at the HTTP layer this can
> result in live nodes returning 403s. I have verified that this issue exists
> even though cassim now handles authorization (cassim eventually makes the
> same call to fabric:get_security/2).
> The crux of the problem is that the algorithm used by fabric_util:get_db/2
> doesn't account for the possibility of nodes being in maintenance mode.
> See https://gist.github.com/mikewallace1979/8d01bb8661a50762bfc3 for the
> steps to reproduce locally.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)