Re: Setting up CouchDB 2.0

Alexander Shorin Sun, 26 Oct 2014 14:12:27 -0700

On Sun, Oct 26, 2014 at 11:25 PM, Jan Lehnardt <[email protected]> wrote:
> Definitely, sorry for missing that bit.
>


No worries. Let's clear this up (:

>> - If node already has admin-party fixed should it accepts new admin 
>> credentials?
>
> Good question, I’d say if an admin already exists, no new admin credentials 
> are needed/
>
>
>> - Any reasons to replace 1-3 PUT requests to /_config with single POST
>> one in this case?
>
> I’m not sure what the 1-3 PUT requests are?

These ones:
curl -XPUT http://localhost:5984/_config/admin/root -d '"password"' -H
'Content-Type: application/json'
curl -XPUT http://localhost:5984/_config/httpd/bind_address -d
'"0.0.0.0"' -H 'Content-Type: application/json'
curl -XPUT http://localhost:5984/_config/httpd/port -d '"5984"' -H
'Content-Type: application/json'

The last two are optional as like as the related fields are optional
for /_setup call.

>> 3. Pick any one node, for simplicity use the first one, to be the
>>> “setup coordination node”.
>>> - this is a “master” node that manages the setup and requires all
>>>   other nodes to be able to see it and vice versa. Setup won’t work
>>>   with unavailable nodes (duh). The notion of “master” will be gone
>>>   once the setup is finished. At that point, the system has no
>>>   master node. Ignore I ever said “master”.
>>>
>>> a. Go to Fauxton / Cluster Setup, once we have enabled the cluster, the
>>> UI shows an “Add Node” interface with the fields admin, and node:
>>> - POST to /_setup with
>>>   {
>>>     "action": "add_node",
>>>     "admin": { // should be auto-filled from Fauxton
>>>       "user": "username",
>>>       "pass": "password"
>>>     },
>>>     "node": {
>>>       "host": "hostname",
>>>       ["port": 5984]
>>>     }
>>>   }
>>>
>>> b. as in a, but without the Fauxton bits, just POST to /_setup
>>> - this request will do this:
>>>  - on the “setup coordination node”:
>>>   - check if we have an Erlang Cookie Secret. If not, generate
>>>     a UUID and set the erlang cookie to to that UUID.
>>>     // TBD: persist the cookie, so it survives restarts
>>>   - make a POST request to the node specified in the body above
>>>     using the admin credentials in the body above:
>>>     POST to http://username:password@node_b:5984/_setup with:
>>>     {
>>>       "action": "receive_cookie",
>>>       "cookie": "<secretcookie>",
>>>     }
>>>     // TBD: persist the cookie on node B, so it survives restarts
>>>
>>>   - when the request to node B returns, we know the Erlang-level
>>>     inter-cluster communication is enabled and we can start adding
>>>     the node on the CouchDB level. To do that, the “setup
>>>     coordination node” does this to it’s own HTTP endpoint:
>>>     PUT /nodes/node_b:5984 or the same thing with internal APIs.
>>>
>>> - Repeat for all nodes.
>>> - Fauxton keeps a list of all set up nodes for users to see.
>>
>> Question:
>> - Since Fauxton already known all the nodes admin credentials and all
>> the nodes are bounded to 0.0.0.0 iface (from previous step), will
>> Fauxton automate nodes join into the cluster? This is about to skip
>> "Repeat on all nodes" step
>
> How does Fauxton know about the other nodes at this point?
> (I guess since the Erlang cluster is already set up, it could expose that
> info to Fauxton in a zeroconf kind of fashion and auto-populate the Fauxton
> UI with nodes that then can be joined with just a click of a button.)

Oh, right. The "should be auto-filled from Fauxton" comment confused
me, so I thought about that Fauxton is already aware about nodes list.
However, zeroconf is desirable, but this is another feature to add. So
everything is ok here.

>> - If some of my nodes have different admin credentials, is this the
>> blocker error case or should Fauxton ask me for these credentials?
>
> That’s why `add_node` takes a username and password as options, you
> can set that up if you want. / This could also be made an error case.
> It should certainly not be recommended.
>

Right, same confusion by "auto-filled" commentary (:

>> - Any reasons for replacing regular request to /_nodes with custom
>> /_setup?
>
> I don’t know what /_nodes is. Do you mean /nodes? — The reason this isn’t
> using /nodes at this point is that /nodes already has a special meaning
> and I didn’t want to complicate the existing logic. In addition, /nodes
> might have to be adjusted to carry the username and password of the target
> CouchDB to do the setup (if we otherwise keep the proposed model, happy
> to see alternatives, though!).
>
> If we can reduce all of what I outlined to `PUT /nodes/node_b|c|d`, that
> would be nice. Fauxton could then offer the setup UI based on whether /nodes
> has any entries. But I don’t know enough about the semantics and other
> uses of /nodes, so I haven’t thought about this option too much.
>

*Bikeshedding alert*: shouldn't system database names be started with
leading underscore?(:
Yes, /nodes. Btw, nice idea about storing there node credentials -
this should help with cluster management in case when admin
credentials are different everywhere. I only worry that this would
cause a conflict with cassim logic.

>> Point about cookie counts.
>
> Not sure I follow.

I'd tried to overcome your reply. I could be wrong, but /nodes doesn't
knows anything about Erlang cookies and how to work with them while
your /_setup provides such functionality.  I eventually trying to find
reasons to avoid having special HTTP resource which will used only
once for whole cluster lifespan while there exists other which are
able to made the same job. Setting up cookies makes a sense to have it
instead.

>> 4.a. When all nodes are added, click the [Finish Cluster Setup] button
>>> in Fauxton.
>>> - this does POST /_setup
>>>   {
>>>     "action": "finish_setup"
>>>   }
>>>
>>> b. Same as in a.
>>>
>>> - this manages the final setup bits, like creating the _users,
>>>   _replicator and _db_updates endpoints and whatever else is needed.
>>>   // TBD: collect what else is needed.
>>
>> This is the only useful thing that /_setup does from my current point
>> of view - everything else was just masking standard requests to
>> existed API.
>
> Which existing API in particular?
>
> If you mean that this all can be done over /_config and /nodes, yes totally,
> but Fauxton on node_a can’t access /_config on node_b. That’s one of the
> reasons of why I suggest using /_setup, so it can do all this from a single
> node via Fauxton. The other reason is that it is a dedicated API end-point
> that hides a lot of complexity instead of having end-users hit a bunch of
> seemingly random endpoints (although this *could* be hidden in Fauxton maybe,
> except for the cross domain issue).

Yes, I'm about /_config and /nodes. But why Fauxton cannot access to
config on node_b? Especially, if it knows the credentials and node_b
bounded to 0.0.0.0 iface.

About API usage complexity: for followers of Fauxton-driven way they
really don't care about what HTTP requests will be made behind while
nice spinner loops in their browser. As for fellows of console way
this isn't an issue too: some small cluster installations are easily
to made via "seemingly random endpoints" following our guidelines; for
bigger clusters these processes tends to be automated by provisioning
tools.



>>
>>> ## The Setup Endpoint
>>>
>>> This is not a REST-y endpoint, it is a simple state machine operated
>>> by HTTP POST with JSON bodies that have an `action` field.
>>>
>>> ### State 1: No Cluster Enabled
>>>
>>> This is right after starting a node for the first time, and any time
>>> before the cluster is enabled as outlined above.
>>>
>>> GET /_setup
>>> {"state": "cluster_disabled"}
>>>
>>> POST /_setup {"action":"enable_cluster"...} -> Transition to State 2
>>> POST /_setup {"action":"enable_cluster"...} with empty admin user/pass or 
>>> invalid host/post or host/port not available -> Error
>>> POST /_setup {"action":"anything_but_enable_cluster"...} -> Error
>>>
>>
>> If "enable_cluster" only creates/setups admin and bind address, could
>> this step be skipped? Because the same actions are possible to do via
>> regular config setup.
>
> Yes! It just needs to ensure these things are done. If Fauxton detects
> they *are* done, it can skip the enable step and show the add_node interface
> right away.

Good!

>>
>>
>>> ### State 2: Cluster enabled, admin user set, waiting for nodes to be added.
>>>
>>> GET /_setup
>>> {"state":"cluster_enabled","nodes":[]}
>>>
>>> POST /_setup {"action":"enable_cluster"...} -> Error
>>> POST /_setup {"action":"add_node"...} -> Stay in State 2, but return 
>>> "nodes":["node B"}] on GET
>>> POST /_setup {"action":"add_node"...} -> if target node not available, Error
>>> POST /_setup {"action":"finish_cluster"} with no nodes set up -> Error
>>> POST /_setup {"action":"finish_cluster"} -> Transition to State 3
>>>
>>
>> Questions:
>> - How much nodes required to be added? 1? 2? 3?...
>
> Doesn’t matter.

Then a case:
POST /_setup {"action":"finish_cluster"} with no nodes set up -> Error

will never happens since there will be always at least one done in
cluster - those one who runs setup (:

>
>> - How to remove accidentally added node from cluster?
>
> Delete from /nodes database. Could be added as a UI element in Fauxton.

That's what I worried about: we adding nodes via /_setup, but have to
remove them via /nodes. Consistency have to be preserved (:


>>
>>> ### State 3: Cluster set up, all nodes operational
>>>
>>> GET /_setup
>>> {"state":"cluster_finished","nodes":["node a", "node b", ...]}
>>>
>>> POST /_setup {"action":"enable_cluster"...} -> Error
>>> POST /_setup {"action":"finish_cluster"...} -> Stay in State 3, do nothing
>>> POST /_setup {"action":"add_node"...} -> Error
>>> POST /_setup?i_know_what_i_am_doing=true {"action":"add_node"...} -> Add 
>>> node, stay in State 3.
>>>
>>> // TBD: we need to persist the setup state somewhere.
>>>
>>
>> Questions:
>> - Why adding a new node after finish_cluster is some specific case to
>> mark it with "i_know_what_i_am_doing" parameter?
>
> Because I think it is not advisable to do this regularly, but someone might
> want to do this regardless (see next).
>
>
>> - How to enlarge / reduce cluster after his setup or even disband it?
>
> Enlarge: see above.
> Reduce: delete from /nodes
> Disband: shut down all CouchDB processes :)
>
> I don’t know the BigCouch/Cloudant best practices for this. I’ll chalk
> this down as a “needs input from Cloudant people” :)
>

You run me into recursion with (see next) and (see above) notes! Nice
trick, but still unclear how to let your cluster grow - this isn't
some exceptional case. Reducing (not eventual during network issues)
is what more rarely could happens. +1 for having more info from
Cloudant people (:


>> Or this isn't what /_setup should cares about?
>
> In general, this isn’t really covered by the setup proposal here. I’d like
> to keep this out of scope for now, but we should have good answers to that
> going forward.

Agreed.

>> - What happens with /_setup resource after finish_cluster? Any case
>> for it to be useful?
>
> Only for Fauxton to show the correct setup state.
>

If so then I just figured out some better name for it: /_cluster
- it setups the cluster as you planned
- it shows cluster state as you planned
- it allows to manage cluster nodes in the way which isn't suitable
for /nodes API (like setting cookies)
- it becomes useful after cluster setup
and it could handle other cluster-wide tasks.

What do you think about?

>> - How could /_setup helps with admin password change among the all
>> cluster nodes?
>
> At least on the first run setup, Fauxton can just keep the new password in
> memory and pre-fill the add_node screens with the same username and password.
> /_setup then transports it over.
>
> For later setups, I don’t know, as we would have the admin to enter the 
> password
> in plaintext so we can send it. Alternatively, we could use the /_config API 
> to
> read and send the PBKDF2 hash *waves hands*.
>

"Fauxton can just keep the new password in memory" opens a door to the
issue when you accidentally refreshes page / closes tab / loses page
memory in other way. Not an flaw, just a case to remember about.

As about sending (or replicating) PBKDF2 hash looks good for me.

>> - If I add a new node after "finish_cluster" setup, will it have all
>> system databases (global_changes, cassim, _users...whatever else)
>> created?
>
> That is unspecified at this point. I’d need more input from the Cloudant 
> people
> on this one. I’m happy to go either way, or make it an option for later joined
> nodes.

Ok. Let's wait what Cloudant people say. I'm pretty sure they already
know the solution for all these problems or at least knows their
specifics.

Thanks a lot, Jan! (:

--
,,,^..^,,,

Re: Setting up CouchDB 2.0

Reply via email to