Sean Lang created COUCHDB-3320:
----------------------------------

             Summary: confusion with adding node to existing cluster
                 Key: COUCHDB-3320
                 URL: https://issues.apache.org/jira/browse/COUCHDB-3320
             Project: CouchDB
          Issue Type: Question
            Reporter: Sean Lang


I've got a cluster running on 2 separate servers, and want to add a 3rd server. 
I read the docs at http://docs.couchdb.org/en/2.0.0/cluster/nodes.html and at 
the start my _membership endpoint looked like this on both nodes:

```
$ curl -X GET "http://0.0.0.0:5984/_membership"; --user root
{"all_nodes":["[email protected]","[email protected]"],"cluster_nodes":["[email protected]","[email protected]"]}
```

And the server I'm trying to add is at `192.168.1.226`. Running `curl -X PUT 
"http://0.0.0.0:5986/_nodes/[email protected]"; -d {} --user root` from the 
server at `192.168.1.214` didn't work. It showed `Connection attempt from 
disallowed node '[email protected]'` and `Connection attempt from 
disallowed node '[email protected]'` in the logs of `192.168.1.226`... 
Which makes sense, because the command I ran didn't even provide the password 
of `192.168.1.226`, so there's no reason why it should work.

I deleted the added node from the cluster with `curl -X DELETE 
"http://0.0.0.0:5986/_nodes/[email protected]?rev=1-967a00dff5e02add41819138abb3284d";
 -d {} --user root`. The Node Management docs don't actually mention that the 
revision id is required and that the `_nodes` db operates like a normal 
database, which seems to have confused at least [one 
person](https://groups.google.com/d/msg/couchdb-user-archive/54tEryERBiI/O0GKBo_NBAAJ).

After reading through the dev cluster [setup 
script](https://github.com/apache/couchdb/blob/master/dev/run#L422) I tried 
running the following from the server at `192.168.1.214`:

```
$ curl -X POST -H "Content-Type: application/json" 
"http://0.0.0.0:5984/_cluster_setup"; -d 
'{"action":"add_node","host":"192.168.1.226","port":5984,"username":"root","password":"XXXXXXX"}'
 --user root
```

That almost worked. The membership for `192.168.1.214` was correct:

```
{"all_nodes":["[email protected]","[email protected]","[email protected]"],"cluster_nodes":["[email protected]","[email protected]","[email protected]"]}
```

But `192.168.1.226` showed that it wasn't talking to `192.168.1.202`

```
{"all_nodes":["[email protected]","[email protected]"],"cluster_nodes":["[email protected]","[email protected]","[email protected]"]}
```

Logs on `192.168.1.226` showed `global: '[email protected]' failed to 
connect to '[email protected]'` and `Connection attempt from disallowed 
node '[email protected]'`, but I don't understand why.

Rebooting  `192.168.1.226`, deleting the entry from the `_nodes` database 
again, and re-adding it with the exact same command run on `192.168.1.214` as 
before seemed to work (all servers show full connectivity to each other). 
However, I have no idea why it worked the second time, or if I'm doing 
something horribly wrong.

Is this the correct way to add nodes to a cluster? Should the Node Management 
docs be updated? I want to make sure I'm doing this right before I automate the 
whole process with Kubernetes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to