hello yes is correct .. it was the first thing i checked ..
first master etcdClientInfo: ca: master.etcd-ca.crt certFile: master.etcd-client.crt keyFile: master.etcd-client.key urls: - https://openshift-balancer01:2379 - https://openshift-balancer02:2379 second master etcdClientInfo: ca: master.etcd-ca.crt certFile: master.etcd-client.crt keyFile: master.etcd-client.key urls: - https://openshift-balancer01:2379 - https://openshift-balancer02:2379 dns names resolve in both masters Best regards and thanks! > El 13 jun 2016, a las 18:45, Scott Dodson <[email protected]> escribió: > > Can you verify the connection information etcdClientInfo section in > /etc/origin/master/master-config.yaml is correct? > > On Mon, Jun 13, 2016 at 11:56 AM, Julio Saura <[email protected]> wrote: >> hello >> >> yes.. i have a external balancer in front of my masters for HA as doc says. >> >> i don’t have any balancer in front of my etcd servers for masters >> connection, it’s not necessary right? masters will try all etcd availables >> it one is down right? >> >> i don’t know why but none of my masters were able to connect to the second >> etcd instance, but using telnet from their shell worked .. so it was not a >> net o fw issue.. >> >> >> best regards. >> >>> El 13 jun 2016, a las 17:53, Clayton Coleman <[email protected]> escribió: >>> >>> I have not seen that particular issue. Do you have a load balancer in >>> between your masters and etcd? >>> >>> On Fri, Jun 10, 2016 at 5:55 AM, Julio Saura <[email protected]> wrote: >>>> hello >>>> >>>> i have an origin 3.1 installation working cool so far >>>> >>>> today one of my etcd nodes ( 1 of 2 ) crashed and i started having >>>> problems.. >>>> >>>> i noticed on one of my master nodes that it was not able to connect to >>>> second etcd server and that the etcd server was not able to promote as >>>> leader.. >>>> >>>> >>>> un 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4 is >>>> starting a new election at term 10048 >>>> jun 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4 became >>>> candidate at term 10049 >>>> jun 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4 >>>> received vote from 12c8a31c8fcae0d4 at term 10049 >>>> jun 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4 >>>> [logterm: 8, index: 4600461] sent vote request to bf80ee3a26e8772c at term >>>> 10049 >>>> jun 10 11:09:56 openshift-balancer02 etcd[47218]: got unexpected response >>>> error (etcdserver: request timed out) >>>> >>>> my masters logged that they were not able to connect to the etcd >>>> >>>> er.go:218] unexpected ListAndWatch error: pkg/storage/cacher.go:161: >>>> Failed to list *extensions.Job: error #0: dial tcp X.X.X.X:2379: >>>> connection refused >>>> >>>> so i tried a simple test, just telnet from masters to the etcd node port .. >>>> >>>> [root@openshift-master01 log]# telnet X.X.X.X 2379 >>>> Trying X.X.X.X... >>>> Connected to X.X.X.X. >>>> Escape character is '^]’ >>>> >>>> so i was able to connect from masters. >>>> >>>> i was not able to recover my oc masters until the first etcd node rebooted >>>> .. so it seems my etcd “cluster” is not working without the first node .. >>>> >>>> any clue? >>>> >>>> thanks >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> [email protected] >>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users >> >> >> _______________________________________________ >> users mailing list >> [email protected] >> http://lists.openshift.redhat.com/openshiftmm/listinfo/users _______________________________________________ users mailing list [email protected] http://lists.openshift.redhat.com/openshiftmm/listinfo/users
