Did you verify connectivity over the peering port as well (2380)? On Jun 21, 2016 7:17 AM, "Julio Saura" <[email protected]> wrote:
> hello > > same problem > > jun 21 13:11:03 openshift-master01 atomic-openshift-master-api[59618]: > F0621 13:11:03.155246 59618 auth.go:141] error #0: dial tcp XXXX:2379: > connection refused ( the one i rebooted ) > jun 21 13:11:03 openshift-master01 atomic-openshift-master-api[59618]: > error #1: client: etcd member https://YYYY:2379 has no leader > > i rebooted the etcd server and my master is not able to use other one > > still able to connect from both masters using telnet to the etcd port .. > > any clue? this is weird. > > > > El 14 jun 2016, a las 9:28, Julio Saura <[email protected]> escribió: > > > > hello > > > > yes is correct .. it was the first thing i checked .. > > > > first master > > > > etcdClientInfo: > > ca: master.etcd-ca.crt > > certFile: master.etcd-client.crt > > keyFile: master.etcd-client.key > > urls: > > - https://openshift-balancer01:2379 > > - https://openshift-balancer02:2379 > > > > > > second master > > > > etcdClientInfo: > > ca: master.etcd-ca.crt > > certFile: master.etcd-client.crt > > keyFile: master.etcd-client.key > > urls: > > - https://openshift-balancer01:2379 > > - https://openshift-balancer02:2379 > > > > dns names resolve in both masters > > > > Best regards and thanks! > > > > > >> El 13 jun 2016, a las 18:45, Scott Dodson <[email protected]> > escribió: > >> > >> Can you verify the connection information etcdClientInfo section in > >> /etc/origin/master/master-config.yaml is correct? > >> > >> On Mon, Jun 13, 2016 at 11:56 AM, Julio Saura <[email protected]> > wrote: > >>> hello > >>> > >>> yes.. i have a external balancer in front of my masters for HA as doc > says. > >>> > >>> i don’t have any balancer in front of my etcd servers for masters > connection, it’s not necessary right? masters will try all etcd availables > it one is down right? > >>> > >>> i don’t know why but none of my masters were able to connect to the > second etcd instance, but using telnet from their shell worked .. so it was > not a net o fw issue.. > >>> > >>> > >>> best regards. > >>> > >>>> El 13 jun 2016, a las 17:53, Clayton Coleman <[email protected]> > escribió: > >>>> > >>>> I have not seen that particular issue. Do you have a load balancer in > >>>> between your masters and etcd? > >>>> > >>>> On Fri, Jun 10, 2016 at 5:55 AM, Julio Saura <[email protected]> > wrote: > >>>>> hello > >>>>> > >>>>> i have an origin 3.1 installation working cool so far > >>>>> > >>>>> today one of my etcd nodes ( 1 of 2 ) crashed and i started having > problems.. > >>>>> > >>>>> i noticed on one of my master nodes that it was not able to connect > to second etcd server and that the etcd server was not able to promote as > leader.. > >>>>> > >>>>> > >>>>> un 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4 is > starting a new election at term 10048 > >>>>> jun 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4 > became candidate at term 10049 > >>>>> jun 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4 > received vote from 12c8a31c8fcae0d4 at term 10049 > >>>>> jun 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4 > [logterm: 8, index: 4600461] sent vote request to bf80ee3a26e8772c at term > 10049 > >>>>> jun 10 11:09:56 openshift-balancer02 etcd[47218]: got unexpected > response error (etcdserver: request timed out) > >>>>> > >>>>> my masters logged that they were not able to connect to the etcd > >>>>> > >>>>> er.go:218] unexpected ListAndWatch error: pkg/storage/cacher.go:161: > Failed to list *extensions.Job: error #0: dial tcp X.X.X.X:2379: connection > refused > >>>>> > >>>>> so i tried a simple test, just telnet from masters to the etcd node > port .. > >>>>> > >>>>> [root@openshift-master01 log]# telnet X.X.X.X 2379 > >>>>> Trying X.X.X.X... > >>>>> Connected to X.X.X.X. > >>>>> Escape character is '^]’ > >>>>> > >>>>> so i was able to connect from masters. > >>>>> > >>>>> i was not able to recover my oc masters until the first etcd node > rebooted .. so it seems my etcd “cluster” is not working without the first > node .. > >>>>> > >>>>> any clue? > >>>>> > >>>>> thanks > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> users mailing list > >>>>> [email protected] > >>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users > >>> > >>> > >>> _______________________________________________ > >>> users mailing list > >>> [email protected] > >>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users > > > > > > _______________________________________________ > > users mailing list > > [email protected] > > http://lists.openshift.redhat.com/openshiftmm/listinfo/users > > > _______________________________________________ > users mailing list > [email protected] > http://lists.openshift.redhat.com/openshiftmm/listinfo/users >
_______________________________________________ users mailing list [email protected] http://lists.openshift.redhat.com/openshiftmm/listinfo/users
