Re: weird issue with etcd

Jason DeTiberus Tue, 21 Jun 2016 04:23:59 -0700

Did you verify connectivity over the peering port as well (2380)?
On Jun 21, 2016 7:17 AM, "Julio Saura" <[email protected]> wrote:


> hello
>
> same problem
>
> jun 21 13:11:03 openshift-master01 atomic-openshift-master-api[59618]:
> F0621 13:11:03.155246   59618 auth.go:141] error #0: dial tcp XXXX:2379:
> connection refused ( the one i rebooted )
> jun 21 13:11:03 openshift-master01 atomic-openshift-master-api[59618]:
> error #1: client: etcd member https://YYYY:2379 has no leader
>
> i rebooted the etcd server and my master is not able to use other one
>
> still able to connect from both masters using telnet to the etcd port ..
>
> any clue? this is weird.
>
>
> > El 14 jun 2016, a las 9:28, Julio Saura <[email protected]> escribió:
> >
> > hello
> >
> > yes is correct .. it was the first thing i checked ..
> >
> > first master
> >
> > etcdClientInfo:
> > ca: master.etcd-ca.crt
> > certFile: master.etcd-client.crt
> > keyFile: master.etcd-client.key
> > urls:
> >   - https://openshift-balancer01:2379
> >   - https://openshift-balancer02:2379
> >
> >
> > second master
> >
> > etcdClientInfo:
> > ca: master.etcd-ca.crt
> > certFile: master.etcd-client.crt
> > keyFile: master.etcd-client.key
> > urls:
> >   - https://openshift-balancer01:2379
> >   - https://openshift-balancer02:2379
> >
> > dns names resolve in both masters
> >
> > Best regards and thanks!
> >
> >
> >> El 13 jun 2016, a las 18:45, Scott Dodson <[email protected]>
> escribió:
> >>
> >> Can you verify the connection information etcdClientInfo section in
> >> /etc/origin/master/master-config.yaml is correct?
> >>
> >> On Mon, Jun 13, 2016 at 11:56 AM, Julio Saura <[email protected]>
> wrote:
> >>> hello
> >>>
> >>> yes.. i have a external balancer in front of my masters for HA as doc
> says.
> >>>
> >>> i don’t have any balancer in front of my etcd servers for masters
> connection, it’s not necessary right? masters will try all etcd availables
> it one is down right?
> >>>
> >>> i don’t know why but none of my masters were able to connect to the
> second etcd instance, but using telnet from their shell worked .. so it was
> not a net o fw issue..
> >>>
> >>>
> >>> best regards.
> >>>
> >>>> El 13 jun 2016, a las 17:53, Clayton Coleman <[email protected]>
> escribió:
> >>>>
> >>>> I have not seen that particular issue.  Do you have a load balancer in
> >>>> between your masters and etcd?
> >>>>
> >>>> On Fri, Jun 10, 2016 at 5:55 AM, Julio Saura <[email protected]>
> wrote:
> >>>>> hello
> >>>>>
> >>>>> i have an origin 3.1 installation working cool so far
> >>>>>
> >>>>> today one of my etcd nodes ( 1 of 2 ) crashed and i started having
> problems..
> >>>>>
> >>>>> i noticed on one of my master nodes that it was not able to connect
> to second etcd server and that the etcd server was not able to promote as
> leader..
> >>>>>
> >>>>>
> >>>>> un 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4 is
> starting a new election at term 10048
> >>>>> jun 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4
> became candidate at term 10049
> >>>>> jun 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4
> received vote from 12c8a31c8fcae0d4 at term 10049
> >>>>> jun 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4
> [logterm: 8, index: 4600461] sent vote request to bf80ee3a26e8772c at term
> 10049
> >>>>> jun 10 11:09:56 openshift-balancer02 etcd[47218]: got unexpected
> response error (etcdserver: request timed out)
> >>>>>
> >>>>> my masters logged that they were not able to connect to the etcd
> >>>>>
> >>>>> er.go:218] unexpected ListAndWatch error: pkg/storage/cacher.go:161:
> Failed to list *extensions.Job: error #0: dial tcp X.X.X.X:2379: connection
> refused
> >>>>>
> >>>>> so i tried a simple test, just telnet from masters to the etcd node
> port ..
> >>>>>
> >>>>> [root@openshift-master01 log]# telnet X.X.X.X 2379
> >>>>> Trying X.X.X.X...
> >>>>> Connected to X.X.X.X.
> >>>>> Escape character is '^]’
> >>>>>
> >>>>> so i was able to connect from masters.
> >>>>>
> >>>>> i was not able to recover my oc masters until the first etcd node
> rebooted .. so it seems my etcd “cluster” is not working without the first
> node ..
> >>>>>
> >>>>> any clue?
> >>>>>
> >>>>> thanks
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> users mailing list
> >>>>> [email protected]
> >>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
> >>>
> >>>
> >>> _______________________________________________
> >>> users mailing list
> >>> [email protected]
> >>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
> >
> >
> > _______________________________________________
> > users mailing list
> > [email protected]
> > http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>
> _______________________________________________
> users mailing list
> [email protected]
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>

_______________________________________________
users mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: weird issue with etcd

Reply via email to