etcdctl -C https://openshift-balancer01:2379,https://openshift-balancer02:2379 
--ca-file=/etc/origin/master/maer.etcd-ca.crt 
--cert-file=/etc/origin/master/master.etcd-client.crt 
--key-file=/etc/origin/master/master.etcd-client.key member list


12c8a31c8fcae0d4: name=openshift-balancer02 peerURLs=https://XXXX:2380 
clientURLs=https://XXXX:2379
bf80ee3a26e8772c: name=openshift-balancer01 peerURLs=https://XXXX:2380 
clientURLs=https://XXXX:2379 <https://xxxx:2379/>



member list is ok

cluster health tells me what i already know :(

etcdctl -C https://openshift-balancer01;2379,https://openshift-balancer02:2379 
--ca-file=/etc/origin/master/master.etcd-ca.crt 
--cert-file=/etc/origin/master/master.etcd-client.crt 
--key-file=/etc/origin/master/master.etcd-client.key cluster-health

member 12c8a31c8fcae0d4 is unhealthy: got unhealthy result from 
https://XXXX:2379
failed to check the health of member bf80ee3a26e8772c on https://XXXX:2379: Get 
https://XXXX:2379/health: dial tcp XXXX:2379: i/o timeout
member bf80ee3a26e8772c is unreachable: [https://XXXX:2379] are all unreachable

the "main etcd" is halted right now 

Thanks!





> El 21 jun 2016, a las 17:45, Julio Saura <[email protected]> escribió:
> 
> regarding the certs, i used ansible to install origin so i guess ansible 
> should have done it right …
> 
> 
>> El 21 jun 2016, a las 15:29, Julio Saura <[email protected] 
>> <mailto:[email protected]>> escribió:
>> 
>> hello
>> 
>> yes, they are synced with and internal NTP server .. 
>> 
>> gonna try ectdctl thanks!
>> 
>> 
>>> El 21 jun 2016, a las 15:20, Jason DeTiberus <[email protected] 
>>> <mailto:[email protected]>> escribió:
>>> 
>>> On Tue, Jun 21, 2016 at 7:28 AM, Julio Saura <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>>> yes
>>>> 
>>>> working
>>>> 
>>>> [root@openshift-master01 ~]# telnet XXXXX 2380
>>>> Trying XXXX...
>>>> Connected to XXXX.
>>>> Escape character is '^]'.
>>>> ^CConnection closed by foreign host.
>>> 
>>> 
>>> Have you verified that time is syncd between the hosts? I'd also check
>>> the peer certs between the hosts... Can you connect to the hosts using
>>> etcdctl? There should be a status command that will give you more
>>> information.
>>> 
>>>> 
>>>> 
>>>> El 21 jun 2016, a las 13:21, Jason DeTiberus <[email protected] 
>>>> <mailto:[email protected]>> escribió:
>>>> 
>>>> Did you verify connectivity over the peering port as well (2380)?
>>>> 
>>>> On Jun 21, 2016 7:17 AM, "Julio Saura" <[email protected] 
>>>> <mailto:[email protected]>> wrote:
>>>>> 
>>>>> hello
>>>>> 
>>>>> same problem
>>>>> 
>>>>> jun 21 13:11:03 openshift-master01 atomic-openshift-master-api[59618]:
>>>>> F0621 13:11:03.155246   59618 auth.go:141] error #0: dial tcp XXXX:2379:
>>>>> connection refused ( the one i rebooted )
>>>>> jun 21 13:11:03 openshift-master01 atomic-openshift-master-api[59618]:
>>>>> error #1: client: etcd member https://YYYY:2379 <https://yyyy:2379/> has 
>>>>> no leader
>>>>> 
>>>>> i rebooted the etcd server and my master is not able to use other one
>>>>> 
>>>>> still able to connect from both masters using telnet to the etcd port ..
>>>>> 
>>>>> any clue? this is weird.
>>>>> 
>>>>> 
>>>>>> El 14 jun 2016, a las 9:28, Julio Saura <[email protected] 
>>>>>> <mailto:[email protected]>> escribió:
>>>>>> 
>>>>>> hello
>>>>>> 
>>>>>> yes is correct .. it was the first thing i checked ..
>>>>>> 
>>>>>> first master
>>>>>> 
>>>>>> etcdClientInfo:
>>>>>> ca: master.etcd-ca.crt
>>>>>> certFile: master.etcd-client.crt
>>>>>> keyFile: master.etcd-client.key
>>>>>> urls:
>>>>>>  - https://openshift-balancer01:2379 <https://openshift-balancer01:2379/>
>>>>>>  - https://openshift-balancer02:2379 <https://openshift-balancer02:2379/>
>>>>>> 
>>>>>> 
>>>>>> second master
>>>>>> 
>>>>>> etcdClientInfo:
>>>>>> ca: master.etcd-ca.crt
>>>>>> certFile: master.etcd-client.crt
>>>>>> keyFile: master.etcd-client.key
>>>>>> urls:
>>>>>>  - https://openshift-balancer01:2379 <https://openshift-balancer01:2379/>
>>>>>>  - https://openshift-balancer02:2379 <https://openshift-balancer02:2379/>
>>>>>> 
>>>>>> dns names resolve in both masters
>>>>>> 
>>>>>> Best regards and thanks!
>>>>>> 
>>>>>> 
>>>>>>> El 13 jun 2016, a las 18:45, Scott Dodson <[email protected] 
>>>>>>> <mailto:[email protected]>>
>>>>>>> escribió:
>>>>>>> 
>>>>>>> Can you verify the connection information etcdClientInfo section in
>>>>>>> /etc/origin/master/master-config.yaml is correct?
>>>>>>> 
>>>>>>> On Mon, Jun 13, 2016 at 11:56 AM, Julio Saura <[email protected] 
>>>>>>> <mailto:[email protected]>>
>>>>>>> wrote:
>>>>>>>> hello
>>>>>>>> 
>>>>>>>> yes.. i have a external balancer in front of my masters for HA as doc
>>>>>>>> says.
>>>>>>>> 
>>>>>>>> i don’t have any balancer in front of my etcd servers for masters
>>>>>>>> connection, it’s not necessary right? masters will try all etcd 
>>>>>>>> availables
>>>>>>>> it one is down right?
>>>>>>>> 
>>>>>>>> i don’t know why but none of my masters were able to connect to the
>>>>>>>> second etcd instance, but using telnet from their shell worked .. so 
>>>>>>>> it was
>>>>>>>> not a net o fw issue..
>>>>>>>> 
>>>>>>>> 
>>>>>>>> best regards.
>>>>>>>> 
>>>>>>>>> El 13 jun 2016, a las 17:53, Clayton Coleman <[email protected] 
>>>>>>>>> <mailto:[email protected]>>
>>>>>>>>> escribió:
>>>>>>>>> credentials from
>>>>>>>>> I have not seen that particular issue.  Do you have a load balancer
>>>>>>>>> in
>>>>>>>>> between your masters and etcd?
>>>>>>>>> 
>>>>>>>>> On Fri, Jun 10, 2016 at 5:55 AM, Julio Saura <[email protected] 
>>>>>>>>> <mailto:[email protected]>>
>>>>>>>>> wrote:
>>>>>>>>>> hello
>>>>>>>>>> 
>>>>>>>>>> i have an origin 3.1 installation working cool so far
>>>>>>>>>> 
>>>>>>>>>> today one of my etcd nodes ( 1 of 2 ) crashed and i started having
>>>>>>>>>> problems..
>>>>>>>>>> 
>>>>>>>>>> i noticed on one of my master nodes that it was not able to connect
>>>>>>>>>> to second etcd server and that the etcd server was not able to 
>>>>>>>>>> promote as
>>>>>>>>>> leader..
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> un 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4 is
>>>>>>>>>> starting a new election at term 10048
>>>>>>>>>> jun 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4
>>>>>>>>>> became candidate at term 10049
>>>>>>>>>> jun 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4
>>>>>>>>>> received vote from 12c8a31c8fcae0d4 at term 10049
>>>>>>>>>> jun 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4
>>>>>>>>>> [logterm: 8, index: 4600461] sent vote request to bf80ee3a26e8772c 
>>>>>>>>>> at term
>>>>>>>>>> 10049
>>>>>>>>>> jun 10 11:09:56 openshift-balancer02 etcd[47218]: got unexpected
>>>>>>>>>> response error (etcdserver: request timed out)
>>>>>>>>>> 
>>>>>>>>>> my masters logged that they were not able to connect to the etcd
>>>>>>>>>> 
>>>>>>>>>> er.go:218] unexpected ListAndWatch error: pkg/storage/cacher.go:161:
>>>>>>>>>> Failed to list *extensions.Job: error #0: dial tcp X.X.X.X:2379: 
>>>>>>>>>> connection
>>>>>>>>>> refused
>>>>>>>>>> 
>>>>>>>>>> so i tried a simple test, just telnet from masters to the etcd node
>>>>>>>>>> port ..
>>>>>>>>>> 
>>>>>>>>>> [root@openshift-master01 log]# telnet X.X.X.X 2379
>>>>>>>>>> Trying X.X.X.X...
>>>>>>>>>> Connected to X.X.X.X.
>>>>>>>>>> Escape character is '^]’
>>>>>>>>>> 
>>>>>>>>>> so i was able to connect from masters.
>>>>>>>>>> 
>>>>>>>>>> i was not able to recover my oc masters until the first etcd node
>>>>>>>>>> rebooted .. so it seems my etcd “cluster” is not working without the 
>>>>>>>>>> first
>>>>>>>>>> node ..
>>>>>>>>>> 
>>>>>>>>>> any clue?
>>>>>>>>>> 
>>>>>>>>>> thanks
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> [email protected] 
>>>>>>>>>> <mailto:[email protected]>
>>>>>>>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users 
>>>>>>>>>> <http://lists.openshift.redhat.com/openshiftmm/listinfo/users>
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> [email protected] 
>>>>>>>> <mailto:[email protected]>
>>>>>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users 
>>>>>>>> <http://lists.openshift.redhat.com/openshiftmm/listinfo/users>
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> [email protected] 
>>>>>> <mailto:[email protected]>
>>>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users 
>>>>>> <http://lists.openshift.redhat.com/openshiftmm/listinfo/users>
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> [email protected] <mailto:[email protected]>
>>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users 
>>>>> <http://lists.openshift.redhat.com/openshiftmm/listinfo/users>
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Jason DeTiberus
>> 
>> _______________________________________________
>> users mailing list
>> [email protected] <mailto:[email protected]>
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
> 
> _______________________________________________
> users mailing list
> [email protected]
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users

_______________________________________________
users mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Reply via email to