Re: 503 service unavailable

Ram Ranganathan Thu, 03 Mar 2016 00:49:56 -0800

Yeah, that should not matter. The routes + namespaces you would see are
based on the permissions of the service account.


I was able to get Dean on irc and ssh into his instance seeing something
wonky with the permissions.
CCing Jordan and Paul  for some help.

Inside the router container, I tried running this:
curl -k -vvv https://127.0.0.1:8443/api/v1/endpoints -H "Authorization:
Bearer $(</var/run/secrets/kubernetes.io/serviceaccount/token)"

which returns the endpoints if that token has permissions and I get a 403
error back :
"message": "User \"system:serviceaccount:default:router\" cannot list all
endpoints in the cluster",


but the oadm policy shows that the router service account has those
permissions.

On the host, running :
$  oadm policy who-can get endpoints

output has the router service account:  http://fpaste.org/332733/45699454/


The token info from inside the router container (/var/run/secrets/
kubernetes.io/serviceaccount/token) seems to work if I use it
with oc login but not with the curl command - so it feels a bit odd.   Any
ideas what's amiss here?

Thanks,

Ram//



On Wed, Mar 2, 2016 at 11:56 PM, Dean Peterson <[email protected]>
wrote:

> The router is on default namespace but the service pods are running on a
> different namespace.
>
> On Thu, Mar 3, 2016 at 1:53 AM, Julio Saura <[email protected]> wrote:
>
>> seems your router is running on default namespace, your pods are also
>> running on namespace default?
>>
>>
>> El 3 mar 2016, a las 7:58, Dean Peterson <[email protected]>
>> escribió:
>>
>> I did do an "oc edit scc privileged" and made sure this was at the end:
>>
>> users:
>> - system:serviceaccount:openshift-infra:build-controller
>> - system:serviceaccount:management-infra:management-admin
>> - system:serviceaccount:default:router
>> - system:serviceaccount:default:registry
>>
>> router has always been a privileged user service account.
>>
>> On Thu, Mar 3, 2016 at 12:55 AM, Ram Ranganathan <[email protected]>
>> wrote:
>>
>>> So you have no app level backends in that gist (haproxy.config file).
>>> That would explain the 503s - there's nothing there for haproxy to route
>>> to.  Most likely its due to the router service account has no permissions
>>> to get the routes/endpoints info from etcd.
>>> Check that the router service account (router default or whatever
>>> service account you used to start the router) is
>>> part of the privileged SCC and has read permissions to etcd.
>>>
>>>
>>> On Wed, Mar 2, 2016 at 10:43 PM, Dean Peterson <[email protected]>
>>> wrote:
>>>
>>>> I created a public gist from the output:
>>>> https://gist.github.com/deanpeterson/76aa9abf2c7fa182b56c
>>>>
>>>> On Thu, Mar 3, 2016 at 12:35 AM, Ram Ranganathan <[email protected]>
>>>> wrote:
>>>>
>>>>> You shouldn't need to restart the router. It should have created a new
>>>>> deployment and redeployed the router.
>>>>> So looks like the cause for your 503 errors is something else.
>>>>>
>>>>> Can you check that your haproxy.config file is correct (has the
>>>>> correct backends and servers).
>>>>> Either nsenter into your router docker container and cat the file or
>>>>> then run:
>>>>>     oc exec <router-pod-name> cat /var/lib/haproxy/conf/haproxy.config
>>>>>    #  router-pod-name as shown in oc get pods
>>>>>
>>>>> Ram//
>>>>>
>>>>> On Wed, Mar 2, 2016 at 10:10 PM, Dean Peterson <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> I ran that "oc env dc router RELOAD_INTERVAL=5s" but I still get the
>>>>>> 503 error.  Do I need to restart anything?
>>>>>>
>>>>>> On Wed, Mar 2, 2016 at 11:47 PM, Ram Ranganathan <[email protected]
>>>>>> > wrote:
>>>>>>
>>>>>>> Dean, we did have a recent change to coalesce router reloads
>>>>>>> (default is 0s) and it looks like with that default we are more 
>>>>>>> aggressive
>>>>>>> with the reloads which could be causing this problem.
>>>>>>>
>>>>>>> Could you please try setting an environment variable ala:
>>>>>>>     oc env dc router RELOAD_INTERVAL=5s
>>>>>>>        #  or even 2s or 3s  - that's reload interval in seconds btw
>>>>>>>        # if you have a custom deployment config then replace the dc
>>>>>>> name router to that deployment config name.
>>>>>>>
>>>>>>> and see if that helps.
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 2, 2016 at 6:21 PM, Dean Peterson <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Is there another place I can look to track down the problem?  The
>>>>>>>> router logs don't say much, just: " Router is including routes in
>>>>>>>> all namespaces"
>>>>>>>>
>>>>>>>> On Wed, Mar 2, 2016 at 7:39 PM, Dean Peterson <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> All it says is: " Router is including routes in all namespaces"
>>>>>>>>>  That's it.
>>>>>>>>>
>>>>>>>>> On Wed, Mar 2, 2016 at 7:38 PM, Clayton Coleman <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> What do the router logs say?
>>>>>>>>>>
>>>>>>>>>> On Mar 2, 2016, at 7:43 PM, Dean Peterson <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>> This is as close to having openshift origin set up perfectly as I
>>>>>>>>>> have gotten.  My builds work great, container deployments always 
>>>>>>>>>> work now.
>>>>>>>>>> I thought I was finally going to have a smooth running Openshift; I 
>>>>>>>>>> just
>>>>>>>>>> need to get past this last router issue.  It makes little sense.  I 
>>>>>>>>>> have
>>>>>>>>>> set up a router many times before and never had this issue.  I've had
>>>>>>>>>> issues with other parts of the system but never the router.
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 2, 2016 at 6:34 PM, Dean Peterson <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> I have a number of happy pods.  They are all running normally.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 2, 2016 at 6:28 PM, Mohamed Lrhazi <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Click on a pod and get to its log and events tabs.... see if
>>>>>>>>>>>> they are actually happy or stuck on something...
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Mar 2, 2016 at 7:03 PM, Dean Peterson <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I have successfully started the ha proxy router.  I have a pod
>>>>>>>>>>>>> running, yet all my routes take me to a 503 service unavailable 
>>>>>>>>>>>>> error
>>>>>>>>>>>>> page.  I updated my resolv.conf file to have my master ip as 
>>>>>>>>>>>>> nameserver;
>>>>>>>>>>>>> I've never had this problem on previous versions.  I installed 
>>>>>>>>>>>>> openshift
>>>>>>>>>>>>> origin 1.1.3 with ansible; everything seems to be running 
>>>>>>>>>>>>> smoothly like
>>>>>>>>>>>>> before but I just get 503 service unavailable errors trying to 
>>>>>>>>>>>>> visit any
>>>>>>>>>>>>> route.
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> [email protected]
>>>>>>>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> [email protected]
>>>>>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Ram//
>>>>>>> main(O,s){s=--O;10<putchar(3^O?97-(15&7183>>4*s)*(O++?-1:1):10)&&\
>>>>>>> main(++O,s++);}
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Ram//
>>>>> main(O,s){s=--O;10<putchar(3^O?97-(15&7183>>4*s)*(O++?-1:1):10)&&\
>>>>> main(++O,s++);}
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Ram//
>>> main(O,s){s=--O;10<putchar(3^O?97-(15&7183>>4*s)*(O++?-1:1):10)&&\
>>> main(++O,s++);}
>>>
>>
>> _______________________________________________
>> users mailing list
>> [email protected]
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>>
>>
>


-- 
Ram//
main(O,s){s=--O;10<putchar(3^O?97-(15&7183>>4*s)*(O++?-1:1):10)&&\
main(++O,s++);}

_______________________________________________
users mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: 503 service unavailable

Reply via email to