Re: oc cluster up - dns issue?

2016-07-27 Thread Andrew Lau
I had a similar issue on F23 and oc cluster up, the dns service couldn't be reached by the pods. Restarting firewalld then docker fixed it for me. On Wed, 27 Jul 2016 at 21:22 Clayton Coleman wrote: > Is anything already listening on port 80/443/1936 on your host? Did the

daemonsets and nodes schedulable=false

2017-01-23 Thread Andrew Lau
Hi, Is there a way to exclude daemonsets from nodes that are schedulable=false? ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: daemonsets and nodes schedulable=false

2017-01-25 Thread Andrew Lau
ithub.com/kubernetes/kubernetes/issues/29178 > On Mon, Jan 23, 2017 at 4:07 PM, Andrew Lau <and...@andrewklau.com> wrote: > > Hi, > > Is there a way to exclude daemonsets from nodes that are schedulable=false? > > ___ > users mail

Re: Dynamic Provisioning PVs in AWS across Zones

2016-08-15 Thread Andrew Lau
ouldn't find it. > > > > On Mon, Aug 15, 2016 at 5:43 PM, Andrew Lau <and...@andrewklau.com> wrote: > >> This something thats tracked in kubernetes, iirc its something that was >> planned to be fixed in 1.4 >> >> On Tue, 16 Aug 2016, 7:08 AM Isaac Christ

Re: Dynamic Provisioning PVs in AWS across Zones

2016-08-15 Thread Andrew Lau
This something thats tracked in kubernetes, iirc its something that was planned to be fixed in 1.4 On Tue, 16 Aug 2016, 7:08 AM Isaac Christoffersen < ichristoffer...@vizuri.com> wrote: > I have an Origin v1.2.1 setup in AWS and I have my masters and nodes > spread across zones in AWS. > > I'm

node constraints for builds/terminating pods

2016-09-10 Thread Andrew Lau
Hi all, Is it possible to tweak the scheduler to have a different ruleset for terminating/run-once pods? eg. I want to run all builds on a specific subset of nodes. Thanks ___ users mailing list users@lists.openshift.redhat.com

Re: node constraints for builds/terminating pods

2016-09-10 Thread Andrew Lau
On Sun, 11 Sep 2016 at 02:09 Ben Parees <bpar...@redhat.com> wrote: > On Sat, Sep 10, 2016 at 4:03 AM, Andrew Lau <and...@andrewklau.com> wrote: > >> Hi all, >> >> Is it possible to tweak the scheduler to have a different ruleset for >> terminating/r

Re: Router Sharding

2016-09-23 Thread Andrew Lau
There are docs here: - https://docs.openshift.org/latest/architecture/core_concepts/routes.html#router-sharding - https://docs.openshift.org/latest/install_config/router/default_haproxy_router.html#creating-router-shards On Sat, 24 Sep 2016 at 06:13 Srinivas Naga Kotaru (skotaru) <

masters elb configuration

2016-09-30 Thread Andrew Lau
Has anyone had any success running the master (api and console) behind ELB? The new ALB supports web sockets, however spdy isn't supported (although http/2 is): Running oc rsh or oc rsync through ELB ends up with clients getting the respective responses: Error from server: Upgrade request

Re: Handling rolling restarts

2016-10-04 Thread Andrew Lau
; using openshift-ansible, I don't know if you're using that load > balancer or not. > > -- > Scott > > On Sat, Oct 1, 2016 at 2:42 AM, Andrew Lau <and...@andrewklau.com> wrote: > > Is there something like node evacuate for master hosts > > > > If we want to

Re: openshift api swagger 2.0

2016-09-15 Thread Andrew Lau
Cool, thanks! On Thu., 15 Sep. 2016, 12:13 pm Clayton Coleman, <ccole...@redhat.com> wrote: > Coming very soon - 1.4 Kube should have /swagger.json on the root path. > > > On Sep 14, 2016, at 9:44 PM, Andrew Lau <and...@andrewklau.com> wrote: > > &g

Re: Enabling emptyDir quota on atomic hosts

2016-09-28 Thread Andrew Lau
- gaze upon the setup code here: > > > https://github.com/openshift/vagrant-openshift/blob/master/lib/vagrant-openshift/action/install_origin_base_dependencies.rb#L262 > > I thought there was doc for this but I'm not seeing it in my quick > searches. > > On

Enabling emptyDir quota on atomic hosts

2016-09-27 Thread Andrew Lau
I noticed support for emptyDir volume quota was added in 1.3, is there any documentation on how we can enable this on atomic hosts? Setting gquota in /etc/fstab doesn't apply. "Preliminary support for local emptyDir volume quotas, set this value to a resource quantity representing the desired

Re: Enabling emptyDir quota on atomic hosts

2016-09-28 Thread Andrew Lau
Andrew Lau <and...@andrewklau.com> wrote: > lol - thanks > > Is `/mnt/openshift-xfs-vol-dir` a predefined mount or something? I'm not > seeing this anywhere in the docs either > > > On Wed, 28 Sep 2016 at 11:49 Clayton Coleman <ccole...@redhat.com> wrote: >

Handling rolling restarts

2016-10-01 Thread Andrew Lau
Is there something like node evacuate for master hosts If we want to restart a master host in a HA cluster, for whatever reason, it'll cause some temporary failures like pod/service DNS lookup. With routers it's been possible to remove it from the external DNS roundrobin or LB before performing

hostsubnets after node removed

2016-10-26 Thread Andrew Lau
Hi, When the AWS cloudprovider automatically removes a node after it's terminated in AWS, I noticed we often had a lot of the hostsubnets still around `oc get hostsubnet` Should these not be automatically deleted? The nodes no longer exist `oc get node`

node based resourceoverride

2016-11-06 Thread Andrew Lau
Are there any options for node based resourceoverride (clusterresourceoverride) eg. we'd like to allow users to choose a specific node selector which would give them guaranteed resources and a different level SLA. project level `quota.openshift.io/cluster-resource-override-enabled=false` doesn't

Re: is resourceVersion unique?

2016-11-09 Thread Andrew Lau
of resources, or different servers. > > On Wed, Nov 9, 2016 at 10:57 PM, Andrew Lau <and...@andrewklau.com> wrote: > > Is the resourceVersion unique across the whole cluster or just for the > particular resource? > > ___ &

Re: is resourceVersion unique?

2016-11-09 Thread Andrew Lau
Cluster wide or per namespace? Using the case scenario of watching every project quota across the entire cluster. On Thu, 10 Nov 2016 at 09:34 Jordan Liggitt <jligg...@redhat.com> wrote: > Not guaranteed unique across resource types > > > On Nov 9, 2016, at 1

is resourceVersion unique?

2016-11-09 Thread Andrew Lau
Is the resourceVersion unique across the whole cluster or just for the particular resource? ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users

default node selectors

2016-11-07 Thread Andrew Lau
>From the doc examples, node with label disktype: magnetic / ssd Is there a way to default the selector to be magnetic, while giving the user the option to select ssd. ___ users mailing list users@lists.openshift.redhat.com

Re: default node selectors

2016-11-07 Thread Andrew Lau
s. > Awesome, thanks > > > Did we backport that to 1.4? > > Its merged in kube upstream. I did not backport it in 1.4. > Is there a link to the upstream pr? > > > > On Nov 7, 2016, at 1:32 AM, Andrew Lau <and...@andrewklau.com> wrote: > > > >

Re: Referencing labels/annotations in templates?

2016-10-23 Thread Andrew Lau
Liggitt <jligg...@redhat.com> wrote: > Storage classes were intended to be globally visible. One per project > would be both unmanageable and would leak a lot of info about the projects > on the system. > > On Oct 23, 2016, at 7:46 PM, Andrew Lau <and...@andrewklau.

Re: is resourceVersion unique?

2016-11-21 Thread Andrew Lau
Resource version of an object, given to WATCH, is guaranteed to show changes that occur after the version you saw (after the resource version) If you have to cheat, expect that it may be broken in the future (especially if we end up doing sharding) On Wed, Nov 9, 2016 at 5:38 PM, Andrew Lau <an

Node triggered evacuation

2016-10-28 Thread Andrew Lau
Hi, Is there any facility to trigger a node evacuation from within the node? eg. if we are in the console of a particular node or the node receives a signal (eg. spot termination notice). Thanks ___ users mailing list users@lists.openshift.redhat.com

Re: hostsubnets after node removed

2016-10-27 Thread Andrew Lau
Disregard On Thu, 27 Oct 2016 at 11:44 Andrew Lau <and...@andrewklau.com> wrote: > Hi, > > When the AWS cloudprovider automatically removes a node after it's > terminated in AWS, I noticed we often had a lot of the hostsubnets still > around `oc get h

multi cloudprovider

2016-10-26 Thread Andrew Lau
Does openshift have support for multi cloudproviders (without federation)? eg. if we want to spread a cluster across AWS and oVirt. My concern around such an implementation is the AWS dynamic volume provisioning and masters access key requirements. ___

Re: multi cloudprovider

2016-10-26 Thread Andrew Lau
Thanks On Wed, 26 Oct 2016 at 22:12 Jason DeTiberus <jdeti...@redhat.com> wrote: > On Oct 26, 2016 4:29 AM, "Andrew Lau" <and...@andrewklau.com> wrote: > > > > Does openshift have support for multi cloudproviders (without > federation)? eg. if we want

Re: Blue/Green control plane upgrade

2017-01-12 Thread Andrew Lau
Thanks On Fri, 6 Jan 2017 at 03:47 Andrew Butcher <abutc...@redhat.com> wrote: > On Tue, Jan 3, 2017 at 8:32 PM, Andrew Lau <and...@andrewklau.com> wrote: > > Hi, > > Has anyone had any experience in upgrading control plane with the > blue/green method (contai

Re: Blue/Green control plane upgrade

2017-01-04 Thread Andrew Lau
er it will take to snapshot to a new > node, which could have significant impacts on the cluster. You might > also put the cluster in undesirable states. > > It's possible, but has some real challenges. > > > On Jan 3, 2017, at 8:35 PM, Andrew Lau <and...@andrewklau.com> wro

Blue/Green control plane upgrade

2017-01-03 Thread Andrew Lau
Hi, Has anyone had any experience in upgrading control plane with the blue/green method (containerisation install)? Preps for upgrading to 1.4 I want to redeploy parts of the control plane as part of the process. I'm doing tests but would appreciate some feedback and/or battle stories. - Add

Global projects

2017-03-28 Thread Andrew Lau
Hi, Is it possible to create another global project similar to the `openshift` namespace for sharing images/imagestreams/template? Docs seem point out imagestreams and templates are made global but couldn't find reference to images. They are also made available if pushed to the registry? Thanks

Re: Why I can use insecureEdgeTerminationPolicy: Redirect when I have termination: reencrypt

2017-03-31 Thread Andrew Lau
There is this https://trello.com/c/0BaxAOK9/181-3-re-encrypt-routes-should-support-the-same-features-as-edge-terminations-http-to-https-redirect-etc-demo-ingress On Fri, 31 Mar 2017 at 19:39 Stéphane Klein wrote: > Hi, > > why I can't use: > >

Overlayfs support

2017-04-10 Thread Andrew Lau
Hi, Is overlayfs support available? I read earlier its not yet recommended due to selinux issues and something about kernel requirements being back ported but I can't seem to track this down anymore. Thanks ___ users mailing list

Re: Global projects

2017-03-31 Thread Andrew Lau
ms and templates available from namespaces other than the openshift namespace, but the CLI and UI only look in the current project and the openshift project by default. > On Mar 28, 2017, at 10:12 PM, Andrew Lau <and...@andrewklau.com> wrote: > > Hi, > > Is it possible to cre

Re: Rolling pod evacuation

2017-04-20 Thread Andrew Lau
> number of pods that must be kept running when removing pods voluntarily > (draining nodes is an example of this). But this feature may not be in > OpenShift yet (IIRC draining nodes in Kubernetes honors the > PodDisruptionBudget from version 1.6 onwards). > > On 20. 04. 2

Re: Node report OK but every pod marked unready

2017-04-20 Thread Andrew Lau
Thanks! Hopefully we don't hit this too much until 1.5.0 is released On Fri, 21 Apr 2017 at 01:26 Patrick Tescher <patr...@outtherelabs.com> wrote: > We upgraded to 1.5.0 and that error went away. > > -- > Patrick Tescher > > On Apr 19, 2017, at 10:59 PM, Andrew Lau <a

Re: Rolling pod evacuation

2017-04-20 Thread Andrew Lau
Apr 20, 2017 at 10:11 AM, Andrew Lau <and...@andrewklau.com> > wrote: > >> Is there any way to evacuate a node using the rolling deployment process >> where a the new pod can start up first before being deleted from the >> current node? >> >> Drain see

Rolling pod evacuation

2017-04-20 Thread Andrew Lau
Is there any way to evacuate a node using the rolling deployment process where a the new pod can start up first before being deleted from the current node? Drain seems to only delete the pod straight away. If there is a grace period set, it would be nice if the new pod could atleast have its

Node report OK but every pod marked unready

2017-04-19 Thread Andrew Lau
I'm trying to debug a weird scenario where a node has had every pod crash with the error: "rpc error: code = 2 desc = shim error: context deadline exceeded" The pods stayed in the state Ready 0/1 The docker daemon was responding and the kublet and all it's services were running. The node was

Re: Node report OK but every pod marked unready

2017-04-20 Thread Andrew Lau
how_bug.cgi?id=1427212 > > > On Thu, 20 Apr 2017 at 15:41 Tero Ahonen <taho...@redhat.com> wrote: > >> Hi >> >> Did u try to ssh to that node and execute sudo docker run to some >> container? >> >> .t >> >> Sent from my iPhone >> >&

Re: Rolling pod evacuation

2017-04-23 Thread Andrew Lau
ukša <marko.lu...@gmail.com> > wrote: > >> He isn't performing a rolling upgrade; he just wants to drain a node one >> pod at a time. >> >> On 21. 04. 2017 09:18, Michail Kargakis wrote: >> >> If you used those settings and it wasn't honoured then it's a bu

Fieldselector

2017-03-30 Thread Andrew Lau
Hi, There seems to be some missing docs on how to properly use the fieldSelector /api/v1/namespaces?fieldSelector=metadata.name=test seems to work fine, is there a way to dig into annotations and get details with special chars like "openshift.io/requester" Thanks

High number of 4xx requests on etcd (3.6 upgrade)

2017-08-12 Thread Andrew Lau
Post upgrade to 3.6 I'm noticing the API server seems to be responding a lot slower and my etcd metrics etcd_http_failed_total is returning a large number of failed GET requests. Has anyone seen this? ___ users mailing list

Re: High number of 4xx requests on etcd (3.6 upgrade)

2017-08-12 Thread Andrew Lau
etcd data is on dedicated drives and aws reports idle and burst capacity around 90% On Sun, 13 Aug 2017 at 00:28 Clayton Coleman <ccole...@redhat.com> wrote: > Check how much IO is being used by etcd and how much you have provisioned. > > > > On Aug 12, 2017, a

Re: High number of 4xx requests on etcd (3.6 upgrade)

2017-08-13 Thread Andrew Lau
I found an upstream issue on kubernetes https://github.com/kubernetes/kubernetes/issues/48998 Would volume mount errors contribute to the 404 count? On Mon, 14 Aug 2017 at 11:55 Andrew Lau <and...@andrewklau.com> wrote: > On Mon, 14 Aug 2017 at 02:10 Clayton Coleman <ccole...@redha

Re: High number of 4xx requests on etcd (3.6 upgrade)

2017-08-13 Thread Andrew Lau
On Mon, 14 Aug 2017 at 02:10 Clayton Coleman <ccole...@redhat.com> wrote: > > > On Aug 13, 2017, at 12:07 AM, Andrew Lau <and...@andrewklau.com> wrote: > > I ended up finding that one of my rules had a wrong filter that was > returning a +inf value. Most of the

Re: timeout expired waiting for volumes to attach/mount for pod

2017-07-11 Thread Andrew Lau
Also might want to check if your nodes has any os updates, I found Network Manager 1.4.0-19.el7_3 has a memory leak which appears overtime. There was a recent devicemapper update too I believe. On Wed, 12 Jul 2017 at 07:59 Andrew Lau <and...@andrewklau.com> wrote: > Try restarting or

Re: timeout expired waiting for volumes to attach/mount for pod

2017-07-17 Thread Andrew Lau
I see this too. It only started happening after mixing 1.5 and 1.4 nodes. I think you are also doing the same thing since the SDN bug was never made into a release. On Mon., 17 Jul. 2017, 10:08 pm Stéphane Klein, wrote: > > 2017-07-17 17:03 GMT+02:00 Stéphane Klein

Re: timeout expired waiting for volumes to attach/mount for pod

2017-07-25 Thread Andrew Lau
I think your issue may come from https://github.com/kubernetes/kubernetes/issues/38498 Too many orphaned volumes causing the timeout. I guess the downgrade doesn't help with the increased number of volumes(?) On Wed, 19 Jul 2017 at 05:39 Philippe Lafoucrière <

BuildRequest name parameter

2017-07-26 Thread Andrew Lau
What is the purpose of the {name} URL parameter for the instantiate in BuildRequests? ie. POST /oapi/v1/namespaces/{namespace}/buildconfigs/{name}/instantiate https://docs.openshift.org/latest/rest_api/openshift_v1.html#create-instantiate-of-a-buildrequest The docs seem to suggest it's the name

Re: timeout expired waiting for volumes to attach/mount for pod

2017-07-11 Thread Andrew Lau
Try restarting origin-node it seemed to fix this issue for me. Also sometimes those mount errors are actually harmless. It happens when one of the controllers had been restarted but didn't sync the status. There's a fix upstream but I think only landed in 1.7 The volume is already mounted but

Catching kill due to oom

2017-07-02 Thread Andrew Lau
Hi, I'm often seeing issues where builds are getting killed due to oom. I'm hoping to get some ideas on ways we could perhaps catch the OOM for the purpose of displaying some sort of useful message. Based on what I am seeing, a SIGKILL is being sent to the container, so it's not possible to

Re: BuildRequest name parameter

2017-08-06 Thread Andrew Lau
On Fri, 28 Jul 2017 at 01:11 Ben Parees <bpar...@redhat.com> wrote: > On Thu, Jul 27, 2017 at 9:23 AM, Andrew Lau <and...@andrewklau.com> wrote: > >> On Thu, 27 Jul 2017 at 22:52 Ben Parees <bpar...@redhat.com> wrote: >> >>> On Thu, Jul 27, 2017

Re: BuildRequest name parameter

2017-08-06 Thread Andrew Lau
On Mon, 7 Aug 2017 at 09:11 Ben Parees <bpar...@redhat.com> wrote: > On Sun, Aug 6, 2017 at 4:42 AM, Andrew Lau <and...@andrewklau.com> wrote: > >> >> On Fri, 28 Jul 2017 at 01:11 Ben Parees <bpar...@redhat.com> wrote: >> >>> On Thu, Jul 27,

oauth token info

2017-06-12 Thread Andrew Lau
Is there an endpoint to retrieve the current token information? ie. /oapi/v1/users/~ seems to be an undocumented way to get the current user information. I'm looking to obtain the expiry time on the current token being used. ___ users mailing list

Re: Global projects

2017-06-22 Thread Andrew Lau
gt;> >> >> On Mar 28, 2017 22:34, "Jordan Liggitt" <jligg...@redhat.com> wrote: >> >> Images exist outside namespaces. They are accessed via imagestreams, >> which are namespaced. >> >> You can make imagestreams and templates available from

Re: Global projects

2017-06-22 Thread Andrew Lau
gt; Regards, > > Frédéric > > On Thu, Jun 22, 2017 at 9:50 AM, Andrew Lau <and...@andrewklau.com> wrote: > >> Was there an extra step you used before >> >> oc policy add-role-to-group shared-resource-viewer system:authenticated >> --role-namespace=common >

Re: Global projects

2017-06-22 Thread Andrew Lau
22, 2017 at 10:23 AM, Andrew Lau <and...@andrewklau.com> > wrote: > >> The namespace "common" does exist. >> >> On Thu, 22 Jun 2017 at 18:17 Frederic Giloux <fgil...@redhat.com> wrote: >> >>> Hi Andrew >>> >>> as per G

MustRunAsRange vs MustRunAsNonRoot

2017-05-20 Thread Andrew Lau
I'm looking to find some clarification on the difference between MustRunAsRange vs MustRunAsNonRoot MustRunAsRange seems to be the cluster default, this allow containers to run even if they are not having the USER definition Many pages seem to tout OpenShift does not run containers as root, and

Re: MustRunAsRange vs MustRunAsNonRoot

2017-05-20 Thread Andrew Lau
If the image has a > user > 0 (numeric) it'll be allowed through, otherwise the pod should > be failed. > > Now that we have the image resolver, it would certainly be possible to > do that calculation via image resolution and report it to get early > rejection. > > > On

Re: Pods has connectivity to other pod and service only when I run an additional pod

2017-05-23 Thread Andrew Lau
Yes, I believe you can. Otherwise you wouldn't be able to handle rolling updates On Wed, 24 May 2017 at 00:00 Philippe Lafoucrière < philippe.lafoucri...@tech-angels.com> wrote: > Do you know if it's possible to run 1.4 nodes with 1.5 masters? > We need to start rolling back, we have too many

Re: import-image from imagestream

2017-05-30 Thread Andrew Lau
/14404 to track > this issue. > > Cheers, > Maciej > > On Tue, May 30, 2017 at 9:52 AM, Andrew Lau <and...@andrewklau.com> wrote: > >> >> >> On Tue, 30 May 2017 at 17:46 Office ME2Digtial e. U. < >> off...@me2digital.eu> wrote: >> >

Re: import-image from imagestream

2017-05-30 Thread Andrew Lau
gt; name: '$*IS_DEV_PROJECT_NAME*:tag-youchoose' > > > My2ç > > On Tue, May 30, 2017 at 9:52 AM, Andrew Lau <and...@andrewklau.com> wrote: > > > On Tue, 30 May 2017 at 17:46 Office ME2Digtial e. U. <off...@me2digital.eu> > wrote: > Hi Andrew

garbage collection docker metadata

2017-06-09 Thread Andrew Lau
Does garbage collection get triggered when the docker metadata storage is full? Every few days I see some nodes fail to create new containers due to the docker metadata storage being full. Docker data storage has plenty of capacity. I've been cleaning out the images manually as the garbage

Re: garbage collection docker metadata

2017-06-09 Thread Andrew Lau
On Fri, 9 Jun 2017 at 21:10 Aleksandar Lazic <al...@me2digital.eu> wrote: > Hi Andrew Lau. > > on Freitag, 09. Juni 2017 at 12:35 was written: > > > Does garbage collection get triggered when the docker metadata storage is > full? Every few days I see some nodes fail t

Re: garbage collection docker metadata

2017-06-09 Thread Andrew Lau
:11 Fernando Lozano <floz...@redhat.com> wrote: > If the Docker GC complains images are in use and you get out of disk space > errors, I'd assume you need more space for docker storage. > > On Fri, Jun 9, 2017 at 8:37 AM, Andrew Lau <and...@andrewklau.com> wrote: > >&g

Re: Getting "error: unexpected EOF" while checking logs on a single pod

2017-06-11 Thread Andrew Lau
Are you running any load balancers in front of your masters? I see this happen more when you are connecting to the API through a loadbalancer. On Sun, 11 Jun 2017 at 08:02 G. Jones wrote: > I have a pod that's constantly restarting (Hawkular Metrics), pretty much >

Re: oauth token info

2017-06-13 Thread Andrew Lau
Thanks! That's what I was looking for. On Wed, 14 Jun 2017 at 01:37 Clayton Coleman <ccole...@redhat.com> wrote: > /oauth/info should return info about the token you pass as Authorization: > Bearer > > On Mon, Jun 12, 2017 at 9:38 PM, Andrew Lau <and...@

Re: Overlayfs support

2017-05-02 Thread Andrew Lau
By "landed in 7.3" is there any issue/BZ I can follow or would it not be a until rhel 7.4 or openshift 3.6 ? On Wed, 12 Apr 2017 at 10:03 Subhendu Ghosh wrote: > I guess I was looking for node configuration playbooks that could be used > for blue green node roll out. > >

Re: Pods has connectivity to other pod and service only when I run an additional pod

2017-06-27 Thread Andrew Lau
Will there be another 1.5 release now that https://github.com/openshift/origin/pull/14801 has merged? On Wed, 24 May 2017 at 00:00 Philippe Lafoucrière < philippe.lafoucri...@tech-angels.com> wrote: > Do you know if it's possible to run 1.4 nodes with 1.5 masters? > We need to start rolling