Ahh, I looked into all the objects that were getting deleted and they all
have an ownerReference, eg:

"ownerReferences": [
                    {
                        "apiVersion": "template.openshift.io/v1",
                        "kind": "TemplateInstance",
                        "name": "75c0ccd3-642e-4035-a5cf-3c27e54cae40",
                        "uid": "a7301596-f41a-11e7-88e5-fa163eb8ca3a",
                        "blockOwnerDeletion": true
                    }
                ]

That looks like what patch is about. I also found that if I tried to edit
an object and remove the ownerReference then it also triggered a garbage
collect on the spot and all the resources evaporated.

So I guess my workaround can be, run the template, wait for everything to
deploy, export all templated resources to json, strip out ownerReferences,
and create all the resources again.

On Mon, Jan 8, 2018 at 12:30 PM Joel Pearson <japear...@agiledigital.com.au>
wrote:

> Hmm, in my case I don't need to need to restart to cause the problem to
> happen. Is there some way to run nightlies of openshift:release-3.7 using
> the openshift-ansible? So that I can verify it's fixed for me?
>
> On Mon, Jan 8, 2018 at 12:23 PM Jordan Liggitt <jligg...@redhat.com>
> wrote:
>
>> Garbage collection in particular could be related to
>> https://bugzilla.redhat.com/show_bug.cgi?id=1525699 (fixed in
>> https://github.com/openshift/origin/pull/17818 but not included in a
>> point release yet)
>>
>>
>> On Jan 7, 2018, at 8:17 PM, Joel Pearson <japear...@agiledigital.com.au>
>> wrote:
>>
>> Hi,
>>
>> Has anyone else noticed that the new OpenShift Origin 3.7 Template Broker
>> seems super flaky?
>>
>> For example, if I deploy a Jenkins (Persistent or Ephemeral), and then I
>> modify the route, by adding an annotation for example:
>>
>> kubernetes.io/tls-acme: 'true'
>>
>> I have https://github.com/tnozicka/openshift-acme Installed in the
>> cluster which then grabs an SSL cert for me, adds it to the route, then
>> moments later all resources from the template are garbage collected for no
>> apparent reason.
>>
>> I also got the same behaviour when I modified the service account the
>> Jenkins template uses, I added an additional route so I added a new "
>> serviceaccounts.openshift.io/oauth-redirectreference.jenkins:" entry. It
>> took a bit longer (like 12 hours), but it all disappeared again.  I have a
>> suspicion that if you modify any object that a template created, then
>> eventually the template broker will remove all objects it created.
>>
>> Is there any way to disable the new template broker and use the old
>> template system?
>>
>> In Origin 3.6 it was flawless and worked with openshift-acme without any
>> problems at all.
>>
>> I should mention that if I create things manually then it works fine, I
>> can use openshift-acme, and all my resources don't vanish at whim.
>>
>> Here is a snippet of the logs, you can see the acme points are removed
>> after successfully getting a cert, and then moments later, the deleting
>> starts:
>>
>> Jan 08 00:26:47 master-0.openshift.staging.local dockerd-current[23329]:
>> I0108 00:26:47.648255       1 leaderelection.go:199] successfully renewed
>> lease kube-service-catalog/service-catalog-controller-manager
>> Jan 08 00:26:47 master-0.openshift.staging.local origin-node[26684]:
>> I0108 00:26:47.744777   26749 roundrobin.go:338] LoadBalancerRR: Removing
>> endpoints for jenkins-test/acme-9cv97q5dn8:
>> Jan 08 00:26:47 master-0.openshift.staging.local dockerd-current[23329]:
>> I0108 00:26:47.744777   26749 roundrobin.go:338] LoadBalancerRR: Removing
>> endpoints for jenkins-test/acme-9cv97q5dn8:
>> Jan 08 00:26:47 master-0.openshift.staging.local origin-node[26684]:
>> I0108 00:26:47.762005   26749 ovs.go:143] Error executing ovs-ofctl:
>> ovs-ofctl: None: invalid IP address
>> Jan 08 00:26:47 master-0.openshift.staging.local dockerd-current[23329]:
>> I0108 00:26:47.762005   26749 ovs.go:143] Error executing ovs-ofctl:
>> ovs-ofctl: None: invalid IP address
>> Jan 08 00:26:47 master-0.openshift.staging.local dockerd-current[23329]:
>> E0108 00:26:47.765091   26749 sdn_controller.go:284] Error deleting OVS
>> flows for service &{{ } {acme-9cv97q5dn8  jenkins-test
>> /api/v1/namespaces/jenkins-test/services/acme-9cv97q5dn8
>> 94c6b3b3-f40a-11e7-88e5-fa163eb8ca3a 622382 0 2018-01-08 00:26:34 +0000 UTC
>> <nil> <nil> map[] map[] [] nil [] } {ClusterIP [{http TCP 80 {0 80 } 0}]
>> map[] None  []  None []  0} {{[]}}}: exit status 1
>> Jan 08 00:26:47 master-0.openshift.staging.local origin-node[26684]:
>> E0108 00:26:47.765091   26749 sdn_controller.go:284] Error deleting OVS
>> flows for service &{{ } {acme-9cv97q5dn8  jenkins-test
>> /api/v1/namespaces/jenkins-test/services/acme-9cv97q5dn8
>> 94c6b3b3-f40a-11e7-88e5-fa163eb8ca3a 622382 0 2018-01-08 00:26:34 +0000 UTC
>> <nil> <nil> map[] map[] [] nil [] } {ClusterIP [{http TCP 80 {0 80 } 0}]
>> map[] None  []  None []  0} {{[]}}}: exit status 1
>> Jan 08 00:26:48 master-0.openshift.staging.local dockerd-current[23329]:
>> I0108 00:26:48.139090       1 rest.go:362] Starting watch for
>> /api/v1/namespaces, rv=622418 labels= fields= timeout=8m38s
>> Jan 08 00:26:48 master-0.openshift.staging.local
>> origin-master-api[23448]: I0108 00:26:48.139090       1 rest.go:362]
>> Starting watch for /api/v1/namespaces, rv=622418 labels= fields=
>> timeout=8m38s
>> Jan 08 00:26:49 master-0.openshift.staging.local dockerd-current[23329]:
>> I0108 00:26:49.668205       1 leaderelection.go:199] successfully renewed
>> lease kube-service-catalog/service-catalog-controller-manager
>> Jan 08 00:26:49 master-0.openshift.staging.local dockerd-current[23329]:
>> I0108 00:26:49.885207       1 garbagecollector.go:291] processing item [
>> template.openshift.io/v1/TemplateInstance, namespace: jenkins-test,
>> name: e3639aec-bbbc-4170-b0e4-3b63735af348, uid:
>> 915d585d-f408-11e7-88e5-fa163eb8ca3a]
>> Jan 08 00:26:49 master-0.openshift.staging.local
>> origin-master-controllers[73353]: I0108 00:26:49.885207       1
>> garbagecollector.go:291] processing item [
>> template.openshift.io/v1/TemplateInstance, namespace: jenkins-test,
>> name: e3639aec-bbbc-4170-b0e4-3b63735af348, uid:
>> 915d585d-f408-11e7-88e5-fa163eb8ca3a]
>> Jan 08 00:26:49 master-0.openshift.staging.local dockerd-current[23329]:
>> I0108 00:26:49.904249       1 garbagecollector.go:394] delete object [
>> template.openshift.io/v1/TemplateInstance, namespace: jenkins-test,
>> name: e3639aec-bbbc-4170-b0e4-3b63735af348, uid:
>> 915d585d-f408-11e7-88e5-fa163eb8ca3a] with propagation policy Background
>> Jan 08 00:26:49 master-0.openshift.staging.local
>> origin-master-controllers[73353]: I0108 00:26:49.904249       1
>> garbagecollector.go:394] delete object [
>> template.openshift.io/v1/TemplateInstance, namespace: jenkins-test,
>> name: e3639aec-bbbc-4170-b0e4-3b63735af348, uid:
>> 915d585d-f408-11e7-88e5-fa163eb8ca3a] with propagation policy Background
>> Jan 08 00:26:49 master-0.openshift.staging.local dockerd-current[23329]:
>> I0108 00:26:49.910964       1 garbagecollector.go:291] processing item [
>> apps.openshift.io/v1/DeploymentConfig, namespace: jenkins-test, name:
>> jenkins, uid: 91759f72-f408-11e7-88e5-fa163eb8ca3a]
>>
>> Any ideas? Has anyone else seen this?  Considering
>> "openshift-ansible-service-broker" is deployed in a broken state by
>> openshift-ansible on the release-3.7 branch (for origin, I think enterprise
>> would work as the tags exist), it makes me think that not many people are
>> using the new service brokers that are talked about here:
>> https://blog.openshift.com/whats-new-in-openshift-3-7-service-catalog-and-brokers/
>>
>> Thanks,
>>
>> Joel
>>
>> _______________________________________________
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>>
_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Reply via email to