Yes, that's exactly the sort of behaviour I'm seeing.
But I also see it when deploying a cluster (the playbooks/byo/config.yml playbook) as well as when scaling up.

On one of my (OpenStack) environments it seems to happen on about 50% of the nodes!


On 13/04/18 14:51, Rodrigo Bersa wrote:
Hi TIm,

Yes, I've seen this error sometimes, mainly during the Scaleup process.

What I did that apparently solve this issue is to remove the /etc/cni directory, and let the installation/scaleup process create it, but I don't know the root cause either.

As you said, it happens randomly and don't seem to have a pattern. The first time I faced it, I was scaling a cluster and adding four new Nodes, and just one presented the error, the other three were added to the cluster with no errors.


Best regards,


Rodrigo Bersa

Cloud Consultant, RHCVA, RHCE

Red Hat Brasil <https://www.redhat.com>

rbe...@redhat.com <mailto:rbe...@redhat.com> M: +55-11-99557-5841 <tel:+55-11-99557-5841>

<https://red.ht/sig>      
TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>

Red Hat é reconhecida entre as melhores empresas para trabalhar no Brasil pelo *Great Place to Work*.

On Fri, Apr 13, 2018 at 10:10 AM, Tim Dudgeon <tdudgeon...@gmail.com <mailto:tdudgeon...@gmail.com>> wrote:

    We've long been encountering a seemingly random problem installing
    Origin 3.7 on Centos nodes.
    This is manifested in the /etc/cni/net.d/ directory on the node
    being empty (it should contain one file named
    80-openshift-sdn.conf) and that prevents the origin-node service
    from starting, with the key error in the logs (using journalctl)
    being something like this:

    Apr 13 12:23:44 ip-10-0-0-61.eu-central-1.compute.internal
    origin-node[26683]: W0413 12:23:44.933963   26683 cni.go:189]
    Unable to update cni config: No networks found in /etc/cni/net.d

    Something is preventing the ansible installer from creating this
    file on the nodes (though the real cause maybe upstream of this).

    This seems to happen randomly, and with differing frequencies on
    different environments. One one environement abotu 50% of the
    nodes fail in this way. On others its much less frequent. We
    thought this was a problem with our OpenStack environment but we
    have now also seen this on AWS so it looks like its a OpenShift
    specific problem.

    Has anyone else seen this or know what causes it?
    It's been a really big impediment to rolling out a cluster.

    Tim


    _______________________________________________
    users mailing list
    users@lists.openshift.redhat.com
    <mailto:users@lists.openshift.redhat.com>
    http://lists.openshift.redhat.com/openshiftmm/listinfo/users
    <http://lists.openshift.redhat.com/openshiftmm/listinfo/users>



_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Reply via email to