Hi,
we momentarily stopped the gating chain on Orange Openlab.
We are facing weird behaviors since some days.
Initially we believed it could be due to ONAP but latest investigations
seem to indicate that it is a kubernetes issue.
Our gating chains include a 3 controllers + 12 compute nodes
configurations.
For some reasons we still ignore, sometimes some compute nodes (usually
1 or 2 on the 12) are badly configured. The IPvS (~ IP Tables to manage
the routing within the k8s clusters are incomplete)
Apparently it is not due to the CNI (we tried with weave and flanel).
As a consequence, some ONAP components cannot contact other ONAP
components if they are on a wrongly configured compute nodes.
The delete of the pod may lead to restoration if by chance the pod is
rescheduled on a healthy node.
It could explain the intermittent problems we reported - and why it was
hard to reproduce the issues.
For gating and our daily chains, the consequence is an abnormal number
of failed pods - pods should be OK but as they cannot contact pods they
depend on, init is failing.
We need to understand the root cause of the problem and see what
changed over the last few days. By default we run a simple healthcheck
test suite after kubernetes installation, it is probably not enough.
Meanwhile to avoid any misleading reporting on gating, we disabled the
listener on gerrit.
Sorry for the inconvenience
Morgan & Sylvain
_________________________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou
falsifie. Merci.
This message and its attachments may contain confidential or privileged
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been
modified, changed or falsified.
Thank you.
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#17978): https://lists.onap.org/g/onap-discuss/message/17978
Mute This Topic: https://lists.onap.org/mt/32428410/21656
Group Owner: [email protected]
Unsubscribe: https://lists.onap.org/g/onap-discuss/unsub
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-