Hi

I recently started the resiliency testing on honolulu
the test case can be summarized as follows:

1) run basic_vm test
2) stop a k8s worker (I shutdown the VM hosting the worker in my case)
3) run basic_vm
4) restart the k8s worker (I restyart the VM)
5) run basic_vm

I gave more details and logs in https://jira.onap.org/browse/TEST-308

On my first attempt, I had the impression that the pods were properly restarted 
on the other healthy workers but on the last attempt, it was clearly not the 
case.
I can see in the events of the pod that used to be stopped worker

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning NodeNotReady 2m45s node-controller Node is not ready

it seems that I just did not wait enough in the second case
the default value is 5m
https://github.com/kubernetes/kubernetes/issues/55713
I will redo a test before reinstalling a weekly and reduce the 
pod-eviction-timeout.

Anyway in both cases the test was failing when the worker was down
apparently due to network issues...
more surprisingly my test reported that I was not able anymore to contact the 
SDC (first step of the basic_vm test), even if on the faulty pods there was no 
SDC pods at all. I did not change anything on the controler
what could explain the timeout on requests to the SDC?
Note that the requests are launched from a docker out of k8s cluster

When I restart the worker, the pods are restarted properly on other pods and 
the test is PASS again

/Morgan

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.



-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#23147): https://lists.onap.org/g/onap-discuss/message/23147
Mute This Topic: https://lists.onap.org/mt/82233952/21656
Group Owner: [email protected]
Unsubscribe: https://lists.onap.org/g/onap-discuss/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to