gyfora commented on code in PR #439:
URL:
https://github.com/apache/flink-kubernetes-operator/pull/439#discussion_r1023676295
##########
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/utils/StatusRecorder.java:
##########
@@ -97,20 +96,79 @@ public void patchAndCacheStatus(CR resource) {
Exception err = null;
for (int i = 0; i < 3; i++) {
- // In any case we retry the status update 3 times to avoid some
intermittent
- // connectivity errors if any
+ // We retry the status update 3 times to avoid some intermittent
connectivity errors
try {
- client.resource(resource).patchStatus();
- statusUpdateListener.accept(resource, prevStatus);
- metricManager.onUpdate(resource);
- return;
- } catch (Exception e) {
+ replaceStatus(resource, prevStatus);
+ err = null;
+ } catch (KubernetesClientException e) {
LOG.error("Error while patching status, retrying {}/3...", (i
+ 1), e);
Thread.sleep(1000);
err = e;
}
}
- throw err;
+
Review Comment:
This is not really a question of multithreading, the main problem with unit
tests is that the fabric8 mockservers cannot really handle the resource locking.
I have added a case to the Operator integration test which validates the
core conflict detection. Otherwise the e2e tests and manual testing should be
enough to cover the basic behaviour (we don't expect any errors in the logs
normally related to this part)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]