style95 commented on issue #5286:
URL: https://github.com/apache/openwhisk/issues/5286#issuecomment-1188500211

   Let me look into it.
   But I see some issues and questions at first glance.
   
   I found the default lease timeout is 1 second.
   https://github.com/apache/openwhisk/blob/master/ansible/group_vars/all#L467
   An intermittent network rupture can happen at any time, this should be 
bigger than 1s.
   It could easily break the system with short network unavailability and we 
are using 10s in our downstream.
   I think it is better to update the default.
   
   Regarding the error `The activation has not been processed`, the queue 
manager is supposed to retry until it fetches the endpoint. It is supposed to 
retry up to 13 times with exponential backoff starting with 1ms and the total 
wait time would be around 8 seconds(`1ms + 2ms + 4ms + ... 4096ms`).
   Was there such a log?
   
   Also, when an endpoint is removed while there is actually a queue, the 
system is supposed to restore the etcd data until the data is explicitly 
requested to be deleted.
   
   Regarding the error, `No scheduler endpoint available`, it conforms to the 
existing behavior of the ShardingPoolBalancer that no retry is performed when 
there is any issue in Kafka.
   
https://github.com/apache/openwhisk/blob/master/core/controller/src/main/scala/org/apache/openwhisk/core/loadBalancer/CommonLoadBalancer.scala#L210
   But I feel it would be better to add a retry mechanism here too.
   
   In general, there was a but in the code and we recently fixed it with the 
following.
   https://github.com/apache/openwhisk/pull/5251
   Need to see if it could cause any regression.
   
   
   @ningyougang @jiangpengcheng 
   Do you have any idea?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to