On 08/28/2018 06:56 PM, David Lang wrote:
On Tue, 28 Aug 2018, Rich Megginson wrote:

On 08/28/2018 06:20 PM, David Lang wrote:
On Tue, 28 Aug 2018, Rich Megginson via rsyslog wrote:

As part of the fix for https://github.com/rsyslog/rsyslog/pull/2962 I've added handling for error 429 Busy.  I've seen this most commonly with large Kubernetes clusters where if you have hundreds of nodes, all with open connections to the Kubernetes API service, the service will eventually hit the maximum concurrent connections limit, and start returning error 429 for requests.

In this case, the mmkubernetes plugin does not cache whatever metadata it has, so that it will retry for the next record. However, this means that until Kubernetes API becomes less busy, mmkubernetes will be hammering the Kubernetes API for every missing namespace/pod that it does not have cached.  If the plugin action returns an error in this case, rsyslog will suspend the plugin, meaning no mmkubernetes metadata handling at all.  But I also don't want the mmkubernetes plugin to sleep and retry, because that will hold up the entire pipeline.

How to handle this case?

have mmkubernetes cache the fact that it got a 429 and when it got it, then only retry after X amount of time (say 1 second).

worst case, you will have a burst of multiple threads trying every second and you loose a second's worth of metadata.

but this should reduce the load on the API server to manageable levels.

Are there any retry facilities like that in rsyslog, or would this have to be implemented in mmkubernetes?  I suppose the latter, if only because that's the only way to ensure the records get some metadata at all.

It would have to be done inside the module, rsyslog has back-off capabilities, but only when the entire pipeline is suspended.

Another option is to handle 429 in mmkubernetes similar to how it is handled in omelasticsearch - resubmit the record back to the beginning of the processing pipeline, for cases when you absolutely must have the metadata associated with the record, and you cannot accept records without the appropriate metadata.


David Lang


_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to