Issue #15420 has been updated by Ramon Alteren.

eric sorenson wrote:
> Some comments:
> 
> I don't think we'd add a new config option into 2.7.x at this point. 
> 
> Even in 3.x, it seems like adding timeout settings for the various http 
> connections is not a good choice either, because it adds a lot of complexity 
> that is really just used to cover up underlying problems.  If you have to 
> bump your catalog / file fetch timeout to high values because of slow puppet 
> master response, stacking up connections is probably adding to the death 
> spiral rather than helping it; conversely if you need to use really short 
> timeouts because of an underlying app server problem you should probably be 
> fixing that problem rather than working around it with http settings.

Fair enough. The death spiral of stacking up connections was exactly the 
problem we were trying to solve with this patch. Although I agree that the 
problem should probably be solved at puppetdashboard level I'm a bit doubtful 
I'll be able to avoid all error situations that could result in a (long) 
timeout. The original cause was a filled mysql disk causing puppet dashboard to 
timeout, but I can think of a number of network related failures that would 
have roughly the same effect but can never be solved at the target (reporting) 
system.

We rely on our puppetmasters to keep functioning regardless of their ability to 
report status to any secondary services, in general that would be the preferred 
behavior I think ? It's really bad for us to have the primary function as 
config management system fail due it's reporting facilities. We care less about 
the reporting and more about a correct functioning config management system.

> If I understand correctly, Ramon/Erik can use the :configtimeout setting to 
> tune the timeout down today without any code changes upstream, it's just that 
> this will also affect catalog and file fetches. 

That sadly doesn't work for 2.7, would it be possible to set the :configtimeout 
setting for reports only in 3.0?

> Relatedly, as Patrick discovered, the `configtimeout` option is erroneously 
> named and ought to be corrected and used for this purpose in 3.x, since it 
> affects all http connections. I've created #15611 to address this.

That would definitely help, thanks for looking into this.

----------------------------------------
Bug #15420: puppet master resource starvation on http report url timeouts 
instead of failures
https://projects.puppetlabs.com/issues/15420#change-67475

Author: Ramon Alteren
Status: Closed
Priority: Normal
Assignee: eric sorenson
Category: reports
Target version: 
Affected Puppet version: 2.7.13
Keywords: ntbf
Branch: 


We recently had an issue with puppetdashboard that caused it to timeout instead 
of fail / succeed.
Since the default http timeout in ruby is set to 60 seconds, the puppetmaster 
process is busy for the entire catalog-handling + and additional 60secs to wait 
on report http post timeout.

This causes severe resource starvation on the master resulting in failed puppet 
runs for _all_ nodes

It would make sense to add a timeout parameter for http reports or in the long 
run split off http reporting into a separate thread/process on the master...


-- 
You have received this notification because you have either subscribed to it, 
or are involved in it.
To change your notification preferences, please click here: 
http://projects.puppetlabs.com/my/account

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Bugs" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-bugs?hl=en.

Reply via email to