I've got a very, very flaky network and many remote hosts which phone home hourly to pick up puppet updates. Some will complete really quickly, others can take minutes for a do-nothing agent run. My server is 4.3 and clients are mostly 3.8.6 but some are 4.3 as well. A mix of Centos (6 & 7) and Fedora (21+).
Every so often, I get the "Could not retrieve file metadata for ... :end of file reached" error on clients. It's usually random -- some will run fine for a days, then suddenly exhibit this once or twice, then be fine again. To try to get to a state where my errors actually mean something, I started cranking up the http_keepalive_timeout value. I'll readily admit that I'm not sure I completely understand how to bound it. I started at 30s, went to 3m, and am now sitting on 30m on the server, 29m on agents. How big should this be? Enough to encapsulate a complete successful run or the expected duration of a single file request? What's the downside of cranking this up? The affected file has changed since I raised the value so I think it's having some affect, but I'm also seeing more failures (though that may be a red herring if our network is acting up today). What's a good guideline for properly settting this value? -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/0fd32cf6-c4d4-46f4-8642-f200b0379db4%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
