Issue #12623 has been updated by Daniel Pittman.

Chris Price wrote:
> FWIW I spent a bit of time looking at the code in question, and the Ruby 
> Resolv::DNS library that we are using did not have any obvious hooks for 
> setting a timeout... so I didn't see any easy way of dealing with that, short 
> of launching a new thread that we could kill after our timeout window had 
> passed.  And it sounds like we've not been having much fun with code that 
> spawns new threads recently.
> 
> Perhaps I overlooked the timeout setting, or perhaps there is another library 
> that would provide this functionality...

You probably want the `timeout` module from the standard library, which should 
implement that all correctly, I believe.
----------------------------------------
Bug #12623: Long timeout for SRV DNS rsolution 
https://projects.puppetlabs.com/issues/12623#change-54914

Author: Chris Price
Status: Accepted
Priority: Normal
Assignee: Daniel Pittman
Category: network
Target version: 
Affected Puppet version: development
Keywords: 
Branch: 


This issue has come up for me several times when setting up nodes for 
acceptance tests.  It is probably most easily reproducible with a fresh copy of 
our CentOS6 VM.

In /etc/sysconfig/network there is a line that looks like this:

<pre>
HOSTNAME=pe-centos6.localdomain
</pre>

This causes facter to return that same string for "fqdn".  Then, when you 
attempt to use this VM as an agent node for acceptance tests, it blocks for an 
*extremely* long time during the 03_ValidateSignCerts phase.  Running the 
offending commands outside of the acceptance test framework:

<pre>
puppet master --dns_alt_names="puppet,$(hostname -s),$(hostname -f)" --verbose 
--no-daemonize --logdest=/var/lib/puppet/log/puppetmaster.log --debug --trace
</pre>

and

<pre>
puppet agent --test --debug
</pre>

Will reproduce the slowness, and you'll see the following in the agent output:

<pre>
debug: Searching for SRV records for domain: localdomain
debug: Found 0 SRV records for: _x-puppet-report._tcp.localdomain
</pre>

There may be a delay of 5 minutes or more in between those two lines being 
printed, however.

The code that this occurs in is in lib/puppet/network/resolver.rb, in the 
method each_srv_record.  However, this code is simply calling into the ruby 
Resolve::DNS.getresources() method.  Reviewing the documentation for this 
method, I don't see a way to specify a timeout interval.

I'm also not entirely sure whether reducing the timeout is the correct 
solution, or if there is some other alternative.

As a workaround, simply fixing the "HOSTNAME" line in /etc/sysconfig/network so 
that it does not contain the domain name seems to dramatically improve the 
performance.


-- 
You have received this notification because you have either subscribed to it, 
or are involved in it.
To change your notification preferences, please click here: 
http://projects.puppetlabs.com/my/account

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Bugs" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-bugs?hl=en.

Reply via email to