Issue #12623 has been updated by Chris Price.

Confirmed that the use_srv_records setting also alleviates this problem.

In my acceptance testing environment, I logged into my CentOS6 node and changed 
the /etc/sysconfig/network line back to its original value:

<pre>
HOSTNAME=pe-centos6.localdomain
</pre>

Rebooted, and 'hostname' now returns a pseudo-fqdn again:

<pre>
[root@pe-centos6 ~]# hostname
<b>pe-centos6.localdomain</b>
[root@pe-centos6 ~]# 
</pre>

Facter interprets this as the fqdn again:

<pre>
[root@pe-centos6 ~]# facter fqdn
pe-centos6.localdomain
[root@pe-centos6 ~]# 
</pre>

And running this command:

<pre>
time puppet agent --test --debug
</pre>

regularly takes over three minutes to execute.  It very visibly hangs on this 
line:

<pre>
debug: Searching for SRV records for domain: localdomain
</pre>

at least twice during each execution of the agent.  If I then edit the 
/etc/puppet/puppet.conf and adding the line:

<pre>
use_srv_records=false
</pre>

and then run the agent, the agent execution time has dropped from over 3 
minutes to less than 5 seconds.

----------------------------------------
Bug #12623: Long timeout for SRV DNS rsolution 
https://projects.puppetlabs.com/issues/12623#change-54876

Author: Chris Price
Status: Investigating
Priority: Normal
Assignee: Chris Price
Category: network
Target version: 
Affected Puppet version: development
Keywords: 
Branch: 


This issue has come up for me several times when setting up nodes for 
acceptance tests.  It is probably most easily reproducible with a fresh copy of 
our CentOS6 VM.

In /etc/sysconfig/network there is a line that looks like this:

<pre>
HOSTNAME=pe-centos6.localdomain
</pre>

This causes facter to return that same string for "fqdn".  Then, when you 
attempt to use this VM as an agent node for acceptance tests, it blocks for an 
*extremely* long time during the 03_ValidateSignCerts phase.  Running the 
offending commands outside of the acceptance test framework:

<pre>
puppet master --dns_alt_names="puppet,$(hostname -s),$(hostname -f)" --verbose 
--no-daemonize --logdest=/var/lib/puppet/log/puppetmaster.log --debug --trace
</pre>

and

<pre>
puppet agent --test --debug
</pre>

Will reproduce the slowness, and you'll see the following in the agent output:

<pre>
debug: Searching for SRV records for domain: localdomain
debug: Found 0 SRV records for: _x-puppet-report._tcp.localdomain
</pre>

There may be a delay of 5 minutes or more in between those two lines being 
printed, however.

The code that this occurs in is in lib/puppet/network/resolver.rb, in the 
method each_srv_record.  However, this code is simply calling into the ruby 
Resolve::DNS.getresources() method.  Reviewing the documentation for this 
method, I don't see a way to specify a timeout interval.

I'm also not entirely sure whether reducing the timeout is the correct 
solution, or if there is some other alternative.

As a workaround, simply fixing the "HOSTNAME" line in /etc/sysconfig/network so 
that it does not contain the domain name seems to dramatically improve the 
performance.


-- 
You have received this notification because you have either subscribed to it, 
or are involved in it.
To change your notification preferences, please click here: 
http://projects.puppetlabs.com/my/account

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Bugs" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-bugs?hl=en.

Reply via email to