Issue #2691 has been updated by Marc Fournier.

Affected version changed from 0.25.1rc1 to 0.25.1rc2

I have more details about this issue.

As Brice suggested, I had a look at the performance of mysql but didn't
seem to be missing any indexes. Just to be sure, I dropped and recreated an
empty database, and let all the clients rebuild the content. I then tried
collecting all the resources on one node, but run in the same timeout/high
load issue again.

I also checked out 0.25.1rc2, but nothing magically happened in the
meantime and solved the issue...

I noticed that the SQL queries used to collect the resources were slightly
different between my 0.24.8 server, and the 0.25.1rc2 one.

On the 0.24.8 server, the query returns about 150 lines, one line per
collected resource. The query sent to mysql with 0.25.1rc2 generates 202879
lines. When dumped to a text file, this is about 75 MB of data.

I can imagine puppetmaster would have some trouble digesting all this,
which could cause the timeouts and high cpu usage.

This is the SQL generated on 0.24.8:
<pre>
SELECT `resources`.`id` AS t0_r0, `resources`.`title` AS t0_r1, 
`resources`.`restype` AS t0_r2, `resources`.`host_id` AS t0_r3, 
`resources`.`source_file_id` AS t0_r4, `resources`.`exported` AS t0_r5, 
`resources`.`line` AS t0_r6, `resources`.`updated_at` AS t0_r7, 
`resources`.`created_at` AS t0_r8, `param_values`.`id` AS t1_r0, 
`param_values`.`value` AS t1_r1, `param_values`.`param_name_id` AS t1_r2, 
`param_values`.`line` AS t1_r3, `param_values`.`resource_id` AS t1_r4, 
`param_values`.`updated_at` AS t1_r5, `param_values`.`created_at` AS t1_r6, 
`param_names`.`id` AS t2_r0, `param_names`.`name` AS t2_r1, 
`param_names`.`updated_at` AS t2_r2, `param_names`.`created_at` AS t2_r3 FROM 
`resources` LEFT OUTER JOIN `param_values` ON param_values.resource_id = 
resources.id LEFT OUTER JOIN `param_names` ON `param_names`.id = 
`param_values`.param_name_id WHERE (host_id != 281 AND (exported=1 AND 
restype='Nagios_service') AND (param_values.value = 'nagiosserver.domain.tld' 
and param_names.name = 'tag'))
</pre>

This is the SQL generated on 0.25.1rc2:
<pre>
SELECT `resources`.`id` AS t0_r0, `resources`.`title` AS t0_r1, 
`resources`.`restype` AS t0_r2, `resources`.`host_id` AS t0_r3, 
`resources`.`source_file_id` AS t0_r4, `resources`.`exported` AS t0_r5, 
`resources`.`line` AS t0_r6, `resources`.`updated_at` AS t0_r7, 
`resources`.`created_at` AS t0_r8, `param_values`.`id` AS t1_r0, 
`param_values`.`value` AS t1_r1, `param_values`.`param_name_id` AS t1_r2, 
`param_values`.`line` AS t1_r3, `param_values`.`resource_id` AS t1_r4, 
`param_values`.`updated_at` AS t1_r5, `param_values`.`created_at` AS t1_r6, 
`param_names`.`id` AS t2_r0, `param_names`.`name` AS t2_r1, 
`param_names`.`updated_at` AS t2_r2, `param_names`.`created_at` AS t2_r3, 
`puppet_tags`.`id` AS t3_r0, `puppet_tags`.`name` AS t3_r1, 
`puppet_tags`.`updated_at` AS t3_r2, `puppet_tags`.`created_at` AS t3_r3, 
`resource_tags_puppet_tags`.`id` AS t4_r0, 
`resource_tags_puppet_tags`.`resource_id` AS t4_r1, 
`resource_tags_puppet_tags`.`puppet_tag_id` AS t4_r2, 
`resource_tags_puppet_tags`.`updated_at` AS t4_r3, 
`resource_tags_puppet_tags`.`created_at` AS t4_r4 FROM `resources` LEFT OUTER 
JOIN `param_values` ON param_values.resource_id = resources.id LEFT OUTER JOIN 
`param_names` ON `param_names`.id = `param_values`.param_name_id LEFT OUTER 
JOIN `resource_tags` ON (`resources`.`id` = `resource_tags`.`resource_id`) LEFT 
OUTER JOIN `puppet_tags` ON (`puppet_tags`.`id` = 
`resource_tags`.`puppet_tag_id`) LEFT OUTER JOIN `resource_tags` 
resource_tags_puppet_tags ON resource_tags_puppet_tags.puppet_tag_id = 
puppet_tags.id WHERE (host_id != 6 AND (exported=1 AND 
restype='Nagios_service') AND (puppet_tags.name = 'nagiosserver.domain.tld'))
</pre>

I've tried running the first SQL query on 0.25.1rc2 server. It works fine
and returns 80 exported resources.

Running the 2nd query on the 0.24.8 server returns 2588158 lines/930MB.
I'll have to postpone migrating this server to 0.25 :-)

On the 0.25.1rc2 server, I have the following gems installed:
activerecord (2.3.2)
activesupport (2.3.2)
fastthread (1.0.7)
passenger (2.2.2)
rack (1.0.0)
rake (0.8.7)

On the 0.24.8 server, I have exactly the same gems, except activesupport
which isn't installed.

Please let me know if I can give more details !


----------------------------------------
Bug #2691: "Could not retrieve catalog: HTTP-Error: 500 Internal Server Error" 
with tagged exported resources
http://projects.reductivelabs.com/issues/2691

Author: Marc Fournier
Status: Needs more information
Priority: Normal
Assigned to: 
Category: 
Target version: 
Affected version: 0.25.1rc2
Keywords: 
Branch: 


The problem happens with this sort of manifest (in fact, this simple example 
works fine):
<pre>
node a {
    @@file { "/tmp/foo": content => "fjskfjs\n", tag => "foofile", }
}
node b {
    File <<| tag == 'foofile' |>>
}
</pre>

In my case, I have a about 20 nodes, each one having:
<pre>
@@nagios_host    { $fqdn: tag => "nagiosserver.domain.tld", ... }
@@nagios_service { "a":   tag => "nagiosserver.domain.tld", ... }
@@nagios_service { "b":   tag => "nagiosserver.domain.tld", ... }
@@nagios_service { "c":   tag => "nagiosserver.domain.tld", ... }
@@nagios_service { "d":   tag => "nagiosserver.domain.tld", ... }
@@nagios_service { "e":   tag => "nagiosserver.domain.tld", ... }
</pre>

I suspect the problem occurs when many exported resources are involved (if 20x5 
can be considered many).

The problem happens when running puppet on a node containing:
<pre>
Nagios_host <<| tag == "nagiosserver.domain.tld" |>>
Nagios_service <<| tag == nagiosserver.domain.tld" |>>
</pre>

On the puppetmaster, one ruby process consumes 100% CPU during a few minutes, 
then the client says:
<pre>
err: Could not call puppetmaster.getconfig: #<RuntimeError: HTTP-Error: 500 
Internal Server Error>
err: Could not retrieve catalog: HTTP-Error: 500 Internal Server Error
</pre>

the ruby process continues running at 100%, and after a while we get this 
backtrace:
<pre>
[ pid=3338 file=ext/apache2/Hooks.cpp:547 time=2009-10-01 11:16:46.189 ]:
  Backend process 3465 did not return a valid HTTP response. It returned no 
data.
*** Exception Errno::EPIPE in Passenger RequestHandler (Broken pipe) (process 
3465):
        from 
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/rack/request_handler.rb:83:in
 `write'
        from 
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/rack/request_handler.rb:83:in
 `process_request'
        from 
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/abstract_request_handler.rb:203:in
 `main_loop'
        from 
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/rack/application_spawner.rb:110:in
 `run'
        from 
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/rack/application_spawner.rb:67:in
 `spawn_application'
        from 
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/utils.rb:181:in
 `safe_fork'
        from 
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/rack/application_spawner.rb:60:in
 `spawn_application'
        from 
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/rack/application_spawner.rb:45:in
 `spawn_application'
        from 
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/spawn_manager.rb:158:in
 `spawn_application'
        from 
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/spawn_manager.rb:282:in
 `handle_spawn_application'
        from 
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/abstract_server.rb:337:in
 `__send__'
        from 
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/abstract_server.rb:337:in
 `main_loop'
        from 
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/abstract_server.rb:187:in
 `start_synchronously'
        from 
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/bin/passenger-spawn-server:61
</pre>

The problem can be produced with 0.24.8 and 0.25.0 clients, and 0.25.0 as well 
as 0.25.1rc1 server, with passenger and these gems installed:

activerecord (2.3.2)
activesupport (2.3.2)
fastthread (1.0.7)
passenger (2.2.2)
rack (1.0.0)
rake (0.8.7)

It should be noted that this works fine:
<pre>
Nagios_host <<| |>>
Nagios_service <<| |>>
</pre>

I've also noticed the following error from time to time on 0.25.0. But I'm 
unable to reproduce it, and never seen it with 0.25.1rc1, so I'm not sure it's 
related:
<pre>
/srv/puppet/lib/puppet/util/settings/file_setting.rb:19:in `group='
/srv/puppet/lib/puppet/util/settings/setting.rb:44:in `send'
/srv/puppet/lib/puppet/util/settings/setting.rb:44:in `initialize'
/srv/puppet/lib/puppet/util/settings/setting.rb:38:in `each'
/srv/puppet/lib/puppet/util/settings/setting.rb:38:in `initialize'
/srv/puppet/lib/puppet/util/settings.rb:398:in `new'
/srv/puppet/lib/puppet/util/settings.rb:398:in `newsetting'
/srv/puppet/lib/puppet/util/settings.rb:533:in `setdefaults'
/srv/puppet/lib/puppet/util/settings.rb:518:in `each'
/srv/puppet/lib/puppet/util/settings.rb:518:in `setdefaults'
/srv/puppet/lib/puppet/reports/store.rb:15:in `mkclientdir'
/srv/puppet/lib/puppet/reports/store.rb:35:in `process'
/srv/puppet/lib/puppet/network/handler/report.rb:66:in `process'
/srv/puppet/lib/puppet/network/handler/report.rb:59:in `each'
/srv/puppet/lib/puppet/network/handler/report.rb:59:in `process'
/srv/puppet/lib/puppet/network/handler/report.rb:33:in `report'
/usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `to_proc'
/srv/puppet/lib/puppet/network/xmlrpc/processor.rb:52:in `call'
/srv/puppet/lib/puppet/network/xmlrpc/processor.rb:52:in `protect_service'
/srv/puppet/lib/puppet/network/xmlrpc/processor.rb:85:in `setup_processor'
/usr/lib/ruby/1.8/xmlrpc/server.rb:336:in `call'
/usr/lib/ruby/1.8/xmlrpc/server.rb:336:in `dispatch'
/usr/lib/ruby/1.8/xmlrpc/server.rb:323:in `each'
/usr/lib/ruby/1.8/xmlrpc/server.rb:323:in `dispatch'
/usr/lib/ruby/1.8/xmlrpc/server.rb:366:in `call_method'
/usr/lib/ruby/1.8/xmlrpc/server.rb:378:in `handle'
/srv/puppet/lib/puppet/network/xmlrpc/processor.rb:44:in `process'
/srv/puppet/lib/puppet/network/http/rack/xmlrpc.rb:35:in `process'
/srv/puppet/lib/puppet/network/http/rack.rb:48:in `call'
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/rack/request_handler.rb:81:in
 `process_request'
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/abstract_request_handler.rb:203:in
 `main_loop'
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/rack/application_spawner.rb:110:in
 `run'
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/rack/application_spawner.rb:67:in
 `spawn_application'
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/utils.rb:181:in
 `safe_fork'
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/rack/application_spawner.rb:60:in
 `spawn_application'
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/rack/application_spawner.rb:45:in
 `spawn_application'
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/spawn_manager.rb:158:in
 `spawn_application'
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/spawn_manager.rb:282:in
 `handle_spawn_application'
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/abstract_server.rb:337:in
 `__send__'
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/abstract_server.rb:337:in
 `main_loop'
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_passenger/abstract_server.rb:187:in
 `start_synchronously'
/usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/bin/passenger-spawn-server:61
</pre>



-- 
You have received this notification because you have either subscribed to it, 
or are involved in it.
To change your notification preferences, please click here: 
http://reductivelabs.com/redmine/my/account

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Puppet Bugs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/puppet-bugs?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to