Issue #4923 has been updated by Markus Roberts.

Status changed from Unreviewed to Accepted
Priority changed from Normal to High
Target version set to 2.6.x

I'd like to see this fixed in 2.6.x if possible, though I'm not seeing it as 
new (and thus our policy would be to fix it in 2.7).  I'm also raising it to 
high as we have to start flushing out these threading problems.

----------------------------------------
Bug #4923: Puppet file locking is not thread safe
http://projects.puppetlabs.com/issues/4923

Author: Brice Figureau
Status: Accepted
Priority: High
Assignee: 
Category: threading
Target version: 2.6.x
Affected version: 2.6.1
Keywords: 
Branch: 


Jason Wright discovered that running puppet-load with a high concurrency (ie > 
10) was randomly producing the following error on a multiprocess passenger 
system:
<pre>
failed: Could not parse YAML data
for node thiscert-isss-forr-thee-healthchecks: syntax error on line
10, col 2: `  domain: thiscert-isss-forr-thee-healthchecks'
</pre>

With the following stacktrace:
<pre>
/usr/lib/ruby/1.8/puppet/indirector/yaml.rb:22:in `find'
/usr/lib/ruby/1.8/puppet/indirector/indirection.rb:208:in `find_in_cache'
/usr/lib/ruby/1.8/puppet/indirector/indirection.rb:184:in `find'
/usr/lib/ruby/1.8/puppet/indirector.rb:50:in `find'
/usr/lib/ruby/1.8/puppet/indirector/catalog/compiler.rb:90:in `find_node'
/usr/lib/ruby/1.8/puppet/indirector/catalog/compiler.rb:114:in 
`node_from_request'
/usr/lib/ruby/1.8/puppet/indirector/catalog/compiler.rb:32:in `find'
/usr/lib/ruby/1.8/puppet/indirector/indirection.rb:193:in `find'
/usr/lib/ruby/1.8/puppet/indirector.rb:50:in `find'
/usr/lib/ruby/1.8/puppet/network/http/handler.rb:101:in `do_find'
/usr/lib/ruby/1.8/puppet/network/http/handler.rb:68:in `send'
/usr/lib/ruby/1.8/puppet/network/http/handler.rb:68:in `process'
/usr/lib/ruby/1.8/puppet/network/http/rack.rb:51:in `call'
/usr/lib/ruby/1.8/phusion_passenger/rack/request_handler.rb:95:in 
`process_request'
/usr/lib/ruby/1.8/phusion_passenger/abstract_request_handler.rb:207:in 
`main_loop'
/usr/lib/ruby/1.8/phusion_passenger/rack/application_spawner.rb:118:in `run'
/usr/lib/ruby/1.8/phusion_passenger/rack/application_spawner.rb:69:in 
`spawn_application'
/usr/lib/ruby/1.8/phusion_passenger/utils.rb:184:in `safe_fork'
/usr/lib/ruby/1.8/phusion_passenger/rack/application_spawner.rb:62:in 
`spawn_application'
/usr/lib/ruby/1.8/phusion_passenger/rack/application_spawner.rb:45:in 
`spawn_application'
/usr/lib/ruby/1.8/phusion_passenger/spawn_manager.rb:159:in `spawn_application'
/usr/lib/ruby/1.8/phusion_passenger/spawn_manager.rb:287:in 
`handle_spawn_application'
/usr/lib/ruby/1.8/phusion_passenger/abstract_server.rb:352:in `__send__'
/usr/lib/ruby/1.8/phusion_passenger/abstract_server.rb:352:in `main_loop'
/usr/lib/ruby/1.8/phusion_passenger/abstract_server.rb:196:in 
`start_synchronously'
/usr/lib/phusion_passenger/passenger-spawn-server:61

</pre>

I was able to reproduce the issue locally with a 2 processes mongrel, and could 
find that the issue is a corruption of the node facts cache on the master.

It does happen frequently with puppet-load because this one asks the catalog of 
only one node but more than one time concurrently (which is improbable in 
production).

I found that the Puppet::Util::FileLocking module wasn't correctly threadsafe, 
certainly because the reader/writer ruby Sync lock has a bug (at least on MRI 
1.8.7).

Here is the failure scenario:
# process 1, thread 1 enters the thread write lock, calls flock in exclusive 
mode, and starts writing the yaml file.
# process 1, thread 2 enters the thread read lock (which shouldn't happen) and 
call flock in shared mode. This *downgrades* the exclusive file lock to a 
shared lock
# process 2, thread 1 enters the thread write lock, calls flock in exclusive 
mode _and_ succeed. It starts writing the yaml file. *Corruption happens*
# process 1, thread 1 resumes and finishes writing the yaml. *file is corrupted*

I was able to fix locally the issue by using a mutually exclusive critical 
section (see the soon to come patch).






-- 
You have received this notification because you have either subscribed to it, 
or are involved in it.
To change your notification preferences, please click here: 
http://projects.puppetlabs.com/my/account

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Bugs" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-bugs?hl=en.

Reply via email to