On Tue, Feb 16, 2010 at 12:23 PM, Nigel Kersten <nig...@google.com> wrote:
> On Sat, Feb 13, 2010 at 6:13 PM, Joshua Anderson
> <joshua_ander...@mac.com> wrote:
>> I'm afraid that I couldn't reproduce this on a Debian VM with Kai's example.
>
> Joshua, I was just having issues reproducing it as well on a 4 core system.
>
> As soon as I ran 3 instances of:
>
> while : ; do openssl speed; done
>
> to peg 3 of the cores, I could reproduce the same case as Kai initially 
> posted.
>
>               exec {"TEST-EXEC" :
>                       cwd => "/tmp/",
>                       command =>"/usr/bin/touch /tmp/7777 >/tmp/123 2>&1",
>                       timeout => 5,
>                       logoutput=> on_failure
>               }
>
> puppet -v ~/test_exec.pp
> err: //Exec[TEST-EXEC]/returns: change from notrun to 0 failed:
> Command exceeded timeout at /root/test_exec.pp:6

ahah. cc'ing puppet-dev as they may have suggestions for the best way forward.

So this isn't a Puppet bug at all.

It looks to be a bug in the Ruby Timeout module that seems to be
triggered when most of your cores are busy.

I can reliably reproduce it by firing up openssl speed (n-1) times
where n is the number of cores and then using the Timeout module.

#!/usr/bin/ruby1.8
#

%x{/usr/bin/touch /tmp/7777}
puts "executed without timeout ok"

puts "executing with timeout"

require 'timeout'

status = Timeout::timeout(5) {
       %x{/usr/bin/touch /tmp/7777}
}

puts "executed with timeout ok"


which will produce something like:

r...@testhost:~# ps auxww|grep [o]penssl
root     22337 99.6  0.0  14616  2028 pts/6    R    15:04   2:52 openssl speed
root     22338 99.9  0.0  14616  2028 pts/6    R    15:04   2:49 openssl speed
root     22339  100  0.0  14616  2024 pts/6    R    15:04   2:49 openssl speed

r...@testhost:~# ~/tickle_ruby.rb
executed without timeout ok
executing with timeout
/usr/lib/ruby/1.8/timeout.rb:60: execution expired (Timeout::Error)
        from /root/tickle_ruby.rb:11

r...@testhost:~# killall openssl
[1]   Terminated              openssl speed &>/dev/null
[2]-  Terminated              openssl speed &>/dev/null
[3]+  Terminated              openssl speed &>/dev/null

r...@testhost:~# ~/tickle_ruby.rb
executed without timeout ok
executing with timeout
executed with timeout ok

This looks to be a problem for all the ruby 1.8.7 p249 variants I've
tried, apart from the MacPorts one, which looks to have a bunch of
patches around these issues.



>
>
>>
>> Here's my attempt:
>>
>> j...@debian:~$ uname -a
>> Linux debian 2.6.18.8-x86_64-linode10 #1 SMP Tue Nov 10 16:29:17 UTC 2009 
>> x86_64 GNU/Linux
>> j...@debian:~$ ruby -v
>> ruby 1.8.7 (2010-01-10 patchlevel 249) [x86_64-linux]
>> j...@debian:~$ puppet --version
>> 0.25.4
>> j...@debian:~$ puppet --debug --trace test.pp
>> Could not retrieve virtual: Permission denied - /proc/xen/capabilities
>> Could not retrieve virtual: Permission denied - /proc/xen/capabilities
>> Could not retrieve virtual: Permission denied - /proc/xen/capabilities
>> Could not retrieve virtual: Permission denied - /proc/xen/capabilities
>> debug: Creating default schedules
>> debug: Failed to load library 'selinux' for feature 'selinux'
>> debug: Failed to load library 'ldap' for feature 'ldap'
>> debug: /File[/home/josh/.puppet/ssl]: Autorequiring File[/home/josh/.puppet]
>> debug: /File[/home/josh/.puppet/var/client_yaml]: Autorequiring 
>> File[/home/josh/.puppet/var]
>> debug: /File[/home/josh/.puppet/ssl/certificate_requests]: Autorequiring 
>> File[/home/josh/.puppet/ssl]
>> debug: /File[/home/josh/.puppet/var/log]: Autorequiring 
>> File[/home/josh/.puppet/var]
>> debug: /File[/home/josh/.puppet/var/lib]: Autorequiring 
>> File[/home/josh/.puppet/var]
>> debug: /File[/home/josh/.puppet/var/state]: Autorequiring 
>> File[/home/josh/.puppet/var]
>> debug: /File[/home/josh/.puppet/var/clientbucket]: Autorequiring 
>> File[/home/josh/.puppet/var]
>> debug: /File[/home/josh/.puppet/ssl/private_keys]: Autorequiring 
>> File[/home/josh/.puppet/ssl]
>> debug: /File[/home/josh/.puppet/ssl/certs]: Autorequiring 
>> File[/home/josh/.puppet/ssl]
>> debug: /File[/home/josh/.puppet/var]: Autorequiring File[/home/josh/.puppet]
>> debug: /File[/home/josh/.puppet/ssl/private]: Autorequiring 
>> File[/home/josh/.puppet/ssl]
>> debug: /File[/home/josh/.puppet/ssl/public_keys]: Autorequiring 
>> File[/home/josh/.puppet/ssl]
>> debug: /File[/home/josh/.puppet/var/state/graphs]: Autorequiring 
>> File[/home/josh/.puppet/var/state]
>> debug: /File[/home/josh/.puppet/var/facts]: Autorequiring 
>> File[/home/josh/.puppet/var]
>> debug: /File[/home/josh/.puppet/var/run]: Autorequiring 
>> File[/home/josh/.puppet/var]
>> debug: Finishing transaction 23715921915640 with 0 changes
>> info: Applying configuration version '1266113402'
>> debug: //testmodule/Exec[TEST-EXEC]: Changing returns
>> debug: //testmodule/Exec[TEST-EXEC]: 1 change(s)
>> debug: //testmodule/Exec[TEST-EXEC]: Executing '/usr/bin/touch /tmp/7777 
>> >/tmp/123 2>&1'
>> debug: Executing '/usr/bin/touch /tmp/7777 >/tmp/123 2>&1'
>> notice: //testmodule/Exec[TEST-EXEC]/returns: executed successfully
>> debug: Finishing transaction 23715922698720 with 1 changes
>> j...@debian:~$
>>
>> -Josh
>>
>>
>> On Feb 13, 2010, at 9:49 AM, Nigel Kersten wrote:
>>
>>> Note too that the same bug should be affecting Debian testing and
>>> unstable if the Ruby 1.8.7 p249 package is the problem.
>>>
>>> Surely we have some people running Debian testing on the list? Seeing
>>> any weird timeouts with execs?
>>>
>>>
>>>
>>> On Fri, Feb 12, 2010 at 11:57 AM, Joel Ebel <jbe...@google.com> wrote:
>>>> Kai, and anyone else experiencing this problem, please go vote, and
>>>> optionally chime in with any details you can provide on:
>>>> https://bugs.launchpad.net/ubuntu/+source/ruby1.8/+bug/520715
>>>>
>>>> Thanks,
>>>> Joel
>>>>
>>>> On Feb 11, 3:06 pm, Joel Ebel <jbe...@google.com> wrote:
>>>>> I've reported this bug to Ubuntu.  The solution is to rebuild ruby1.8
>>>>> without pthreads, unless ruby fixes the bug upstream which causes the
>>>>> hang.
>>>>>
>>>>> https://bugs.launchpad.net/ubuntu/+source/ruby1.8/+bug/520715
>>>>>
>>>>> Joel
>>>>>
>>>>> On Feb 10, 2:42 pm, Nigel Kersten <nig...@google.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>> On Wed, Feb 10, 2010 at 11:48 AM, Nigel Kersten <nig...@google.com> 
>>>>>> wrote:
>>>>>>> On Tue, Feb 9, 2010 at 5:06 AM, kai.steverding
>>>>>>> <kai.steverd...@googlemail.com> wrote:
>>>>>>>> I installed ruby on the above server and tried with a simple exec-
>>>>>>>> test :
>>>>>
>>>>>>>> class testmodule {
>>>>>>>>                exec {"TEST-EXEC" :
>>>>>>>>                        cwd => "/tmp/",
>>>>>>>>                        command =>"/usr/bin/touch /tmp/7777 >/tmp/123 
>>>>>>>> 2>&1",
>>>>>>>>                        timeout => 5,
>>>>>>>>                        logoutput=> on_failure
>>>>>>>>                }
>>>>>>>> }
>>>>>
>>>>>>>> This simple thing gets the following output from "puppet --debug --
>>>>>>>> test"
>>>>>
>>>>>>>> debug: Loaded state in 0.00 seconds
>>>>>>>> info: Applying configuration version '1265719507'
>>>>>>>> debug: //testmodule/Exec[TEST-EXEC]: Changing returns
>>>>>>>> debug: //testmodule/Exec[TEST-EXEC]: 1 change(s)
>>>>>>>> debug: //testmodule/Exec[TEST-EXEC]: Executing '/usr/bin/touch /tmp/
>>>>>>>> 7777'
>>>>>>>> debug: Executing '/usr/bin/touch /tmp/7777'
>>>>>>>> err: //testmodule/Exec[TEST-EXEC]/returns: change from notrun to 0
>>>>>>>> failed: Command exceeded timeout at /etc/puppet/modules/testmodule/
>>>>>>>> manifests/init.pp:6
>>>>>>>> debug: Finishing transaction 69914685668640 with 1 changes
>>>>>>>> debug: Storing state
>>>>>>>> debug: Stored state in 0.01 seconds
>>>>>>>> debug: Format pson not supported for Puppet::Transaction::Report; has
>>>>>>>> not implemented method 'from_pson'
>>>>>>>> debug: Format s not supported for Puppet::Transaction::Report; has not
>>>>>>>> implemented method 'from_s'
>>>>>
>>>>>>>> What can I do ? Did i make a mistake, or is exec broken ?
>>>>>
>>>>>>> Kai, something is definitely broken in Lucid.
>>>>>
>>>>>>> We're seeing all sorts of process exec issues.
>>>>>
>>>>>>> Have you nailed this down at all?
>>>>>
>>>>>> So Kai, we've been doing some experimenting here today, and have
>>>>>> reproduced these hangs in all the Debian Ruby1.8 packages back to
>>>>>> 1.8.7.174-2.
>>>>>
>>>>>> 1.8.7.174-1 we've been unable to reproduce it on though.
>>>>>
>>>>>> From the changelog I'm wondering if the first entry under 174-2 is
>>>>>> responsible. Note this was later removed after upstream integrated it.
>>>>>
>>>>>> ruby1.8 (1.8.7.174-2) unstable; urgency=medium
>>>>>
>>>>>>    [ akira yamada ]
>>>>>>    * Added debian/patches/090811_thread_and_select.dpatch: threads may 
>>>>>> hangup
>>>>>>      when IO.select called from two or more threads.
>>>>>>    * Added debian/patches/090812_finalizer_at_exit.dpatch: finalizers 
>>>>>> should be
>>>>>>      run at exit (Closes: #534241)
>>>>>>    * Added debian/patches/090812_class_clone_segv.dpatch: avoid segv 
>>>>>> when an
>>>>>>      object cloned.  (Closes: #533329)
>>>>>>    * Added debian/patches/090812_eval_long_exp_segv.dpatch: fix segv 
>>>>>> when eval
>>>>>>      a long expression.  (Closes: #510561)
>>>>>>    * Added debian/patches/090812_openssl_x509_warning.dpatch: suppress 
>>>>>> warning
>>>>>>      from OpenSSL::X509::ExtensionFactory.  (Closes: #489443)
>>>>>
>>>>>>    [ Lucas Nussbaum ]
>>>>>>    * Removed Fumitoshi UKAI <u...@debian.or.jp> from Uploaders. Thanks a
>>>>>>      lot for the past help! Closes: #541037
>>>>>
>>>>>>    [ Daigo Moriwaki ]
>>>>>>    * debian/fixshebang.sh: skip non-text files, which works around 
>>>>>> hanging of
>>>>>>      sed on scanning gif images.
>>>>>>    * Bumped up Standards-Version to 3.8.2.
>>>>>
>>>>>> --
>>>>>> nigel
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "Puppet Users" group.
>>>> To post to this group, send email to puppet-us...@googlegroups.com.
>>>> To unsubscribe from this group, send email to 
>>>> puppet-users+unsubscr...@googlegroups.com.
>>>> For more options, visit this group at 
>>>> http://groups.google.com/group/puppet-users?hl=en.
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> nigel
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups 
>>> "Puppet Users" group.
>>> To post to this group, send email to puppet-us...@googlegroups.com.
>>> To unsubscribe from this group, send email to 
>>> puppet-users+unsubscr...@googlegroups.com.
>>> For more options, visit this group at 
>>> http://groups.google.com/group/puppet-users?hl=en.
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups 
>> "Puppet Users" group.
>> To post to this group, send email to puppet-us...@googlegroups.com.
>> To unsubscribe from this group, send email to 
>> puppet-users+unsubscr...@googlegroups.com.
>> For more options, visit this group at 
>> http://groups.google.com/group/puppet-users?hl=en.
>>
>>
>
>
>
> --
> nigel
>



-- 
nigel

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To post to this group, send email to puppet-us...@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.

Reply via email to