Hello John,
Thanks for your answer. I have open an issue with my hardward manufacturer
and so I will do it with my SO one.
Anyway I paste the strace listings so maybe someone can shed light on it:

server1:

BIOS: American Megatrends Inc. 1.2
SYS: Supermicro X8SIE
CPU: Intel(R) Core(TM) i3 CPU 550 @ 3.20GHz [4 cores]
MEM:
  SLOT0  2048 MB
  SLOT1  2048 MB


open("/usr/lib/ruby/1.8/facter/osfamily.rb", O_RDONLY|O_LARGEFILE) = 3
close(3) = 0
open("/usr/lib/ruby/1.8/facter/osfamily.rb", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=800, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0xb7297000
read(3, "# Fact: osfamily\n#\n# Purpose: Re"..., 4096) = 800
......CRASH


server2:

BIOS: American Megatrends Inc. 1.2
SYS: Supermicro X8SIE
CPU: Intel(R) Core(TM) i3 CPU 560 @ 3.33GHz [4 cores]
MEM:
  SLOT0  2048 MB
  SLOT1  2048 MB



stat64("/usr/sbin/dmidecode", {st_mode=S_IFREG|0755, st_size=48408, ...}) =
0
pipe([3, 4]) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0xb74e5ba8) = 8709
close(4) = 0
fcntl64(3, F_GETFL) = 0 (flags O_RDONLY)
fstat64(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0xb725e000
_llseek(3, 0, 0xbf900930, SEEK_CUR) = -1 ESPIPE(Illegal seek)
fstat64(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
read(3, "# dmidecode 2.9\nSMBIOS 2.6 prese"..., 1024) = 1024
read(3, "oot is supported\n\t\tBIOS boot spe"..., 1024) = 1024
read(3, "tate: Safe\n\tThermal State: Safe\n"..., 1024) = 1024
read(3, "Maximum Size: 128 KB\n\tSupported "..., 1024) = 1024
read(3, "e 5, 28 bytes\nMemory Controller "..., 1024) = 1024
read(3, " Installed\n\tError Status: OK\n\nHa"..., 1024) = 1024
read(3, " type 8, 9 bytes\nPort Connector "..., 1024) = 1024
read(3, "ternal Reference Designator: LPT"..., 1024) = 1024
read(3, "nal Reference Designator: Not Sp"..., 1024) = 1024
read(3, "nator: Not Specified\n\tExternal C"..., 1024) = 1024
read(3, "or Type: None\n\tPort Type: Other\n"..., 1024) = 1024
read(3, "ector Information\n\tInternal Refe"..., 1024) = 1024
read(3, "\tLength: Short\n\tID: 1\n\tCharacter"..., 1024) = 1024
read(3, "escriptor 5: POST error\n\tData Fo"..., 1024) = 1024
read(3, "ype 19, 15 bytes\nMemory Array Ma"..., 1024) = 1024
read(3, " Width: Unknown\n\tSize: No Module"..., 1024) = 1024
read(3, "ry Device Mapped Address\n\tStarti"..., 1024) = 1024
read(3, "on Handle: Not Provided\n\tTotal W"..., 1024) = 1024
--- SIGCHLD (Child exited) @ 0 (0) ---
read(3, "\n\nHandle 0x0039, DMI type 20, 19"..., 1024) = 1024
read(3, "on-recoverable Threshold: 6\n\nHan"..., 1024) = 1024
read(3, "UT OF SPEC>\n\tCooling Unit Group:"..., 1024) = 1024
read(3, "ed: Yes\n\tHot Replaceable: No\n\tCo"..., 1024) = 669
read(3, "", 1024) = 0
close(3) = 0
munmap(0xb725e000, 4096) = 0
rt_sigaction(SIGHUP, {SIG_IGN}, {0xb77388f0, [HUP], SA_RESTART}, 8) = 0
rt_sigaction(SIGQUIT, {SIG_IGN}, {0xb77388f0, [QUIT], SA_RESTART}, 8) = 0
rt_sigaction(SIGINT, {SIG_IGN}, {0xb77388f0, [INT], SA_RESTART}, 8) = 0
waitpid(8709, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 8709
rt_sigaction(SIGHUP, {0xb77388f0, [HUP], SA_RESTART}, {SIG_IGN}, 8) = 0
rt_sigaction(SIGQUIT, {0xb77388f0, [QUIT], SA_RESTART}, {SIG_IGN}, 8) = 0
rt_sigaction(SIGINT, {0xb77388f0, [INT], SA_RESTART}, {SIG_IGN}, 8) = 0
............
sigprocmask(SIG_SETMASK, [], NULL) = 0
sigprocmask(SIG_BLOCK, NULL, []) = 0
sigprocmask(SIG_BLOCK, NULL, []) = 0
sigprocmask(SIG_BLOCK, NULL, []) = 0
sigprocmask(SIG_SETMASK, [], NULL) = 0
sigprocmask(SIG_BLOCK, NULL, []) = 0
sigprocmask(SIG_BLOCK, NULL, []) = 0
sigprocmask(SIG_BLOCK, NULL, []) = 0
.............
sigprocmask(SIG_BLOCK, NULL, []) = 0
sigprocmask(SIG_BLOCK, NULL, []) = 0
sigprocmask(SIG_BLOCK, NULL, []) = 0
sigprocmask(SIG_BLOCK, NULL, []) = 0
sigprocmask(SIG_BLOCK, NULL, []) = 0
sigprocmask(SIG_BLOCK, NULL, []) = 0
sigprocmask(SIG_SETMASK, [], NULL) = 0
sigprocmask(SIG_BLOCK, NULL, []) = 0
sigprocmask(SIG_BLOCK, NULL, []) = 0
.........
sigprocmask(SIG_BLOCK, NULL, []) = 0
sigprocmask(SIG_BLOCK, NULL, []) = 0
.......CRASH


2012/11/26 jcbollinger <john.bollin...@stjude.org>

>
>
> On Thursday, November 22, 2012 6:23:06 AM UTC-6, Mon wrote:
>>
>>
>>
>>
>> Hello all,
>>>
>>> We have a problem with puppet and certain kind of machines from our farm
>>> (+300), those with Supermicro X8SIE motherboard. Sometime when running
>>> puppet the machine crashes, we lose access to it and logging through IPMI
>>> doesn't show anything in the console, the only thing we can do is a cold
>>> reboot. Then if we run puppet again, nothing happens. If we run puppet
>>> several days after it could be another crash or not, it is random.
>>> I debugged the problem and got the conclusion that the cause was when
>>> running "facter", running it in a mpssh session caused 7 or 8 crashes in
>>> different machines.
>>>
>>> Soft Version:
>>> S.O: ubuntu 8.04
>>> facter                        **  1.5.4-1ubuntu1
>>> puppet                         0.25.1-2
>>>
>>> After upgrading to facter -1.6.11-1 crashes continued. (last .deb in
>>> puppetlabs to hardy)
>>>
>>>
>> Sorry, I sent before ending.......
>>
>> I managed to get some traces executing with "strace" that I could paste
>> if you consider so.
>>
>> Someone has experienced something like that?
>>
>>
>>
>
>
>
> For what it's worth, Facter itself is unlikely to be crashing your system,
> but it runs a variety of commands that probe system details, and it's
> possible that one or a combination of those sometimes crashes them.  It
> should be possible to crash the systems by running the same commands from
> the shell.
>
> If you have straces of facter sessions that resulted in crashes then they
> might be illuminating.  The key thing I would be looking for is what
> commands Facter is trying to run when the crashes occurred.  Unfortunately,
> the nature of the problem precludes being certain that the last thing in
> the captured trace is actually the thing Facter was trying to do when the
> crash happened.
>
> If there is a software bug then it is probably in a separate tool or in
> the OS kernel.  It might also be that you have a firmware (i.e. BIOS) bug
> on the affected systems, or even that the particular motherboard model that
> is affected has a design or fabrication flaw.
>
>
> John
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/puppet-users/-/uRikgvYaJN8J.
>
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to
> puppet-users+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/puppet-users?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.

Reply via email to