Re: Udev-165 (apparently) System Crash

2011-01-11 Thread alupu
On Thu Jan 6 2011 09:24 PM, William Immendorf wrote:
 Intriguing.  This is something that you should report to both
 LKML and linux-hotplug (the Udev list).
 Also, be sure to provide the kdump of that kernel, the log,
 and the hardware that you think is causing the Udev issue.

Hi William,

First, apologies for getting back to you so late.
I deduce you are familiar with Kdump _and_ (B)LFS so maybe
you can help me further, beyond sending me to LKML and Udev list
(anybody can obviously chime in here with their thoughts):

In a nutshell, the basic functionality of a Kdump is, as we all know,
for the second,dump-capture kernel to boot as soon as the first,
system kernel, crashes and then to capture the preserved memory of
the first (crashed) system and write it somewhere for later analysis.
So far so good.

On an (B)LFS system I see two problems (issues, as they say):

1. On boot, the dump-capture kernel goes through the same steps
 as the original (system) kernel (mountkernfs, modules, UDEV, swap ...),
 finds a corrupt file system, fsck's it and then re-reboots into the
 first, bad, system, and all is lost.

2. In a particular situation like mine (as described in the intro to this
 thread), the crash happens early in the boot (at UDEV activation) which
 makes the possible interaction between the crashing system and
 dump-capture one even more intriguing.

As an aside, I suspect these two complications may have something to do
with the (B)LFS procedures seemingly not covering the system crashes.

Thanks,
-- Alex
-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-support
FAQ: http://www.linuxfromscratch.org/lfs/faq.html
Unsubscribe: See the above information page


Re: Udev-165 (apparently) System Crash

2011-01-08 Thread alupu
On Thu, 2011-01-06 at 02:08 AM, Simon Geard wrote:

 ... udevd supports a couple of options that might help, --debug and 
 --debug-trace.

1.  FWIW, udevd-165 does not (no-longer?) have a --debug-trace option.
 Maybe it's undocumented now (shades of Undocumented DOS of yore :)
 Please see 'man udevd'.

2.  For argument sake, udevd writes to sys.log and daemon.log (identically).

 There are a few things that intrigue me (which might help in my
 non-kdump troubleshooting, if answered positively):

2.1. Udevd is active way before the main partition becomes read-WRITE
 (stemming from a chicken and egg situation, I suppose).
 Be that as it may, where's the mechanism that writes the accumulated
 udevd log  from memory(?) to disk (/var/log/) ultimately (i.e. delayed)?

2.2. Can udevd (syslogd?) output be originally redirected to a different
 partition where it can go instantaneously, so I can find it postmortem
 and deduce the hardware element that bothers Udev?
 syslogd started early for this test, etc.

2.3. More simplistically, can I temporarily mount the main partition writable as
 soon as Udev discovers it and somehow start syslogd?
 As I mentioned, the crash occurs a few lines before the end of Udev run,
 i.e. long after Udev sees the partitions.
 Would syslogd start writing udevd output to disk immediately?
 That would involve creating an appropriate rule with some action in it, etc.
 Maybe splitting the udevd logic in two phases..., etc.

I'd appreciate your thoughts.
If 2.1-3 above too crazy, please disregard with my apologies.

Thanks,
-- Alex
-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-support
FAQ: http://www.linuxfromscratch.org/lfs/faq.html
Unsubscribe: See the above information page


Re: Udev-165 (apparently) System Crash

2011-01-08 Thread Neal Murphy
On Saturday 08 January 2011 13:44:23 al...@verizon.net wrote:
 On Thu, 2011-01-06 at 02:08 AM, Simon Geard wrote:
  ... udevd supports a couple of options that might help, --debug and
  --debug-trace.
 
 1.  FWIW, udevd-165 does not (no-longer?) have a --debug-trace option.
  Maybe it's undocumented now (shades of Undocumented DOS of yore :)
  Please see 'man udevd'.
 
 2.  For argument sake, udevd writes to sys.log and daemon.log
 (identically).
 
  There are a few things that intrigue me (which might help in my
  non-kdump troubleshooting, if answered positively):
 
 2.1. Udevd is active way before the main partition becomes read-WRITE
  (stemming from a chicken and egg situation, I suppose).
  Be that as it may, where's the mechanism that writes the accumulated
  udevd log  from memory(?) to disk (/var/log/) ultimately (i.e. delayed)?
 
 2.2. Can udevd (syslogd?) output be originally redirected to a different
  partition where it can go instantaneously, so I can find it postmortem
  and deduce the hardware element that bothers Udev?
  syslogd started early for this test, etc.
 
 2.3. More simplistically, can I temporarily mount the main partition
 writable as soon as Udev discovers it and somehow start syslogd?
  As I mentioned, the crash occurs a few lines before the end of Udev run,
  i.e. long after Udev sees the partitions.
  Would syslogd start writing udevd output to disk immediately?
  That would involve creating an appropriate rule with some action in it,
 etc. Maybe splitting the udevd logic in two phases..., etc.
 
 I'd appreciate your thoughts.
 If 2.1-3 above too crazy, please disregard with my apologies.

While I was integrating udev into my test/dev version of Smoothwall, I had to 
go through some 
'contortions' when the system was in early boot (running in the initramfs). At 
this point, syslogd 
is not running and there is no hard drive available; logging is pretty much 
WYSIWYG. What follows is 
'generic' Linux boot processes and may not mesh well with how LFS is actually 
booted.

You may have to get your hands *really* dirty. You can modify the initramfs 
image (unpack it, modify 
it, repack it, reboot) by changing its init script to direct udev's STDOUT and 
STDERR to a file (in 
the tmpfs, of course). After udev runs (in the initramfs' init script), you can 
run a shell to 
examine the file. The bootup process will be suspended until you exit the shell.

But you say 'the crash'. If the system is crashing (I, red-facedly, haven't 
paid attention), then 
you might try populating the initramfs' /dev with device nodes for your hard 
drive and modifying the 
initramfs' init script to mount the hard drive. Then run udev as verbosely as 
possible and redirect 
its stdout and stderr to a file on the hard drive.

Be warned that playing in initramfs (early boot) is almost a black art. Things 
we take for granted, 
like standard TTY features, standard device nodes, and ordinary debugging tools 
may not exist. It's 
not as limiting as having only 18 toggles on the front panel, but it's still a 
bother.

If you are truly desperate, you can always put the entire LFS / tree into the 
initramfs image. It'll 
be slow to unpack, but you'll have all the tools at your fingertips. You *will* 
need a lot of RAM to 
do this, though.
-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-support
FAQ: http://www.linuxfromscratch.org/lfs/faq.html
Unsubscribe: See the above information page


Re: Udev-165 (apparently) System Crash

2011-01-08 Thread alupu
On Saturday 08 January 2011 02:48 PM Neal Murphy wrote:
 While I was integrating udev into my test/dev version of Smoothwall ...

Hi Neal,

This is only to acknowledge and thank you for your detailed comments.

I haven't had time to go into any depth at all, what with the NFL playoffs and
all, but judging strictly by the size of your comments, I may be on the brink 
of going
kdump after all.  I am a bit traumatized by kdump, I remember finding little 
support
for the heavy procedure, + the actual submission which is another put off.
If desperate, I'll go the whole 9 yards (maybe even longer).  My only hope is
if I stall long enough I won't need to do anything any more:

online.wsj.com/article/SB10001424052970204527804576043803826627110.html?mod=WSJ_hp_mostpop_read

I'll try to absorb and follow your friendly and valuable advice soon.

Thanks again,
-- Alex
-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-support
FAQ: http://www.linuxfromscratch.org/lfs/faq.html
Unsubscribe: See the above information page


Re: Udev-165 (apparently) System Crash

2011-01-06 Thread alupu

On Thu, 2011-01-06 at 02:08:45 AM, Simon Geard wrote ... if the system is completely frozen, it may be hardware related ...Hi Simon,Very good points, overall.When it crashes it's frozen all right (i.e. a crash crash).No hardware changes as of late.Seems some old hardware the latest Udev iteration started disagreeing with.I have a software-"mirror" system, more modern, SATA based, which still boots verysmoothly.If I correctly read the frozen screen, and taking into account the Udev stepwhere the crash occurs, seems an IRQ conflict of some sort.I'll dig into it.The important thing for me is you appear to support me, even if indirectly,in my attempt to avoid kdump in my troubleshoot, if at all possible :)Thank you very much,Best Wishes,-- Alex
-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-support
FAQ: http://www.linuxfromscratch.org/lfs/faq.html
Unsubscribe: See the above information page


Re: Udev-165 (apparently) System Crash

2011-01-06 Thread William Immendorf
On Thu, Jan 6, 2011 at 6:36 PM,  al...@verizon.net wrote:
 If I correctly read the frozen screen, and taking into account the Udev step
 where the crash occurs, seems an IRQ conflict of some sort.
 I'll dig into it.
Intriguing. This is something that you should report to both LKML and
linux-hotplug (the Udev list). Also, be sure to provide the kdump of
that kernel, the log, and the hardware that you think is causing the
Udev issue.


-- 
William Immendorf
The ultimate in free computing.
Messages in plain text, please, no HTML.
GPG key ID: 1697BE98
If it's not signed, it's not from me.

--

Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman
-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-support
FAQ: http://www.linuxfromscratch.org/lfs/faq.html
Unsubscribe: See the above information page


Re: Udev-165 (apparently) System Crash

2011-01-06 Thread Simon Geard
On Thu, 2011-01-06 at 18:36 -0600, al...@verizon.net wrote:
 The important thing for me is you appear to support me, even if
 indirectly, in my attempt to avoid kdump in my troubleshoot, if at all
 possible :)

Happy to help.

Simon.


signature.asc
Description: This is a digitally signed message part
-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-support
FAQ: http://www.linuxfromscratch.org/lfs/faq.html
Unsubscribe: See the above information page


Re: Udev-165 (apparently) System Crash

2011-01-05 Thread Simon Geard
On Wed, 2011-01-05 at 20:12 -0600, al...@verizon.net wrote:
 QUESTION
 Is there a simple way of analyzing the crash, maybe with help of
 the specialists here, as opposed to formally reporting it?

Never tried, myself, but udevd supports a couple of options that might
help, --debug and --debug-trace. You might try adding those to the
command in /etc/rc.d/init.d/udev.

Also, if you mount the partition from another system, you can look at
the logs in /var/log, and see if anything looks relevant.

Finally, if the system is completely frozen, it may be hardware related
- either an actual hardware problem, or broken software talking to
hardware. So, you could try removing/disconnecting any unnecessary
devices, and see if the problem goes away.

Simon.


signature.asc
Description: This is a digitally signed message part
-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-support
FAQ: http://www.linuxfromscratch.org/lfs/faq.html
Unsubscribe: See the above information page