Re: [OpenAFS-devel] OpenAFS on 2.4.26 ? OpenMosix ?

Jeffrey Hutzelman Wed, 15 Dec 2004 11:50:09 -0800

On Wednesday, December 15, 2004 14:02:26 -0500 Terry Gliedt <[EMAIL PROTECTED]> wrote:

####### from /var/log/messages   Watch for line wraps

Unable to handle kernel NULL pointer dereference at virtual address
00000004   printing eip:
  f8b73af8
  *pde = 2bcc0001
  *pte = 00000000
  Oops: 0000
  CPU:    2
  EIP:    0010:[<f8b73af8>]    Tainted: PF
  EFLAGS: 00010282
  eax: 20003312   ebx: f8c4be14   ecx: ec6b5dfc   edx: 00000000
  esi: f8c4c038   edi: ec6b5da0   ebp: ec6b5da0   esp: ecbbfe40
  ds: 0018   es: 0018   ss: 0018
  Process cp (pid: 3288, stackpage=ecbbf000)
  Stack: f9417000 ecbbe000 00000000 f8c4be14 f8c4c038 ecbbfe90 ec6b5da0
f8b776b2          ec6b5da0 ec6b5dfc 00000002 ecbbfe90 c0360a00 ec71ad20
00000001 f9417000          ec6b5dfc f8c4c038 ec6b5dfc 0000ffff 0001e194
00000040 f8ba22c0 f8b78a00   Call Trace:    [<f8b776b2>] [<f8ba22c0>]
[<f8b78a00>] [<c01611ed>] [<c0161a22>]     [<c01620c9>] [<c0162429>]
[<c0153443>] [<c016c8d1>] [<c0155f88>] [<c01befd5>]     [<c01bf0df>]
[<c010b8bc>]

Code: 39 42 04 0f 84 c7 00 00 00 e8 3a e7 ff ff 89 c5 50 8d 44 24

That's not surprising. In all of the cases you described where a process randomly seg faults, you should see output like that in /var/log/messages or in dmesg output. There are a wide variety of bad things that, if user code does them, cause the program to exit on a signal like SIGSEGV or SIGBUS, and drop a core file. In Linux, if one of these things happens in kernel code, the process exits on SIGSEGV (no core), and you get an "oops" message which contains information about the state of the kernel at the time of the failure. That's what the message you quoted is.

Unfortunately, the oops message is not useful in its raw form. All of the numbers you see in [<>] are actually addresses inside the kernel. In order for the backtrace to be useful, these need to be converted to symbolic form. This is usually done automatically by the logging software, if it can find the kernel symbol table, which is usually available in a file called "System.map". Since the conversion did not happen automatically, you will need to either find and use ksymoops, or reconfigure the kernel logging software to do the translation, and then reproduce the problem again.

The simplest thing to do is to make sure that klogd is able to find the System.map file, and that it is not invoked with -x. You will probably get the best results by running klogd with -p, so it will reload symbol table information when it sees an error (otherwise it may not have a complete set of symbols for openafs).

FWIW, I have not heard of anyone getting OpenAFS and OpenMosix to work together, even to the extent that you've reported so far. We have had several reports of failures in the past, though...

-- Jeffrey T. Hutzelman (N3NHS) <[EMAIL PROTECTED]>
  Sr. Research Systems Programmer
  School of Computer Science - Research Computing Facility
  Carnegie Mellon University - Pittsburgh, PA

_______________________________________________
OpenAFS-devel mailing list
[EMAIL PROTECTED]
https://lists.openafs.org/mailman/listinfo/openafs-devel

Re: [OpenAFS-devel] OpenAFS on 2.4.26 ? OpenMosix ?

Reply via email to