Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-18 Thread Américo Wang
On Wed, Nov 18, 2009 at 8:53 AM, David VomLehn dvoml...@cisco.com wrote:
 On Tue, Nov 17, 2009 at 04:28:22PM -0800, Eric W. Biederman wrote:
 David VomLehn dvoml...@cisco.com writes:

  On Tue, Nov 17, 2009 at 10:45:43AM -0500, Eric W. Biederman wrote:
  ...
  Why not use the kdump hook?  If you handle a kernel panic that way
  you get enhanced reliability and full user space support.  All in a hook
  that is already present and already works.
 ...
  1. In what ways would this be better than, say, a panic_notifier?

 A couple of ways.

 - You are doing the work in a known good kernel instead of the kernel
   that just paniced for some unknown reason.
 - All of the control logic is in user space (not the kernel) so you can
   potentially do something as simple as date  logfile to get the
   date.

 I think I see better what you're suggesting--passing the info to a kdump
 kernel and having it do whatever it wants. I don't think I want to do this,
 but I haven't used any of the kexec() stuff, so I may be missing the point.
 Some more context:

 My application is an embedded one, and one of the big things I need to do
 after a failure is to bring up a fully functional kernel ASAP. Once I have
 that kernel, I process all of the crash data in user space concurrently with
 running my main application. Because I'm embedded, I'm very limited in how
 much crash data I can save over a reboot, how much I can store, and how
 much I can send to a central collection point. This is good, since it doesn't
 take up a lot of resources, but core dumps are out of the question.


I think the problem of kdump is that it uses much memory to hold
the core, i.e. /proc/vmcore, and no way to free it unless using another
reboot. This is why Fedora only does some data-collection in the second
reboot after crash, and then reboots.

I got an idea many days ago, that is providing a way to delete /proc/vmcore
in the second reboot, so that we can have enough memory to continue without
another reboot. I am not sure if Eric likes this? Eric?


 As I understand kdump, I would also need to have a second kernel in memory
 to do the kdump work. It wouldn't need to be as big is the kernel that
 failed, but it would still require a significant amount of memory. On an
 embedded system, the idle memory may be a luxury we can't afford.


You can use only one kernel, as long as it is relocatable.


 I think this makes a kdump-based solution difficult, but if it can meet
 my requirements, I'd much rather use it (I've been following kdump since
 it's inception quite a few years ago, but it hasn't seemed a good match
 for embedded Linux). Does this still sound like a good match?

What do you think about my idea above? If we had that, would kdump
meet your requirements?



  2. Where would you suggest tying in? (Particularly since not all 
  architectures
     currently support kdump)

 No changes are needed kernel side.  You just need an appropriate kernel and
 initrd for your purpose.

 I think I must still be missing something. I have dynamic data that I want
 to preserve as I reboot from a failed kernel to a fresh new kernel. By
 the time the fresh kernel starts init, the dynamic data (IP addresses, MAC
 addresses) has been re-written with new values. This is why I'm trying to
 preserve, but I may be running without disk or flash. This patch doesn't
 preserve the data, but it gets it into the kernel so that it can be
 preserved. At present, I'm preserving the data in a panic_notifier function,
 but I am not wedded to that. At present, the data will be copied to a
 section of memory retained across boots, but I know others will want to
 write to flash.



I believe you can get everything from /proc/vmcore, if you use kexec,
after crash, with some tools like 'crash'.

 All of the interesting architectures support kexec, and if an
 architecture doesn't it isn't hard to add.  The architecture specific
 part is very simple.  A pain to debug initially but very simple.

 I use MIPS processors, and it looks like it is supported. So long as it's
 stable, I'm happy to use it.

MIPS seems to have some kexec() support, but after looking at
arch/mips/kernel/machine_kexec.c, maybe the support is still
broken? But anyway, you are welcome to work on it. :)
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-18 Thread Tim Bird
Eric W. Biederman wrote:
 Matt Mackall m...@selenic.com writes:
 
 As much as I like kexec, it loses on memory footprint by about 100x.
 It's not appropriate for all use cases, especially things like
 consumer-grade wireless access points and phones.
 
 In general I agree.  The cost of a second kernel and initrd can be
 prohibitive in the smallest systems, and if you do a crash capture
 with using a standalone app that is reinventing the wheel.
 
 That said.  I can happily run kdump with only 16M-20M reserved.
 So on many systems the cost is affordable.

Understood.  On some of my systems, the memory budget for the
entire system is 10M.  On most systems I work with, it is a
struggle to reserve even 64K for this feature.
  -- Tim

=
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Corporation of America
=

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-18 Thread Eric W. Biederman
Tim Bird tim.b...@am.sony.com writes:

 Eric W. Biederman wrote:
 Matt Mackall m...@selenic.com writes:
 
 As much as I like kexec, it loses on memory footprint by about 100x.
 It's not appropriate for all use cases, especially things like
 consumer-grade wireless access points and phones.
 
 In general I agree.  The cost of a second kernel and initrd can be
 prohibitive in the smallest systems, and if you do a crash capture
 with using a standalone app that is reinventing the wheel.
 
 That said.  I can happily run kdump with only 16M-20M reserved.
 So on many systems the cost is affordable.

 Understood.  On some of my systems, the memory budget for the
 entire system is 10M.  On most systems I work with, it is a
 struggle to reserve even 64K for this feature.

crash_kexec is really a glorified jump.  It is possible to do a lot in
64K with a standalone application.  If reliable capture of kernel
crashes is desirable to an embedded NAND device I expect a semi-general
purpose dedicated application for capturing at least dmesg from the
crashed kernel and write it to a file on a NAND filesystem could be
worth someones time.

On general purpose hardware we use a kernel and an initrd simply to
reduce the development work of supporting everything and the kitchen
sink.  My impression is that embedded systems can afford a little more
setup time, and a custom compilation, and that the hardware you would like
to store things too is much more common.

Eric

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-17 Thread Artem Bityutskiy
On Tue, 2009-11-17 at 13:45 +0100, Marco Stornelli wrote:
 2009/11/17 Artem Bityutskiy dedeki...@gmail.com:
  Take a look at my mails where I describe different complications we have
  in our system. We really want to have an OOPS/panic + our environment
  stuff to go together, at once. This makes things so much simpler.
 
  Really, what is the problem providing this trivial panic-note
  capability, where user-space can give the kernel a small buffer, and ask
  the kernel to print this buffer at the oops/panic time. Very simple and
  elegant, and just solves the problem.
 
  Why perversions with time-stamps, separate storages are needed?
 
  IOW, you suggest a complicated approach, and demand explaining why we do
  not go for it. Simply because it is unnecessarily complex.
 
 I don't think it's a complicated approach we are talking of a system
 log like syslog with a temporal information, nothing more.

We need to store this information of NAND flash. Implementing logs on
NAND flash is about handling bad blocks, choosing format of records, and
may be even handling wear-levelling. This is not that simple.

And then I have match oops to the userspace environment prints, using I
guess timestamps, which is also about complications in userspace.

  This patch solves the problem gracefully, and I'd rather demand you to 
  point what
  is the technical problem with the patches.
 
 
 Simply because I think that we should avoid to include in the kernel
 things we can do in a simply way at user space level.

If it is much easier to have in the kernel, then this argument does not
work, IMHO.

  I think this
 patch is well done but it's one of the patches that are solutions for
 embedded only, but it's only my opinion.

Also IMHO, but having embedded-only things is not bad at all.

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-17 Thread Eric W. Biederman
Artem Bityutskiy dedeki...@gmail.com writes:

 On Tue, 2009-11-17 at 13:45 +0100, Marco Stornelli wrote:
 2009/11/17 Artem Bityutskiy dedeki...@gmail.com:
  Take a look at my mails where I describe different complications we have
  in our system. We really want to have an OOPS/panic + our environment
  stuff to go together, at once. This makes things so much simpler.
 
  Really, what is the problem providing this trivial panic-note
  capability, where user-space can give the kernel a small buffer, and ask
  the kernel to print this buffer at the oops/panic time. Very simple and
  elegant, and just solves the problem.
 
  Why perversions with time-stamps, separate storages are needed?
 
  IOW, you suggest a complicated approach, and demand explaining why we do
  not go for it. Simply because it is unnecessarily complex.
 
 I don't think it's a complicated approach we are talking of a system
 log like syslog with a temporal information, nothing more.

 We need to store this information of NAND flash. Implementing logs on
 NAND flash is about handling bad blocks, choosing format of records, and
 may be even handling wear-levelling. This is not that simple.

 And then I have match oops to the userspace environment prints, using I
 guess timestamps, which is also about complications in userspace.

  This patch solves the problem gracefully, and I'd rather demand you to 
  point what
  is the technical problem with the patches.
 
 
 Simply because I think that we should avoid to include in the kernel
 things we can do in a simply way at user space level.

 If it is much easier to have in the kernel, then this argument does not
 work, IMHO.

  I think this
 patch is well done but it's one of the patches that are solutions for
 embedded only, but it's only my opinion.

 Also IMHO, but having embedded-only things is not bad at all.

Why not use the kdump hook?  If you handle a kernel panic that way
you get enhanced reliability and full user space support.  All in a hook
that is already present and already works.

Eric

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-17 Thread Marco Stornelli
Artem Bityutskiy wrote:
 On Tue, 2009-11-17 at 13:45 +0100, Marco Stornelli wrote:
 2009/11/17 Artem Bityutskiy dedeki...@gmail.com:

 We need to store this information of NAND flash. Implementing logs on
 NAND flash is about handling bad blocks, choosing format of records, and
 may be even handling wear-levelling. This is not that simple.
 
 And then I have match oops to the userspace environment prints, using I
 guess timestamps, which is also about complications in userspace.
 

Indeed my suggestion was to use a persistent ram, not difficult to use.

 This patch solves the problem gracefully, and I'd rather demand you to 
 point what
 is the technical problem with the patches.

 Simply because I think that we should avoid to include in the kernel
 things we can do in a simply way at user space level.
 
 If it is much easier to have in the kernel, then this argument does not
 work, IMHO.
 
  I think this
 patch is well done but it's one of the patches that are solutions for
 embedded only, but it's only my opinion.
 
 Also IMHO, but having embedded-only things is not bad at all.
 

In the past other patches are not accepted in main line for this, maybe
you'll be luckier.

Marco
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-17 Thread David VomLehn
On Tue, Nov 17, 2009 at 10:45:43AM -0500, Eric W. Biederman wrote:
...
 Why not use the kdump hook?  If you handle a kernel panic that way
 you get enhanced reliability and full user space support.  All in a hook
 that is already present and already works.

I'm a big fan of avoiding reinvention of the wheel--if I can use something
already present, I will. However, I'm not clear about how much of the problem
I'm addressing will be solved by using a kdump hook. If I understand
correctly, you'd still need a pseudo-file somewhere to actually get the data
from user space to kernel space. *Then* you could use a kdump hook to
transfer the data to flash or some memory area that will be retained across
boots. Is this the approach to which you were referring? If so, I have a
couple more questions:

1. In what ways would this be better than, say, a panic_notifier?
2. Where would you suggest tying in? (Particularly since not all architectures
   currently support kdump)

 Eric

David VL
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-17 Thread Eric W. Biederman
David VomLehn dvoml...@cisco.com writes:

 On Tue, Nov 17, 2009 at 10:45:43AM -0500, Eric W. Biederman wrote:
 ...
 Why not use the kdump hook?  If you handle a kernel panic that way
 you get enhanced reliability and full user space support.  All in a hook
 that is already present and already works.

 I'm a big fan of avoiding reinvention of the wheel--if I can use something
 already present, I will. However, I'm not clear about how much of the problem
 I'm addressing will be solved by using a kdump hook. If I understand
 correctly, you'd still need a pseudo-file somewhere to actually get the data
 from user space to kernel space. *Then* you could use a kdump hook to
 transfer the data to flash or some memory area that will be retained across
 boots. Is this the approach to which you were referring? If so, I have a
 couple more questions:

 1. In what ways would this be better than, say, a panic_notifier?

A couple of ways. 

- You are doing the work in a known good kernel instead of the kernel
  that just paniced for some unknown reason.
- All of the control logic is in user space (not the kernel) so you can
  potentially do something as simple as date  logfile to get the
  date.

 2. Where would you suggest tying in? (Particularly since not all architectures
currently support kdump)

No changes are needed kernel side.  You just need an appropriate kernel and
initrd for your purpose.

All of the interesting architectures support kexec, and if an
architecture doesn't it isn't hard to add.  The architecture specific
part is very simple.  A pain to debug initially but very simple.

Eric

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-17 Thread David VomLehn
On Tue, Nov 17, 2009 at 04:28:22PM -0800, Eric W. Biederman wrote:
 David VomLehn dvoml...@cisco.com writes:
 
  On Tue, Nov 17, 2009 at 10:45:43AM -0500, Eric W. Biederman wrote:
  ...
  Why not use the kdump hook?  If you handle a kernel panic that way
  you get enhanced reliability and full user space support.  All in a hook
  that is already present and already works.
...
  1. In what ways would this be better than, say, a panic_notifier?
 
 A couple of ways. 
 
 - You are doing the work in a known good kernel instead of the kernel
   that just paniced for some unknown reason.
 - All of the control logic is in user space (not the kernel) so you can
   potentially do something as simple as date  logfile to get the
   date.

I think I see better what you're suggesting--passing the info to a kdump
kernel and having it do whatever it wants. I don't think I want to do this,
but I haven't used any of the kexec() stuff, so I may be missing the point.
Some more context:

My application is an embedded one, and one of the big things I need to do
after a failure is to bring up a fully functional kernel ASAP. Once I have
that kernel, I process all of the crash data in user space concurrently with
running my main application. Because I'm embedded, I'm very limited in how
much crash data I can save over a reboot, how much I can store, and how
much I can send to a central collection point. This is good, since it doesn't
take up a lot of resources, but core dumps are out of the question.

As I understand kdump, I would also need to have a second kernel in memory
to do the kdump work. It wouldn't need to be as big is the kernel that
failed, but it would still require a significant amount of memory. On an
embedded system, the idle memory may be a luxury we can't afford.

I think this makes a kdump-based solution difficult, but if it can meet
my requirements, I'd much rather use it (I've been following kdump since
it's inception quite a few years ago, but it hasn't seemed a good match
for embedded Linux). Does this still sound like a good match?

  2. Where would you suggest tying in? (Particularly since not all 
  architectures
 currently support kdump)
 
 No changes are needed kernel side.  You just need an appropriate kernel and
 initrd for your purpose.

I think I must still be missing something. I have dynamic data that I want
to preserve as I reboot from a failed kernel to a fresh new kernel. By
the time the fresh kernel starts init, the dynamic data (IP addresses, MAC
addresses) has been re-written with new values. This is why I'm trying to
preserve, but I may be running without disk or flash. This patch doesn't
preserve the data, but it gets it into the kernel so that it can be
preserved. At present, I'm preserving the data in a panic_notifier function,
but I am not wedded to that. At present, the data will be copied to a
section of memory retained across boots, but I know others will want to
write to flash.

 All of the interesting architectures support kexec, and if an
 architecture doesn't it isn't hard to add.  The architecture specific
 part is very simple.  A pain to debug initially but very simple.

I use MIPS processors, and it looks like it is supported. So long as it's
stable, I'm happy to use it.

 Eric

David VL
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-17 Thread Matt Mackall
On Tue, 2009-11-17 at 16:28 -0800, Eric W. Biederman wrote:
 David VomLehn dvoml...@cisco.com writes:
 
  On Tue, Nov 17, 2009 at 10:45:43AM -0500, Eric W. Biederman wrote:
  ...
  Why not use the kdump hook?  If you handle a kernel panic that way
  you get enhanced reliability and full user space support.  All in a hook
  that is already present and already works.
 
  I'm a big fan of avoiding reinvention of the wheel--if I can use something
  already present, I will. However, I'm not clear about how much of the 
  problem
  I'm addressing will be solved by using a kdump hook. If I understand
  correctly, you'd still need a pseudo-file somewhere to actually get the data
  from user space to kernel space. *Then* you could use a kdump hook to
  transfer the data to flash or some memory area that will be retained across
  boots. Is this the approach to which you were referring? If so, I have a
  couple more questions:
 
  1. In what ways would this be better than, say, a panic_notifier?
 
 A couple of ways. 
 
 - You are doing the work in a known good kernel instead of the kernel
   that just paniced for some unknown reason.
 - All of the control logic is in user space (not the kernel) so you can
   potentially do something as simple as date  logfile to get the
   date.
 
  2. Where would you suggest tying in? (Particularly since not all 
  architectures
 currently support kdump)
 
 No changes are needed kernel side.  You just need an appropriate kernel and
 initrd for your purpose.
 
 All of the interesting architectures support kexec, and if an
 architecture doesn't it isn't hard to add.  The architecture specific
 part is very simple.  A pain to debug initially but very simple.

As much as I like kexec, it loses on memory footprint by about 100x.
It's not appropriate for all use cases, especially things like
consumer-grade wireless access points and phones.

-- 
http://selenic.com : development and support for Mercurial and Linux


--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-14 Thread Marco Stornelli
I think in general the procedure should be: at startup or event (for
example acquired IP address from DHCP) user applications write in flash
(better in persistent ram) a log with a tag or a timestamp or something
like this, when there is a kernel panic, it is captured in a file stored
together the log and when possible the system should send all via
network for example. Are there problems that I can't see to follow this
approach? When David says ...so this looks much more like a real file
than a sysctl file I quite agree, it seems a normal application/system
log indeed.

Marco

Artem Bityutskiy wrote:
 On Fri, 2009-11-13 at 09:10 +0100, Simon Kagstrom wrote:
 On Thu, 12 Nov 2009 16:56:49 -0500
 David VomLehn dvoml...@cisco.com wrote:

 Good question. Some more detail on our application might help. In some
 situations, we may have no disk and only enough flash for the bootloader.
 The kernel is downloaded over the network. When we get to user space, we
 initialize a number of things dynamically. For example, we dynamically
 compute some MAC address, and most of the IP addresses are obtained with
 DHCP. This are very useful to have for panic analysis.

 Since there is neither flash nor disk, user space has no place to store
 this information, should the kernel panic. When we come back up, we will get
 different MAC and IP addresses. Storing them in memory is our only hope.

 Fortunately, there is a section of RAM that the bootloader promises not
 to overwrite. On a panic, we capture the messages written on the console
 and store them in the protected area. If the information from the
 /proc file is written as part of the panic, we will capture it, too.
 Can't you solve this completely from userspace using phram and mtdoops
 instead? I.e., setup two phram areas

  modprobe phram 4...@start-of-your-area,4...@start-of-your-area+4k# 
 Can't remember the exact syntax!

 you'll then get /dev/mtdX and /dev/mtdX+1 for these two. You can then do

  modprobe mtdoops mtddev=/dev/mtdX+1 dump_oops=0

 to load mtdoops to catch the panic in the second area, and just write
 your userspace messages to /dev/mtdX.
 
 This might work for them, not sure, but not for us. We store panics on
 flash, and later they are automatically sent to the panic collection
 system via the network. And the complications are:
 
 1. There may be many panics before the device has network access and has
 a chance to send the panics.
 2. User can re-flash the device with different SW inbetween.
 
 So we really need to print some user-space supplied information during
 the panic, and then we store it on flash with mtdoops, and the later,
 when the device has network access we send whole bunch of oopses via the
 network.
 
 One thing probably have to be fixed though: I don't think phram has a
 panic_write, which will be needed by mtdoops to catch the panic - this
 should be trivial to add though since it's plain RAM.
 
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-13 Thread Simon Kagstrom
On Thu, 12 Nov 2009 16:56:49 -0500
David VomLehn dvoml...@cisco.com wrote:

 Good question. Some more detail on our application might help. In some
 situations, we may have no disk and only enough flash for the bootloader.
 The kernel is downloaded over the network. When we get to user space, we
 initialize a number of things dynamically. For example, we dynamically
 compute some MAC address, and most of the IP addresses are obtained with
 DHCP. This are very useful to have for panic analysis.
 
 Since there is neither flash nor disk, user space has no place to store
 this information, should the kernel panic. When we come back up, we will get
 different MAC and IP addresses. Storing them in memory is our only hope.
 
 Fortunately, there is a section of RAM that the bootloader promises not
 to overwrite. On a panic, we capture the messages written on the console
 and store them in the protected area. If the information from the
 /proc file is written as part of the panic, we will capture it, too.

Can't you solve this completely from userspace using phram and mtdoops
instead? I.e., setup two phram areas

modprobe phram 4...@start-of-your-area,4...@start-of-your-area+4k# 
Can't remember the exact syntax!

you'll then get /dev/mtdX and /dev/mtdX+1 for these two. You can then do

modprobe mtdoops mtddev=/dev/mtdX+1 dump_oops=0

to load mtdoops to catch the panic in the second area, and just write
your userspace messages to /dev/mtdX.


One thing probably have to be fixed though: I don't think phram has a
panic_write, which will be needed by mtdoops to catch the panic - this
should be trivial to add though since it's plain RAM.

// Simon
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-13 Thread Artem Bityutskiy
On Thu, 2009-11-12 at 12:06 -0600, Matt Mackall wrote:
 On Wed, 2009-11-11 at 21:13 -0500, David VomLehn wrote:
  Allows annotation of panics to include platform information. It's no big
  deal to collect information, but way helpful when you are collecting
  failure reports from a eventual base of millions of systems deployed in
  other people's homes.
 
 I'd like to hear a bit more use case motivation on this feature. Also,
 why do you want more than a page?

We also need this kind of functionality. The use case is very simple.
Every time the kernel oopeses, we save the oops information on the flash
using mtdoops module. There is even core support, which should be merged
to 2.6.33, see this patch:

http://git.infradead.org/users/dedekind/l2-mtd-2.6.git/commit/832c3d00e82f267316a2b53634631a1821eebae8

(and there was a corresponding discussion on lkml).

And what we want is to dump information about the user-space environment
at the same time to the oops. Specifically, we want to dump information
about what was the SW build number.

And we want this information to be printed at the same time, because we
cannot run any user-space at the panic time. This information is later
read from the flash and sent via the network to the central place. And
by the time it is sent, the user may have already re-flashed his device
with something else.

So I very much appreciate this patch, although I think it should use the
panic notifiers instead of calling a function directly from the panic.

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-13 Thread Shargorodsky Atal (EXT-Teleca/Helsinki)
On Thu, 2009-11-12 at 23:09 +0100, ext David VomLehn wrote:
 On Thu, Nov 12, 2009 at 02:50:41PM -0500, Paul Gortmaker wrote:
  David VomLehn wrote:
  Allows annotation of panics to include platform information. It's no big
  deal to collect information, but way helpful when you are collecting
  failure reports from a eventual base of millions of systems deployed in
  other people's homes.
 ...
  Why hook into panic() directly like this, vs. using the panic
  notifier list? If you use that, and then put the data handling
  magic that you need into your own kernel module that knows how
  to interface with the reporting apps that you have, you can
  do the whole thing without having to alter existing code, I think.
 
 I agree--a panic notifier list is probably a better approach.
 

That's what we currently use:

http://marc.info/?l=linux-kernelm=125655380512117w=2

Very simple and does the job.
Requires to rearrange the panic code a bit, though.
Namely, to move notifiers invocation before the call to kmsg_dump().
Dumpers were introduced by Simon Kagstrom here:

http://marc.info/?l=linux-kernelm=125569530109871w=2


Regards,
Atal

 David VL
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-13 Thread Artem Bityutskiy
On Fri, 2009-11-13 at 12:59 +0100, Simon Kagstrom wrote:
 (Also fix David Woodhouses address and add Atal)
 
 On Fri, 13 Nov 2009 13:45:48 +0200
 Artem Bityutskiy dedeki...@gmail.com wrote:
 
  So we really need to print some user-space supplied information during
  the panic, and then we store it on flash with mtdoops, and the later,
  when the device has network access we send whole bunch of oopses via the
  network.
 
 Yes, I see that your case would have to be handled differently. A
 complication (which I believe was discussed before) is that kmsg_dump()
 is done before the panic notifiers are called.
 
 The reason I put it there is to have it before crash_kexec(), so I
 guess we'll have to take up the discussion on what to do with it. For
 me it now seems like it would be OK to move kmsg_dump() down below the
 panic notifiers.
 
 If you have a kdump kernel to load, then you will most likely not need
 the kmsg dumped data anyway.

Yeah, I think this is a separate issue which can be fixed separately.

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-12 Thread Marco Stornelli
Sincerely, I don't understand why we should involve the kernel to gather
this kind of information when we can use other (user-space) tools, only
to have all in a single report maybe? I think it's a bit weak reason
to include this additional behavior in the kernel.

David VomLehn ha scritto:
 Allows annotation of panics to include platform information. It's no big
 deal to collect information, but way helpful when you are collecting
 failure reports from a eventual base of millions of systems deployed in
 other people's homes.
 

Marco
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-12 Thread Matt Mackall
On Wed, 2009-11-11 at 21:13 -0500, David VomLehn wrote:
 Allows annotation of panics to include platform information. It's no big
 deal to collect information, but way helpful when you are collecting
 failure reports from a eventual base of millions of systems deployed in
 other people's homes.

I'd like to hear a bit more use case motivation on this feature. Also,
why do you want more than a page?

-- 
http://selenic.com : development and support for Mercurial and Linux


--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-12 Thread Paul Gortmaker

David VomLehn wrote:

Allows annotation of panics to include platform information. It's no big
deal to collect information, but way helpful when you are collecting
failure reports from a eventual base of millions of systems deployed in
other people's homes.

One of the biggest reasons this is an RFC is that I'm uncomfortable with
putting the pseudo-file that holds the annotation information in /proc.
Different layers of the software stack may drop dynamic information, such
as DHCP-supplied IP addresses, in here as they come up. This means it's
necessary to be able to append to the end of the annotation, so this looks
much more like a real file than a sysctl file.  It also has multiple lines,
which doesn't look a sysctl file. Annotation can be viewed as a debug thing,
so maybe it belongs in debugfs, but people seem to be doing somewhat different
things with that filesystem.

So, suggestions on this issue, and any others are most welcome. If there a
better way to do this, I'll be happy to use it.

Signed-off-by: David VomLehn dvoml...@cisco.com
---



--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -70,6 +70,7 @@ NORET_TYPE void panic(const char * fmt, ...)
vsnprintf(buf, sizeof(buf), fmt, args);
va_end(args);
printk(KERN_EMERG Kernel panic - not syncing: %s\n,buf);
+   panic_note_print();
 #ifdef CONFIG_DEBUG_BUGVERBOSE
dump_stack();
 #endif



Why hook into panic() directly like this, vs. using the panic
notifier list? If you use that, and then put the data handling
magic that you need into your own kernel module that knows how
to interface with the reporting apps that you have, you can
do the whole thing without having to alter existing code, I think.

Paul.


diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 30df586..bade7a1 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1045,6 +1045,14 @@ config DMA_API_DEBUG
  This option causes a performance degredation.  Use only if you want
  to debug device drivers. If unsure, say N.
 
+config PANIC_NOTE

+   bool Create file for user space data to be reported at panic time
+   default n
+   help
+ This creates a pseudo-file, named /proc/panic_note, into which
+ user space data can be written. If a panic occurs, the contents
+ of the file will be included in the failure report.
+
 source samples/Kconfig
 
 source lib/Kconfig.kgdb


--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-12 Thread David VomLehn
On Thu, Nov 12, 2009 at 01:06:51PM -0500, Matt Mackall wrote:
 On Wed, 2009-11-11 at 21:13 -0500, David VomLehn wrote:
  Allows annotation of panics to include platform information. It's no big
  deal to collect information, but way helpful when you are collecting
  failure reports from a eventual base of millions of systems deployed in
  other people's homes.
 
 I'd like to hear a bit more use case motivation on this feature. Also,
 why do you want more than a page?

Hopefully, I have addressed the first question in my previous email. As
for the second, I doubt there is a need for more than a page. I just picked
a value to start developing with. This is still a work in progress...

 http://selenic.com : development and support for Mercurial and Linux

David VL
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] panic-note: Annotation from user space for panics

2009-11-12 Thread David VomLehn
On Thu, Nov 12, 2009 at 02:50:41PM -0500, Paul Gortmaker wrote:
 David VomLehn wrote:
 Allows annotation of panics to include platform information. It's no big
 deal to collect information, but way helpful when you are collecting
 failure reports from a eventual base of millions of systems deployed in
 other people's homes.
...
 Why hook into panic() directly like this, vs. using the panic
 notifier list? If you use that, and then put the data handling
 magic that you need into your own kernel module that knows how
 to interface with the reporting apps that you have, you can
 do the whole thing without having to alter existing code, I think.

I agree--a panic notifier list is probably a better approach.

David VL
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html