Re: [PATCH, RFC] panic-note: Annotation from user space for panics
On Wed, Nov 18, 2009 at 8:53 AM, David VomLehn dvoml...@cisco.com wrote: On Tue, Nov 17, 2009 at 04:28:22PM -0800, Eric W. Biederman wrote: David VomLehn dvoml...@cisco.com writes: On Tue, Nov 17, 2009 at 10:45:43AM -0500, Eric W. Biederman wrote: ... Why not use the kdump hook? If you handle a kernel panic that way you get enhanced reliability and full user space support. All in a hook that is already present and already works. ... 1. In what ways would this be better than, say, a panic_notifier? A couple of ways. - You are doing the work in a known good kernel instead of the kernel that just paniced for some unknown reason. - All of the control logic is in user space (not the kernel) so you can potentially do something as simple as date logfile to get the date. I think I see better what you're suggesting--passing the info to a kdump kernel and having it do whatever it wants. I don't think I want to do this, but I haven't used any of the kexec() stuff, so I may be missing the point. Some more context: My application is an embedded one, and one of the big things I need to do after a failure is to bring up a fully functional kernel ASAP. Once I have that kernel, I process all of the crash data in user space concurrently with running my main application. Because I'm embedded, I'm very limited in how much crash data I can save over a reboot, how much I can store, and how much I can send to a central collection point. This is good, since it doesn't take up a lot of resources, but core dumps are out of the question. I think the problem of kdump is that it uses much memory to hold the core, i.e. /proc/vmcore, and no way to free it unless using another reboot. This is why Fedora only does some data-collection in the second reboot after crash, and then reboots. I got an idea many days ago, that is providing a way to delete /proc/vmcore in the second reboot, so that we can have enough memory to continue without another reboot. I am not sure if Eric likes this? Eric? As I understand kdump, I would also need to have a second kernel in memory to do the kdump work. It wouldn't need to be as big is the kernel that failed, but it would still require a significant amount of memory. On an embedded system, the idle memory may be a luxury we can't afford. You can use only one kernel, as long as it is relocatable. I think this makes a kdump-based solution difficult, but if it can meet my requirements, I'd much rather use it (I've been following kdump since it's inception quite a few years ago, but it hasn't seemed a good match for embedded Linux). Does this still sound like a good match? What do you think about my idea above? If we had that, would kdump meet your requirements? 2. Where would you suggest tying in? (Particularly since not all architectures currently support kdump) No changes are needed kernel side. You just need an appropriate kernel and initrd for your purpose. I think I must still be missing something. I have dynamic data that I want to preserve as I reboot from a failed kernel to a fresh new kernel. By the time the fresh kernel starts init, the dynamic data (IP addresses, MAC addresses) has been re-written with new values. This is why I'm trying to preserve, but I may be running without disk or flash. This patch doesn't preserve the data, but it gets it into the kernel so that it can be preserved. At present, I'm preserving the data in a panic_notifier function, but I am not wedded to that. At present, the data will be copied to a section of memory retained across boots, but I know others will want to write to flash. I believe you can get everything from /proc/vmcore, if you use kexec, after crash, with some tools like 'crash'. All of the interesting architectures support kexec, and if an architecture doesn't it isn't hard to add. The architecture specific part is very simple. A pain to debug initially but very simple. I use MIPS processors, and it looks like it is supported. So long as it's stable, I'm happy to use it. MIPS seems to have some kexec() support, but after looking at arch/mips/kernel/machine_kexec.c, maybe the support is still broken? But anyway, you are welcome to work on it. :) -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
Eric W. Biederman wrote: Matt Mackall m...@selenic.com writes: As much as I like kexec, it loses on memory footprint by about 100x. It's not appropriate for all use cases, especially things like consumer-grade wireless access points and phones. In general I agree. The cost of a second kernel and initrd can be prohibitive in the smallest systems, and if you do a crash capture with using a standalone app that is reinventing the wheel. That said. I can happily run kdump with only 16M-20M reserved. So on many systems the cost is affordable. Understood. On some of my systems, the memory budget for the entire system is 10M. On most systems I work with, it is a struggle to reserve even 64K for this feature. -- Tim = Tim Bird Architecture Group Chair, CE Linux Forum Senior Staff Engineer, Sony Corporation of America = -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
Tim Bird tim.b...@am.sony.com writes: Eric W. Biederman wrote: Matt Mackall m...@selenic.com writes: As much as I like kexec, it loses on memory footprint by about 100x. It's not appropriate for all use cases, especially things like consumer-grade wireless access points and phones. In general I agree. The cost of a second kernel and initrd can be prohibitive in the smallest systems, and if you do a crash capture with using a standalone app that is reinventing the wheel. That said. I can happily run kdump with only 16M-20M reserved. So on many systems the cost is affordable. Understood. On some of my systems, the memory budget for the entire system is 10M. On most systems I work with, it is a struggle to reserve even 64K for this feature. crash_kexec is really a glorified jump. It is possible to do a lot in 64K with a standalone application. If reliable capture of kernel crashes is desirable to an embedded NAND device I expect a semi-general purpose dedicated application for capturing at least dmesg from the crashed kernel and write it to a file on a NAND filesystem could be worth someones time. On general purpose hardware we use a kernel and an initrd simply to reduce the development work of supporting everything and the kitchen sink. My impression is that embedded systems can afford a little more setup time, and a custom compilation, and that the hardware you would like to store things too is much more common. Eric -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
On Tue, 2009-11-17 at 13:45 +0100, Marco Stornelli wrote: 2009/11/17 Artem Bityutskiy dedeki...@gmail.com: Take a look at my mails where I describe different complications we have in our system. We really want to have an OOPS/panic + our environment stuff to go together, at once. This makes things so much simpler. Really, what is the problem providing this trivial panic-note capability, where user-space can give the kernel a small buffer, and ask the kernel to print this buffer at the oops/panic time. Very simple and elegant, and just solves the problem. Why perversions with time-stamps, separate storages are needed? IOW, you suggest a complicated approach, and demand explaining why we do not go for it. Simply because it is unnecessarily complex. I don't think it's a complicated approach we are talking of a system log like syslog with a temporal information, nothing more. We need to store this information of NAND flash. Implementing logs on NAND flash is about handling bad blocks, choosing format of records, and may be even handling wear-levelling. This is not that simple. And then I have match oops to the userspace environment prints, using I guess timestamps, which is also about complications in userspace. This patch solves the problem gracefully, and I'd rather demand you to point what is the technical problem with the patches. Simply because I think that we should avoid to include in the kernel things we can do in a simply way at user space level. If it is much easier to have in the kernel, then this argument does not work, IMHO. I think this patch is well done but it's one of the patches that are solutions for embedded only, but it's only my opinion. Also IMHO, but having embedded-only things is not bad at all. -- Best Regards, Artem Bityutskiy (Артём Битюцкий) -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
Artem Bityutskiy dedeki...@gmail.com writes: On Tue, 2009-11-17 at 13:45 +0100, Marco Stornelli wrote: 2009/11/17 Artem Bityutskiy dedeki...@gmail.com: Take a look at my mails where I describe different complications we have in our system. We really want to have an OOPS/panic + our environment stuff to go together, at once. This makes things so much simpler. Really, what is the problem providing this trivial panic-note capability, where user-space can give the kernel a small buffer, and ask the kernel to print this buffer at the oops/panic time. Very simple and elegant, and just solves the problem. Why perversions with time-stamps, separate storages are needed? IOW, you suggest a complicated approach, and demand explaining why we do not go for it. Simply because it is unnecessarily complex. I don't think it's a complicated approach we are talking of a system log like syslog with a temporal information, nothing more. We need to store this information of NAND flash. Implementing logs on NAND flash is about handling bad blocks, choosing format of records, and may be even handling wear-levelling. This is not that simple. And then I have match oops to the userspace environment prints, using I guess timestamps, which is also about complications in userspace. This patch solves the problem gracefully, and I'd rather demand you to point what is the technical problem with the patches. Simply because I think that we should avoid to include in the kernel things we can do in a simply way at user space level. If it is much easier to have in the kernel, then this argument does not work, IMHO. I think this patch is well done but it's one of the patches that are solutions for embedded only, but it's only my opinion. Also IMHO, but having embedded-only things is not bad at all. Why not use the kdump hook? If you handle a kernel panic that way you get enhanced reliability and full user space support. All in a hook that is already present and already works. Eric -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
Artem Bityutskiy wrote: On Tue, 2009-11-17 at 13:45 +0100, Marco Stornelli wrote: 2009/11/17 Artem Bityutskiy dedeki...@gmail.com: We need to store this information of NAND flash. Implementing logs on NAND flash is about handling bad blocks, choosing format of records, and may be even handling wear-levelling. This is not that simple. And then I have match oops to the userspace environment prints, using I guess timestamps, which is also about complications in userspace. Indeed my suggestion was to use a persistent ram, not difficult to use. This patch solves the problem gracefully, and I'd rather demand you to point what is the technical problem with the patches. Simply because I think that we should avoid to include in the kernel things we can do in a simply way at user space level. If it is much easier to have in the kernel, then this argument does not work, IMHO. I think this patch is well done but it's one of the patches that are solutions for embedded only, but it's only my opinion. Also IMHO, but having embedded-only things is not bad at all. In the past other patches are not accepted in main line for this, maybe you'll be luckier. Marco -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
On Tue, Nov 17, 2009 at 10:45:43AM -0500, Eric W. Biederman wrote: ... Why not use the kdump hook? If you handle a kernel panic that way you get enhanced reliability and full user space support. All in a hook that is already present and already works. I'm a big fan of avoiding reinvention of the wheel--if I can use something already present, I will. However, I'm not clear about how much of the problem I'm addressing will be solved by using a kdump hook. If I understand correctly, you'd still need a pseudo-file somewhere to actually get the data from user space to kernel space. *Then* you could use a kdump hook to transfer the data to flash or some memory area that will be retained across boots. Is this the approach to which you were referring? If so, I have a couple more questions: 1. In what ways would this be better than, say, a panic_notifier? 2. Where would you suggest tying in? (Particularly since not all architectures currently support kdump) Eric David VL -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
David VomLehn dvoml...@cisco.com writes: On Tue, Nov 17, 2009 at 10:45:43AM -0500, Eric W. Biederman wrote: ... Why not use the kdump hook? If you handle a kernel panic that way you get enhanced reliability and full user space support. All in a hook that is already present and already works. I'm a big fan of avoiding reinvention of the wheel--if I can use something already present, I will. However, I'm not clear about how much of the problem I'm addressing will be solved by using a kdump hook. If I understand correctly, you'd still need a pseudo-file somewhere to actually get the data from user space to kernel space. *Then* you could use a kdump hook to transfer the data to flash or some memory area that will be retained across boots. Is this the approach to which you were referring? If so, I have a couple more questions: 1. In what ways would this be better than, say, a panic_notifier? A couple of ways. - You are doing the work in a known good kernel instead of the kernel that just paniced for some unknown reason. - All of the control logic is in user space (not the kernel) so you can potentially do something as simple as date logfile to get the date. 2. Where would you suggest tying in? (Particularly since not all architectures currently support kdump) No changes are needed kernel side. You just need an appropriate kernel and initrd for your purpose. All of the interesting architectures support kexec, and if an architecture doesn't it isn't hard to add. The architecture specific part is very simple. A pain to debug initially but very simple. Eric -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
On Tue, Nov 17, 2009 at 04:28:22PM -0800, Eric W. Biederman wrote: David VomLehn dvoml...@cisco.com writes: On Tue, Nov 17, 2009 at 10:45:43AM -0500, Eric W. Biederman wrote: ... Why not use the kdump hook? If you handle a kernel panic that way you get enhanced reliability and full user space support. All in a hook that is already present and already works. ... 1. In what ways would this be better than, say, a panic_notifier? A couple of ways. - You are doing the work in a known good kernel instead of the kernel that just paniced for some unknown reason. - All of the control logic is in user space (not the kernel) so you can potentially do something as simple as date logfile to get the date. I think I see better what you're suggesting--passing the info to a kdump kernel and having it do whatever it wants. I don't think I want to do this, but I haven't used any of the kexec() stuff, so I may be missing the point. Some more context: My application is an embedded one, and one of the big things I need to do after a failure is to bring up a fully functional kernel ASAP. Once I have that kernel, I process all of the crash data in user space concurrently with running my main application. Because I'm embedded, I'm very limited in how much crash data I can save over a reboot, how much I can store, and how much I can send to a central collection point. This is good, since it doesn't take up a lot of resources, but core dumps are out of the question. As I understand kdump, I would also need to have a second kernel in memory to do the kdump work. It wouldn't need to be as big is the kernel that failed, but it would still require a significant amount of memory. On an embedded system, the idle memory may be a luxury we can't afford. I think this makes a kdump-based solution difficult, but if it can meet my requirements, I'd much rather use it (I've been following kdump since it's inception quite a few years ago, but it hasn't seemed a good match for embedded Linux). Does this still sound like a good match? 2. Where would you suggest tying in? (Particularly since not all architectures currently support kdump) No changes are needed kernel side. You just need an appropriate kernel and initrd for your purpose. I think I must still be missing something. I have dynamic data that I want to preserve as I reboot from a failed kernel to a fresh new kernel. By the time the fresh kernel starts init, the dynamic data (IP addresses, MAC addresses) has been re-written with new values. This is why I'm trying to preserve, but I may be running without disk or flash. This patch doesn't preserve the data, but it gets it into the kernel so that it can be preserved. At present, I'm preserving the data in a panic_notifier function, but I am not wedded to that. At present, the data will be copied to a section of memory retained across boots, but I know others will want to write to flash. All of the interesting architectures support kexec, and if an architecture doesn't it isn't hard to add. The architecture specific part is very simple. A pain to debug initially but very simple. I use MIPS processors, and it looks like it is supported. So long as it's stable, I'm happy to use it. Eric David VL -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
On Tue, 2009-11-17 at 16:28 -0800, Eric W. Biederman wrote: David VomLehn dvoml...@cisco.com writes: On Tue, Nov 17, 2009 at 10:45:43AM -0500, Eric W. Biederman wrote: ... Why not use the kdump hook? If you handle a kernel panic that way you get enhanced reliability and full user space support. All in a hook that is already present and already works. I'm a big fan of avoiding reinvention of the wheel--if I can use something already present, I will. However, I'm not clear about how much of the problem I'm addressing will be solved by using a kdump hook. If I understand correctly, you'd still need a pseudo-file somewhere to actually get the data from user space to kernel space. *Then* you could use a kdump hook to transfer the data to flash or some memory area that will be retained across boots. Is this the approach to which you were referring? If so, I have a couple more questions: 1. In what ways would this be better than, say, a panic_notifier? A couple of ways. - You are doing the work in a known good kernel instead of the kernel that just paniced for some unknown reason. - All of the control logic is in user space (not the kernel) so you can potentially do something as simple as date logfile to get the date. 2. Where would you suggest tying in? (Particularly since not all architectures currently support kdump) No changes are needed kernel side. You just need an appropriate kernel and initrd for your purpose. All of the interesting architectures support kexec, and if an architecture doesn't it isn't hard to add. The architecture specific part is very simple. A pain to debug initially but very simple. As much as I like kexec, it loses on memory footprint by about 100x. It's not appropriate for all use cases, especially things like consumer-grade wireless access points and phones. -- http://selenic.com : development and support for Mercurial and Linux -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
I think in general the procedure should be: at startup or event (for example acquired IP address from DHCP) user applications write in flash (better in persistent ram) a log with a tag or a timestamp or something like this, when there is a kernel panic, it is captured in a file stored together the log and when possible the system should send all via network for example. Are there problems that I can't see to follow this approach? When David says ...so this looks much more like a real file than a sysctl file I quite agree, it seems a normal application/system log indeed. Marco Artem Bityutskiy wrote: On Fri, 2009-11-13 at 09:10 +0100, Simon Kagstrom wrote: On Thu, 12 Nov 2009 16:56:49 -0500 David VomLehn dvoml...@cisco.com wrote: Good question. Some more detail on our application might help. In some situations, we may have no disk and only enough flash for the bootloader. The kernel is downloaded over the network. When we get to user space, we initialize a number of things dynamically. For example, we dynamically compute some MAC address, and most of the IP addresses are obtained with DHCP. This are very useful to have for panic analysis. Since there is neither flash nor disk, user space has no place to store this information, should the kernel panic. When we come back up, we will get different MAC and IP addresses. Storing them in memory is our only hope. Fortunately, there is a section of RAM that the bootloader promises not to overwrite. On a panic, we capture the messages written on the console and store them in the protected area. If the information from the /proc file is written as part of the panic, we will capture it, too. Can't you solve this completely from userspace using phram and mtdoops instead? I.e., setup two phram areas modprobe phram 4...@start-of-your-area,4...@start-of-your-area+4k# Can't remember the exact syntax! you'll then get /dev/mtdX and /dev/mtdX+1 for these two. You can then do modprobe mtdoops mtddev=/dev/mtdX+1 dump_oops=0 to load mtdoops to catch the panic in the second area, and just write your userspace messages to /dev/mtdX. This might work for them, not sure, but not for us. We store panics on flash, and later they are automatically sent to the panic collection system via the network. And the complications are: 1. There may be many panics before the device has network access and has a chance to send the panics. 2. User can re-flash the device with different SW inbetween. So we really need to print some user-space supplied information during the panic, and then we store it on flash with mtdoops, and the later, when the device has network access we send whole bunch of oopses via the network. One thing probably have to be fixed though: I don't think phram has a panic_write, which will be needed by mtdoops to catch the panic - this should be trivial to add though since it's plain RAM. -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
On Thu, 12 Nov 2009 16:56:49 -0500 David VomLehn dvoml...@cisco.com wrote: Good question. Some more detail on our application might help. In some situations, we may have no disk and only enough flash for the bootloader. The kernel is downloaded over the network. When we get to user space, we initialize a number of things dynamically. For example, we dynamically compute some MAC address, and most of the IP addresses are obtained with DHCP. This are very useful to have for panic analysis. Since there is neither flash nor disk, user space has no place to store this information, should the kernel panic. When we come back up, we will get different MAC and IP addresses. Storing them in memory is our only hope. Fortunately, there is a section of RAM that the bootloader promises not to overwrite. On a panic, we capture the messages written on the console and store them in the protected area. If the information from the /proc file is written as part of the panic, we will capture it, too. Can't you solve this completely from userspace using phram and mtdoops instead? I.e., setup two phram areas modprobe phram 4...@start-of-your-area,4...@start-of-your-area+4k# Can't remember the exact syntax! you'll then get /dev/mtdX and /dev/mtdX+1 for these two. You can then do modprobe mtdoops mtddev=/dev/mtdX+1 dump_oops=0 to load mtdoops to catch the panic in the second area, and just write your userspace messages to /dev/mtdX. One thing probably have to be fixed though: I don't think phram has a panic_write, which will be needed by mtdoops to catch the panic - this should be trivial to add though since it's plain RAM. // Simon -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
On Thu, 2009-11-12 at 12:06 -0600, Matt Mackall wrote: On Wed, 2009-11-11 at 21:13 -0500, David VomLehn wrote: Allows annotation of panics to include platform information. It's no big deal to collect information, but way helpful when you are collecting failure reports from a eventual base of millions of systems deployed in other people's homes. I'd like to hear a bit more use case motivation on this feature. Also, why do you want more than a page? We also need this kind of functionality. The use case is very simple. Every time the kernel oopeses, we save the oops information on the flash using mtdoops module. There is even core support, which should be merged to 2.6.33, see this patch: http://git.infradead.org/users/dedekind/l2-mtd-2.6.git/commit/832c3d00e82f267316a2b53634631a1821eebae8 (and there was a corresponding discussion on lkml). And what we want is to dump information about the user-space environment at the same time to the oops. Specifically, we want to dump information about what was the SW build number. And we want this information to be printed at the same time, because we cannot run any user-space at the panic time. This information is later read from the flash and sent via the network to the central place. And by the time it is sent, the user may have already re-flashed his device with something else. So I very much appreciate this patch, although I think it should use the panic notifiers instead of calling a function directly from the panic. -- Best Regards, Artem Bityutskiy (Артём Битюцкий) -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
On Thu, 2009-11-12 at 23:09 +0100, ext David VomLehn wrote: On Thu, Nov 12, 2009 at 02:50:41PM -0500, Paul Gortmaker wrote: David VomLehn wrote: Allows annotation of panics to include platform information. It's no big deal to collect information, but way helpful when you are collecting failure reports from a eventual base of millions of systems deployed in other people's homes. ... Why hook into panic() directly like this, vs. using the panic notifier list? If you use that, and then put the data handling magic that you need into your own kernel module that knows how to interface with the reporting apps that you have, you can do the whole thing without having to alter existing code, I think. I agree--a panic notifier list is probably a better approach. That's what we currently use: http://marc.info/?l=linux-kernelm=125655380512117w=2 Very simple and does the job. Requires to rearrange the panic code a bit, though. Namely, to move notifiers invocation before the call to kmsg_dump(). Dumpers were introduced by Simon Kagstrom here: http://marc.info/?l=linux-kernelm=125569530109871w=2 Regards, Atal David VL -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
On Fri, 2009-11-13 at 12:59 +0100, Simon Kagstrom wrote: (Also fix David Woodhouses address and add Atal) On Fri, 13 Nov 2009 13:45:48 +0200 Artem Bityutskiy dedeki...@gmail.com wrote: So we really need to print some user-space supplied information during the panic, and then we store it on flash with mtdoops, and the later, when the device has network access we send whole bunch of oopses via the network. Yes, I see that your case would have to be handled differently. A complication (which I believe was discussed before) is that kmsg_dump() is done before the panic notifiers are called. The reason I put it there is to have it before crash_kexec(), so I guess we'll have to take up the discussion on what to do with it. For me it now seems like it would be OK to move kmsg_dump() down below the panic notifiers. If you have a kdump kernel to load, then you will most likely not need the kmsg dumped data anyway. Yeah, I think this is a separate issue which can be fixed separately. -- Best Regards, Artem Bityutskiy (Артём Битюцкий) -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
Sincerely, I don't understand why we should involve the kernel to gather this kind of information when we can use other (user-space) tools, only to have all in a single report maybe? I think it's a bit weak reason to include this additional behavior in the kernel. David VomLehn ha scritto: Allows annotation of panics to include platform information. It's no big deal to collect information, but way helpful when you are collecting failure reports from a eventual base of millions of systems deployed in other people's homes. Marco -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
On Wed, 2009-11-11 at 21:13 -0500, David VomLehn wrote: Allows annotation of panics to include platform information. It's no big deal to collect information, but way helpful when you are collecting failure reports from a eventual base of millions of systems deployed in other people's homes. I'd like to hear a bit more use case motivation on this feature. Also, why do you want more than a page? -- http://selenic.com : development and support for Mercurial and Linux -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
David VomLehn wrote: Allows annotation of panics to include platform information. It's no big deal to collect information, but way helpful when you are collecting failure reports from a eventual base of millions of systems deployed in other people's homes. One of the biggest reasons this is an RFC is that I'm uncomfortable with putting the pseudo-file that holds the annotation information in /proc. Different layers of the software stack may drop dynamic information, such as DHCP-supplied IP addresses, in here as they come up. This means it's necessary to be able to append to the end of the annotation, so this looks much more like a real file than a sysctl file. It also has multiple lines, which doesn't look a sysctl file. Annotation can be viewed as a debug thing, so maybe it belongs in debugfs, but people seem to be doing somewhat different things with that filesystem. So, suggestions on this issue, and any others are most welcome. If there a better way to do this, I'll be happy to use it. Signed-off-by: David VomLehn dvoml...@cisco.com --- --- a/kernel/panic.c +++ b/kernel/panic.c @@ -70,6 +70,7 @@ NORET_TYPE void panic(const char * fmt, ...) vsnprintf(buf, sizeof(buf), fmt, args); va_end(args); printk(KERN_EMERG Kernel panic - not syncing: %s\n,buf); + panic_note_print(); #ifdef CONFIG_DEBUG_BUGVERBOSE dump_stack(); #endif Why hook into panic() directly like this, vs. using the panic notifier list? If you use that, and then put the data handling magic that you need into your own kernel module that knows how to interface with the reporting apps that you have, you can do the whole thing without having to alter existing code, I think. Paul. diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 30df586..bade7a1 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1045,6 +1045,14 @@ config DMA_API_DEBUG This option causes a performance degredation. Use only if you want to debug device drivers. If unsure, say N. +config PANIC_NOTE + bool Create file for user space data to be reported at panic time + default n + help + This creates a pseudo-file, named /proc/panic_note, into which + user space data can be written. If a panic occurs, the contents + of the file will be included in the failure report. + source samples/Kconfig source lib/Kconfig.kgdb -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
On Thu, Nov 12, 2009 at 01:06:51PM -0500, Matt Mackall wrote: On Wed, 2009-11-11 at 21:13 -0500, David VomLehn wrote: Allows annotation of panics to include platform information. It's no big deal to collect information, but way helpful when you are collecting failure reports from a eventual base of millions of systems deployed in other people's homes. I'd like to hear a bit more use case motivation on this feature. Also, why do you want more than a page? Hopefully, I have addressed the first question in my previous email. As for the second, I doubt there is a need for more than a page. I just picked a value to start developing with. This is still a work in progress... http://selenic.com : development and support for Mercurial and Linux David VL -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] panic-note: Annotation from user space for panics
On Thu, Nov 12, 2009 at 02:50:41PM -0500, Paul Gortmaker wrote: David VomLehn wrote: Allows annotation of panics to include platform information. It's no big deal to collect information, but way helpful when you are collecting failure reports from a eventual base of millions of systems deployed in other people's homes. ... Why hook into panic() directly like this, vs. using the panic notifier list? If you use that, and then put the data handling magic that you need into your own kernel module that knows how to interface with the reporting apps that you have, you can do the whole thing without having to alter existing code, I think. I agree--a panic notifier list is probably a better approach. David VL -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html