RE: [PATCH v2] Add the values related to buddy system for filtering free pages.
Thanks, that's good news, and thanks for the commit ID, that was the thing I was having trouble finding. -Original Message- From: Atsushi Kumagai [mailto:kumagai-atsu...@mxc.nes.nec.co.jp] Sent: Thursday, February 07, 2013 7:45 PM To: Mitchell, Lisa (MCLinux in Fort Collins) Cc: vgo...@redhat.com; ke...@lists.infradead.org; linux-kernel@vger.kernel.org; linux...@kvack.org; d.hatay...@jp.fujitsu.com; ebied...@xmission.com; a...@linux-foundation.org; c...@sgi.com Subject: Re: [PATCH v2] Add the values related to buddy system for filtering free pages. Hello Lisa, On Thu, 07 Feb 2013 05:29:11 -0700 Lisa Mitchell wrote: > > > > Also, I have one question. Can we always think of 1st and 2nd > > > > kernels are same? > > > > > > Not at all. Distros frequently implement it with the same kernel > > > in both role but it should be possible to use an old crusty stable > > > kernel as the 2nd kernel. > > > > > > > If I understand correctly, kexec/kdump can use the 2nd kernel > > > > different from the 1st's. So, differnet kernels need to do the > > > > same thing as makedumpfile does. If assuming two are same, problem is > > > > mush simplified. > > > > > > As a developer it becomes attractive to use a known stable kernel > > > to capture the crash dump even as I experiment with a brand new kernel. > > > > To allow to use the 2nd kernel different from the 1st's, I think we > > have to take care of each kernel version with the logic included in > > makedumpfile for them. That's to say, makedumpfile goes on as before. > > > > > > Thanks > > Atsushi Kumagai > > > Atsushi and Vivek: > > I'm trying to get the status of whether the patch submitted in > https://lkml.org/lkml/2012/11/21/90 is going to be accepted upstream > and get in some version of the Linux 3.8 kernel. I'm replying to the > last email thread above on kexec_lists and lkml.org that I could find > about this patch. > > I was counting on this kernel patch to improve performance of > makedumpfilev1.5.1, so at least it wouldn't be a regression in > performance over makedumpfile v1.4. It was listed as recommended in > the makedumpfilev1.5.1 release posting: > http://lists.infradead.org/pipermail/kexec/2012-December/007460.html > > > All the conversations in the thread since this patch was committed > seem to voice some reservations now, and reference other fixes being > tried to improve performance. > > Does that mean you are abandoning getting this patch accepted > upstream, in favor of pursuing other alternatives? No, this patch has been merged into -next, we should just wait for it to be merged into linus tree. http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=0c63e90dd1c7b35ae2ea9475ba67cf68d8801a26 What interests us now is improvement for interfaces of /proc/vmcore, it's not alternative but another idea which can be consistent with this patch. Thanks Atsushi Kumagai > > I had hoped this patch would be okay to get accepted upstream, and > then other improvements could be built on top of it. > > Is that not the case? > > Or has further review concluded now that this change is a bad idea due > to adding dependence of this new makedumpfile feature on some deep > kernel memory internals? > > Thanks, > > Lisa Mitchell > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v2] Add the values related to buddy system for filtering free pages.
Thanks, that's good news, and thanks for the commit ID, that was the thing I was having trouble finding. -Original Message- From: Atsushi Kumagai [mailto:kumagai-atsu...@mxc.nes.nec.co.jp] Sent: Thursday, February 07, 2013 7:45 PM To: Mitchell, Lisa (MCLinux in Fort Collins) Cc: vgo...@redhat.com; ke...@lists.infradead.org; linux-kernel@vger.kernel.org; linux...@kvack.org; d.hatay...@jp.fujitsu.com; ebied...@xmission.com; a...@linux-foundation.org; c...@sgi.com Subject: Re: [PATCH v2] Add the values related to buddy system for filtering free pages. Hello Lisa, On Thu, 07 Feb 2013 05:29:11 -0700 Lisa Mitchell lisa.mitch...@hp.com wrote: Also, I have one question. Can we always think of 1st and 2nd kernels are same? Not at all. Distros frequently implement it with the same kernel in both role but it should be possible to use an old crusty stable kernel as the 2nd kernel. If I understand correctly, kexec/kdump can use the 2nd kernel different from the 1st's. So, differnet kernels need to do the same thing as makedumpfile does. If assuming two are same, problem is mush simplified. As a developer it becomes attractive to use a known stable kernel to capture the crash dump even as I experiment with a brand new kernel. To allow to use the 2nd kernel different from the 1st's, I think we have to take care of each kernel version with the logic included in makedumpfile for them. That's to say, makedumpfile goes on as before. Thanks Atsushi Kumagai Atsushi and Vivek: I'm trying to get the status of whether the patch submitted in https://lkml.org/lkml/2012/11/21/90 is going to be accepted upstream and get in some version of the Linux 3.8 kernel. I'm replying to the last email thread above on kexec_lists and lkml.org that I could find about this patch. I was counting on this kernel patch to improve performance of makedumpfilev1.5.1, so at least it wouldn't be a regression in performance over makedumpfile v1.4. It was listed as recommended in the makedumpfilev1.5.1 release posting: http://lists.infradead.org/pipermail/kexec/2012-December/007460.html All the conversations in the thread since this patch was committed seem to voice some reservations now, and reference other fixes being tried to improve performance. Does that mean you are abandoning getting this patch accepted upstream, in favor of pursuing other alternatives? No, this patch has been merged into -next, we should just wait for it to be merged into linus tree. http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=0c63e90dd1c7b35ae2ea9475ba67cf68d8801a26 What interests us now is improvement for interfaces of /proc/vmcore, it's not alternative but another idea which can be consistent with this patch. Thanks Atsushi Kumagai I had hoped this patch would be okay to get accepted upstream, and then other improvements could be built on top of it. Is that not the case? Or has further review concluded now that this change is a bad idea due to adding dependence of this new makedumpfile feature on some deep kernel memory internals? Thanks, Lisa Mitchell -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
Hello Lisa, On Thu, 07 Feb 2013 05:29:11 -0700 Lisa Mitchell wrote: > > > > Also, I have one question. Can we always think of 1st and 2nd kernels > > > > are same? > > > > > > Not at all. Distros frequently implement it with the same kernel in > > > both role but it should be possible to use an old crusty stable kernel > > > as the 2nd kernel. > > > > > > > If I understand correctly, kexec/kdump can use the 2nd kernel different > > > > from the 1st's. So, differnet kernels need to do the same thing as > > > > makedumpfile > > > > does. If assuming two are same, problem is mush simplified. > > > > > > As a developer it becomes attractive to use a known stable kernel to > > > capture the crash dump even as I experiment with a brand new kernel. > > > > To allow to use the 2nd kernel different from the 1st's, I think we have > > to take care of each kernel version with the logic included in makedumpfile > > for them. That's to say, makedumpfile goes on as before. > > > > > > Thanks > > Atsushi Kumagai > > > Atsushi and Vivek: > > I'm trying to get the status of whether the patch submitted in > https://lkml.org/lkml/2012/11/21/90 is going to be accepted upstream > and get in some version of the Linux 3.8 kernel. I'm replying to the > last email thread above on kexec_lists and lkml.org that I could find > about this patch. > > I was counting on this kernel patch to improve performance of > makedumpfilev1.5.1, so at least it wouldn't be a regression in > performance over makedumpfile v1.4. It was listed as recommended in > the makedumpfilev1.5.1 release posting: > http://lists.infradead.org/pipermail/kexec/2012-December/007460.html > > > All the conversations in the thread since this patch was committed seem > to voice some reservations now, and reference other fixes being tried to > improve performance. > > Does that mean you are abandoning getting this patch accepted upstream, > in favor of pursuing other alternatives? No, this patch has been merged into -next, we should just wait for it to be merged into linus tree. http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=0c63e90dd1c7b35ae2ea9475ba67cf68d8801a26 What interests us now is improvement for interfaces of /proc/vmcore, it's not alternative but another idea which can be consistent with this patch. Thanks Atsushi Kumagai > > I had hoped this patch would be okay to get accepted upstream, and then > other improvements could be built on top of it. > > Is that not the case? > > Or has further review concluded now that this change is a bad idea due > to adding dependence of this new makedumpfile feature on some deep > kernel memory internals? > > Thanks, > > Lisa Mitchell > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
On Thu, 2012-12-27 at 08:35 +, Atsushi Kumagai wrote: > Hello, > > On Thu, 20 Dec 2012 18:00:11 -0800 > ebied...@xmission.com (Eric W. Biederman) wrote: > > > "Hatayama, Daisuke" writes: > > > > >> From: kexec-boun...@lists.infradead.org > > >> [mailto:kexec-boun...@lists.infradead.org] On Behalf Of Atsushi Kumagai > > >> Sent: Thursday, December 20, 2012 11:21 AM > > > > > >> On Wed, 19 Dec 2012 16:18:56 -0800 > > >> Andrew Morton wrote: > > >> > > >> > On Mon, 10 Dec 2012 10:39:13 +0900 > > >> > Atsushi Kumagai wrote: > > >> > > > > > > >> > > > >> > We might change the PageBuddy() implementation at any time, and > > >> > makedumpfile will break. Or in this case, become less efficient. > > >> > > > >> > Is there any way in which we can move some of this logic into the > > >> > kernel? In this case, add some kernel code which uses PageBuddy() on > > >> > behalf of makedumpfile, rather than replicating the PageBuddy() logic > > >> > in userspace? > > >> > > >> In last month, Cliff Wickman proposed such idea: > > >> > > >> [PATCH v2] makedumpfile: request the kernel do page scans > > >> http://lists.infradead.org/pipermail/kexec/2012-November/007318.html > > >> > > >> [PATCH] scan page tables for makedumpfile, 3.0.13 kernel > > >> http://lists.infradead.org/pipermail/kexec/2012-November/007319.html > > >> > > >> In his idea, the kernel does page scans to distinguish unnecessary pages > > >> (free pages and others) and returns the list of PFN's which should be > > >> excluded for makedumpfile. > > >> As a result, makedumpfile doesn't need to consider internal kernel > > >> behavior. > > >> > > >> I think it's a good idea from the viewpoint of maintainability and > > >> performance. > > > > > I also think wide part of his code can be reused in this work. But the bad > > > performance is caused by a lot of ioremap, not a lot of copying. See my > > > profiling result I posted some days ago. Two issues, ioremap one and > > > filtering > > > maintainability, should be considered separately. Even on ioremap issue, > > > there is secondary one to consider in memory consumption on the 2nd > > > kernel. > > > > Thanks. I was wondering why moving the code into /proc/vmcore would > > make things faster. > > Thanks HATAYAMA-san, I've understood the issues correctly. > We should continue improving the ioremap issue as Cliff and HATAYAMA-san > are doing now. > > > > > > Also, I have one question. Can we always think of 1st and 2nd kernels > > > are same? > > > > Not at all. Distros frequently implement it with the same kernel in > > both role but it should be possible to use an old crusty stable kernel > > as the 2nd kernel. > > > > > If I understand correctly, kexec/kdump can use the 2nd kernel different > > > from the 1st's. So, differnet kernels need to do the same thing as > > > makedumpfile > > > does. If assuming two are same, problem is mush simplified. > > > > As a developer it becomes attractive to use a known stable kernel to > > capture the crash dump even as I experiment with a brand new kernel. > > To allow to use the 2nd kernel different from the 1st's, I think we have > to take care of each kernel version with the logic included in makedumpfile > for them. That's to say, makedumpfile goes on as before. > > > Thanks > Atsushi Kumagai Atsushi and Vivek: I'm trying to get the status of whether the patch submitted in https://lkml.org/lkml/2012/11/21/90 is going to be accepted upstream and get in some version of the Linux 3.8 kernel. I'm replying to the last email thread above on kexec_lists and lkml.org that I could find about this patch. I was counting on this kernel patch to improve performance of makedumpfilev1.5.1, so at least it wouldn't be a regression in performance over makedumpfile v1.4. It was listed as recommended in the makedumpfilev1.5.1 release posting: http://lists.infradead.org/pipermail/kexec/2012-December/007460.html All the conversations in the thread since this patch was committed seem to voice some reservations now, and reference other fixes being tried to improve performance. Does that mean you are abandoning getting this patch accepted upstream, in favor of pursuing other alternatives? I had hoped this patch would be okay to get accepted upstream, and then other improvements could be built on top of it. Is that not the case? Or has further review concluded now that this change is a bad idea due to adding dependence of this new makedumpfile feature on some deep kernel memory internals? Thanks, Lisa Mitchell -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
Hello Lisa, On Thu, 07 Feb 2013 05:29:11 -0700 Lisa Mitchell lisa.mitch...@hp.com wrote: Also, I have one question. Can we always think of 1st and 2nd kernels are same? Not at all. Distros frequently implement it with the same kernel in both role but it should be possible to use an old crusty stable kernel as the 2nd kernel. If I understand correctly, kexec/kdump can use the 2nd kernel different from the 1st's. So, differnet kernels need to do the same thing as makedumpfile does. If assuming two are same, problem is mush simplified. As a developer it becomes attractive to use a known stable kernel to capture the crash dump even as I experiment with a brand new kernel. To allow to use the 2nd kernel different from the 1st's, I think we have to take care of each kernel version with the logic included in makedumpfile for them. That's to say, makedumpfile goes on as before. Thanks Atsushi Kumagai Atsushi and Vivek: I'm trying to get the status of whether the patch submitted in https://lkml.org/lkml/2012/11/21/90 is going to be accepted upstream and get in some version of the Linux 3.8 kernel. I'm replying to the last email thread above on kexec_lists and lkml.org that I could find about this patch. I was counting on this kernel patch to improve performance of makedumpfilev1.5.1, so at least it wouldn't be a regression in performance over makedumpfile v1.4. It was listed as recommended in the makedumpfilev1.5.1 release posting: http://lists.infradead.org/pipermail/kexec/2012-December/007460.html All the conversations in the thread since this patch was committed seem to voice some reservations now, and reference other fixes being tried to improve performance. Does that mean you are abandoning getting this patch accepted upstream, in favor of pursuing other alternatives? No, this patch has been merged into -next, we should just wait for it to be merged into linus tree. http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=0c63e90dd1c7b35ae2ea9475ba67cf68d8801a26 What interests us now is improvement for interfaces of /proc/vmcore, it's not alternative but another idea which can be consistent with this patch. Thanks Atsushi Kumagai I had hoped this patch would be okay to get accepted upstream, and then other improvements could be built on top of it. Is that not the case? Or has further review concluded now that this change is a bad idea due to adding dependence of this new makedumpfile feature on some deep kernel memory internals? Thanks, Lisa Mitchell -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
On Thu, 2012-12-27 at 08:35 +, Atsushi Kumagai wrote: Hello, On Thu, 20 Dec 2012 18:00:11 -0800 ebied...@xmission.com (Eric W. Biederman) wrote: Hatayama, Daisuke d.hatay...@jp.fujitsu.com writes: From: kexec-boun...@lists.infradead.org [mailto:kexec-boun...@lists.infradead.org] On Behalf Of Atsushi Kumagai Sent: Thursday, December 20, 2012 11:21 AM On Wed, 19 Dec 2012 16:18:56 -0800 Andrew Morton a...@linux-foundation.org wrote: On Mon, 10 Dec 2012 10:39:13 +0900 Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp wrote: We might change the PageBuddy() implementation at any time, and makedumpfile will break. Or in this case, become less efficient. Is there any way in which we can move some of this logic into the kernel? In this case, add some kernel code which uses PageBuddy() on behalf of makedumpfile, rather than replicating the PageBuddy() logic in userspace? In last month, Cliff Wickman proposed such idea: [PATCH v2] makedumpfile: request the kernel do page scans http://lists.infradead.org/pipermail/kexec/2012-November/007318.html [PATCH] scan page tables for makedumpfile, 3.0.13 kernel http://lists.infradead.org/pipermail/kexec/2012-November/007319.html In his idea, the kernel does page scans to distinguish unnecessary pages (free pages and others) and returns the list of PFN's which should be excluded for makedumpfile. As a result, makedumpfile doesn't need to consider internal kernel behavior. I think it's a good idea from the viewpoint of maintainability and performance. I also think wide part of his code can be reused in this work. But the bad performance is caused by a lot of ioremap, not a lot of copying. See my profiling result I posted some days ago. Two issues, ioremap one and filtering maintainability, should be considered separately. Even on ioremap issue, there is secondary one to consider in memory consumption on the 2nd kernel. Thanks. I was wondering why moving the code into /proc/vmcore would make things faster. Thanks HATAYAMA-san, I've understood the issues correctly. We should continue improving the ioremap issue as Cliff and HATAYAMA-san are doing now. Also, I have one question. Can we always think of 1st and 2nd kernels are same? Not at all. Distros frequently implement it with the same kernel in both role but it should be possible to use an old crusty stable kernel as the 2nd kernel. If I understand correctly, kexec/kdump can use the 2nd kernel different from the 1st's. So, differnet kernels need to do the same thing as makedumpfile does. If assuming two are same, problem is mush simplified. As a developer it becomes attractive to use a known stable kernel to capture the crash dump even as I experiment with a brand new kernel. To allow to use the 2nd kernel different from the 1st's, I think we have to take care of each kernel version with the logic included in makedumpfile for them. That's to say, makedumpfile goes on as before. Thanks Atsushi Kumagai Atsushi and Vivek: I'm trying to get the status of whether the patch submitted in https://lkml.org/lkml/2012/11/21/90 is going to be accepted upstream and get in some version of the Linux 3.8 kernel. I'm replying to the last email thread above on kexec_lists and lkml.org that I could find about this patch. I was counting on this kernel patch to improve performance of makedumpfilev1.5.1, so at least it wouldn't be a regression in performance over makedumpfile v1.4. It was listed as recommended in the makedumpfilev1.5.1 release posting: http://lists.infradead.org/pipermail/kexec/2012-December/007460.html All the conversations in the thread since this patch was committed seem to voice some reservations now, and reference other fixes being tried to improve performance. Does that mean you are abandoning getting this patch accepted upstream, in favor of pursuing other alternatives? I had hoped this patch would be okay to get accepted upstream, and then other improvements could be built on top of it. Is that not the case? Or has further review concluded now that this change is a bad idea due to adding dependence of this new makedumpfile feature on some deep kernel memory internals? Thanks, Lisa Mitchell -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
Hello, On Thu, 20 Dec 2012 18:00:11 -0800 ebied...@xmission.com (Eric W. Biederman) wrote: > "Hatayama, Daisuke" writes: > > >> From: kexec-boun...@lists.infradead.org > >> [mailto:kexec-boun...@lists.infradead.org] On Behalf Of Atsushi Kumagai > >> Sent: Thursday, December 20, 2012 11:21 AM > > > >> On Wed, 19 Dec 2012 16:18:56 -0800 > >> Andrew Morton wrote: > >> > >> > On Mon, 10 Dec 2012 10:39:13 +0900 > >> > Atsushi Kumagai wrote: > >> > > > > >> > > >> > We might change the PageBuddy() implementation at any time, and > >> > makedumpfile will break. Or in this case, become less efficient. > >> > > >> > Is there any way in which we can move some of this logic into the > >> > kernel? In this case, add some kernel code which uses PageBuddy() on > >> > behalf of makedumpfile, rather than replicating the PageBuddy() logic > >> > in userspace? > >> > >> In last month, Cliff Wickman proposed such idea: > >> > >> [PATCH v2] makedumpfile: request the kernel do page scans > >> http://lists.infradead.org/pipermail/kexec/2012-November/007318.html > >> > >> [PATCH] scan page tables for makedumpfile, 3.0.13 kernel > >> http://lists.infradead.org/pipermail/kexec/2012-November/007319.html > >> > >> In his idea, the kernel does page scans to distinguish unnecessary pages > >> (free pages and others) and returns the list of PFN's which should be > >> excluded for makedumpfile. > >> As a result, makedumpfile doesn't need to consider internal kernel > >> behavior. > >> > >> I think it's a good idea from the viewpoint of maintainability and > >> performance. > > > I also think wide part of his code can be reused in this work. But the bad > > performance is caused by a lot of ioremap, not a lot of copying. See my > > profiling result I posted some days ago. Two issues, ioremap one and > > filtering > > maintainability, should be considered separately. Even on ioremap issue, > > there is secondary one to consider in memory consumption on the 2nd > > kernel. > > Thanks. I was wondering why moving the code into /proc/vmcore would > make things faster. Thanks HATAYAMA-san, I've understood the issues correctly. We should continue improving the ioremap issue as Cliff and HATAYAMA-san are doing now. > > > Also, I have one question. Can we always think of 1st and 2nd kernels > > are same? > > Not at all. Distros frequently implement it with the same kernel in > both role but it should be possible to use an old crusty stable kernel > as the 2nd kernel. > > > If I understand correctly, kexec/kdump can use the 2nd kernel different > > from the 1st's. So, differnet kernels need to do the same thing as > > makedumpfile > > does. If assuming two are same, problem is mush simplified. > > As a developer it becomes attractive to use a known stable kernel to > capture the crash dump even as I experiment with a brand new kernel. To allow to use the 2nd kernel different from the 1st's, I think we have to take care of each kernel version with the logic included in makedumpfile for them. That's to say, makedumpfile goes on as before. Thanks Atsushi Kumagai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
Hello, On Thu, 20 Dec 2012 18:00:11 -0800 ebied...@xmission.com (Eric W. Biederman) wrote: Hatayama, Daisuke d.hatay...@jp.fujitsu.com writes: From: kexec-boun...@lists.infradead.org [mailto:kexec-boun...@lists.infradead.org] On Behalf Of Atsushi Kumagai Sent: Thursday, December 20, 2012 11:21 AM On Wed, 19 Dec 2012 16:18:56 -0800 Andrew Morton a...@linux-foundation.org wrote: On Mon, 10 Dec 2012 10:39:13 +0900 Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp wrote: We might change the PageBuddy() implementation at any time, and makedumpfile will break. Or in this case, become less efficient. Is there any way in which we can move some of this logic into the kernel? In this case, add some kernel code which uses PageBuddy() on behalf of makedumpfile, rather than replicating the PageBuddy() logic in userspace? In last month, Cliff Wickman proposed such idea: [PATCH v2] makedumpfile: request the kernel do page scans http://lists.infradead.org/pipermail/kexec/2012-November/007318.html [PATCH] scan page tables for makedumpfile, 3.0.13 kernel http://lists.infradead.org/pipermail/kexec/2012-November/007319.html In his idea, the kernel does page scans to distinguish unnecessary pages (free pages and others) and returns the list of PFN's which should be excluded for makedumpfile. As a result, makedumpfile doesn't need to consider internal kernel behavior. I think it's a good idea from the viewpoint of maintainability and performance. I also think wide part of his code can be reused in this work. But the bad performance is caused by a lot of ioremap, not a lot of copying. See my profiling result I posted some days ago. Two issues, ioremap one and filtering maintainability, should be considered separately. Even on ioremap issue, there is secondary one to consider in memory consumption on the 2nd kernel. Thanks. I was wondering why moving the code into /proc/vmcore would make things faster. Thanks HATAYAMA-san, I've understood the issues correctly. We should continue improving the ioremap issue as Cliff and HATAYAMA-san are doing now. Also, I have one question. Can we always think of 1st and 2nd kernels are same? Not at all. Distros frequently implement it with the same kernel in both role but it should be possible to use an old crusty stable kernel as the 2nd kernel. If I understand correctly, kexec/kdump can use the 2nd kernel different from the 1st's. So, differnet kernels need to do the same thing as makedumpfile does. If assuming two are same, problem is mush simplified. As a developer it becomes attractive to use a known stable kernel to capture the crash dump even as I experiment with a brand new kernel. To allow to use the 2nd kernel different from the 1st's, I think we have to take care of each kernel version with the logic included in makedumpfile for them. That's to say, makedumpfile goes on as before. Thanks Atsushi Kumagai -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
"Hatayama, Daisuke" writes: >> From: kexec-boun...@lists.infradead.org >> [mailto:kexec-boun...@lists.infradead.org] On Behalf Of Atsushi Kumagai >> Sent: Thursday, December 20, 2012 11:21 AM > >> On Wed, 19 Dec 2012 16:18:56 -0800 >> Andrew Morton wrote: >> >> > On Mon, 10 Dec 2012 10:39:13 +0900 >> > Atsushi Kumagai wrote: >> > > >> > >> > We might change the PageBuddy() implementation at any time, and >> > makedumpfile will break. Or in this case, become less efficient. >> > >> > Is there any way in which we can move some of this logic into the >> > kernel? In this case, add some kernel code which uses PageBuddy() on >> > behalf of makedumpfile, rather than replicating the PageBuddy() logic >> > in userspace? >> >> In last month, Cliff Wickman proposed such idea: >> >> [PATCH v2] makedumpfile: request the kernel do page scans >> http://lists.infradead.org/pipermail/kexec/2012-November/007318.html >> >> [PATCH] scan page tables for makedumpfile, 3.0.13 kernel >> http://lists.infradead.org/pipermail/kexec/2012-November/007319.html >> >> In his idea, the kernel does page scans to distinguish unnecessary pages >> (free pages and others) and returns the list of PFN's which should be >> excluded for makedumpfile. >> As a result, makedumpfile doesn't need to consider internal kernel >> behavior. >> >> I think it's a good idea from the viewpoint of maintainability and >> performance. > I also think wide part of his code can be reused in this work. But the bad > performance is caused by a lot of ioremap, not a lot of copying. See my > profiling result I posted some days ago. Two issues, ioremap one and filtering > maintainability, should be considered separately. Even on ioremap issue, > there is secondary one to consider in memory consumption on the 2nd > kernel. Thanks. I was wondering why moving the code into /proc/vmcore would make things faster. > Also, I have one question. Can we always think of 1st and 2nd kernels > are same? Not at all. Distros frequently implement it with the same kernel in both role but it should be possible to use an old crusty stable kernel as the 2nd kernel. > If I understand correctly, kexec/kdump can use the 2nd kernel different > from the 1st's. So, differnet kernels need to do the same thing as > makedumpfile > does. If assuming two are same, problem is mush simplified. As a developer it becomes attractive to use a known stable kernel to capture the crash dump even as I experiment with a brand new kernel. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
Hatayama, Daisuke d.hatay...@jp.fujitsu.com writes: From: kexec-boun...@lists.infradead.org [mailto:kexec-boun...@lists.infradead.org] On Behalf Of Atsushi Kumagai Sent: Thursday, December 20, 2012 11:21 AM On Wed, 19 Dec 2012 16:18:56 -0800 Andrew Morton a...@linux-foundation.org wrote: On Mon, 10 Dec 2012 10:39:13 +0900 Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp wrote: We might change the PageBuddy() implementation at any time, and makedumpfile will break. Or in this case, become less efficient. Is there any way in which we can move some of this logic into the kernel? In this case, add some kernel code which uses PageBuddy() on behalf of makedumpfile, rather than replicating the PageBuddy() logic in userspace? In last month, Cliff Wickman proposed such idea: [PATCH v2] makedumpfile: request the kernel do page scans http://lists.infradead.org/pipermail/kexec/2012-November/007318.html [PATCH] scan page tables for makedumpfile, 3.0.13 kernel http://lists.infradead.org/pipermail/kexec/2012-November/007319.html In his idea, the kernel does page scans to distinguish unnecessary pages (free pages and others) and returns the list of PFN's which should be excluded for makedumpfile. As a result, makedumpfile doesn't need to consider internal kernel behavior. I think it's a good idea from the viewpoint of maintainability and performance. I also think wide part of his code can be reused in this work. But the bad performance is caused by a lot of ioremap, not a lot of copying. See my profiling result I posted some days ago. Two issues, ioremap one and filtering maintainability, should be considered separately. Even on ioremap issue, there is secondary one to consider in memory consumption on the 2nd kernel. Thanks. I was wondering why moving the code into /proc/vmcore would make things faster. Also, I have one question. Can we always think of 1st and 2nd kernels are same? Not at all. Distros frequently implement it with the same kernel in both role but it should be possible to use an old crusty stable kernel as the 2nd kernel. If I understand correctly, kexec/kdump can use the 2nd kernel different from the 1st's. So, differnet kernels need to do the same thing as makedumpfile does. If assuming two are same, problem is mush simplified. As a developer it becomes attractive to use a known stable kernel to capture the crash dump even as I experiment with a brand new kernel. Eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v2] Add the values related to buddy system for filtering free pages.
> From: kexec-boun...@lists.infradead.org > [mailto:kexec-boun...@lists.infradead.org] On Behalf Of Atsushi Kumagai > Sent: Thursday, December 20, 2012 11:21 AM > On Wed, 19 Dec 2012 16:18:56 -0800 > Andrew Morton wrote: > > > On Mon, 10 Dec 2012 10:39:13 +0900 > > Atsushi Kumagai wrote: > > > > > > We might change the PageBuddy() implementation at any time, and > > makedumpfile will break. Or in this case, become less efficient. > > > > Is there any way in which we can move some of this logic into the > > kernel? In this case, add some kernel code which uses PageBuddy() on > > behalf of makedumpfile, rather than replicating the PageBuddy() logic > > in userspace? > > In last month, Cliff Wickman proposed such idea: > > [PATCH v2] makedumpfile: request the kernel do page scans > http://lists.infradead.org/pipermail/kexec/2012-November/007318.html > > [PATCH] scan page tables for makedumpfile, 3.0.13 kernel > http://lists.infradead.org/pipermail/kexec/2012-November/007319.html > > In his idea, the kernel does page scans to distinguish unnecessary pages > (free pages and others) and returns the list of PFN's which should be > excluded for makedumpfile. > As a result, makedumpfile doesn't need to consider internal kernel > behavior. > > I think it's a good idea from the viewpoint of maintainability and > performance. I also think wide part of his code can be reused in this work. But the bad performance is caused by a lot of ioremap, not a lot of copying. See my profiling result I posted some days ago. Two issues, ioremap one and filtering maintainability, should be considered separately. Even on ioremap issue, there is secondary one to consider in memory consumption on the 2nd kernel. Also, I have one question. Can we always think of 1st and 2nd kernels are same? If I understand correctly, kexec/kdump can use the 2nd kernel different from the 1st's. So, differnet kernels need to do the same thing as makedumpfile does. If assuming two are same, problem is mush simplified. Thanks. HATAYAMA, Daisuke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
Hello Andrew, On Wed, 19 Dec 2012 16:18:56 -0800 Andrew Morton wrote: > On Mon, 10 Dec 2012 10:39:13 +0900 > Atsushi Kumagai wrote: > > > This patch adds the values related to buddy system to vmcoreinfo data > > so that makedumpfile (dump filtering command) can filter out all free > > pages with the new logic. > > It's faster than the current logic because it can distinguish free page > > by analyzing page structure at the same time as filtering for other > > unnecessary pages (e.g. anonymous page). > > OTOH, the current logic has to trace free_list to distinguish free > > pages while analyzing page structure to filter out other unnecessary > > pages. > > > > The new logic uses the fact that buddy page is marked by _mapcount == > > PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other > > fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab > > is set or not before looking up _mapcount value. > > And we can get the order of buddy system from private field. > > To sum it up, the values below are required for this logic. > > > > Required values: > > - OFFSET(page._mapcount) > > - OFFSET(page.private) > > - NUMBER(PG_slab) > > - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) > > > > Changelog from v1 to v2: > > 1. remove SIZE(pageflags) > > The new logic was changed after I sent v1 patch. > > Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile. > > > > What's makedumpfile: > > makedumpfile creates a small dumpfile by excluding unnecessary pages > > for the analysis. To distinguish unnecessary pages, makedumpfile gets > > the vmcoreinfo data which has the minimum debugging information only > > for dump filtering. > > Gee, this info is getting highly dependent upon deep internal kernel > behaviour. Yes. makedumpfile should be changed depend on kernel version and we did it. > > index 5e4bd78..b27efe4 100644 > > --- a/kernel/kexec.c > > +++ b/kernel/kexec.c > > @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void) > > VMCOREINFO_OFFSET(page, _count); > > VMCOREINFO_OFFSET(page, mapping); > > VMCOREINFO_OFFSET(page, lru); > > + VMCOREINFO_OFFSET(page, _mapcount); > > + VMCOREINFO_OFFSET(page, private); > > VMCOREINFO_OFFSET(pglist_data, node_zones); > > VMCOREINFO_OFFSET(pglist_data, nr_zones); > > #ifdef CONFIG_FLAT_NODE_MEM_MAP > > @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void) > > VMCOREINFO_NUMBER(PG_lru); > > VMCOREINFO_NUMBER(PG_private); > > VMCOREINFO_NUMBER(PG_swapcache); > > + VMCOREINFO_NUMBER(PG_slab); > > + VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE); > > We might change the PageBuddy() implementation at any time, and > makedumpfile will break. Or in this case, become less efficient. > > Is there any way in which we can move some of this logic into the > kernel? In this case, add some kernel code which uses PageBuddy() on > behalf of makedumpfile, rather than replicating the PageBuddy() logic > in userspace? In last month, Cliff Wickman proposed such idea: [PATCH v2] makedumpfile: request the kernel do page scans http://lists.infradead.org/pipermail/kexec/2012-November/007318.html [PATCH] scan page tables for makedumpfile, 3.0.13 kernel http://lists.infradead.org/pipermail/kexec/2012-November/007319.html In his idea, the kernel does page scans to distinguish unnecessary pages (free pages and others) and returns the list of PFN's which should be excluded for makedumpfile. As a result, makedumpfile doesn't need to consider internal kernel behavior. I think it's a good idea from the viewpoint of maintainability and performance. Thanks Atsushi Kumagai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
Andrew Morton writes: > On Wed, 19 Dec 2012 16:57:03 -0800 > ebied...@xmission.com (Eric W. Biederman) wrote: > >> Andrew Morton writes: >> >> > Is there any way in which we can move some of this logic into the >> > kernel? In this case, add some kernel code which uses PageBuddy() on >> > behalf of makedumpfile, rather than replicating the PageBuddy() logic >> > in userspace? >> >> All that exists when makedumpfile runs is a core file. So it would have >> to be something like a share library that builds with the kernel and >> then makedumpfile loads. > > Can we omit free pages from that core file? > > And/or add a section to that core file which flags free pages? Ommitting pages is what makedumpfile does. Very loosely shortly after boot when things are running fine /sbin/kexec runs. /sbin/kexec constructs a set of elf headers that describe where the memory is and load the crashdump kernel an initrd and those elf headers into memory. Years later when the running kernel calls panic. panic calls machine_kexec machine_kexec jmps to the preloaded crashdump kernel. I think it is /proc/vmcore that reads the elf headers out of memory and presents them to userspace. Then we have options. vmcore-to-dmesg will just read the dmesg ring buffer so we have that. makedumpfile reads the kernel data structures and filters out the free pages for people who don't want to write everything to disk. So the basic interface is strongly kernel version agnostic. The challenge is how to filter out undesirable pages from the core dump quickly and reliably. Right now what we have are a set of ELF notes that describe struct page. For my uses I have either had enough disk space that saving everything didn't matter or so little disk space that all I could afford was getting out the dmesg ring buffer. So I don't know how robust the solution adopted by makedumpfile is. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
On Wed, 19 Dec 2012 16:57:03 -0800 ebied...@xmission.com (Eric W. Biederman) wrote: > Andrew Morton writes: > > > Is there any way in which we can move some of this logic into the > > kernel? In this case, add some kernel code which uses PageBuddy() on > > behalf of makedumpfile, rather than replicating the PageBuddy() logic > > in userspace? > > All that exists when makedumpfile runs is a core file. So it would have > to be something like a share library that builds with the kernel and > then makedumpfile loads. Can we omit free pages from that core file? And/or add a section to that core file which flags free pages? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
Andrew Morton writes: > On Mon, 10 Dec 2012 10:39:13 +0900 > Atsushi Kumagai wrote: > >> This patch adds the values related to buddy system to vmcoreinfo data >> so that makedumpfile (dump filtering command) can filter out all free >> pages with the new logic. >> It's faster than the current logic because it can distinguish free page >> by analyzing page structure at the same time as filtering for other >> unnecessary pages (e.g. anonymous page). >> OTOH, the current logic has to trace free_list to distinguish free >> pages while analyzing page structure to filter out other unnecessary >> pages. >> >> The new logic uses the fact that buddy page is marked by _mapcount == >> PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other >> fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab >> is set or not before looking up _mapcount value. >> And we can get the order of buddy system from private field. >> To sum it up, the values below are required for this logic. >> >> Required values: >> - OFFSET(page._mapcount) >> - OFFSET(page.private) >> - NUMBER(PG_slab) >> - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) >> >> Changelog from v1 to v2: >> 1. remove SIZE(pageflags) >> The new logic was changed after I sent v1 patch. >> Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile. >> >> What's makedumpfile: >> makedumpfile creates a small dumpfile by excluding unnecessary pages >> for the analysis. To distinguish unnecessary pages, makedumpfile gets >> the vmcoreinfo data which has the minimum debugging information only >> for dump filtering. > > Gee, this info is getting highly dependent upon deep internal kernel > behaviour. > >> index 5e4bd78..b27efe4 100644 >> --- a/kernel/kexec.c >> +++ b/kernel/kexec.c >> @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void) >> VMCOREINFO_OFFSET(page, _count); >> VMCOREINFO_OFFSET(page, mapping); >> VMCOREINFO_OFFSET(page, lru); >> +VMCOREINFO_OFFSET(page, _mapcount); >> +VMCOREINFO_OFFSET(page, private); >> VMCOREINFO_OFFSET(pglist_data, node_zones); >> VMCOREINFO_OFFSET(pglist_data, nr_zones); >> #ifdef CONFIG_FLAT_NODE_MEM_MAP >> @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void) >> VMCOREINFO_NUMBER(PG_lru); >> VMCOREINFO_NUMBER(PG_private); >> VMCOREINFO_NUMBER(PG_swapcache); >> +VMCOREINFO_NUMBER(PG_slab); >> +VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE); > > We might change the PageBuddy() implementation at any time, and > makedumpfile will break. Or in this case, become less efficient. > > Is there any way in which we can move some of this logic into the > kernel? In this case, add some kernel code which uses PageBuddy() on > behalf of makedumpfile, rather than replicating the PageBuddy() logic > in userspace? All that exists when makedumpfile runs is a core file. So it would have to be something like a share library that builds with the kernel and then makedumpfile loads. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
On Mon, 10 Dec 2012 10:39:13 +0900 Atsushi Kumagai wrote: > This patch adds the values related to buddy system to vmcoreinfo data > so that makedumpfile (dump filtering command) can filter out all free > pages with the new logic. > It's faster than the current logic because it can distinguish free page > by analyzing page structure at the same time as filtering for other > unnecessary pages (e.g. anonymous page). > OTOH, the current logic has to trace free_list to distinguish free > pages while analyzing page structure to filter out other unnecessary > pages. > > The new logic uses the fact that buddy page is marked by _mapcount == > PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other > fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab > is set or not before looking up _mapcount value. > And we can get the order of buddy system from private field. > To sum it up, the values below are required for this logic. > > Required values: > - OFFSET(page._mapcount) > - OFFSET(page.private) > - NUMBER(PG_slab) > - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) > > Changelog from v1 to v2: > 1. remove SIZE(pageflags) > The new logic was changed after I sent v1 patch. > Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile. > > What's makedumpfile: > makedumpfile creates a small dumpfile by excluding unnecessary pages > for the analysis. To distinguish unnecessary pages, makedumpfile gets > the vmcoreinfo data which has the minimum debugging information only > for dump filtering. Gee, this info is getting highly dependent upon deep internal kernel behaviour. > index 5e4bd78..b27efe4 100644 > --- a/kernel/kexec.c > +++ b/kernel/kexec.c > @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void) > VMCOREINFO_OFFSET(page, _count); > VMCOREINFO_OFFSET(page, mapping); > VMCOREINFO_OFFSET(page, lru); > + VMCOREINFO_OFFSET(page, _mapcount); > + VMCOREINFO_OFFSET(page, private); > VMCOREINFO_OFFSET(pglist_data, node_zones); > VMCOREINFO_OFFSET(pglist_data, nr_zones); > #ifdef CONFIG_FLAT_NODE_MEM_MAP > @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void) > VMCOREINFO_NUMBER(PG_lru); > VMCOREINFO_NUMBER(PG_private); > VMCOREINFO_NUMBER(PG_swapcache); > + VMCOREINFO_NUMBER(PG_slab); > + VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE); We might change the PageBuddy() implementation at any time, and makedumpfile will break. Or in this case, become less efficient. Is there any way in which we can move some of this logic into the kernel? In this case, add some kernel code which uses PageBuddy() on behalf of makedumpfile, rather than replicating the PageBuddy() logic in userspace? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
On Mon, 10 Dec 2012 10:39:13 +0900 Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp wrote: This patch adds the values related to buddy system to vmcoreinfo data so that makedumpfile (dump filtering command) can filter out all free pages with the new logic. It's faster than the current logic because it can distinguish free page by analyzing page structure at the same time as filtering for other unnecessary pages (e.g. anonymous page). OTOH, the current logic has to trace free_list to distinguish free pages while analyzing page structure to filter out other unnecessary pages. The new logic uses the fact that buddy page is marked by _mapcount == PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab is set or not before looking up _mapcount value. And we can get the order of buddy system from private field. To sum it up, the values below are required for this logic. Required values: - OFFSET(page._mapcount) - OFFSET(page.private) - NUMBER(PG_slab) - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) Changelog from v1 to v2: 1. remove SIZE(pageflags) The new logic was changed after I sent v1 patch. Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile. What's makedumpfile: makedumpfile creates a small dumpfile by excluding unnecessary pages for the analysis. To distinguish unnecessary pages, makedumpfile gets the vmcoreinfo data which has the minimum debugging information only for dump filtering. Gee, this info is getting highly dependent upon deep internal kernel behaviour. index 5e4bd78..b27efe4 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_OFFSET(page, _count); VMCOREINFO_OFFSET(page, mapping); VMCOREINFO_OFFSET(page, lru); + VMCOREINFO_OFFSET(page, _mapcount); + VMCOREINFO_OFFSET(page, private); VMCOREINFO_OFFSET(pglist_data, node_zones); VMCOREINFO_OFFSET(pglist_data, nr_zones); #ifdef CONFIG_FLAT_NODE_MEM_MAP @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_NUMBER(PG_lru); VMCOREINFO_NUMBER(PG_private); VMCOREINFO_NUMBER(PG_swapcache); + VMCOREINFO_NUMBER(PG_slab); + VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE); We might change the PageBuddy() implementation at any time, and makedumpfile will break. Or in this case, become less efficient. Is there any way in which we can move some of this logic into the kernel? In this case, add some kernel code which uses PageBuddy() on behalf of makedumpfile, rather than replicating the PageBuddy() logic in userspace? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
Andrew Morton a...@linux-foundation.org writes: On Mon, 10 Dec 2012 10:39:13 +0900 Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp wrote: This patch adds the values related to buddy system to vmcoreinfo data so that makedumpfile (dump filtering command) can filter out all free pages with the new logic. It's faster than the current logic because it can distinguish free page by analyzing page structure at the same time as filtering for other unnecessary pages (e.g. anonymous page). OTOH, the current logic has to trace free_list to distinguish free pages while analyzing page structure to filter out other unnecessary pages. The new logic uses the fact that buddy page is marked by _mapcount == PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab is set or not before looking up _mapcount value. And we can get the order of buddy system from private field. To sum it up, the values below are required for this logic. Required values: - OFFSET(page._mapcount) - OFFSET(page.private) - NUMBER(PG_slab) - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) Changelog from v1 to v2: 1. remove SIZE(pageflags) The new logic was changed after I sent v1 patch. Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile. What's makedumpfile: makedumpfile creates a small dumpfile by excluding unnecessary pages for the analysis. To distinguish unnecessary pages, makedumpfile gets the vmcoreinfo data which has the minimum debugging information only for dump filtering. Gee, this info is getting highly dependent upon deep internal kernel behaviour. index 5e4bd78..b27efe4 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_OFFSET(page, _count); VMCOREINFO_OFFSET(page, mapping); VMCOREINFO_OFFSET(page, lru); +VMCOREINFO_OFFSET(page, _mapcount); +VMCOREINFO_OFFSET(page, private); VMCOREINFO_OFFSET(pglist_data, node_zones); VMCOREINFO_OFFSET(pglist_data, nr_zones); #ifdef CONFIG_FLAT_NODE_MEM_MAP @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_NUMBER(PG_lru); VMCOREINFO_NUMBER(PG_private); VMCOREINFO_NUMBER(PG_swapcache); +VMCOREINFO_NUMBER(PG_slab); +VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE); We might change the PageBuddy() implementation at any time, and makedumpfile will break. Or in this case, become less efficient. Is there any way in which we can move some of this logic into the kernel? In this case, add some kernel code which uses PageBuddy() on behalf of makedumpfile, rather than replicating the PageBuddy() logic in userspace? All that exists when makedumpfile runs is a core file. So it would have to be something like a share library that builds with the kernel and then makedumpfile loads. Eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
On Wed, 19 Dec 2012 16:57:03 -0800 ebied...@xmission.com (Eric W. Biederman) wrote: Andrew Morton a...@linux-foundation.org writes: Is there any way in which we can move some of this logic into the kernel? In this case, add some kernel code which uses PageBuddy() on behalf of makedumpfile, rather than replicating the PageBuddy() logic in userspace? All that exists when makedumpfile runs is a core file. So it would have to be something like a share library that builds with the kernel and then makedumpfile loads. Can we omit free pages from that core file? And/or add a section to that core file which flags free pages? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
Andrew Morton a...@linux-foundation.org writes: On Wed, 19 Dec 2012 16:57:03 -0800 ebied...@xmission.com (Eric W. Biederman) wrote: Andrew Morton a...@linux-foundation.org writes: Is there any way in which we can move some of this logic into the kernel? In this case, add some kernel code which uses PageBuddy() on behalf of makedumpfile, rather than replicating the PageBuddy() logic in userspace? All that exists when makedumpfile runs is a core file. So it would have to be something like a share library that builds with the kernel and then makedumpfile loads. Can we omit free pages from that core file? And/or add a section to that core file which flags free pages? Ommitting pages is what makedumpfile does. Very loosely shortly after boot when things are running fine /sbin/kexec runs. /sbin/kexec constructs a set of elf headers that describe where the memory is and load the crashdump kernel an initrd and those elf headers into memory. Years later when the running kernel calls panic. panic calls machine_kexec machine_kexec jmps to the preloaded crashdump kernel. I think it is /proc/vmcore that reads the elf headers out of memory and presents them to userspace. Then we have options. vmcore-to-dmesg will just read the dmesg ring buffer so we have that. makedumpfile reads the kernel data structures and filters out the free pages for people who don't want to write everything to disk. So the basic interface is strongly kernel version agnostic. The challenge is how to filter out undesirable pages from the core dump quickly and reliably. Right now what we have are a set of ELF notes that describe struct page. For my uses I have either had enough disk space that saving everything didn't matter or so little disk space that all I could afford was getting out the dmesg ring buffer. So I don't know how robust the solution adopted by makedumpfile is. Eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
Hello Andrew, On Wed, 19 Dec 2012 16:18:56 -0800 Andrew Morton a...@linux-foundation.org wrote: On Mon, 10 Dec 2012 10:39:13 +0900 Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp wrote: This patch adds the values related to buddy system to vmcoreinfo data so that makedumpfile (dump filtering command) can filter out all free pages with the new logic. It's faster than the current logic because it can distinguish free page by analyzing page structure at the same time as filtering for other unnecessary pages (e.g. anonymous page). OTOH, the current logic has to trace free_list to distinguish free pages while analyzing page structure to filter out other unnecessary pages. The new logic uses the fact that buddy page is marked by _mapcount == PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab is set or not before looking up _mapcount value. And we can get the order of buddy system from private field. To sum it up, the values below are required for this logic. Required values: - OFFSET(page._mapcount) - OFFSET(page.private) - NUMBER(PG_slab) - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) Changelog from v1 to v2: 1. remove SIZE(pageflags) The new logic was changed after I sent v1 patch. Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile. What's makedumpfile: makedumpfile creates a small dumpfile by excluding unnecessary pages for the analysis. To distinguish unnecessary pages, makedumpfile gets the vmcoreinfo data which has the minimum debugging information only for dump filtering. Gee, this info is getting highly dependent upon deep internal kernel behaviour. Yes. makedumpfile should be changed depend on kernel version and we did it. index 5e4bd78..b27efe4 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_OFFSET(page, _count); VMCOREINFO_OFFSET(page, mapping); VMCOREINFO_OFFSET(page, lru); + VMCOREINFO_OFFSET(page, _mapcount); + VMCOREINFO_OFFSET(page, private); VMCOREINFO_OFFSET(pglist_data, node_zones); VMCOREINFO_OFFSET(pglist_data, nr_zones); #ifdef CONFIG_FLAT_NODE_MEM_MAP @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_NUMBER(PG_lru); VMCOREINFO_NUMBER(PG_private); VMCOREINFO_NUMBER(PG_swapcache); + VMCOREINFO_NUMBER(PG_slab); + VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE); We might change the PageBuddy() implementation at any time, and makedumpfile will break. Or in this case, become less efficient. Is there any way in which we can move some of this logic into the kernel? In this case, add some kernel code which uses PageBuddy() on behalf of makedumpfile, rather than replicating the PageBuddy() logic in userspace? In last month, Cliff Wickman proposed such idea: [PATCH v2] makedumpfile: request the kernel do page scans http://lists.infradead.org/pipermail/kexec/2012-November/007318.html [PATCH] scan page tables for makedumpfile, 3.0.13 kernel http://lists.infradead.org/pipermail/kexec/2012-November/007319.html In his idea, the kernel does page scans to distinguish unnecessary pages (free pages and others) and returns the list of PFN's which should be excluded for makedumpfile. As a result, makedumpfile doesn't need to consider internal kernel behavior. I think it's a good idea from the viewpoint of maintainability and performance. Thanks Atsushi Kumagai -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v2] Add the values related to buddy system for filtering free pages.
From: kexec-boun...@lists.infradead.org [mailto:kexec-boun...@lists.infradead.org] On Behalf Of Atsushi Kumagai Sent: Thursday, December 20, 2012 11:21 AM On Wed, 19 Dec 2012 16:18:56 -0800 Andrew Morton a...@linux-foundation.org wrote: On Mon, 10 Dec 2012 10:39:13 +0900 Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp wrote: We might change the PageBuddy() implementation at any time, and makedumpfile will break. Or in this case, become less efficient. Is there any way in which we can move some of this logic into the kernel? In this case, add some kernel code which uses PageBuddy() on behalf of makedumpfile, rather than replicating the PageBuddy() logic in userspace? In last month, Cliff Wickman proposed such idea: [PATCH v2] makedumpfile: request the kernel do page scans http://lists.infradead.org/pipermail/kexec/2012-November/007318.html [PATCH] scan page tables for makedumpfile, 3.0.13 kernel http://lists.infradead.org/pipermail/kexec/2012-November/007319.html In his idea, the kernel does page scans to distinguish unnecessary pages (free pages and others) and returns the list of PFN's which should be excluded for makedumpfile. As a result, makedumpfile doesn't need to consider internal kernel behavior. I think it's a good idea from the viewpoint of maintainability and performance. I also think wide part of his code can be reused in this work. But the bad performance is caused by a lot of ioremap, not a lot of copying. See my profiling result I posted some days ago. Two issues, ioremap one and filtering maintainability, should be considered separately. Even on ioremap issue, there is secondary one to consider in memory consumption on the 2nd kernel. Also, I have one question. Can we always think of 1st and 2nd kernels are same? If I understand correctly, kexec/kdump can use the 2nd kernel different from the 1st's. So, differnet kernels need to do the same thing as makedumpfile does. If assuming two are same, problem is mush simplified. Thanks. HATAYAMA, Daisuke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
CCing Andrew Morton. I think he picks kexec patches. On Mon, Dec 10, 2012 at 08:17:05AM -0500, Vivek Goyal wrote: > On Mon, Dec 10, 2012 at 10:39:13AM +0900, Atsushi Kumagai wrote: > > This patch adds the values related to buddy system to vmcoreinfo data > > so that makedumpfile (dump filtering command) can filter out all free > > pages with the new logic. > > It's faster than the current logic because it can distinguish free page > > by analyzing page structure at the same time as filtering for other > > unnecessary pages (e.g. anonymous page). > > OTOH, the current logic has to trace free_list to distinguish free > > pages while analyzing page structure to filter out other unnecessary > > pages. > > > > The new logic uses the fact that buddy page is marked by _mapcount == > > PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other > > fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab > > is set or not before looking up _mapcount value. > > And we can get the order of buddy system from private field. > > To sum it up, the values below are required for this logic. > > > > Required values: > > - OFFSET(page._mapcount) > > - OFFSET(page.private) > > - NUMBER(PG_slab) > > - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) > > > > Changelog from v1 to v2: > > 1. remove SIZE(pageflags) > > The new logic was changed after I sent v1 patch. > > Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile. > > > > What's makedumpfile: > > makedumpfile creates a small dumpfile by excluding unnecessary pages > > for the analysis. To distinguish unnecessary pages, makedumpfile gets > > the vmcoreinfo data which has the minimum debugging information only > > for dump filtering. > > > > Signed-off-by: Atsushi Kumagai > > Looks good to me. > > Acked-by: Vivek Goyal > > Thanks > Vivek > > > --- > > kernel/kexec.c |4 > > 1 file changed, 4 insertions(+) > > > > diff --git a/kernel/kexec.c b/kernel/kexec.c > > index 5e4bd78..b27efe4 100644 > > --- a/kernel/kexec.c > > +++ b/kernel/kexec.c > > @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void) > > VMCOREINFO_OFFSET(page, _count); > > VMCOREINFO_OFFSET(page, mapping); > > VMCOREINFO_OFFSET(page, lru); > > + VMCOREINFO_OFFSET(page, _mapcount); > > + VMCOREINFO_OFFSET(page, private); > > VMCOREINFO_OFFSET(pglist_data, node_zones); > > VMCOREINFO_OFFSET(pglist_data, nr_zones); > > #ifdef CONFIG_FLAT_NODE_MEM_MAP > > @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void) > > VMCOREINFO_NUMBER(PG_lru); > > VMCOREINFO_NUMBER(PG_private); > > VMCOREINFO_NUMBER(PG_swapcache); > > + VMCOREINFO_NUMBER(PG_slab); > > + VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE); > > > > arch_crash_save_vmcoreinfo(); > > update_vmcoreinfo_note(); > > -- > > 1.7.9.2 > > > > ___ > > kexec mailing list > > ke...@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/kexec > > ___ > kexec mailing list > ke...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
CCing Andrew Morton. I think he picks kexec patches. On Mon, Dec 10, 2012 at 08:17:05AM -0500, Vivek Goyal wrote: On Mon, Dec 10, 2012 at 10:39:13AM +0900, Atsushi Kumagai wrote: This patch adds the values related to buddy system to vmcoreinfo data so that makedumpfile (dump filtering command) can filter out all free pages with the new logic. It's faster than the current logic because it can distinguish free page by analyzing page structure at the same time as filtering for other unnecessary pages (e.g. anonymous page). OTOH, the current logic has to trace free_list to distinguish free pages while analyzing page structure to filter out other unnecessary pages. The new logic uses the fact that buddy page is marked by _mapcount == PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab is set or not before looking up _mapcount value. And we can get the order of buddy system from private field. To sum it up, the values below are required for this logic. Required values: - OFFSET(page._mapcount) - OFFSET(page.private) - NUMBER(PG_slab) - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) Changelog from v1 to v2: 1. remove SIZE(pageflags) The new logic was changed after I sent v1 patch. Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile. What's makedumpfile: makedumpfile creates a small dumpfile by excluding unnecessary pages for the analysis. To distinguish unnecessary pages, makedumpfile gets the vmcoreinfo data which has the minimum debugging information only for dump filtering. Signed-off-by: Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp Looks good to me. Acked-by: Vivek Goyal vgo...@redhat.com Thanks Vivek --- kernel/kexec.c |4 1 file changed, 4 insertions(+) diff --git a/kernel/kexec.c b/kernel/kexec.c index 5e4bd78..b27efe4 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_OFFSET(page, _count); VMCOREINFO_OFFSET(page, mapping); VMCOREINFO_OFFSET(page, lru); + VMCOREINFO_OFFSET(page, _mapcount); + VMCOREINFO_OFFSET(page, private); VMCOREINFO_OFFSET(pglist_data, node_zones); VMCOREINFO_OFFSET(pglist_data, nr_zones); #ifdef CONFIG_FLAT_NODE_MEM_MAP @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_NUMBER(PG_lru); VMCOREINFO_NUMBER(PG_private); VMCOREINFO_NUMBER(PG_swapcache); + VMCOREINFO_NUMBER(PG_slab); + VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE); arch_crash_save_vmcoreinfo(); update_vmcoreinfo_note(); -- 1.7.9.2 ___ kexec mailing list ke...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ___ kexec mailing list ke...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
On Mon, Dec 10, 2012 at 10:39:13AM +0900, Atsushi Kumagai wrote: > This patch adds the values related to buddy system to vmcoreinfo data > so that makedumpfile (dump filtering command) can filter out all free > pages with the new logic. > It's faster than the current logic because it can distinguish free page > by analyzing page structure at the same time as filtering for other > unnecessary pages (e.g. anonymous page). > OTOH, the current logic has to trace free_list to distinguish free > pages while analyzing page structure to filter out other unnecessary > pages. > > The new logic uses the fact that buddy page is marked by _mapcount == > PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other > fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab > is set or not before looking up _mapcount value. > And we can get the order of buddy system from private field. > To sum it up, the values below are required for this logic. > > Required values: > - OFFSET(page._mapcount) > - OFFSET(page.private) > - NUMBER(PG_slab) > - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) > > Changelog from v1 to v2: > 1. remove SIZE(pageflags) > The new logic was changed after I sent v1 patch. > Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile. > > What's makedumpfile: > makedumpfile creates a small dumpfile by excluding unnecessary pages > for the analysis. To distinguish unnecessary pages, makedumpfile gets > the vmcoreinfo data which has the minimum debugging information only > for dump filtering. > > Signed-off-by: Atsushi Kumagai Looks good to me. Acked-by: Vivek Goyal Thanks Vivek > --- > kernel/kexec.c |4 > 1 file changed, 4 insertions(+) > > diff --git a/kernel/kexec.c b/kernel/kexec.c > index 5e4bd78..b27efe4 100644 > --- a/kernel/kexec.c > +++ b/kernel/kexec.c > @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void) > VMCOREINFO_OFFSET(page, _count); > VMCOREINFO_OFFSET(page, mapping); > VMCOREINFO_OFFSET(page, lru); > + VMCOREINFO_OFFSET(page, _mapcount); > + VMCOREINFO_OFFSET(page, private); > VMCOREINFO_OFFSET(pglist_data, node_zones); > VMCOREINFO_OFFSET(pglist_data, nr_zones); > #ifdef CONFIG_FLAT_NODE_MEM_MAP > @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void) > VMCOREINFO_NUMBER(PG_lru); > VMCOREINFO_NUMBER(PG_private); > VMCOREINFO_NUMBER(PG_swapcache); > + VMCOREINFO_NUMBER(PG_slab); > + VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE); > > arch_crash_save_vmcoreinfo(); > update_vmcoreinfo_note(); > -- > 1.7.9.2 > > ___ > kexec mailing list > ke...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Add the values related to buddy system for filtering free pages.
On Mon, Dec 10, 2012 at 10:39:13AM +0900, Atsushi Kumagai wrote: This patch adds the values related to buddy system to vmcoreinfo data so that makedumpfile (dump filtering command) can filter out all free pages with the new logic. It's faster than the current logic because it can distinguish free page by analyzing page structure at the same time as filtering for other unnecessary pages (e.g. anonymous page). OTOH, the current logic has to trace free_list to distinguish free pages while analyzing page structure to filter out other unnecessary pages. The new logic uses the fact that buddy page is marked by _mapcount == PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab is set or not before looking up _mapcount value. And we can get the order of buddy system from private field. To sum it up, the values below are required for this logic. Required values: - OFFSET(page._mapcount) - OFFSET(page.private) - NUMBER(PG_slab) - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) Changelog from v1 to v2: 1. remove SIZE(pageflags) The new logic was changed after I sent v1 patch. Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile. What's makedumpfile: makedumpfile creates a small dumpfile by excluding unnecessary pages for the analysis. To distinguish unnecessary pages, makedumpfile gets the vmcoreinfo data which has the minimum debugging information only for dump filtering. Signed-off-by: Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp Looks good to me. Acked-by: Vivek Goyal vgo...@redhat.com Thanks Vivek --- kernel/kexec.c |4 1 file changed, 4 insertions(+) diff --git a/kernel/kexec.c b/kernel/kexec.c index 5e4bd78..b27efe4 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_OFFSET(page, _count); VMCOREINFO_OFFSET(page, mapping); VMCOREINFO_OFFSET(page, lru); + VMCOREINFO_OFFSET(page, _mapcount); + VMCOREINFO_OFFSET(page, private); VMCOREINFO_OFFSET(pglist_data, node_zones); VMCOREINFO_OFFSET(pglist_data, nr_zones); #ifdef CONFIG_FLAT_NODE_MEM_MAP @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_NUMBER(PG_lru); VMCOREINFO_NUMBER(PG_private); VMCOREINFO_NUMBER(PG_swapcache); + VMCOREINFO_NUMBER(PG_slab); + VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE); arch_crash_save_vmcoreinfo(); update_vmcoreinfo_note(); -- 1.7.9.2 ___ kexec mailing list ke...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] Add the values related to buddy system for filtering free pages.
This patch adds the values related to buddy system to vmcoreinfo data so that makedumpfile (dump filtering command) can filter out all free pages with the new logic. It's faster than the current logic because it can distinguish free page by analyzing page structure at the same time as filtering for other unnecessary pages (e.g. anonymous page). OTOH, the current logic has to trace free_list to distinguish free pages while analyzing page structure to filter out other unnecessary pages. The new logic uses the fact that buddy page is marked by _mapcount == PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab is set or not before looking up _mapcount value. And we can get the order of buddy system from private field. To sum it up, the values below are required for this logic. Required values: - OFFSET(page._mapcount) - OFFSET(page.private) - NUMBER(PG_slab) - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) Changelog from v1 to v2: 1. remove SIZE(pageflags) The new logic was changed after I sent v1 patch. Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile. What's makedumpfile: makedumpfile creates a small dumpfile by excluding unnecessary pages for the analysis. To distinguish unnecessary pages, makedumpfile gets the vmcoreinfo data which has the minimum debugging information only for dump filtering. Signed-off-by: Atsushi Kumagai --- kernel/kexec.c |4 1 file changed, 4 insertions(+) diff --git a/kernel/kexec.c b/kernel/kexec.c index 5e4bd78..b27efe4 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_OFFSET(page, _count); VMCOREINFO_OFFSET(page, mapping); VMCOREINFO_OFFSET(page, lru); + VMCOREINFO_OFFSET(page, _mapcount); + VMCOREINFO_OFFSET(page, private); VMCOREINFO_OFFSET(pglist_data, node_zones); VMCOREINFO_OFFSET(pglist_data, nr_zones); #ifdef CONFIG_FLAT_NODE_MEM_MAP @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_NUMBER(PG_lru); VMCOREINFO_NUMBER(PG_private); VMCOREINFO_NUMBER(PG_swapcache); + VMCOREINFO_NUMBER(PG_slab); + VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE); arch_crash_save_vmcoreinfo(); update_vmcoreinfo_note(); -- 1.7.9.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] Add the values related to buddy system for filtering free pages.
This patch adds the values related to buddy system to vmcoreinfo data so that makedumpfile (dump filtering command) can filter out all free pages with the new logic. It's faster than the current logic because it can distinguish free page by analyzing page structure at the same time as filtering for other unnecessary pages (e.g. anonymous page). OTOH, the current logic has to trace free_list to distinguish free pages while analyzing page structure to filter out other unnecessary pages. The new logic uses the fact that buddy page is marked by _mapcount == PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab is set or not before looking up _mapcount value. And we can get the order of buddy system from private field. To sum it up, the values below are required for this logic. Required values: - OFFSET(page._mapcount) - OFFSET(page.private) - NUMBER(PG_slab) - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) Changelog from v1 to v2: 1. remove SIZE(pageflags) The new logic was changed after I sent v1 patch. Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile. What's makedumpfile: makedumpfile creates a small dumpfile by excluding unnecessary pages for the analysis. To distinguish unnecessary pages, makedumpfile gets the vmcoreinfo data which has the minimum debugging information only for dump filtering. Signed-off-by: Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp --- kernel/kexec.c |4 1 file changed, 4 insertions(+) diff --git a/kernel/kexec.c b/kernel/kexec.c index 5e4bd78..b27efe4 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_OFFSET(page, _count); VMCOREINFO_OFFSET(page, mapping); VMCOREINFO_OFFSET(page, lru); + VMCOREINFO_OFFSET(page, _mapcount); + VMCOREINFO_OFFSET(page, private); VMCOREINFO_OFFSET(pglist_data, node_zones); VMCOREINFO_OFFSET(pglist_data, nr_zones); #ifdef CONFIG_FLAT_NODE_MEM_MAP @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_NUMBER(PG_lru); VMCOREINFO_NUMBER(PG_private); VMCOREINFO_NUMBER(PG_swapcache); + VMCOREINFO_NUMBER(PG_slab); + VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE); arch_crash_save_vmcoreinfo(); update_vmcoreinfo_note(); -- 1.7.9.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/