RE: [PATCH v2] Add the values related to buddy system for filtering free pages.

2013-02-08 Thread Mitchell, Lisa (MCLinux in Fort Collins)
Thanks, that's good news, and thanks for the commit ID, that was the thing I 
was having trouble finding.

-Original Message-
From: Atsushi Kumagai [mailto:kumagai-atsu...@mxc.nes.nec.co.jp] 
Sent: Thursday, February 07, 2013 7:45 PM
To: Mitchell, Lisa (MCLinux in Fort Collins)
Cc: vgo...@redhat.com; ke...@lists.infradead.org; linux-kernel@vger.kernel.org; 
linux...@kvack.org; d.hatay...@jp.fujitsu.com; ebied...@xmission.com; 
a...@linux-foundation.org; c...@sgi.com
Subject: Re: [PATCH v2] Add the values related to buddy system for filtering 
free pages.

Hello Lisa,

On Thu, 07 Feb 2013 05:29:11 -0700
Lisa Mitchell  wrote:

> > > > Also, I have one question. Can we always think of 1st and 2nd 
> > > > kernels are same?
> > > 
> > > Not at all.  Distros frequently implement it with the same kernel 
> > > in both role but it should be possible to use an old crusty stable 
> > > kernel as the 2nd kernel.
> > > 
> > > > If I understand correctly, kexec/kdump can use the 2nd kernel 
> > > > different from the 1st's. So, differnet kernels need to do the 
> > > > same thing as makedumpfile does. If assuming two are same, problem is 
> > > > mush simplified.
> > > 
> > > As a developer it becomes attractive to use a known stable kernel 
> > > to capture the crash dump even as I experiment with a brand new kernel.
> > 
> > To allow to use the 2nd kernel different from the 1st's, I think we 
> > have to take care of each kernel version with the logic included in 
> > makedumpfile for them. That's to say, makedumpfile goes on as before.
> > 
> > 
> > Thanks
> > Atsushi Kumagai
> 
> 
> Atsushi and Vivek:  
> 
> I'm trying to get the status of whether the patch submitted in
> https://lkml.org/lkml/2012/11/21/90  is going to be accepted upstream
> and get in some version of the Linux 3.8 kernel.   I'm replying to the
> last email thread above on kexec_lists and lkml.org  that I could find 
> about this patch.
> 
> I was counting on this kernel patch to improve performance of 
> makedumpfilev1.5.1, so at least it wouldn't be a regression in
> performance over makedumpfile v1.4.   It was listed as recommended in
> the makedumpfilev1.5.1 release posting:
> http://lists.infradead.org/pipermail/kexec/2012-December/007460.html
> 
> 
> All the conversations in the thread since this patch was committed 
> seem to voice some reservations now, and reference other fixes being 
> tried to improve performance.
> 
> Does that mean you are abandoning getting this patch accepted 
> upstream, in favor of pursuing other alternatives?

No, this patch has been merged into -next, we should just wait for it to be 
merged into linus tree.

  
http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=0c63e90dd1c7b35ae2ea9475ba67cf68d8801a26

What interests us now is improvement for interfaces of /proc/vmcore, it's not 
alternative but another idea which can be consistent with this patch.


Thanks
Atsushi Kumagai

> 
> I had hoped this patch would be okay to get accepted upstream, and 
> then other improvements could be built on top of it.
> 
> Is that not the case?   
> 
> Or has further review concluded now that this change is a bad idea due 
> to adding dependence of this new makedumpfile feature on some deep 
> kernel memory internals?
> 
> Thanks,
> 
> Lisa Mitchell
> 
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v2] Add the values related to buddy system for filtering free pages.

2013-02-08 Thread Mitchell, Lisa (MCLinux in Fort Collins)
Thanks, that's good news, and thanks for the commit ID, that was the thing I 
was having trouble finding.

-Original Message-
From: Atsushi Kumagai [mailto:kumagai-atsu...@mxc.nes.nec.co.jp] 
Sent: Thursday, February 07, 2013 7:45 PM
To: Mitchell, Lisa (MCLinux in Fort Collins)
Cc: vgo...@redhat.com; ke...@lists.infradead.org; linux-kernel@vger.kernel.org; 
linux...@kvack.org; d.hatay...@jp.fujitsu.com; ebied...@xmission.com; 
a...@linux-foundation.org; c...@sgi.com
Subject: Re: [PATCH v2] Add the values related to buddy system for filtering 
free pages.

Hello Lisa,

On Thu, 07 Feb 2013 05:29:11 -0700
Lisa Mitchell lisa.mitch...@hp.com wrote:

Also, I have one question. Can we always think of 1st and 2nd 
kernels are same?
   
   Not at all.  Distros frequently implement it with the same kernel 
   in both role but it should be possible to use an old crusty stable 
   kernel as the 2nd kernel.
   
If I understand correctly, kexec/kdump can use the 2nd kernel 
different from the 1st's. So, differnet kernels need to do the 
same thing as makedumpfile does. If assuming two are same, problem is 
mush simplified.
   
   As a developer it becomes attractive to use a known stable kernel 
   to capture the crash dump even as I experiment with a brand new kernel.
  
  To allow to use the 2nd kernel different from the 1st's, I think we 
  have to take care of each kernel version with the logic included in 
  makedumpfile for them. That's to say, makedumpfile goes on as before.
  
  
  Thanks
  Atsushi Kumagai
 
 
 Atsushi and Vivek:  
 
 I'm trying to get the status of whether the patch submitted in
 https://lkml.org/lkml/2012/11/21/90  is going to be accepted upstream
 and get in some version of the Linux 3.8 kernel.   I'm replying to the
 last email thread above on kexec_lists and lkml.org  that I could find 
 about this patch.
 
 I was counting on this kernel patch to improve performance of 
 makedumpfilev1.5.1, so at least it wouldn't be a regression in
 performance over makedumpfile v1.4.   It was listed as recommended in
 the makedumpfilev1.5.1 release posting:
 http://lists.infradead.org/pipermail/kexec/2012-December/007460.html
 
 
 All the conversations in the thread since this patch was committed 
 seem to voice some reservations now, and reference other fixes being 
 tried to improve performance.
 
 Does that mean you are abandoning getting this patch accepted 
 upstream, in favor of pursuing other alternatives?

No, this patch has been merged into -next, we should just wait for it to be 
merged into linus tree.

  
http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=0c63e90dd1c7b35ae2ea9475ba67cf68d8801a26

What interests us now is improvement for interfaces of /proc/vmcore, it's not 
alternative but another idea which can be consistent with this patch.


Thanks
Atsushi Kumagai

 
 I had hoped this patch would be okay to get accepted upstream, and 
 then other improvements could be built on top of it.
 
 Is that not the case?   
 
 Or has further review concluded now that this change is a bad idea due 
 to adding dependence of this new makedumpfile feature on some deep 
 kernel memory internals?
 
 Thanks,
 
 Lisa Mitchell
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2013-02-07 Thread Atsushi Kumagai
Hello Lisa,

On Thu, 07 Feb 2013 05:29:11 -0700
Lisa Mitchell  wrote:

> > > > Also, I have one question. Can we always think of 1st and 2nd kernels
> > > > are same?
> > > 
> > > Not at all.  Distros frequently implement it with the same kernel in
> > > both role but it should be possible to use an old crusty stable kernel
> > > as the 2nd kernel.
> > > 
> > > > If I understand correctly, kexec/kdump can use the 2nd kernel different
> > > > from the 1st's. So, differnet kernels need to do the same thing as 
> > > > makedumpfile
> > > > does. If assuming two are same, problem is mush simplified.
> > > 
> > > As a developer it becomes attractive to use a known stable kernel to
> > > capture the crash dump even as I experiment with a brand new kernel.
> > 
> > To allow to use the 2nd kernel different from the 1st's, I think we have
> > to take care of each kernel version with the logic included in makedumpfile
> > for them. That's to say, makedumpfile goes on as before.
> > 
> > 
> > Thanks
> > Atsushi Kumagai
> 
> 
> Atsushi and Vivek:  
> 
> I'm trying to get the status of whether the patch submitted in
> https://lkml.org/lkml/2012/11/21/90  is going to be accepted upstream
> and get in some version of the Linux 3.8 kernel.   I'm replying to the
> last email thread above on kexec_lists and lkml.org  that I could find
> about this patch.  
> 
> I was counting on this kernel patch to improve performance of
> makedumpfilev1.5.1, so at least it wouldn't be a regression in
> performance over makedumpfile v1.4.   It was listed as recommended in
> the makedumpfilev1.5.1 release posting:
> http://lists.infradead.org/pipermail/kexec/2012-December/007460.html
> 
> 
> All the conversations in the thread since this patch was committed seem
> to voice some reservations now, and reference other fixes being tried to
> improve performance.
> 
> Does that mean you are abandoning getting this patch accepted upstream,
> in favor of pursuing other alternatives?

No, this patch has been merged into -next, we should just wait for it to be
merged into linus tree.

  
http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=0c63e90dd1c7b35ae2ea9475ba67cf68d8801a26

What interests us now is improvement for interfaces of /proc/vmcore,
it's not alternative but another idea which can be consistent with
this patch.


Thanks
Atsushi Kumagai

> 
> I had hoped this patch would be okay to get accepted upstream, and then
> other improvements could be built on top of it.  
> 
> Is that not the case?   
> 
> Or has further review concluded now that this change is a bad idea due
> to adding dependence of this new makedumpfile feature on some deep
> kernel memory internals?
> 
> Thanks,
> 
> Lisa Mitchell
> 
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2013-02-07 Thread Lisa Mitchell
On Thu, 2012-12-27 at 08:35 +, Atsushi Kumagai wrote:
> Hello,
> 
> On Thu, 20 Dec 2012 18:00:11 -0800
> ebied...@xmission.com (Eric W. Biederman) wrote:
> 
> > "Hatayama, Daisuke"  writes:
> > 
> > >> From: kexec-boun...@lists.infradead.org
> > >> [mailto:kexec-boun...@lists.infradead.org] On Behalf Of Atsushi Kumagai
> > >> Sent: Thursday, December 20, 2012 11:21 AM
> > >
> > >> On Wed, 19 Dec 2012 16:18:56 -0800
> > >> Andrew Morton  wrote:
> > >> 
> > >> > On Mon, 10 Dec 2012 10:39:13 +0900
> > >> > Atsushi Kumagai  wrote:
> > >> >
> > >
> > >> >
> > >> > We might change the PageBuddy() implementation at any time, and
> > >> > makedumpfile will break.  Or in this case, become less efficient.
> > >> >
> > >> > Is there any way in which we can move some of this logic into the
> > >> > kernel?  In this case, add some kernel code which uses PageBuddy() on
> > >> > behalf of makedumpfile, rather than replicating the PageBuddy() logic
> > >> > in userspace?
> > >> 
> > >> In last month, Cliff Wickman proposed such idea:
> > >> 
> > >>   [PATCH v2] makedumpfile: request the kernel do page scans
> > >>   http://lists.infradead.org/pipermail/kexec/2012-November/007318.html
> > >> 
> > >>   [PATCH] scan page tables for makedumpfile, 3.0.13 kernel
> > >>   http://lists.infradead.org/pipermail/kexec/2012-November/007319.html
> > >> 
> > >> In his idea, the kernel does page scans to distinguish unnecessary pages
> > >> (free pages and others) and returns the list of PFN's which should be
> > >> excluded for makedumpfile.
> > >> As a result, makedumpfile doesn't need to consider internal kernel
> > >> behavior.
> > >> 
> > >> I think it's a good idea from the viewpoint of maintainability and
> > >> performance.
> > 
> > > I also think wide part of his code can be reused in this work. But the bad
> > > performance is caused by a lot of ioremap, not a lot of copying. See my
> > > profiling result I posted some days ago. Two issues, ioremap one and 
> > > filtering
> > > maintainability, should be considered separately. Even on ioremap issue,
> > > there is secondary one to consider in memory consumption on the 2nd
> > > kernel.
> > 
> > Thanks.  I was wondering why moving the code into /proc/vmcore would
> > make things faster.
> 
> Thanks HATAYAMA-san, I've understood the issues correctly.
> We should continue improving the ioremap issue as Cliff and HATAYAMA-san
> are doing now.
> 
> > 
> > > Also, I have one question. Can we always think of 1st and 2nd kernels
> > > are same?
> > 
> > Not at all.  Distros frequently implement it with the same kernel in
> > both role but it should be possible to use an old crusty stable kernel
> > as the 2nd kernel.
> > 
> > > If I understand correctly, kexec/kdump can use the 2nd kernel different
> > > from the 1st's. So, differnet kernels need to do the same thing as 
> > > makedumpfile
> > > does. If assuming two are same, problem is mush simplified.
> > 
> > As a developer it becomes attractive to use a known stable kernel to
> > capture the crash dump even as I experiment with a brand new kernel.
> 
> To allow to use the 2nd kernel different from the 1st's, I think we have
> to take care of each kernel version with the logic included in makedumpfile
> for them. That's to say, makedumpfile goes on as before.
> 
> 
> Thanks
> Atsushi Kumagai


Atsushi and Vivek:  

I'm trying to get the status of whether the patch submitted in
https://lkml.org/lkml/2012/11/21/90  is going to be accepted upstream
and get in some version of the Linux 3.8 kernel.   I'm replying to the
last email thread above on kexec_lists and lkml.org  that I could find
about this patch.  

I was counting on this kernel patch to improve performance of
makedumpfilev1.5.1, so at least it wouldn't be a regression in
performance over makedumpfile v1.4.   It was listed as recommended in
the makedumpfilev1.5.1 release posting:
http://lists.infradead.org/pipermail/kexec/2012-December/007460.html


All the conversations in the thread since this patch was committed seem
to voice some reservations now, and reference other fixes being tried to
improve performance.  

Does that mean you are abandoning getting this patch accepted upstream,
in favor of pursuing other alternatives?

I had hoped this patch would be okay to get accepted upstream, and then
other improvements could be built on top of it.  

Is that not the case?   

Or has further review concluded now that this change is a bad idea due
to adding dependence of this new makedumpfile feature on some deep
kernel memory internals?

Thanks,

Lisa Mitchell

 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2013-02-07 Thread Atsushi Kumagai
Hello Lisa,

On Thu, 07 Feb 2013 05:29:11 -0700
Lisa Mitchell lisa.mitch...@hp.com wrote:

Also, I have one question. Can we always think of 1st and 2nd kernels
are same?
   
   Not at all.  Distros frequently implement it with the same kernel in
   both role but it should be possible to use an old crusty stable kernel
   as the 2nd kernel.
   
If I understand correctly, kexec/kdump can use the 2nd kernel different
from the 1st's. So, differnet kernels need to do the same thing as 
makedumpfile
does. If assuming two are same, problem is mush simplified.
   
   As a developer it becomes attractive to use a known stable kernel to
   capture the crash dump even as I experiment with a brand new kernel.
  
  To allow to use the 2nd kernel different from the 1st's, I think we have
  to take care of each kernel version with the logic included in makedumpfile
  for them. That's to say, makedumpfile goes on as before.
  
  
  Thanks
  Atsushi Kumagai
 
 
 Atsushi and Vivek:  
 
 I'm trying to get the status of whether the patch submitted in
 https://lkml.org/lkml/2012/11/21/90  is going to be accepted upstream
 and get in some version of the Linux 3.8 kernel.   I'm replying to the
 last email thread above on kexec_lists and lkml.org  that I could find
 about this patch.  
 
 I was counting on this kernel patch to improve performance of
 makedumpfilev1.5.1, so at least it wouldn't be a regression in
 performance over makedumpfile v1.4.   It was listed as recommended in
 the makedumpfilev1.5.1 release posting:
 http://lists.infradead.org/pipermail/kexec/2012-December/007460.html
 
 
 All the conversations in the thread since this patch was committed seem
 to voice some reservations now, and reference other fixes being tried to
 improve performance.
 
 Does that mean you are abandoning getting this patch accepted upstream,
 in favor of pursuing other alternatives?

No, this patch has been merged into -next, we should just wait for it to be
merged into linus tree.

  
http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=0c63e90dd1c7b35ae2ea9475ba67cf68d8801a26

What interests us now is improvement for interfaces of /proc/vmcore,
it's not alternative but another idea which can be consistent with
this patch.


Thanks
Atsushi Kumagai

 
 I had hoped this patch would be okay to get accepted upstream, and then
 other improvements could be built on top of it.  
 
 Is that not the case?   
 
 Or has further review concluded now that this change is a bad idea due
 to adding dependence of this new makedumpfile feature on some deep
 kernel memory internals?
 
 Thanks,
 
 Lisa Mitchell
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2013-02-07 Thread Lisa Mitchell
On Thu, 2012-12-27 at 08:35 +, Atsushi Kumagai wrote:
 Hello,
 
 On Thu, 20 Dec 2012 18:00:11 -0800
 ebied...@xmission.com (Eric W. Biederman) wrote:
 
  Hatayama, Daisuke d.hatay...@jp.fujitsu.com writes:
  
   From: kexec-boun...@lists.infradead.org
   [mailto:kexec-boun...@lists.infradead.org] On Behalf Of Atsushi Kumagai
   Sent: Thursday, December 20, 2012 11:21 AM
  
   On Wed, 19 Dec 2012 16:18:56 -0800
   Andrew Morton a...@linux-foundation.org wrote:
   
On Mon, 10 Dec 2012 10:39:13 +0900
Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp wrote:
   
  
   
We might change the PageBuddy() implementation at any time, and
makedumpfile will break.  Or in this case, become less efficient.
   
Is there any way in which we can move some of this logic into the
kernel?  In this case, add some kernel code which uses PageBuddy() on
behalf of makedumpfile, rather than replicating the PageBuddy() logic
in userspace?
   
   In last month, Cliff Wickman proposed such idea:
   
 [PATCH v2] makedumpfile: request the kernel do page scans
 http://lists.infradead.org/pipermail/kexec/2012-November/007318.html
   
 [PATCH] scan page tables for makedumpfile, 3.0.13 kernel
 http://lists.infradead.org/pipermail/kexec/2012-November/007319.html
   
   In his idea, the kernel does page scans to distinguish unnecessary pages
   (free pages and others) and returns the list of PFN's which should be
   excluded for makedumpfile.
   As a result, makedumpfile doesn't need to consider internal kernel
   behavior.
   
   I think it's a good idea from the viewpoint of maintainability and
   performance.
  
   I also think wide part of his code can be reused in this work. But the bad
   performance is caused by a lot of ioremap, not a lot of copying. See my
   profiling result I posted some days ago. Two issues, ioremap one and 
   filtering
   maintainability, should be considered separately. Even on ioremap issue,
   there is secondary one to consider in memory consumption on the 2nd
   kernel.
  
  Thanks.  I was wondering why moving the code into /proc/vmcore would
  make things faster.
 
 Thanks HATAYAMA-san, I've understood the issues correctly.
 We should continue improving the ioremap issue as Cliff and HATAYAMA-san
 are doing now.
 
  
   Also, I have one question. Can we always think of 1st and 2nd kernels
   are same?
  
  Not at all.  Distros frequently implement it with the same kernel in
  both role but it should be possible to use an old crusty stable kernel
  as the 2nd kernel.
  
   If I understand correctly, kexec/kdump can use the 2nd kernel different
   from the 1st's. So, differnet kernels need to do the same thing as 
   makedumpfile
   does. If assuming two are same, problem is mush simplified.
  
  As a developer it becomes attractive to use a known stable kernel to
  capture the crash dump even as I experiment with a brand new kernel.
 
 To allow to use the 2nd kernel different from the 1st's, I think we have
 to take care of each kernel version with the logic included in makedumpfile
 for them. That's to say, makedumpfile goes on as before.
 
 
 Thanks
 Atsushi Kumagai


Atsushi and Vivek:  

I'm trying to get the status of whether the patch submitted in
https://lkml.org/lkml/2012/11/21/90  is going to be accepted upstream
and get in some version of the Linux 3.8 kernel.   I'm replying to the
last email thread above on kexec_lists and lkml.org  that I could find
about this patch.  

I was counting on this kernel patch to improve performance of
makedumpfilev1.5.1, so at least it wouldn't be a regression in
performance over makedumpfile v1.4.   It was listed as recommended in
the makedumpfilev1.5.1 release posting:
http://lists.infradead.org/pipermail/kexec/2012-December/007460.html


All the conversations in the thread since this patch was committed seem
to voice some reservations now, and reference other fixes being tried to
improve performance.  

Does that mean you are abandoning getting this patch accepted upstream,
in favor of pursuing other alternatives?

I had hoped this patch would be okay to get accepted upstream, and then
other improvements could be built on top of it.  

Is that not the case?   

Or has further review concluded now that this change is a bad idea due
to adding dependence of this new makedumpfile feature on some deep
kernel memory internals?

Thanks,

Lisa Mitchell

 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-27 Thread Atsushi Kumagai
Hello,

On Thu, 20 Dec 2012 18:00:11 -0800
ebied...@xmission.com (Eric W. Biederman) wrote:

> "Hatayama, Daisuke"  writes:
> 
> >> From: kexec-boun...@lists.infradead.org
> >> [mailto:kexec-boun...@lists.infradead.org] On Behalf Of Atsushi Kumagai
> >> Sent: Thursday, December 20, 2012 11:21 AM
> >
> >> On Wed, 19 Dec 2012 16:18:56 -0800
> >> Andrew Morton  wrote:
> >> 
> >> > On Mon, 10 Dec 2012 10:39:13 +0900
> >> > Atsushi Kumagai  wrote:
> >> >
> >
> >> >
> >> > We might change the PageBuddy() implementation at any time, and
> >> > makedumpfile will break.  Or in this case, become less efficient.
> >> >
> >> > Is there any way in which we can move some of this logic into the
> >> > kernel?  In this case, add some kernel code which uses PageBuddy() on
> >> > behalf of makedumpfile, rather than replicating the PageBuddy() logic
> >> > in userspace?
> >> 
> >> In last month, Cliff Wickman proposed such idea:
> >> 
> >>   [PATCH v2] makedumpfile: request the kernel do page scans
> >>   http://lists.infradead.org/pipermail/kexec/2012-November/007318.html
> >> 
> >>   [PATCH] scan page tables for makedumpfile, 3.0.13 kernel
> >>   http://lists.infradead.org/pipermail/kexec/2012-November/007319.html
> >> 
> >> In his idea, the kernel does page scans to distinguish unnecessary pages
> >> (free pages and others) and returns the list of PFN's which should be
> >> excluded for makedumpfile.
> >> As a result, makedumpfile doesn't need to consider internal kernel
> >> behavior.
> >> 
> >> I think it's a good idea from the viewpoint of maintainability and
> >> performance.
> 
> > I also think wide part of his code can be reused in this work. But the bad
> > performance is caused by a lot of ioremap, not a lot of copying. See my
> > profiling result I posted some days ago. Two issues, ioremap one and 
> > filtering
> > maintainability, should be considered separately. Even on ioremap issue,
> > there is secondary one to consider in memory consumption on the 2nd
> > kernel.
> 
> Thanks.  I was wondering why moving the code into /proc/vmcore would
> make things faster.

Thanks HATAYAMA-san, I've understood the issues correctly.
We should continue improving the ioremap issue as Cliff and HATAYAMA-san
are doing now.

> 
> > Also, I have one question. Can we always think of 1st and 2nd kernels
> > are same?
> 
> Not at all.  Distros frequently implement it with the same kernel in
> both role but it should be possible to use an old crusty stable kernel
> as the 2nd kernel.
> 
> > If I understand correctly, kexec/kdump can use the 2nd kernel different
> > from the 1st's. So, differnet kernels need to do the same thing as 
> > makedumpfile
> > does. If assuming two are same, problem is mush simplified.
> 
> As a developer it becomes attractive to use a known stable kernel to
> capture the crash dump even as I experiment with a brand new kernel.

To allow to use the 2nd kernel different from the 1st's, I think we have
to take care of each kernel version with the logic included in makedumpfile
for them. That's to say, makedumpfile goes on as before.


Thanks
Atsushi Kumagai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-27 Thread Atsushi Kumagai
Hello,

On Thu, 20 Dec 2012 18:00:11 -0800
ebied...@xmission.com (Eric W. Biederman) wrote:

 Hatayama, Daisuke d.hatay...@jp.fujitsu.com writes:
 
  From: kexec-boun...@lists.infradead.org
  [mailto:kexec-boun...@lists.infradead.org] On Behalf Of Atsushi Kumagai
  Sent: Thursday, December 20, 2012 11:21 AM
 
  On Wed, 19 Dec 2012 16:18:56 -0800
  Andrew Morton a...@linux-foundation.org wrote:
  
   On Mon, 10 Dec 2012 10:39:13 +0900
   Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp wrote:
  
 
  
   We might change the PageBuddy() implementation at any time, and
   makedumpfile will break.  Or in this case, become less efficient.
  
   Is there any way in which we can move some of this logic into the
   kernel?  In this case, add some kernel code which uses PageBuddy() on
   behalf of makedumpfile, rather than replicating the PageBuddy() logic
   in userspace?
  
  In last month, Cliff Wickman proposed such idea:
  
[PATCH v2] makedumpfile: request the kernel do page scans
http://lists.infradead.org/pipermail/kexec/2012-November/007318.html
  
[PATCH] scan page tables for makedumpfile, 3.0.13 kernel
http://lists.infradead.org/pipermail/kexec/2012-November/007319.html
  
  In his idea, the kernel does page scans to distinguish unnecessary pages
  (free pages and others) and returns the list of PFN's which should be
  excluded for makedumpfile.
  As a result, makedumpfile doesn't need to consider internal kernel
  behavior.
  
  I think it's a good idea from the viewpoint of maintainability and
  performance.
 
  I also think wide part of his code can be reused in this work. But the bad
  performance is caused by a lot of ioremap, not a lot of copying. See my
  profiling result I posted some days ago. Two issues, ioremap one and 
  filtering
  maintainability, should be considered separately. Even on ioremap issue,
  there is secondary one to consider in memory consumption on the 2nd
  kernel.
 
 Thanks.  I was wondering why moving the code into /proc/vmcore would
 make things faster.

Thanks HATAYAMA-san, I've understood the issues correctly.
We should continue improving the ioremap issue as Cliff and HATAYAMA-san
are doing now.

 
  Also, I have one question. Can we always think of 1st and 2nd kernels
  are same?
 
 Not at all.  Distros frequently implement it with the same kernel in
 both role but it should be possible to use an old crusty stable kernel
 as the 2nd kernel.
 
  If I understand correctly, kexec/kdump can use the 2nd kernel different
  from the 1st's. So, differnet kernels need to do the same thing as 
  makedumpfile
  does. If assuming two are same, problem is mush simplified.
 
 As a developer it becomes attractive to use a known stable kernel to
 capture the crash dump even as I experiment with a brand new kernel.

To allow to use the 2nd kernel different from the 1st's, I think we have
to take care of each kernel version with the logic included in makedumpfile
for them. That's to say, makedumpfile goes on as before.


Thanks
Atsushi Kumagai
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-20 Thread Eric W. Biederman
"Hatayama, Daisuke"  writes:

>> From: kexec-boun...@lists.infradead.org
>> [mailto:kexec-boun...@lists.infradead.org] On Behalf Of Atsushi Kumagai
>> Sent: Thursday, December 20, 2012 11:21 AM
>
>> On Wed, 19 Dec 2012 16:18:56 -0800
>> Andrew Morton  wrote:
>> 
>> > On Mon, 10 Dec 2012 10:39:13 +0900
>> > Atsushi Kumagai  wrote:
>> >
>
>> >
>> > We might change the PageBuddy() implementation at any time, and
>> > makedumpfile will break.  Or in this case, become less efficient.
>> >
>> > Is there any way in which we can move some of this logic into the
>> > kernel?  In this case, add some kernel code which uses PageBuddy() on
>> > behalf of makedumpfile, rather than replicating the PageBuddy() logic
>> > in userspace?
>> 
>> In last month, Cliff Wickman proposed such idea:
>> 
>>   [PATCH v2] makedumpfile: request the kernel do page scans
>>   http://lists.infradead.org/pipermail/kexec/2012-November/007318.html
>> 
>>   [PATCH] scan page tables for makedumpfile, 3.0.13 kernel
>>   http://lists.infradead.org/pipermail/kexec/2012-November/007319.html
>> 
>> In his idea, the kernel does page scans to distinguish unnecessary pages
>> (free pages and others) and returns the list of PFN's which should be
>> excluded for makedumpfile.
>> As a result, makedumpfile doesn't need to consider internal kernel
>> behavior.
>> 
>> I think it's a good idea from the viewpoint of maintainability and
>> performance.

> I also think wide part of his code can be reused in this work. But the bad
> performance is caused by a lot of ioremap, not a lot of copying. See my
> profiling result I posted some days ago. Two issues, ioremap one and filtering
> maintainability, should be considered separately. Even on ioremap issue,
> there is secondary one to consider in memory consumption on the 2nd
> kernel.

Thanks.  I was wondering why moving the code into /proc/vmcore would
make things faster.

> Also, I have one question. Can we always think of 1st and 2nd kernels
> are same?

Not at all.  Distros frequently implement it with the same kernel in
both role but it should be possible to use an old crusty stable kernel
as the 2nd kernel.

> If I understand correctly, kexec/kdump can use the 2nd kernel different
> from the 1st's. So, differnet kernels need to do the same thing as 
> makedumpfile
> does. If assuming two are same, problem is mush simplified.

As a developer it becomes attractive to use a known stable kernel to
capture the crash dump even as I experiment with a brand new kernel.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-20 Thread Eric W. Biederman
Hatayama, Daisuke d.hatay...@jp.fujitsu.com writes:

 From: kexec-boun...@lists.infradead.org
 [mailto:kexec-boun...@lists.infradead.org] On Behalf Of Atsushi Kumagai
 Sent: Thursday, December 20, 2012 11:21 AM

 On Wed, 19 Dec 2012 16:18:56 -0800
 Andrew Morton a...@linux-foundation.org wrote:
 
  On Mon, 10 Dec 2012 10:39:13 +0900
  Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp wrote:
 

 
  We might change the PageBuddy() implementation at any time, and
  makedumpfile will break.  Or in this case, become less efficient.
 
  Is there any way in which we can move some of this logic into the
  kernel?  In this case, add some kernel code which uses PageBuddy() on
  behalf of makedumpfile, rather than replicating the PageBuddy() logic
  in userspace?
 
 In last month, Cliff Wickman proposed such idea:
 
   [PATCH v2] makedumpfile: request the kernel do page scans
   http://lists.infradead.org/pipermail/kexec/2012-November/007318.html
 
   [PATCH] scan page tables for makedumpfile, 3.0.13 kernel
   http://lists.infradead.org/pipermail/kexec/2012-November/007319.html
 
 In his idea, the kernel does page scans to distinguish unnecessary pages
 (free pages and others) and returns the list of PFN's which should be
 excluded for makedumpfile.
 As a result, makedumpfile doesn't need to consider internal kernel
 behavior.
 
 I think it's a good idea from the viewpoint of maintainability and
 performance.

 I also think wide part of his code can be reused in this work. But the bad
 performance is caused by a lot of ioremap, not a lot of copying. See my
 profiling result I posted some days ago. Two issues, ioremap one and filtering
 maintainability, should be considered separately. Even on ioremap issue,
 there is secondary one to consider in memory consumption on the 2nd
 kernel.

Thanks.  I was wondering why moving the code into /proc/vmcore would
make things faster.

 Also, I have one question. Can we always think of 1st and 2nd kernels
 are same?

Not at all.  Distros frequently implement it with the same kernel in
both role but it should be possible to use an old crusty stable kernel
as the 2nd kernel.

 If I understand correctly, kexec/kdump can use the 2nd kernel different
 from the 1st's. So, differnet kernels need to do the same thing as 
 makedumpfile
 does. If assuming two are same, problem is mush simplified.

As a developer it becomes attractive to use a known stable kernel to
capture the crash dump even as I experiment with a brand new kernel.

Eric
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-19 Thread Hatayama, Daisuke
> From: kexec-boun...@lists.infradead.org
> [mailto:kexec-boun...@lists.infradead.org] On Behalf Of Atsushi Kumagai
> Sent: Thursday, December 20, 2012 11:21 AM

> On Wed, 19 Dec 2012 16:18:56 -0800
> Andrew Morton  wrote:
> 
> > On Mon, 10 Dec 2012 10:39:13 +0900
> > Atsushi Kumagai  wrote:
> >

> >
> > We might change the PageBuddy() implementation at any time, and
> > makedumpfile will break.  Or in this case, become less efficient.
> >
> > Is there any way in which we can move some of this logic into the
> > kernel?  In this case, add some kernel code which uses PageBuddy() on
> > behalf of makedumpfile, rather than replicating the PageBuddy() logic
> > in userspace?
> 
> In last month, Cliff Wickman proposed such idea:
> 
>   [PATCH v2] makedumpfile: request the kernel do page scans
>   http://lists.infradead.org/pipermail/kexec/2012-November/007318.html
> 
>   [PATCH] scan page tables for makedumpfile, 3.0.13 kernel
>   http://lists.infradead.org/pipermail/kexec/2012-November/007319.html
> 
> In his idea, the kernel does page scans to distinguish unnecessary pages
> (free pages and others) and returns the list of PFN's which should be
> excluded for makedumpfile.
> As a result, makedumpfile doesn't need to consider internal kernel
> behavior.
> 
> I think it's a good idea from the viewpoint of maintainability and
> performance.

I also think wide part of his code can be reused in this work. But the bad
performance is caused by a lot of ioremap, not a lot of copying. See my
profiling result I posted some days ago. Two issues, ioremap one and filtering
maintainability, should be considered separately. Even on ioremap issue,
there is secondary one to consider in memory consumption on the 2nd kernel.

Also, I have one question. Can we always think of 1st and 2nd kernels are same?
If I understand correctly, kexec/kdump can use the 2nd kernel different
from the 1st's. So, differnet kernels need to do the same thing as makedumpfile
does. If assuming two are same, problem is mush simplified.

Thanks.
HATAYAMA, Daisuke

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-19 Thread Atsushi Kumagai
Hello Andrew,

On Wed, 19 Dec 2012 16:18:56 -0800
Andrew Morton  wrote:

> On Mon, 10 Dec 2012 10:39:13 +0900
> Atsushi Kumagai  wrote:
> 
> > This patch adds the values related to buddy system to vmcoreinfo data
> > so that makedumpfile (dump filtering command) can filter out all free
> > pages with the new logic.
> > It's faster than the current logic because it can distinguish free page
> > by analyzing page structure at the same time as filtering for other
> > unnecessary pages (e.g. anonymous page).
> > OTOH, the current logic has to trace free_list to distinguish free 
> > pages while analyzing page structure to filter out other unnecessary
> > pages.
> > 
> > The new logic uses the fact that buddy page is marked by _mapcount == 
> > PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other
> > fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab
> > is set or not before looking up _mapcount value.
> > And we can get the order of buddy system from private field.
> > To sum it up, the values below are required for this logic.
> > 
> > Required values:
> >   - OFFSET(page._mapcount)
> >   - OFFSET(page.private)
> >   - NUMBER(PG_slab)
> >   - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)
> > 
> > Changelog from v1 to v2:
> > 1. remove SIZE(pageflags)
> >   The new logic was changed after I sent v1 patch.  
> >   Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile.
> > 
> > What's makedumpfile:
> >   makedumpfile creates a small dumpfile by excluding unnecessary pages
> >   for the analysis. To distinguish unnecessary pages, makedumpfile gets
> >   the vmcoreinfo data which has the minimum debugging information only
> >   for dump filtering.
> 
> Gee, this info is getting highly dependent upon deep internal kernel
> behaviour.

Yes. makedumpfile should be changed depend on kernel version and we did it.

> > index 5e4bd78..b27efe4 100644
> > --- a/kernel/kexec.c
> > +++ b/kernel/kexec.c
> > @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void)
> > VMCOREINFO_OFFSET(page, _count);
> > VMCOREINFO_OFFSET(page, mapping);
> > VMCOREINFO_OFFSET(page, lru);
> > +   VMCOREINFO_OFFSET(page, _mapcount);
> > +   VMCOREINFO_OFFSET(page, private);
> > VMCOREINFO_OFFSET(pglist_data, node_zones);
> > VMCOREINFO_OFFSET(pglist_data, nr_zones);
> >  #ifdef CONFIG_FLAT_NODE_MEM_MAP
> > @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void)
> > VMCOREINFO_NUMBER(PG_lru);
> > VMCOREINFO_NUMBER(PG_private);
> > VMCOREINFO_NUMBER(PG_swapcache);
> > +   VMCOREINFO_NUMBER(PG_slab);
> > +   VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE);
> 
> We might change the PageBuddy() implementation at any time, and
> makedumpfile will break.  Or in this case, become less efficient.
> 
> Is there any way in which we can move some of this logic into the
> kernel?  In this case, add some kernel code which uses PageBuddy() on
> behalf of makedumpfile, rather than replicating the PageBuddy() logic
> in userspace?

In last month, Cliff Wickman proposed such idea:

  [PATCH v2] makedumpfile: request the kernel do page scans
  http://lists.infradead.org/pipermail/kexec/2012-November/007318.html

  [PATCH] scan page tables for makedumpfile, 3.0.13 kernel
  http://lists.infradead.org/pipermail/kexec/2012-November/007319.html

In his idea, the kernel does page scans to distinguish unnecessary pages
(free pages and others) and returns the list of PFN's which should be
excluded for makedumpfile.
As a result, makedumpfile doesn't need to consider internal kernel
behavior.

I think it's a good idea from the viewpoint of maintainability and
performance.


Thanks
Atsushi Kumagai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-19 Thread Eric W. Biederman
Andrew Morton  writes:

> On Wed, 19 Dec 2012 16:57:03 -0800
> ebied...@xmission.com (Eric W. Biederman) wrote:
>
>> Andrew Morton  writes:
>> 
>> > Is there any way in which we can move some of this logic into the
>> > kernel?  In this case, add some kernel code which uses PageBuddy() on
>> > behalf of makedumpfile, rather than replicating the PageBuddy() logic
>> > in userspace?
>> 
>> All that exists when makedumpfile runs is a core file.  So it would have
>> to be something like a share library that builds with the kernel and
>> then makedumpfile loads.
>
> Can we omit free pages from that core file?
>
> And/or add a section to that core file which flags free pages?

Ommitting pages is what makedumpfile does.

Very loosely shortly after boot when things are running fine /sbin/kexec
runs.

/sbin/kexec constructs a set of elf headers that describe where the
memory is and load the crashdump kernel an initrd and those elf headers
into memory.

Years later when the running kernel calls panic.
panic calls machine_kexec
machine_kexec jmps to the preloaded crashdump kernel.

I think it is /proc/vmcore that reads the elf headers out of memory and
presents them to userspace.

Then we have options.
vmcore-to-dmesg will just read the dmesg ring buffer so we have that.

makedumpfile reads the kernel data structures and filters out the free
pages for people who don't want to write everything to disk.

So the basic interface is strongly kernel version agnostic.  The
challenge is how to filter out undesirable pages from the core dump
quickly and reliably.

Right now what we have are a set of ELF notes that describe struct page.

For my uses I have either had enough disk space that saving everything
didn't matter or so little disk space that all I could afford was
getting out the dmesg ring buffer.  So I don't know how robust the
solution adopted by makedumpfile is.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-19 Thread Andrew Morton
On Wed, 19 Dec 2012 16:57:03 -0800
ebied...@xmission.com (Eric W. Biederman) wrote:

> Andrew Morton  writes:
> 
> > Is there any way in which we can move some of this logic into the
> > kernel?  In this case, add some kernel code which uses PageBuddy() on
> > behalf of makedumpfile, rather than replicating the PageBuddy() logic
> > in userspace?
> 
> All that exists when makedumpfile runs is a core file.  So it would have
> to be something like a share library that builds with the kernel and
> then makedumpfile loads.

Can we omit free pages from that core file?

And/or add a section to that core file which flags free pages?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-19 Thread Eric W. Biederman
Andrew Morton  writes:

> On Mon, 10 Dec 2012 10:39:13 +0900
> Atsushi Kumagai  wrote:
>
>> This patch adds the values related to buddy system to vmcoreinfo data
>> so that makedumpfile (dump filtering command) can filter out all free
>> pages with the new logic.
>> It's faster than the current logic because it can distinguish free page
>> by analyzing page structure at the same time as filtering for other
>> unnecessary pages (e.g. anonymous page).
>> OTOH, the current logic has to trace free_list to distinguish free 
>> pages while analyzing page structure to filter out other unnecessary
>> pages.
>> 
>> The new logic uses the fact that buddy page is marked by _mapcount == 
>> PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other
>> fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab
>> is set or not before looking up _mapcount value.
>> And we can get the order of buddy system from private field.
>> To sum it up, the values below are required for this logic.
>> 
>> Required values:
>>   - OFFSET(page._mapcount)
>>   - OFFSET(page.private)
>>   - NUMBER(PG_slab)
>>   - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)
>> 
>> Changelog from v1 to v2:
>> 1. remove SIZE(pageflags)
>>   The new logic was changed after I sent v1 patch.  
>>   Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile.
>> 
>> What's makedumpfile:
>>   makedumpfile creates a small dumpfile by excluding unnecessary pages
>>   for the analysis. To distinguish unnecessary pages, makedumpfile gets
>>   the vmcoreinfo data which has the minimum debugging information only
>>   for dump filtering.
>
> Gee, this info is getting highly dependent upon deep internal kernel
> behaviour.
>
>> index 5e4bd78..b27efe4 100644
>> --- a/kernel/kexec.c
>> +++ b/kernel/kexec.c
>> @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void)
>>  VMCOREINFO_OFFSET(page, _count);
>>  VMCOREINFO_OFFSET(page, mapping);
>>  VMCOREINFO_OFFSET(page, lru);
>> +VMCOREINFO_OFFSET(page, _mapcount);
>> +VMCOREINFO_OFFSET(page, private);
>>  VMCOREINFO_OFFSET(pglist_data, node_zones);
>>  VMCOREINFO_OFFSET(pglist_data, nr_zones);
>>  #ifdef CONFIG_FLAT_NODE_MEM_MAP
>> @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void)
>>  VMCOREINFO_NUMBER(PG_lru);
>>  VMCOREINFO_NUMBER(PG_private);
>>  VMCOREINFO_NUMBER(PG_swapcache);
>> +VMCOREINFO_NUMBER(PG_slab);
>> +VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE);
>
> We might change the PageBuddy() implementation at any time, and
> makedumpfile will break.  Or in this case, become less efficient.
>
> Is there any way in which we can move some of this logic into the
> kernel?  In this case, add some kernel code which uses PageBuddy() on
> behalf of makedumpfile, rather than replicating the PageBuddy() logic
> in userspace?

All that exists when makedumpfile runs is a core file.  So it would have
to be something like a share library that builds with the kernel and
then makedumpfile loads.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-19 Thread Andrew Morton
On Mon, 10 Dec 2012 10:39:13 +0900
Atsushi Kumagai  wrote:

> This patch adds the values related to buddy system to vmcoreinfo data
> so that makedumpfile (dump filtering command) can filter out all free
> pages with the new logic.
> It's faster than the current logic because it can distinguish free page
> by analyzing page structure at the same time as filtering for other
> unnecessary pages (e.g. anonymous page).
> OTOH, the current logic has to trace free_list to distinguish free 
> pages while analyzing page structure to filter out other unnecessary
> pages.
> 
> The new logic uses the fact that buddy page is marked by _mapcount == 
> PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other
> fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab
> is set or not before looking up _mapcount value.
> And we can get the order of buddy system from private field.
> To sum it up, the values below are required for this logic.
> 
> Required values:
>   - OFFSET(page._mapcount)
>   - OFFSET(page.private)
>   - NUMBER(PG_slab)
>   - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)
> 
> Changelog from v1 to v2:
> 1. remove SIZE(pageflags)
>   The new logic was changed after I sent v1 patch.  
>   Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile.
> 
> What's makedumpfile:
>   makedumpfile creates a small dumpfile by excluding unnecessary pages
>   for the analysis. To distinguish unnecessary pages, makedumpfile gets
>   the vmcoreinfo data which has the minimum debugging information only
>   for dump filtering.

Gee, this info is getting highly dependent upon deep internal kernel
behaviour.

> index 5e4bd78..b27efe4 100644
> --- a/kernel/kexec.c
> +++ b/kernel/kexec.c
> @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void)
>   VMCOREINFO_OFFSET(page, _count);
>   VMCOREINFO_OFFSET(page, mapping);
>   VMCOREINFO_OFFSET(page, lru);
> + VMCOREINFO_OFFSET(page, _mapcount);
> + VMCOREINFO_OFFSET(page, private);
>   VMCOREINFO_OFFSET(pglist_data, node_zones);
>   VMCOREINFO_OFFSET(pglist_data, nr_zones);
>  #ifdef CONFIG_FLAT_NODE_MEM_MAP
> @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void)
>   VMCOREINFO_NUMBER(PG_lru);
>   VMCOREINFO_NUMBER(PG_private);
>   VMCOREINFO_NUMBER(PG_swapcache);
> + VMCOREINFO_NUMBER(PG_slab);
> + VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE);

We might change the PageBuddy() implementation at any time, and
makedumpfile will break.  Or in this case, become less efficient.

Is there any way in which we can move some of this logic into the
kernel?  In this case, add some kernel code which uses PageBuddy() on
behalf of makedumpfile, rather than replicating the PageBuddy() logic
in userspace?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-19 Thread Andrew Morton
On Mon, 10 Dec 2012 10:39:13 +0900
Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp wrote:

 This patch adds the values related to buddy system to vmcoreinfo data
 so that makedumpfile (dump filtering command) can filter out all free
 pages with the new logic.
 It's faster than the current logic because it can distinguish free page
 by analyzing page structure at the same time as filtering for other
 unnecessary pages (e.g. anonymous page).
 OTOH, the current logic has to trace free_list to distinguish free 
 pages while analyzing page structure to filter out other unnecessary
 pages.
 
 The new logic uses the fact that buddy page is marked by _mapcount == 
 PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other
 fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab
 is set or not before looking up _mapcount value.
 And we can get the order of buddy system from private field.
 To sum it up, the values below are required for this logic.
 
 Required values:
   - OFFSET(page._mapcount)
   - OFFSET(page.private)
   - NUMBER(PG_slab)
   - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)
 
 Changelog from v1 to v2:
 1. remove SIZE(pageflags)
   The new logic was changed after I sent v1 patch.  
   Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile.
 
 What's makedumpfile:
   makedumpfile creates a small dumpfile by excluding unnecessary pages
   for the analysis. To distinguish unnecessary pages, makedumpfile gets
   the vmcoreinfo data which has the minimum debugging information only
   for dump filtering.

Gee, this info is getting highly dependent upon deep internal kernel
behaviour.

 index 5e4bd78..b27efe4 100644
 --- a/kernel/kexec.c
 +++ b/kernel/kexec.c
 @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void)
   VMCOREINFO_OFFSET(page, _count);
   VMCOREINFO_OFFSET(page, mapping);
   VMCOREINFO_OFFSET(page, lru);
 + VMCOREINFO_OFFSET(page, _mapcount);
 + VMCOREINFO_OFFSET(page, private);
   VMCOREINFO_OFFSET(pglist_data, node_zones);
   VMCOREINFO_OFFSET(pglist_data, nr_zones);
  #ifdef CONFIG_FLAT_NODE_MEM_MAP
 @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void)
   VMCOREINFO_NUMBER(PG_lru);
   VMCOREINFO_NUMBER(PG_private);
   VMCOREINFO_NUMBER(PG_swapcache);
 + VMCOREINFO_NUMBER(PG_slab);
 + VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE);

We might change the PageBuddy() implementation at any time, and
makedumpfile will break.  Or in this case, become less efficient.

Is there any way in which we can move some of this logic into the
kernel?  In this case, add some kernel code which uses PageBuddy() on
behalf of makedumpfile, rather than replicating the PageBuddy() logic
in userspace?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-19 Thread Eric W. Biederman
Andrew Morton a...@linux-foundation.org writes:

 On Mon, 10 Dec 2012 10:39:13 +0900
 Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp wrote:

 This patch adds the values related to buddy system to vmcoreinfo data
 so that makedumpfile (dump filtering command) can filter out all free
 pages with the new logic.
 It's faster than the current logic because it can distinguish free page
 by analyzing page structure at the same time as filtering for other
 unnecessary pages (e.g. anonymous page).
 OTOH, the current logic has to trace free_list to distinguish free 
 pages while analyzing page structure to filter out other unnecessary
 pages.
 
 The new logic uses the fact that buddy page is marked by _mapcount == 
 PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other
 fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab
 is set or not before looking up _mapcount value.
 And we can get the order of buddy system from private field.
 To sum it up, the values below are required for this logic.
 
 Required values:
   - OFFSET(page._mapcount)
   - OFFSET(page.private)
   - NUMBER(PG_slab)
   - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)
 
 Changelog from v1 to v2:
 1. remove SIZE(pageflags)
   The new logic was changed after I sent v1 patch.  
   Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile.
 
 What's makedumpfile:
   makedumpfile creates a small dumpfile by excluding unnecessary pages
   for the analysis. To distinguish unnecessary pages, makedumpfile gets
   the vmcoreinfo data which has the minimum debugging information only
   for dump filtering.

 Gee, this info is getting highly dependent upon deep internal kernel
 behaviour.

 index 5e4bd78..b27efe4 100644
 --- a/kernel/kexec.c
 +++ b/kernel/kexec.c
 @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void)
  VMCOREINFO_OFFSET(page, _count);
  VMCOREINFO_OFFSET(page, mapping);
  VMCOREINFO_OFFSET(page, lru);
 +VMCOREINFO_OFFSET(page, _mapcount);
 +VMCOREINFO_OFFSET(page, private);
  VMCOREINFO_OFFSET(pglist_data, node_zones);
  VMCOREINFO_OFFSET(pglist_data, nr_zones);
  #ifdef CONFIG_FLAT_NODE_MEM_MAP
 @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void)
  VMCOREINFO_NUMBER(PG_lru);
  VMCOREINFO_NUMBER(PG_private);
  VMCOREINFO_NUMBER(PG_swapcache);
 +VMCOREINFO_NUMBER(PG_slab);
 +VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE);

 We might change the PageBuddy() implementation at any time, and
 makedumpfile will break.  Or in this case, become less efficient.

 Is there any way in which we can move some of this logic into the
 kernel?  In this case, add some kernel code which uses PageBuddy() on
 behalf of makedumpfile, rather than replicating the PageBuddy() logic
 in userspace?

All that exists when makedumpfile runs is a core file.  So it would have
to be something like a share library that builds with the kernel and
then makedumpfile loads.

Eric

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-19 Thread Andrew Morton
On Wed, 19 Dec 2012 16:57:03 -0800
ebied...@xmission.com (Eric W. Biederman) wrote:

 Andrew Morton a...@linux-foundation.org writes:
 
  Is there any way in which we can move some of this logic into the
  kernel?  In this case, add some kernel code which uses PageBuddy() on
  behalf of makedumpfile, rather than replicating the PageBuddy() logic
  in userspace?
 
 All that exists when makedumpfile runs is a core file.  So it would have
 to be something like a share library that builds with the kernel and
 then makedumpfile loads.

Can we omit free pages from that core file?

And/or add a section to that core file which flags free pages?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-19 Thread Eric W. Biederman
Andrew Morton a...@linux-foundation.org writes:

 On Wed, 19 Dec 2012 16:57:03 -0800
 ebied...@xmission.com (Eric W. Biederman) wrote:

 Andrew Morton a...@linux-foundation.org writes:
 
  Is there any way in which we can move some of this logic into the
  kernel?  In this case, add some kernel code which uses PageBuddy() on
  behalf of makedumpfile, rather than replicating the PageBuddy() logic
  in userspace?
 
 All that exists when makedumpfile runs is a core file.  So it would have
 to be something like a share library that builds with the kernel and
 then makedumpfile loads.

 Can we omit free pages from that core file?

 And/or add a section to that core file which flags free pages?

Ommitting pages is what makedumpfile does.

Very loosely shortly after boot when things are running fine /sbin/kexec
runs.

/sbin/kexec constructs a set of elf headers that describe where the
memory is and load the crashdump kernel an initrd and those elf headers
into memory.

Years later when the running kernel calls panic.
panic calls machine_kexec
machine_kexec jmps to the preloaded crashdump kernel.

I think it is /proc/vmcore that reads the elf headers out of memory and
presents them to userspace.

Then we have options.
vmcore-to-dmesg will just read the dmesg ring buffer so we have that.

makedumpfile reads the kernel data structures and filters out the free
pages for people who don't want to write everything to disk.

So the basic interface is strongly kernel version agnostic.  The
challenge is how to filter out undesirable pages from the core dump
quickly and reliably.

Right now what we have are a set of ELF notes that describe struct page.

For my uses I have either had enough disk space that saving everything
didn't matter or so little disk space that all I could afford was
getting out the dmesg ring buffer.  So I don't know how robust the
solution adopted by makedumpfile is.

Eric
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-19 Thread Atsushi Kumagai
Hello Andrew,

On Wed, 19 Dec 2012 16:18:56 -0800
Andrew Morton a...@linux-foundation.org wrote:

 On Mon, 10 Dec 2012 10:39:13 +0900
 Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp wrote:
 
  This patch adds the values related to buddy system to vmcoreinfo data
  so that makedumpfile (dump filtering command) can filter out all free
  pages with the new logic.
  It's faster than the current logic because it can distinguish free page
  by analyzing page structure at the same time as filtering for other
  unnecessary pages (e.g. anonymous page).
  OTOH, the current logic has to trace free_list to distinguish free 
  pages while analyzing page structure to filter out other unnecessary
  pages.
  
  The new logic uses the fact that buddy page is marked by _mapcount == 
  PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other
  fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab
  is set or not before looking up _mapcount value.
  And we can get the order of buddy system from private field.
  To sum it up, the values below are required for this logic.
  
  Required values:
- OFFSET(page._mapcount)
- OFFSET(page.private)
- NUMBER(PG_slab)
- NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)
  
  Changelog from v1 to v2:
  1. remove SIZE(pageflags)
The new logic was changed after I sent v1 patch.  
Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile.
  
  What's makedumpfile:
makedumpfile creates a small dumpfile by excluding unnecessary pages
for the analysis. To distinguish unnecessary pages, makedumpfile gets
the vmcoreinfo data which has the minimum debugging information only
for dump filtering.
 
 Gee, this info is getting highly dependent upon deep internal kernel
 behaviour.

Yes. makedumpfile should be changed depend on kernel version and we did it.

  index 5e4bd78..b27efe4 100644
  --- a/kernel/kexec.c
  +++ b/kernel/kexec.c
  @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void)
  VMCOREINFO_OFFSET(page, _count);
  VMCOREINFO_OFFSET(page, mapping);
  VMCOREINFO_OFFSET(page, lru);
  +   VMCOREINFO_OFFSET(page, _mapcount);
  +   VMCOREINFO_OFFSET(page, private);
  VMCOREINFO_OFFSET(pglist_data, node_zones);
  VMCOREINFO_OFFSET(pglist_data, nr_zones);
   #ifdef CONFIG_FLAT_NODE_MEM_MAP
  @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void)
  VMCOREINFO_NUMBER(PG_lru);
  VMCOREINFO_NUMBER(PG_private);
  VMCOREINFO_NUMBER(PG_swapcache);
  +   VMCOREINFO_NUMBER(PG_slab);
  +   VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE);
 
 We might change the PageBuddy() implementation at any time, and
 makedumpfile will break.  Or in this case, become less efficient.
 
 Is there any way in which we can move some of this logic into the
 kernel?  In this case, add some kernel code which uses PageBuddy() on
 behalf of makedumpfile, rather than replicating the PageBuddy() logic
 in userspace?

In last month, Cliff Wickman proposed such idea:

  [PATCH v2] makedumpfile: request the kernel do page scans
  http://lists.infradead.org/pipermail/kexec/2012-November/007318.html

  [PATCH] scan page tables for makedumpfile, 3.0.13 kernel
  http://lists.infradead.org/pipermail/kexec/2012-November/007319.html

In his idea, the kernel does page scans to distinguish unnecessary pages
(free pages and others) and returns the list of PFN's which should be
excluded for makedumpfile.
As a result, makedumpfile doesn't need to consider internal kernel
behavior.

I think it's a good idea from the viewpoint of maintainability and
performance.


Thanks
Atsushi Kumagai
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-19 Thread Hatayama, Daisuke
 From: kexec-boun...@lists.infradead.org
 [mailto:kexec-boun...@lists.infradead.org] On Behalf Of Atsushi Kumagai
 Sent: Thursday, December 20, 2012 11:21 AM

 On Wed, 19 Dec 2012 16:18:56 -0800
 Andrew Morton a...@linux-foundation.org wrote:
 
  On Mon, 10 Dec 2012 10:39:13 +0900
  Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp wrote:
 

 
  We might change the PageBuddy() implementation at any time, and
  makedumpfile will break.  Or in this case, become less efficient.
 
  Is there any way in which we can move some of this logic into the
  kernel?  In this case, add some kernel code which uses PageBuddy() on
  behalf of makedumpfile, rather than replicating the PageBuddy() logic
  in userspace?
 
 In last month, Cliff Wickman proposed such idea:
 
   [PATCH v2] makedumpfile: request the kernel do page scans
   http://lists.infradead.org/pipermail/kexec/2012-November/007318.html
 
   [PATCH] scan page tables for makedumpfile, 3.0.13 kernel
   http://lists.infradead.org/pipermail/kexec/2012-November/007319.html
 
 In his idea, the kernel does page scans to distinguish unnecessary pages
 (free pages and others) and returns the list of PFN's which should be
 excluded for makedumpfile.
 As a result, makedumpfile doesn't need to consider internal kernel
 behavior.
 
 I think it's a good idea from the viewpoint of maintainability and
 performance.

I also think wide part of his code can be reused in this work. But the bad
performance is caused by a lot of ioremap, not a lot of copying. See my
profiling result I posted some days ago. Two issues, ioremap one and filtering
maintainability, should be considered separately. Even on ioremap issue,
there is secondary one to consider in memory consumption on the 2nd kernel.

Also, I have one question. Can we always think of 1st and 2nd kernels are same?
If I understand correctly, kexec/kdump can use the 2nd kernel different
from the 1st's. So, differnet kernels need to do the same thing as makedumpfile
does. If assuming two are same, problem is mush simplified.

Thanks.
HATAYAMA, Daisuke

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-17 Thread Vivek Goyal
CCing Andrew Morton. I think he picks kexec patches.



On Mon, Dec 10, 2012 at 08:17:05AM -0500, Vivek Goyal wrote:
> On Mon, Dec 10, 2012 at 10:39:13AM +0900, Atsushi Kumagai wrote:
> > This patch adds the values related to buddy system to vmcoreinfo data
> > so that makedumpfile (dump filtering command) can filter out all free
> > pages with the new logic.
> > It's faster than the current logic because it can distinguish free page
> > by analyzing page structure at the same time as filtering for other
> > unnecessary pages (e.g. anonymous page).
> > OTOH, the current logic has to trace free_list to distinguish free 
> > pages while analyzing page structure to filter out other unnecessary
> > pages.
> > 
> > The new logic uses the fact that buddy page is marked by _mapcount == 
> > PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other
> > fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab
> > is set or not before looking up _mapcount value.
> > And we can get the order of buddy system from private field.
> > To sum it up, the values below are required for this logic.
> > 
> > Required values:
> >   - OFFSET(page._mapcount)
> >   - OFFSET(page.private)
> >   - NUMBER(PG_slab)
> >   - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)
> > 
> > Changelog from v1 to v2:
> > 1. remove SIZE(pageflags)
> >   The new logic was changed after I sent v1 patch.  
> >   Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile.
> > 
> > What's makedumpfile:
> >   makedumpfile creates a small dumpfile by excluding unnecessary pages
> >   for the analysis. To distinguish unnecessary pages, makedumpfile gets
> >   the vmcoreinfo data which has the minimum debugging information only
> >   for dump filtering.
> > 
> > Signed-off-by: Atsushi Kumagai 
> 
> Looks good to me.
> 
> Acked-by: Vivek Goyal 
> 
> Thanks
> Vivek
> 
> > ---
> >  kernel/kexec.c |4 
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/kernel/kexec.c b/kernel/kexec.c
> > index 5e4bd78..b27efe4 100644
> > --- a/kernel/kexec.c
> > +++ b/kernel/kexec.c
> > @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void)
> > VMCOREINFO_OFFSET(page, _count);
> > VMCOREINFO_OFFSET(page, mapping);
> > VMCOREINFO_OFFSET(page, lru);
> > +   VMCOREINFO_OFFSET(page, _mapcount);
> > +   VMCOREINFO_OFFSET(page, private);
> > VMCOREINFO_OFFSET(pglist_data, node_zones);
> > VMCOREINFO_OFFSET(pglist_data, nr_zones);
> >  #ifdef CONFIG_FLAT_NODE_MEM_MAP
> > @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void)
> > VMCOREINFO_NUMBER(PG_lru);
> > VMCOREINFO_NUMBER(PG_private);
> > VMCOREINFO_NUMBER(PG_swapcache);
> > +   VMCOREINFO_NUMBER(PG_slab);
> > +   VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE);
> > 
> > arch_crash_save_vmcoreinfo();
> > update_vmcoreinfo_note();
> > --
> > 1.7.9.2
> > 
> > ___
> > kexec mailing list
> > ke...@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-17 Thread Vivek Goyal
CCing Andrew Morton. I think he picks kexec patches.



On Mon, Dec 10, 2012 at 08:17:05AM -0500, Vivek Goyal wrote:
 On Mon, Dec 10, 2012 at 10:39:13AM +0900, Atsushi Kumagai wrote:
  This patch adds the values related to buddy system to vmcoreinfo data
  so that makedumpfile (dump filtering command) can filter out all free
  pages with the new logic.
  It's faster than the current logic because it can distinguish free page
  by analyzing page structure at the same time as filtering for other
  unnecessary pages (e.g. anonymous page).
  OTOH, the current logic has to trace free_list to distinguish free 
  pages while analyzing page structure to filter out other unnecessary
  pages.
  
  The new logic uses the fact that buddy page is marked by _mapcount == 
  PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other
  fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab
  is set or not before looking up _mapcount value.
  And we can get the order of buddy system from private field.
  To sum it up, the values below are required for this logic.
  
  Required values:
- OFFSET(page._mapcount)
- OFFSET(page.private)
- NUMBER(PG_slab)
- NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)
  
  Changelog from v1 to v2:
  1. remove SIZE(pageflags)
The new logic was changed after I sent v1 patch.  
Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile.
  
  What's makedumpfile:
makedumpfile creates a small dumpfile by excluding unnecessary pages
for the analysis. To distinguish unnecessary pages, makedumpfile gets
the vmcoreinfo data which has the minimum debugging information only
for dump filtering.
  
  Signed-off-by: Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp
 
 Looks good to me.
 
 Acked-by: Vivek Goyal vgo...@redhat.com
 
 Thanks
 Vivek
 
  ---
   kernel/kexec.c |4 
   1 file changed, 4 insertions(+)
  
  diff --git a/kernel/kexec.c b/kernel/kexec.c
  index 5e4bd78..b27efe4 100644
  --- a/kernel/kexec.c
  +++ b/kernel/kexec.c
  @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void)
  VMCOREINFO_OFFSET(page, _count);
  VMCOREINFO_OFFSET(page, mapping);
  VMCOREINFO_OFFSET(page, lru);
  +   VMCOREINFO_OFFSET(page, _mapcount);
  +   VMCOREINFO_OFFSET(page, private);
  VMCOREINFO_OFFSET(pglist_data, node_zones);
  VMCOREINFO_OFFSET(pglist_data, nr_zones);
   #ifdef CONFIG_FLAT_NODE_MEM_MAP
  @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void)
  VMCOREINFO_NUMBER(PG_lru);
  VMCOREINFO_NUMBER(PG_private);
  VMCOREINFO_NUMBER(PG_swapcache);
  +   VMCOREINFO_NUMBER(PG_slab);
  +   VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE);
  
  arch_crash_save_vmcoreinfo();
  update_vmcoreinfo_note();
  --
  1.7.9.2
  
  ___
  kexec mailing list
  ke...@lists.infradead.org
  http://lists.infradead.org/mailman/listinfo/kexec
 
 ___
 kexec mailing list
 ke...@lists.infradead.org
 http://lists.infradead.org/mailman/listinfo/kexec
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-10 Thread Vivek Goyal
On Mon, Dec 10, 2012 at 10:39:13AM +0900, Atsushi Kumagai wrote:
> This patch adds the values related to buddy system to vmcoreinfo data
> so that makedumpfile (dump filtering command) can filter out all free
> pages with the new logic.
> It's faster than the current logic because it can distinguish free page
> by analyzing page structure at the same time as filtering for other
> unnecessary pages (e.g. anonymous page).
> OTOH, the current logic has to trace free_list to distinguish free 
> pages while analyzing page structure to filter out other unnecessary
> pages.
> 
> The new logic uses the fact that buddy page is marked by _mapcount == 
> PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other
> fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab
> is set or not before looking up _mapcount value.
> And we can get the order of buddy system from private field.
> To sum it up, the values below are required for this logic.
> 
> Required values:
>   - OFFSET(page._mapcount)
>   - OFFSET(page.private)
>   - NUMBER(PG_slab)
>   - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)
> 
> Changelog from v1 to v2:
> 1. remove SIZE(pageflags)
>   The new logic was changed after I sent v1 patch.  
>   Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile.
> 
> What's makedumpfile:
>   makedumpfile creates a small dumpfile by excluding unnecessary pages
>   for the analysis. To distinguish unnecessary pages, makedumpfile gets
>   the vmcoreinfo data which has the minimum debugging information only
>   for dump filtering.
> 
> Signed-off-by: Atsushi Kumagai 

Looks good to me.

Acked-by: Vivek Goyal 

Thanks
Vivek

> ---
>  kernel/kexec.c |4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/kernel/kexec.c b/kernel/kexec.c
> index 5e4bd78..b27efe4 100644
> --- a/kernel/kexec.c
> +++ b/kernel/kexec.c
> @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void)
>   VMCOREINFO_OFFSET(page, _count);
>   VMCOREINFO_OFFSET(page, mapping);
>   VMCOREINFO_OFFSET(page, lru);
> + VMCOREINFO_OFFSET(page, _mapcount);
> + VMCOREINFO_OFFSET(page, private);
>   VMCOREINFO_OFFSET(pglist_data, node_zones);
>   VMCOREINFO_OFFSET(pglist_data, nr_zones);
>  #ifdef CONFIG_FLAT_NODE_MEM_MAP
> @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void)
>   VMCOREINFO_NUMBER(PG_lru);
>   VMCOREINFO_NUMBER(PG_private);
>   VMCOREINFO_NUMBER(PG_swapcache);
> + VMCOREINFO_NUMBER(PG_slab);
> + VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE);
> 
>   arch_crash_save_vmcoreinfo();
>   update_vmcoreinfo_note();
> --
> 1.7.9.2
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-10 Thread Vivek Goyal
On Mon, Dec 10, 2012 at 10:39:13AM +0900, Atsushi Kumagai wrote:
 This patch adds the values related to buddy system to vmcoreinfo data
 so that makedumpfile (dump filtering command) can filter out all free
 pages with the new logic.
 It's faster than the current logic because it can distinguish free page
 by analyzing page structure at the same time as filtering for other
 unnecessary pages (e.g. anonymous page).
 OTOH, the current logic has to trace free_list to distinguish free 
 pages while analyzing page structure to filter out other unnecessary
 pages.
 
 The new logic uses the fact that buddy page is marked by _mapcount == 
 PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other
 fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab
 is set or not before looking up _mapcount value.
 And we can get the order of buddy system from private field.
 To sum it up, the values below are required for this logic.
 
 Required values:
   - OFFSET(page._mapcount)
   - OFFSET(page.private)
   - NUMBER(PG_slab)
   - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)
 
 Changelog from v1 to v2:
 1. remove SIZE(pageflags)
   The new logic was changed after I sent v1 patch.  
   Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile.
 
 What's makedumpfile:
   makedumpfile creates a small dumpfile by excluding unnecessary pages
   for the analysis. To distinguish unnecessary pages, makedumpfile gets
   the vmcoreinfo data which has the minimum debugging information only
   for dump filtering.
 
 Signed-off-by: Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp

Looks good to me.

Acked-by: Vivek Goyal vgo...@redhat.com

Thanks
Vivek

 ---
  kernel/kexec.c |4 
  1 file changed, 4 insertions(+)
 
 diff --git a/kernel/kexec.c b/kernel/kexec.c
 index 5e4bd78..b27efe4 100644
 --- a/kernel/kexec.c
 +++ b/kernel/kexec.c
 @@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void)
   VMCOREINFO_OFFSET(page, _count);
   VMCOREINFO_OFFSET(page, mapping);
   VMCOREINFO_OFFSET(page, lru);
 + VMCOREINFO_OFFSET(page, _mapcount);
 + VMCOREINFO_OFFSET(page, private);
   VMCOREINFO_OFFSET(pglist_data, node_zones);
   VMCOREINFO_OFFSET(pglist_data, nr_zones);
  #ifdef CONFIG_FLAT_NODE_MEM_MAP
 @@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void)
   VMCOREINFO_NUMBER(PG_lru);
   VMCOREINFO_NUMBER(PG_private);
   VMCOREINFO_NUMBER(PG_swapcache);
 + VMCOREINFO_NUMBER(PG_slab);
 + VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE);
 
   arch_crash_save_vmcoreinfo();
   update_vmcoreinfo_note();
 --
 1.7.9.2
 
 ___
 kexec mailing list
 ke...@lists.infradead.org
 http://lists.infradead.org/mailman/listinfo/kexec
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-09 Thread Atsushi Kumagai
This patch adds the values related to buddy system to vmcoreinfo data
so that makedumpfile (dump filtering command) can filter out all free
pages with the new logic.
It's faster than the current logic because it can distinguish free page
by analyzing page structure at the same time as filtering for other
unnecessary pages (e.g. anonymous page).
OTOH, the current logic has to trace free_list to distinguish free 
pages while analyzing page structure to filter out other unnecessary
pages.

The new logic uses the fact that buddy page is marked by _mapcount == 
PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other
fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab
is set or not before looking up _mapcount value.
And we can get the order of buddy system from private field.
To sum it up, the values below are required for this logic.

Required values:
  - OFFSET(page._mapcount)
  - OFFSET(page.private)
  - NUMBER(PG_slab)
  - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)

Changelog from v1 to v2:
1. remove SIZE(pageflags)
  The new logic was changed after I sent v1 patch.  
  Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile.

What's makedumpfile:
  makedumpfile creates a small dumpfile by excluding unnecessary pages
  for the analysis. To distinguish unnecessary pages, makedumpfile gets
  the vmcoreinfo data which has the minimum debugging information only
  for dump filtering.

Signed-off-by: Atsushi Kumagai 
---
 kernel/kexec.c |4 
 1 file changed, 4 insertions(+)

diff --git a/kernel/kexec.c b/kernel/kexec.c
index 5e4bd78..b27efe4 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_OFFSET(page, _count);
VMCOREINFO_OFFSET(page, mapping);
VMCOREINFO_OFFSET(page, lru);
+   VMCOREINFO_OFFSET(page, _mapcount);
+   VMCOREINFO_OFFSET(page, private);
VMCOREINFO_OFFSET(pglist_data, node_zones);
VMCOREINFO_OFFSET(pglist_data, nr_zones);
 #ifdef CONFIG_FLAT_NODE_MEM_MAP
@@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_NUMBER(PG_lru);
VMCOREINFO_NUMBER(PG_private);
VMCOREINFO_NUMBER(PG_swapcache);
+   VMCOREINFO_NUMBER(PG_slab);
+   VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE);

arch_crash_save_vmcoreinfo();
update_vmcoreinfo_note();
--
1.7.9.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] Add the values related to buddy system for filtering free pages.

2012-12-09 Thread Atsushi Kumagai
This patch adds the values related to buddy system to vmcoreinfo data
so that makedumpfile (dump filtering command) can filter out all free
pages with the new logic.
It's faster than the current logic because it can distinguish free page
by analyzing page structure at the same time as filtering for other
unnecessary pages (e.g. anonymous page).
OTOH, the current logic has to trace free_list to distinguish free 
pages while analyzing page structure to filter out other unnecessary
pages.

The new logic uses the fact that buddy page is marked by _mapcount == 
PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other
fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab
is set or not before looking up _mapcount value.
And we can get the order of buddy system from private field.
To sum it up, the values below are required for this logic.

Required values:
  - OFFSET(page._mapcount)
  - OFFSET(page.private)
  - NUMBER(PG_slab)
  - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)

Changelog from v1 to v2:
1. remove SIZE(pageflags)
  The new logic was changed after I sent v1 patch.  
  Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile.

What's makedumpfile:
  makedumpfile creates a small dumpfile by excluding unnecessary pages
  for the analysis. To distinguish unnecessary pages, makedumpfile gets
  the vmcoreinfo data which has the minimum debugging information only
  for dump filtering.

Signed-off-by: Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp
---
 kernel/kexec.c |4 
 1 file changed, 4 insertions(+)

diff --git a/kernel/kexec.c b/kernel/kexec.c
index 5e4bd78..b27efe4 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -1490,6 +1490,8 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_OFFSET(page, _count);
VMCOREINFO_OFFSET(page, mapping);
VMCOREINFO_OFFSET(page, lru);
+   VMCOREINFO_OFFSET(page, _mapcount);
+   VMCOREINFO_OFFSET(page, private);
VMCOREINFO_OFFSET(pglist_data, node_zones);
VMCOREINFO_OFFSET(pglist_data, nr_zones);
 #ifdef CONFIG_FLAT_NODE_MEM_MAP
@@ -1512,6 +1514,8 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_NUMBER(PG_lru);
VMCOREINFO_NUMBER(PG_private);
VMCOREINFO_NUMBER(PG_swapcache);
+   VMCOREINFO_NUMBER(PG_slab);
+   VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE);

arch_crash_save_vmcoreinfo();
update_vmcoreinfo_note();
--
1.7.9.2
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/