Re: [PATCH v3 12/21] vmcore: allocate per-cpu crash_notes objects on page-size boundary
Vivek Goyal writes: > On Tue, Mar 19, 2013 at 03:12:10PM -0700, Eric W. Biederman wrote: >> HATAYAMA Daisuke writes: >> >> > To satisfy mmap()'s page-size boundary requirement, allocate per-cpu >> > crash_notes objects on page-size boundary. >> > >> > /proc/vmcore on the 2nd kernel checks if each note objects is >> > allocated on page-size boundary. If there's some object not satisfying >> > the page-size boundary requirement, /proc/vmcore doesn't provide >> > mmap() interface. >> >> On second look this requirement does not exist. These all get copyied >> into the unifed PT_NOTE segment so this patch is pointless and wrong. > > Hi Eric, > > They get copied only temporarily and then we free them. Actual reading > of note data still goes to old kernel's memory. > > We copy them temporarily to figure out how much data is there in the > note and update the size of single PT_NOTE header accordingly. We also > add a new element to vmcore_list so that next time a read happens on > an offset which falls into this particular note, we go and read it from > old memory. > > merge_note_headers_elf64() { > notes_section = kmalloc(max_sz, GFP_KERNEL); > rc = read_from_oldmem(notes_section, max_sz, &offset, 0); > /* parse the note, update relevant data structuers */ > kfree(notes_section); > } > > And that's why we have the problem. Actually note buffers are physically > present in old kernel's memory but in /proc/vmcore we have exported them > as contiguous view. So we don't even have the option of mapping extra > bytes (there is no space for mapping extra bytes). > > So there seem to be few options. > > - Do not merge headers. Keep one separate PT_NOTE header for each note and > then map extra bytes aligned. But that's kind of different and gdb does > not expect that. All elf_prstatus are supposed to be in single PT_NOTE > header. > > - Copy all notes to second kernel's memory. > > - align notes in first kernel to page boundary and pad them. I had assumed > that we are already allocating close to 4K of memory in first kernel but > looks like that's not the case. So agree that will be quite wasteful of > memory. > > In fact we are not exporting size of note to user space and kexec-tools > seems to be assuming MAX_NOTE_BYTES of 1024 and that seems very horrible. > Anyway, thats a different issue. We should also export size of reserved > memory for elf notes. > > > Then how about option 2. That is copy all notes in new kernel's memory. > Hatayama had initially implemented that appraoch and I suggested to pad > notes in first kernel to 4K page size boundary. (In an attempt to reduce > second kernel's memory usage). But sounds like per cpu elf note is much > smaller and not 4K. So rounding off these to 4K sounds much more wasteful > of memory. > > Will you like option 2 here where we copy notes to new kernel's memory > in contiguous memory and go from there? Blink. I had missed the fact that we were not keeping the merged copy of the elf notes. That does seem silly after we go to all of the work to create it. The two practical options I see are. 1) Don't allow mmap of the new headers and the merged notes. 2) Copy the notes into a merged note section in the second kernel. I have no problem with either option. In practice I expect copying of the notes will scale better as that terribly long link list won't need to hold all of the previous note sections, but there are other ways to get that speed up. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 12/21] vmcore: allocate per-cpu crash_notes objects on page-size boundary
On Tue, Mar 19, 2013 at 03:12:10PM -0700, Eric W. Biederman wrote: > HATAYAMA Daisuke writes: > > > To satisfy mmap()'s page-size boundary requirement, allocate per-cpu > > crash_notes objects on page-size boundary. > > > > /proc/vmcore on the 2nd kernel checks if each note objects is > > allocated on page-size boundary. If there's some object not satisfying > > the page-size boundary requirement, /proc/vmcore doesn't provide > > mmap() interface. > > On second look this requirement does not exist. These all get copyied > into the unifed PT_NOTE segment so this patch is pointless and wrong. Hi Eric, They get copied only temporarily and then we free them. Actual reading of note data still goes to old kernel's memory. We copy them temporarily to figure out how much data is there in the note and update the size of single PT_NOTE header accordingly. We also add a new element to vmcore_list so that next time a read happens on an offset which falls into this particular note, we go and read it from old memory. merge_note_headers_elf64() { notes_section = kmalloc(max_sz, GFP_KERNEL); rc = read_from_oldmem(notes_section, max_sz, &offset, 0); /* parse the note, update relevant data structuers */ kfree(notes_section); } And that's why we have the problem. Actually note buffers are physically present in old kernel's memory but in /proc/vmcore we have exported them as contiguous view. So we don't even have the option of mapping extra bytes (there is no space for mapping extra bytes). So there seem to be few options. - Do not merge headers. Keep one separate PT_NOTE header for each note and then map extra bytes aligned. But that's kind of different and gdb does not expect that. All elf_prstatus are supposed to be in single PT_NOTE header. - Copy all notes to second kernel's memory. - align notes in first kernel to page boundary and pad them. I had assumed that we are already allocating close to 4K of memory in first kernel but looks like that's not the case. So agree that will be quite wasteful of memory. In fact we are not exporting size of note to user space and kexec-tools seems to be assuming MAX_NOTE_BYTES of 1024 and that seems very horrible. Anyway, thats a different issue. We should also export size of reserved memory for elf notes. Then how about option 2. That is copy all notes in new kernel's memory. Hatayama had initially implemented that appraoch and I suggested to pad notes in first kernel to 4K page size boundary. (In an attempt to reduce second kernel's memory usage). But sounds like per cpu elf note is much smaller and not 4K. So rounding off these to 4K sounds much more wasteful of memory. Will you like option 2 here where we copy notes to new kernel's memory in contiguous memory and go from there? Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 12/21] vmcore: allocate per-cpu crash_notes objects on page-size boundary
HATAYAMA Daisuke writes: > To satisfy mmap()'s page-size boundary requirement, allocate per-cpu > crash_notes objects on page-size boundary. > > /proc/vmcore on the 2nd kernel checks if each note objects is > allocated on page-size boundary. If there's some object not satisfying > the page-size boundary requirement, /proc/vmcore doesn't provide > mmap() interface. On second look this requirement does not exist. These all get copyied into the unifed PT_NOTE segment so this patch is pointless and wrong. I see no evidence of any need to change anything in the first kernel, and this noticably increases the amount of memory used in the first kernel and in the second kernel for no gain. Eric > > Signed-off-by: HATAYAMA Daisuke > --- > > kernel/kexec.c |3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/kernel/kexec.c b/kernel/kexec.c > index bddd3d7..d1f365e 100644 > --- a/kernel/kexec.c > +++ b/kernel/kexec.c > @@ -1234,7 +1234,8 @@ void crash_save_cpu(struct pt_regs *regs, int cpu) > static int __init crash_notes_memory_init(void) > { > /* Allocate memory for saving cpu registers. */ > - crash_notes = alloc_percpu(note_buf_t); > + crash_notes = __alloc_percpu(roundup(sizeof(note_buf_t), PAGE_SIZE), > + PAGE_SIZE); > if (!crash_notes) { > printk("Kexec: Memory allocation for saving cpu register" > " states failed\n"); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 12/21] vmcore: allocate per-cpu crash_notes objects on page-size boundary
HATAYAMA Daisuke writes: > To satisfy mmap()'s page-size boundary requirement, allocate per-cpu > crash_notes objects on page-size boundary. > > /proc/vmcore on the 2nd kernel checks if each note objects is > allocated on page-size boundary. If there's some object not satisfying > the page-size boundary requirement, /proc/vmcore doesn't provide > mmap() interface. Does this actually help? My memory is that /proc/vmcore did some magic behind the scenes to combine these multiple note sections into a single note section. Certainly someone has to combine them together to make a valid elf executable. At the same time I don't see any harm in rounding up to a page size here, but I don't see the point either. Eric > Signed-off-by: HATAYAMA Daisuke > --- > > kernel/kexec.c |3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/kernel/kexec.c b/kernel/kexec.c > index bddd3d7..d1f365e 100644 > --- a/kernel/kexec.c > +++ b/kernel/kexec.c > @@ -1234,7 +1234,8 @@ void crash_save_cpu(struct pt_regs *regs, int cpu) > static int __init crash_notes_memory_init(void) > { > /* Allocate memory for saving cpu registers. */ > - crash_notes = alloc_percpu(note_buf_t); > + crash_notes = __alloc_percpu(roundup(sizeof(note_buf_t), PAGE_SIZE), > + PAGE_SIZE); > if (!crash_notes) { > printk("Kexec: Memory allocation for saving cpu register" > " states failed\n"); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/