Re: [PATCH] android: lmk: add swap pte pmd in tasksize
> On Mar 11, 2016, at 15:23, Lu Bingwrote: > > From: l00215322 > > Many android devices have zram,so we should add "MM_SWAPENTS" in tasksize. > Refer oom_kill.c,we add pte also. > > Reviewed-by: Chen Feng > Reviewed-by: Fu Jun > Reviewed-by: Xu YiPing > Reviewed-by: Yu DongBin > Signed-off-by: Lu Bing > --- > drivers/staging/android/lowmemorykiller.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/staging/android/lowmemorykiller.c > b/drivers/staging/android/lowmemorykiller.c > index 8b5a4a8..0817d3b 100644 > --- a/drivers/staging/android/lowmemorykiller.c > +++ b/drivers/staging/android/lowmemorykiller.c > @@ -139,7 +139,9 @@ static unsigned long lowmem_scan(struct shrinker *s, > struct shrink_control *sc) > task_unlock(p); > continue; > } > - tasksize = get_mm_rss(p->mm); > + tasksize = get_mm_rss(p->mm) + > + get_mm_counter(p->mm, MM_SWAPENTS) + > + atomic_long_read(>mm->nr_ptes) + mm_nr_pmds(p->mm); why not introduce a mm_nr_ptes() help function here ? more clear to see . > task_unlock(p); > if (tasksize <= 0) > continue; > -- > 1.8.3.2 >
Re: [PATCH] android: lmk: add swap pte pmd in tasksize
> On Mar 11, 2016, at 15:23, Lu Bing wrote: > > From: l00215322 > > Many android devices have zram,so we should add "MM_SWAPENTS" in tasksize. > Refer oom_kill.c,we add pte also. > > Reviewed-by: Chen Feng > Reviewed-by: Fu Jun > Reviewed-by: Xu YiPing > Reviewed-by: Yu DongBin > Signed-off-by: Lu Bing > --- > drivers/staging/android/lowmemorykiller.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/staging/android/lowmemorykiller.c > b/drivers/staging/android/lowmemorykiller.c > index 8b5a4a8..0817d3b 100644 > --- a/drivers/staging/android/lowmemorykiller.c > +++ b/drivers/staging/android/lowmemorykiller.c > @@ -139,7 +139,9 @@ static unsigned long lowmem_scan(struct shrinker *s, > struct shrink_control *sc) > task_unlock(p); > continue; > } > - tasksize = get_mm_rss(p->mm); > + tasksize = get_mm_rss(p->mm) + > + get_mm_counter(p->mm, MM_SWAPENTS) + > + atomic_long_read(>mm->nr_ptes) + mm_nr_pmds(p->mm); why not introduce a mm_nr_ptes() help function here ? more clear to see . > task_unlock(p); > if (tasksize <= 0) > continue; > -- > 1.8.3.2 >
[RFC] arm: change to use generic sign_extend32() function
change to use generic sign_extend32() to caaculate branch_displacement. Signed-off-by: yalin wang --- arch/arm/probes/decode-arm.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/arm/probes/decode-arm.c b/arch/arm/probes/decode-arm.c index f72c33a..ff794c0 100644 --- a/arch/arm/probes/decode-arm.c +++ b/arch/arm/probes/decode-arm.c @@ -20,13 +20,12 @@ #include #include #include +#include #include "decode.h" #include "decode-arm.h" -#define sign_extend(x, signbit) ((x) | (0 - ((x) & (1 << (signbit) - -#define branch_displacement(insn) sign_extend(((insn) & 0xff) << 2, 25) +#define branch_displacement(insn) sign_extend32(((insn) & 0xff) << 2, 25) /* * To avoid the complications of mimicing single-stepping on a -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] arm: change to use generic sign_extend32() function
change to use generic sign_extend32() to caaculate branch_displacement. Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- arch/arm/probes/decode-arm.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/arm/probes/decode-arm.c b/arch/arm/probes/decode-arm.c index f72c33a..ff794c0 100644 --- a/arch/arm/probes/decode-arm.c +++ b/arch/arm/probes/decode-arm.c @@ -20,13 +20,12 @@ #include #include #include +#include #include "decode.h" #include "decode-arm.h" -#define sign_extend(x, signbit) ((x) | (0 - ((x) & (1 << (signbit) - -#define branch_displacement(insn) sign_extend(((insn) & 0xff) << 2, 25) +#define branch_displacement(insn) sign_extend32(((insn) & 0xff) << 2, 25) /* * To avoid the complications of mimicing single-stepping on a -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] mm: change find_vma() function
> On Dec 15, 2015, at 19:53, Kirill A. Shutemov > wrote: > > On Tue, Dec 15, 2015 at 02:41:21PM +0800, yalin wang wrote: >>> On Dec 15, 2015, at 05:11, Kirill A. Shutemov wrote: >>> Anyway, I don't think it's possible to gain anything measurable from this >>> optimization. >>> >> the advantage is that if addr don’t belong to any vma, we don’t need loop >> all vma, >> we can break earlier if we found the most closest vma which vma->end_add > >> addr, > > Do you have any workload which can demonstrate the advantage? > > — i add the log in find_vma() to see the call stack , it is very efficient in mmap() / munmap / do_execve() / get_unmaped_area() / mem_cgroup_move_task()->walk_page_range()->find_vma() call , in most time the loop will break after search about 7 vm, i don’t consider the cache pollution problem in this patch, yeah, this patch will check the vm_prev->vm_end for every loop, but this only happened when tmp->vm_end > addr , if you don’t not check this , you will continue to loop to check next rb , this will also pollute the cache , so the question is which one is better ? i don’t have a better method to test this . Any good ideas about this ? how to test it ? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] mm: change find_vma() function
> On Dec 15, 2015, at 19:53, Kirill A. Shutemov > <kirill.shute...@linux.intel.com> wrote: > > On Tue, Dec 15, 2015 at 02:41:21PM +0800, yalin wang wrote: >>> On Dec 15, 2015, at 05:11, Kirill A. Shutemov <kir...@shutemov.name> wrote: >>> Anyway, I don't think it's possible to gain anything measurable from this >>> optimization. >>> >> the advantage is that if addr don’t belong to any vma, we don’t need loop >> all vma, >> we can break earlier if we found the most closest vma which vma->end_add > >> addr, > > Do you have any workload which can demonstrate the advantage? > > — i add the log in find_vma() to see the call stack , it is very efficient in mmap() / munmap / do_execve() / get_unmaped_area() / mem_cgroup_move_task()->walk_page_range()->find_vma() call , in most time the loop will break after search about 7 vm, i don’t consider the cache pollution problem in this patch, yeah, this patch will check the vm_prev->vm_end for every loop, but this only happened when tmp->vm_end > addr , if you don’t not check this , you will continue to loop to check next rb , this will also pollute the cache , so the question is which one is better ? i don’t have a better method to test this . Any good ideas about this ? how to test it ? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] mm: change find_vma() function
> On Dec 15, 2015, at 05:11, Kirill A. Shutemov wrote: > > On Mon, Dec 14, 2015 at 06:55:09PM +0100, Oleg Nesterov wrote: >> On 12/14, Kirill A. Shutemov wrote: >>> >>> On Mon, Dec 14, 2015 at 07:02:25PM +0800, yalin wang wrote: >>>> change find_vma() to break ealier when found the adderss >>>> is not in any vma, don't need loop to search all vma. >>>> >>>> Signed-off-by: yalin wang >>>> --- >>>> mm/mmap.c | 3 +++ >>>> 1 file changed, 3 insertions(+) >>>> >>>> diff --git a/mm/mmap.c b/mm/mmap.c >>>> index b513f20..8294c9b 100644 >>>> --- a/mm/mmap.c >>>> +++ b/mm/mmap.c >>>> @@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct >>>> *mm, unsigned long addr) >>>>vma = tmp; >>>>if (tmp->vm_start <= addr) >>>>break; >>>> + if (!tmp->vm_prev || tmp->vm_prev->vm_end <= addr) >>>> + break; >>>> + >>> >>> This 'break' would return 'tmp' as found vma. >> >> But this would be right? > > Hm. Right. Sorry for my tone. > > I think the right condition is 'tmp->vm_prev->vm_end < addr', not '<=' as > vm_end is the first byte after the vma. But it's equivalent in practice > here. > this should be <= here, because vma’s effect address space doesn’t include vm_end add, so if an address vm_end <= add , this means this addr don’t belong to this vma, > Anyway, I don't think it's possible to gain anything measurable from this > optimization. > the advantage is that if addr don’t belong to any vma, we don’t need loop all vma, we can break earlier if we found the most closest vma which vma->end_add > addr, >> >> Not that I think this optimization makes sense, I simply do not know, >> but to me this change looks technically correct at first glance... >> >> But the changelog is wrong or I missed something. This change can stop >> the main loop earlier; if "tmp" is the first vma, > > For the first vma, we don't get anything comparing to what we have now: > check for !rb_node on the next iteration would have the same trade off and > effect as the proposed check. Yes > >> or if the previous one is below the address. > > Yes, but would it compensate additional check on each 'tmp->vm_end > addr' > iteration to the point? That's not obvious. > >> Or perhaps I just misread that "not in any vma" note in the changelog. >> >> No? >> >> Oleg. >> i have test it, it works fine. :) Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] mm: change find_vma() function
change find_vma() to break ealier when found the adderss is not in any vma, don't need loop to search all vma. Signed-off-by: yalin wang --- mm/mmap.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/mmap.c b/mm/mmap.c index b513f20..8294c9b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr) vma = tmp; if (tmp->vm_start <= addr) break; + if (!tmp->vm_prev || tmp->vm_prev->vm_end <= addr) + break; + rb_node = rb_node->rb_left; } else rb_node = rb_node->rb_right; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] mm: change find_vma() function
change find_vma() to break ealier when found the adderss is not in any vma, don't need loop to search all vma. Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- mm/mmap.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/mmap.c b/mm/mmap.c index b513f20..8294c9b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr) vma = tmp; if (tmp->vm_start <= addr) break; + if (!tmp->vm_prev || tmp->vm_prev->vm_end <= addr) + break; + rb_node = rb_node->rb_left; } else rb_node = rb_node->rb_right; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] mm: change find_vma() function
> On Dec 15, 2015, at 05:11, Kirill A. Shutemov <kir...@shutemov.name> wrote: > > On Mon, Dec 14, 2015 at 06:55:09PM +0100, Oleg Nesterov wrote: >> On 12/14, Kirill A. Shutemov wrote: >>> >>> On Mon, Dec 14, 2015 at 07:02:25PM +0800, yalin wang wrote: >>>> change find_vma() to break ealier when found the adderss >>>> is not in any vma, don't need loop to search all vma. >>>> >>>> Signed-off-by: yalin wang <yalin.wang2...@gmail.com> >>>> --- >>>> mm/mmap.c | 3 +++ >>>> 1 file changed, 3 insertions(+) >>>> >>>> diff --git a/mm/mmap.c b/mm/mmap.c >>>> index b513f20..8294c9b 100644 >>>> --- a/mm/mmap.c >>>> +++ b/mm/mmap.c >>>> @@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct >>>> *mm, unsigned long addr) >>>>vma = tmp; >>>>if (tmp->vm_start <= addr) >>>>break; >>>> + if (!tmp->vm_prev || tmp->vm_prev->vm_end <= addr) >>>> + break; >>>> + >>> >>> This 'break' would return 'tmp' as found vma. >> >> But this would be right? > > Hm. Right. Sorry for my tone. > > I think the right condition is 'tmp->vm_prev->vm_end < addr', not '<=' as > vm_end is the first byte after the vma. But it's equivalent in practice > here. > this should be <= here, because vma’s effect address space doesn’t include vm_end add, so if an address vm_end <= add , this means this addr don’t belong to this vma, > Anyway, I don't think it's possible to gain anything measurable from this > optimization. > the advantage is that if addr don’t belong to any vma, we don’t need loop all vma, we can break earlier if we found the most closest vma which vma->end_add > addr, >> >> Not that I think this optimization makes sense, I simply do not know, >> but to me this change looks technically correct at first glance... >> >> But the changelog is wrong or I missed something. This change can stop >> the main loop earlier; if "tmp" is the first vma, > > For the first vma, we don't get anything comparing to what we have now: > check for !rb_node on the next iteration would have the same trade off and > effect as the proposed check. Yes > >> or if the previous one is below the address. > > Yes, but would it compensate additional check on each 'tmp->vm_end > addr' > iteration to the point? That's not obvious. > >> Or perhaps I just misread that "not in any vma" note in the changelog. >> >> No? >> >> Oleg. >> i have test it, it works fine. :) Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] clear file privilege bits when mmap writing
> On Dec 2, 2015, at 16:03, Kees Cook wrote: > > Normally, when a user can modify a file that has setuid or setgid bits, > those bits are cleared when they are not the file owner or a member > of the group. This is enforced when using write and truncate but not > when writing to a shared mmap on the file. This could allow the file > writer to gain privileges by changing a binary without losing the > setuid/setgid/caps bits. > > Changing the bits requires holding inode->i_mutex, so it cannot be done > during the page fault (due to mmap_sem being held during the fault). > Instead, clear the bits if PROT_WRITE is being used at mmap time. > > Signed-off-by: Kees Cook > Cc: sta...@vger.kernel.org > — is this means mprotect() sys call also need add this check? mprotect() can change to PROT_WRITE, then it can write to a read only map again , also a secure hole here . Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] mm, printk: introduce new format string for flags
>> Technically, I think the answer is yes, at least in C99 (and I suppose >> gcc would accept it in gnu89 mode as well). >> >> printk("%pg\n", &(struct flag_printer){.flags = my_flags, .names = >> vmaflags_names}); >> >> Not tested, and I still don't think it would be particularly readable >> even when macroized >> >> printk("%pg\n", PRINTF_VMAFLAGS(my_flags)); > i test on gcc 4.9.3, it can work for this method, > so the final solution like this: > printk.h: > struct flag_fmt_spec { > unsigned long flag; > struct trace_print_flags *flags; > int array_size; > char delimiter; } > > #define FLAG_FORMAT(flag, flag_array, delimiter) (&(struct flag_ft_spec){ > .flag = flag, .flags = flag_array, .array_size = ARRAY_SIZE(flag_array), > .delimiter = delimiter}) > #define VMA_FLAG_FORMAT(flag) FLAG_FORMAT(flag, vmaflags_names, ‘|’) a little change: #define VMA_FLAG_FORMAT(vma) FLAG_FORMAT(vma->vm_flags, vmaflags_names, ‘|’) > source code: > printk("%pg\n", VMA_FLAG_FORMAT(my_flags)); a little change: printk("%pg\n", VMA_FLAG_FORMAT(vma)); > > that’s all, see cpumask_pr_args(masks) macro, > it also use macro and %*pb to print cpu mask . > i think this method is not very complex to use . > > search source code , > there is lots of printk to print flag into hex number : > $ grep -n -r 'printk.*flag.*%x’ . > it will be great if this flag string print is generic. > > Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] mm, printk: introduce new format string for flags
> On Dec 3, 2015, at 00:03, Rasmus Villemoes wrote: > > On Thu, Dec 03 2015, yalin wang wrote: > >>> On Dec 2, 2015, at 13:04, Vlastimil Babka wrote: >>> >>> On 12/02/2015 06:40 PM, yalin wang wrote: >>> >>> (please trim your reply next time, no need to quote whole patch here) >>> >>>> i am thinking why not make %pg* to be more generic ? >>>> not restricted to only GFP / vma flags / page flags . >>>> so could we change format like this ? >>>> define a flag spec struct to include flag and trace_print_flags and some >>>> other option : >>>> typedef struct { >>>> unsigned long flag; >>>> structtrace_print_flags *flags; >>>> unsigned long option; } flag_sec; >>>> flag_sec my_flag; >>>> in printk we only pass like this : >>>> printk(“%pg\n”, _flag) ; >>>> then it can print any flags defined by user . >>>> more useful for other drivers to use . >>> >>> I don't know, it sounds quite complicated > > Agreed, I think this would be premature generalization. There's also > some value in having the individual %pgX specifiers, as that allows > individual tweaks such as the mask_out for page flags. > > given that we had no flags printing >> if we use this generic method, %pgX where X can be used to specify some flag to mask out some thing . it will be great . > > Compared to printk("%pgv\n", >flag), I know which I'd prefer to read. > >> i am not if DECLARE_FLAG_PRINTK_FMT and FLAG_PRINTK_FMT macro >> can be defined into one macro ? >> maybe need some trick here . >> >> is it possible ? > > Technically, I think the answer is yes, at least in C99 (and I suppose > gcc would accept it in gnu89 mode as well). > > printk("%pg\n", &(struct flag_printer){.flags = my_flags, .names = > vmaflags_names}); > > Not tested, and I still don't think it would be particularly readable > even when macroized > > printk("%pg\n", PRINTF_VMAFLAGS(my_flags)); i test on gcc 4.9.3, it can work for this method, so the final solution like this: printk.h: struct flag_fmt_spec { unsigned long flag; struct trace_print_flags *flags; int array_size; char delimiter; } #define FLAG_FORMAT(flag, flag_array, delimiter) (&(struct flag_ft_spec){ .flag = flag, .flags = flag_array, .array_size = ARRAY_SIZE(flag_array), .delimiter = delimiter}) #define VMA_FLAG_FORMAT(flag) FLAG_FORMAT(flag, vmaflags_names, ‘|') source code: printk("%pg\n", VMA_FLAG_FORMAT(my_flags)); that’s all, see cpumask_pr_args(masks) macro, it also use macro and %*pb to print cpu mask . i think this method is not very complex to use . search source code , there is lots of printk to print flag into hex number : $ grep -n -r 'printk.*flag.*%x’ . it will be great if this flag string print is generic. Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] mm, printk: introduce new format string for flags
> On Dec 3, 2015, at 00:03, Rasmus Villemoes <li...@rasmusvillemoes.dk> wrote: > > On Thu, Dec 03 2015, yalin wang <yalin.wang2...@gmail.com> wrote: > >>> On Dec 2, 2015, at 13:04, Vlastimil Babka <vba...@suse.cz> wrote: >>> >>> On 12/02/2015 06:40 PM, yalin wang wrote: >>> >>> (please trim your reply next time, no need to quote whole patch here) >>> >>>> i am thinking why not make %pg* to be more generic ? >>>> not restricted to only GFP / vma flags / page flags . >>>> so could we change format like this ? >>>> define a flag spec struct to include flag and trace_print_flags and some >>>> other option : >>>> typedef struct { >>>> unsigned long flag; >>>> structtrace_print_flags *flags; >>>> unsigned long option; } flag_sec; >>>> flag_sec my_flag; >>>> in printk we only pass like this : >>>> printk(“%pg\n”, _flag) ; >>>> then it can print any flags defined by user . >>>> more useful for other drivers to use . >>> >>> I don't know, it sounds quite complicated > > Agreed, I think this would be premature generalization. There's also > some value in having the individual %pgX specifiers, as that allows > individual tweaks such as the mask_out for page flags. > > given that we had no flags printing >> if we use this generic method, %pgX where X can be used to specify some flag to mask out some thing . it will be great . > > Compared to printk("%pgv\n", >flag), I know which I'd prefer to read. > >> i am not if DECLARE_FLAG_PRINTK_FMT and FLAG_PRINTK_FMT macro >> can be defined into one macro ? >> maybe need some trick here . >> >> is it possible ? > > Technically, I think the answer is yes, at least in C99 (and I suppose > gcc would accept it in gnu89 mode as well). > > printk("%pg\n", &(struct flag_printer){.flags = my_flags, .names = > vmaflags_names}); > > Not tested, and I still don't think it would be particularly readable > even when macroized > > printk("%pg\n", PRINTF_VMAFLAGS(my_flags)); i test on gcc 4.9.3, it can work for this method, so the final solution like this: printk.h: struct flag_fmt_spec { unsigned long flag; struct trace_print_flags *flags; int array_size; char delimiter; } #define FLAG_FORMAT(flag, flag_array, delimiter) (&(struct flag_ft_spec){ .flag = flag, .flags = flag_array, .array_size = ARRAY_SIZE(flag_array), .delimiter = delimiter}) #define VMA_FLAG_FORMAT(flag) FLAG_FORMAT(flag, vmaflags_names, ‘|') source code: printk("%pg\n", VMA_FLAG_FORMAT(my_flags)); that’s all, see cpumask_pr_args(masks) macro, it also use macro and %*pb to print cpu mask . i think this method is not very complex to use . search source code , there is lots of printk to print flag into hex number : $ grep -n -r 'printk.*flag.*%x’ . it will be great if this flag string print is generic. Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] mm, printk: introduce new format string for flags
>> Technically, I think the answer is yes, at least in C99 (and I suppose >> gcc would accept it in gnu89 mode as well). >> >> printk("%pg\n", &(struct flag_printer){.flags = my_flags, .names = >> vmaflags_names}); >> >> Not tested, and I still don't think it would be particularly readable >> even when macroized >> >> printk("%pg\n", PRINTF_VMAFLAGS(my_flags)); > i test on gcc 4.9.3, it can work for this method, > so the final solution like this: > printk.h: > struct flag_fmt_spec { > unsigned long flag; > struct trace_print_flags *flags; > int array_size; > char delimiter; } > > #define FLAG_FORMAT(flag, flag_array, delimiter) (&(struct flag_ft_spec){ > .flag = flag, .flags = flag_array, .array_size = ARRAY_SIZE(flag_array), > .delimiter = delimiter}) > #define VMA_FLAG_FORMAT(flag) FLAG_FORMAT(flag, vmaflags_names, ‘|’) a little change: #define VMA_FLAG_FORMAT(vma) FLAG_FORMAT(vma->vm_flags, vmaflags_names, ‘|’) > source code: > printk("%pg\n", VMA_FLAG_FORMAT(my_flags)); a little change: printk("%pg\n", VMA_FLAG_FORMAT(vma)); > > that’s all, see cpumask_pr_args(masks) macro, > it also use macro and %*pb to print cpu mask . > i think this method is not very complex to use . > > search source code , > there is lots of printk to print flag into hex number : > $ grep -n -r 'printk.*flag.*%x’ . > it will be great if this flag string print is generic. > > Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] clear file privilege bits when mmap writing
> On Dec 2, 2015, at 16:03, Kees Cookwrote: > > Normally, when a user can modify a file that has setuid or setgid bits, > those bits are cleared when they are not the file owner or a member > of the group. This is enforced when using write and truncate but not > when writing to a shared mmap on the file. This could allow the file > writer to gain privileges by changing a binary without losing the > setuid/setgid/caps bits. > > Changing the bits requires holding inode->i_mutex, so it cannot be done > during the page fault (due to mmap_sem being held during the fault). > Instead, clear the bits if PROT_WRITE is being used at mmap time. > > Signed-off-by: Kees Cook > Cc: sta...@vger.kernel.org > — is this means mprotect() sys call also need add this check? mprotect() can change to PROT_WRITE, then it can write to a read only map again , also a secure hole here . Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] mm, printk: introduce new format string for flags
> On Dec 2, 2015, at 13:04, Vlastimil Babka wrote: > > On 12/02/2015 06:40 PM, yalin wang wrote: > > (please trim your reply next time, no need to quote whole patch here) > >> i am thinking why not make %pg* to be more generic ? >> not restricted to only GFP / vma flags / page flags . >> so could we change format like this ? >> define a flag spec struct to include flag and trace_print_flags and some >> other option : >> typedef struct { >> unsigned long flag; >> structtrace_print_flags *flags; >> unsigned long option; } flag_sec; >> flag_sec my_flag; >> in printk we only pass like this : >> printk(“%pg\n”, _flag) ; >> then it can print any flags defined by user . >> more useful for other drivers to use . > > I don't know, it sounds quite complicated given that we had no flags printing > for years and now there's just three kinds of them. The extra struct flag_sec > is > IMHO nuissance. No other printk format needs such thing AFAIK? For example, > if I > were to print page flags from several places, each would have to define the > struct flag_sec instance, or some header would have to provide it? this can be avoided by provide a macro in header file . we can add a new struct to declare trace_print_flags : for example: #define DECLARE_FLAG_PRINTK_FMT(name, flags_array) flag_spec name = { .flags = flags_array}; #define FLAG_PRINTK_FMT(name, flag) ({ name.flag = flag; }) in source code : DECLARE_FLAG_PRINTK_FMT(my_flag, vmaflags_names); printk(“%pg\n”, FLAG_PRINTK_FMT(my_flag, vma->flag)); i am not if DECLARE_FLAG_PRINTK_FMT and FLAG_PRINTK_FMT macro can be defined into one macro ? maybe need some trick here . is it possible ? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] mm, printk: introduce new format string for flags
> On Nov 30, 2015, at 08:10, Vlastimil Babka wrote: > > In mm we use several kinds of flags bitfields that are sometimes printed for > debugging purposes, or exported to userspace via sysfs. To make them easier to > interpret independently on kernel version and config, we want to dump also the > symbolic flag names. So far this has been done with repeated calls to > pr_cont(), which is unreliable on SMP, and not usable for e.g. sysfs export. > > To get a more reliable and universal solution, this patch extends printk() > format string for pointers to handle the page flags (%pgp), gfp_flags (%pgg) > and vma flags (%pgv). Existing users of dump_flag_names() are converted and > simplified. > > It would be possible to pass flags by value instead of pointer, but the %p > format string for pointers already has extensions for various kernel > structures, so it's a good fit, and the extra indirection in a non-critical > path is negligible. > > Signed-off-by: Vlastimil Babka > Cc: Rasmus Villemoes > --- > I'm sending it on top of the page_owner series, as it's already in mmotm. > But to reduce churn (in case this approach is accepted), I can later > incorporate it and resend it whole. > > Documentation/printk-formats.txt | 14 > include/linux/mmdebug.h | 5 +- > lib/vsprintf.c | 31 > mm/debug.c | 150 ++- > mm/oom_kill.c| 5 +- > mm/page_alloc.c | 5 +- > mm/page_owner.c | 5 +- > 7 files changed, 140 insertions(+), 75 deletions(-) > > diff --git a/Documentation/printk-formats.txt > b/Documentation/printk-formats.txt > index b784c270105f..4b5156e74b09 100644 > --- a/Documentation/printk-formats.txt > +++ b/Documentation/printk-formats.txt > @@ -292,6 +292,20 @@ Raw pointer value SHOULD be printed with %p. The kernel > supports > > Passed by reference. > > +Flags bitfields such as page flags, gfp_flags: > + > + %pgp0x1f886c(referenced|uptodate|lru|active|private) > + %pgg0x24202c4(GFP_USER|GFP_DMA32|GFP_NOWARN) > + %pgv0x875(read|exec|mayread|maywrite|mayexec|denywrite) > + > + For printing raw values of flags bitfields together with symbolic > + strings that would construct the value. The type of flags is given by > + the third character. Currently supported are [p]age flags, [g]fp_flags > + and [v]ma_flags. The flag names and print order depends on the > + particular type. > + > + Passed by reference. > + > Network device features: > > %pNF0xc000 > diff --git a/include/linux/mmdebug.h b/include/linux/mmdebug.h > index 3b77fab7ad28..e6518df259ca 100644 > --- a/include/linux/mmdebug.h > +++ b/include/linux/mmdebug.h > @@ -2,6 +2,7 @@ > #define LINUX_MM_DEBUG_H 1 > > #include > +#include > > struct page; > struct vm_area_struct; > @@ -10,7 +11,9 @@ struct mm_struct; > extern void dump_page(struct page *page, const char *reason); > extern void dump_page_badflags(struct page *page, const char *reason, > unsigned long badflags); > -extern void dump_gfpflag_names(unsigned long gfp_flags); > +extern char *format_page_flags(unsigned long flags, char *buf, char *end); > +extern char *format_vma_flags(unsigned long flags, char *buf, char *end); > +extern char *format_gfp_flags(gfp_t gfp_flags, char *buf, char*end); > void dump_vma(const struct vm_area_struct *vma); > void dump_mm(const struct mm_struct *mm); > > diff --git a/lib/vsprintf.c b/lib/vsprintf.c > index f9cee8e1233c..41cd122bd307 100644 > --- a/lib/vsprintf.c > +++ b/lib/vsprintf.c > @@ -31,6 +31,7 @@ > #include > #include > #include > +#include > > #include /* for PAGE_SIZE */ > #include /* for dereference_function_descriptor() */ > @@ -1361,6 +1362,29 @@ char *clock(char *buf, char *end, struct clk *clk, > struct printf_spec spec, > } > } > > +static noinline_for_stack > +char *flags_string(char *buf, char *end, void *flags_ptr, > + struct printf_spec spec, const char *fmt) > +{ > + unsigned long flags; > + gfp_t gfp_flags; > + > + switch (fmt[1]) { > + case 'p': > + flags = *(unsigned long *)flags_ptr; > + return format_page_flags(flags, buf, end); > + case 'v': > + flags = *(unsigned long *)flags_ptr; > + return format_vma_flags(flags, buf, end); > + case 'g': > + gfp_flags = *(gfp_t *)flags_ptr; > + return format_gfp_flags(gfp_flags, buf, end); > + default: > + WARN_ONCE(1, "Unsupported flags modifier: %c\n", fmt[1]); > + return 0; > + } > +} > + > int kptr_restrict __read_mostly; > > /* > @@ -1448,6 +1472,11 @@ int kptr_restrict __read_mostly; > * - 'Cn' For a clock, it prints the name (Common Clock Framework) or address > *(legacy clock framework) of the clock > * - 'Cr' For a clock, it
Re: [PATCH 1/2] mm, printk: introduce new format string for flags
> On Nov 30, 2015, at 08:10, Vlastimil Babkawrote: > > In mm we use several kinds of flags bitfields that are sometimes printed for > debugging purposes, or exported to userspace via sysfs. To make them easier to > interpret independently on kernel version and config, we want to dump also the > symbolic flag names. So far this has been done with repeated calls to > pr_cont(), which is unreliable on SMP, and not usable for e.g. sysfs export. > > To get a more reliable and universal solution, this patch extends printk() > format string for pointers to handle the page flags (%pgp), gfp_flags (%pgg) > and vma flags (%pgv). Existing users of dump_flag_names() are converted and > simplified. > > It would be possible to pass flags by value instead of pointer, but the %p > format string for pointers already has extensions for various kernel > structures, so it's a good fit, and the extra indirection in a non-critical > path is negligible. > > Signed-off-by: Vlastimil Babka > Cc: Rasmus Villemoes > --- > I'm sending it on top of the page_owner series, as it's already in mmotm. > But to reduce churn (in case this approach is accepted), I can later > incorporate it and resend it whole. > > Documentation/printk-formats.txt | 14 > include/linux/mmdebug.h | 5 +- > lib/vsprintf.c | 31 > mm/debug.c | 150 ++- > mm/oom_kill.c| 5 +- > mm/page_alloc.c | 5 +- > mm/page_owner.c | 5 +- > 7 files changed, 140 insertions(+), 75 deletions(-) > > diff --git a/Documentation/printk-formats.txt > b/Documentation/printk-formats.txt > index b784c270105f..4b5156e74b09 100644 > --- a/Documentation/printk-formats.txt > +++ b/Documentation/printk-formats.txt > @@ -292,6 +292,20 @@ Raw pointer value SHOULD be printed with %p. The kernel > supports > > Passed by reference. > > +Flags bitfields such as page flags, gfp_flags: > + > + %pgp0x1f886c(referenced|uptodate|lru|active|private) > + %pgg0x24202c4(GFP_USER|GFP_DMA32|GFP_NOWARN) > + %pgv0x875(read|exec|mayread|maywrite|mayexec|denywrite) > + > + For printing raw values of flags bitfields together with symbolic > + strings that would construct the value. The type of flags is given by > + the third character. Currently supported are [p]age flags, [g]fp_flags > + and [v]ma_flags. The flag names and print order depends on the > + particular type. > + > + Passed by reference. > + > Network device features: > > %pNF0xc000 > diff --git a/include/linux/mmdebug.h b/include/linux/mmdebug.h > index 3b77fab7ad28..e6518df259ca 100644 > --- a/include/linux/mmdebug.h > +++ b/include/linux/mmdebug.h > @@ -2,6 +2,7 @@ > #define LINUX_MM_DEBUG_H 1 > > #include > +#include > > struct page; > struct vm_area_struct; > @@ -10,7 +11,9 @@ struct mm_struct; > extern void dump_page(struct page *page, const char *reason); > extern void dump_page_badflags(struct page *page, const char *reason, > unsigned long badflags); > -extern void dump_gfpflag_names(unsigned long gfp_flags); > +extern char *format_page_flags(unsigned long flags, char *buf, char *end); > +extern char *format_vma_flags(unsigned long flags, char *buf, char *end); > +extern char *format_gfp_flags(gfp_t gfp_flags, char *buf, char*end); > void dump_vma(const struct vm_area_struct *vma); > void dump_mm(const struct mm_struct *mm); > > diff --git a/lib/vsprintf.c b/lib/vsprintf.c > index f9cee8e1233c..41cd122bd307 100644 > --- a/lib/vsprintf.c > +++ b/lib/vsprintf.c > @@ -31,6 +31,7 @@ > #include > #include > #include > +#include > > #include /* for PAGE_SIZE */ > #include /* for dereference_function_descriptor() */ > @@ -1361,6 +1362,29 @@ char *clock(char *buf, char *end, struct clk *clk, > struct printf_spec spec, > } > } > > +static noinline_for_stack > +char *flags_string(char *buf, char *end, void *flags_ptr, > + struct printf_spec spec, const char *fmt) > +{ > + unsigned long flags; > + gfp_t gfp_flags; > + > + switch (fmt[1]) { > + case 'p': > + flags = *(unsigned long *)flags_ptr; > + return format_page_flags(flags, buf, end); > + case 'v': > + flags = *(unsigned long *)flags_ptr; > + return format_vma_flags(flags, buf, end); > + case 'g': > + gfp_flags = *(gfp_t *)flags_ptr; > + return format_gfp_flags(gfp_flags, buf, end); > + default: > + WARN_ONCE(1, "Unsupported flags modifier: %c\n", fmt[1]); > + return 0; > + } > +} > + > int kptr_restrict __read_mostly; > > /* > @@ -1448,6 +1472,11 @@ int kptr_restrict __read_mostly; > * - 'Cn' For a clock, it prints the name (Common Clock Framework) or address > *(legacy
Re: [PATCH 1/2] mm, printk: introduce new format string for flags
> On Dec 2, 2015, at 13:04, Vlastimil Babka <vba...@suse.cz> wrote: > > On 12/02/2015 06:40 PM, yalin wang wrote: > > (please trim your reply next time, no need to quote whole patch here) > >> i am thinking why not make %pg* to be more generic ? >> not restricted to only GFP / vma flags / page flags . >> so could we change format like this ? >> define a flag spec struct to include flag and trace_print_flags and some >> other option : >> typedef struct { >> unsigned long flag; >> structtrace_print_flags *flags; >> unsigned long option; } flag_sec; >> flag_sec my_flag; >> in printk we only pass like this : >> printk(“%pg\n”, _flag) ; >> then it can print any flags defined by user . >> more useful for other drivers to use . > > I don't know, it sounds quite complicated given that we had no flags printing > for years and now there's just three kinds of them. The extra struct flag_sec > is > IMHO nuissance. No other printk format needs such thing AFAIK? For example, > if I > were to print page flags from several places, each would have to define the > struct flag_sec instance, or some header would have to provide it? this can be avoided by provide a macro in header file . we can add a new struct to declare trace_print_flags : for example: #define DECLARE_FLAG_PRINTK_FMT(name, flags_array) flag_spec name = { .flags = flags_array}; #define FLAG_PRINTK_FMT(name, flag) ({ name.flag = flag; }) in source code : DECLARE_FLAG_PRINTK_FMT(my_flag, vmaflags_names); printk(“%pg\n”, FLAG_PRINTK_FMT(my_flag, vma->flag)); i am not if DECLARE_FLAG_PRINTK_FMT and FLAG_PRINTK_FMT macro can be defined into one macro ? maybe need some trick here . is it possible ? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 5/5] printk/nmi: Increase the size of the temporary buffer
> On Nov 27, 2015, at 19:09, Petr Mladek wrote: > > Testing has shown that the backtrace sometimes does not fit > into the 4kB temporary buffer that is used in NMI context. > > The warnings are gone when I double the temporary buffer size. > > Note that this problem existed even in the x86-specific > implementation that was added by the commit a9edc8809328 > ("x86/nmi: Perform a safe NMI stack trace on all CPUs"). > Nobody noticed it because it did not print any warnings. > > Signed-off-by: Petr Mladek > --- > kernel/printk/nmi.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/printk/nmi.c b/kernel/printk/nmi.c > index 8af1e4016719..6111644d5f01 100644 > --- a/kernel/printk/nmi.c > +++ b/kernel/printk/nmi.c > @@ -42,7 +42,7 @@ atomic_t nmi_message_lost; > struct nmi_seq_buf { > atomic_tlen;/* length of written data */ > struct irq_work work; /* IRQ work that flushes the buffer */ > - unsigned char buffer[PAGE_SIZE - sizeof(atomic_t) - > + unsigned char buffer[2 * PAGE_SIZE - sizeof(atomic_t) - > sizeof(struct irq_work)]; > }; > why not define like this: union { struct {atomic_tlen; struct irq_work work; } unsigned char buffer[PAGE_SIZE * 2] ; } we can make sure the union is 2 PAGE_SIZE . Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 5/5] printk/nmi: Increase the size of the temporary buffer
> On Nov 27, 2015, at 19:09, Petr Mladekwrote: > > Testing has shown that the backtrace sometimes does not fit > into the 4kB temporary buffer that is used in NMI context. > > The warnings are gone when I double the temporary buffer size. > > Note that this problem existed even in the x86-specific > implementation that was added by the commit a9edc8809328 > ("x86/nmi: Perform a safe NMI stack trace on all CPUs"). > Nobody noticed it because it did not print any warnings. > > Signed-off-by: Petr Mladek > --- > kernel/printk/nmi.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/printk/nmi.c b/kernel/printk/nmi.c > index 8af1e4016719..6111644d5f01 100644 > --- a/kernel/printk/nmi.c > +++ b/kernel/printk/nmi.c > @@ -42,7 +42,7 @@ atomic_t nmi_message_lost; > struct nmi_seq_buf { > atomic_tlen;/* length of written data */ > struct irq_work work; /* IRQ work that flushes the buffer */ > - unsigned char buffer[PAGE_SIZE - sizeof(atomic_t) - > + unsigned char buffer[2 * PAGE_SIZE - sizeof(atomic_t) - > sizeof(struct irq_work)]; > }; > why not define like this: union { struct {atomic_tlen; struct irq_work work; } unsigned char buffer[PAGE_SIZE * 2] ; } we can make sure the union is 2 PAGE_SIZE . Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: no-op delay loops
> On Nov 27, 2015, at 16:53, Rasmus Villemoes wrote: > > Hi, > > It seems that gcc happily compiles > > for (i = 0; i < 10; ++i) ; > > into simply > > i = 10; > > (which is then usually eliminated as a dead store). At least at -O2, and > when i is not declared volatile. So it would seem that the loops at > > arch/mips/pci/pci-rt2880.c:235 > arch/mips/pmcs-msp71xx/msp_setup.c:80 > arch/mips/sni/reset.c:35 > > actually don't do anything. (In the middle one, i is 'register', but > that doesn't change anything.) Is mips compiled with some special flags > that would make gcc actually emit code for the above? > you can try to declare i as volatile int i; may gcc will not optimize it . Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2] scripts: fix the sys path for gdb scripts
> On Nov 27, 2015, at 15:04, Jan Kiszka wrote: > > On 2015-11-27 07:41, yalin wang wrote: >> we insert __file__'s real path into sys.path, >> so that no matter we import the vmlinux-gdb.py from $OUT floder or >> from source code folder, we can always find the linux/ lib folder, >> and we don't need create link to linux/*.py files, >> remove the related make file. > > NACK again - I tell you why below. > >> >> Signed-off-by: yalin wang >> --- >> scripts/Makefile | 1 - >> scripts/gdb/Makefile | 1 - >> scripts/gdb/linux/Makefile | 11 --- >> scripts/gdb/vmlinux-gdb.py | 2 +- >> 4 files changed, 1 insertion(+), 14 deletions(-) >> delete mode 100644 scripts/gdb/Makefile >> delete mode 100644 scripts/gdb/linux/Makefile >> >> diff --git a/scripts/Makefile b/scripts/Makefile >> index 2016a64..72902b5 100644 >> --- a/scripts/Makefile >> +++ b/scripts/Makefile >> @@ -36,7 +36,6 @@ subdir-$(CONFIG_MODVERSIONS) += genksyms >> subdir-y += mod >> subdir-$(CONFIG_SECURITY_SELINUX) += selinux >> subdir-$(CONFIG_DTC) += dtc >> -subdir-$(CONFIG_GDB_SCRIPTS) += gdb >> >> # Let clean descend into subdirs >> subdir- += basic kconfig package >> diff --git a/scripts/gdb/Makefile b/scripts/gdb/Makefile >> deleted file mode 100644 >> index 62f5f65..000 >> --- a/scripts/gdb/Makefile >> +++ /dev/null >> @@ -1 +0,0 @@ >> -subdir-y := linux >> diff --git a/scripts/gdb/linux/Makefile b/scripts/gdb/linux/Makefile >> deleted file mode 100644 >> index 6cf1ecf..000 >> --- a/scripts/gdb/linux/Makefile >> +++ /dev/null >> @@ -1,11 +0,0 @@ >> -always := gdb-scripts >> - >> -SRCTREE := $(shell cd $(srctree) && /bin/pwd) >> - >> -$(obj)/gdb-scripts: >> -ifneq ($(KBUILD_SRC),) >> -$(Q)ln -fsn $(SRCTREE)/$(obj)/*.py $(objtree)/$(obj) >> -endif >> -@: >> - >> -clean-files := *.pyc *.pyo $(if $(KBUILD_SRC),*.py) > > This step I don't understand at all. Why do you want to destroy the > possibility to automatically load the scripts? Did you read > Documentation/gdb-kernel-debugging.txt in this regard? > >> diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py >> index ce82bf5..a9029f4 100644 >> --- a/scripts/gdb/vmlinux-gdb.py >> +++ b/scripts/gdb/vmlinux-gdb.py >> @@ -13,7 +13,7 @@ >> >> import os >> >> -sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb") >> +sys.path.insert(0, os.path.dirname(os.path.realpath(__file__))) > > This works only so far as that (if you don't destroy the link) the main > script will still find its modules. However, *.pyc files are then > generated in the source tree, no longer in the output dirs. The code is > designed to prevent this. > > You still don't explain to us why the existing code doesn't work for you > and how you prefer to use it instead. > > Jan > Thanks for your explanation, the reason i change it is because i was doing cross platform debug , debug arm platform on x86 host . and i only have source code on host , i don’t build it .. Then when i start up gdb-arm , i want load its gdb scripts from source code . that is the usage i need . i don’t want build kernel on all host when i just want debug an embedded platform occasionally . Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: no-op delay loops
> On Nov 27, 2015, at 16:53, Rasmus Villemoeswrote: > > Hi, > > It seems that gcc happily compiles > > for (i = 0; i < 10; ++i) ; > > into simply > > i = 10; > > (which is then usually eliminated as a dead store). At least at -O2, and > when i is not declared volatile. So it would seem that the loops at > > arch/mips/pci/pci-rt2880.c:235 > arch/mips/pmcs-msp71xx/msp_setup.c:80 > arch/mips/sni/reset.c:35 > > actually don't do anything. (In the middle one, i is 'register', but > that doesn't change anything.) Is mips compiled with some special flags > that would make gcc actually emit code for the above? > you can try to declare i as volatile int i; may gcc will not optimize it . Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2] scripts: fix the sys path for gdb scripts
> On Nov 27, 2015, at 15:04, Jan Kiszka <jan.kis...@siemens.com> wrote: > > On 2015-11-27 07:41, yalin wang wrote: >> we insert __file__'s real path into sys.path, >> so that no matter we import the vmlinux-gdb.py from $OUT floder or >> from source code folder, we can always find the linux/ lib folder, >> and we don't need create link to linux/*.py files, >> remove the related make file. > > NACK again - I tell you why below. > >> >> Signed-off-by: yalin wang <yalin.wang2...@gmail.com> >> --- >> scripts/Makefile | 1 - >> scripts/gdb/Makefile | 1 - >> scripts/gdb/linux/Makefile | 11 --- >> scripts/gdb/vmlinux-gdb.py | 2 +- >> 4 files changed, 1 insertion(+), 14 deletions(-) >> delete mode 100644 scripts/gdb/Makefile >> delete mode 100644 scripts/gdb/linux/Makefile >> >> diff --git a/scripts/Makefile b/scripts/Makefile >> index 2016a64..72902b5 100644 >> --- a/scripts/Makefile >> +++ b/scripts/Makefile >> @@ -36,7 +36,6 @@ subdir-$(CONFIG_MODVERSIONS) += genksyms >> subdir-y += mod >> subdir-$(CONFIG_SECURITY_SELINUX) += selinux >> subdir-$(CONFIG_DTC) += dtc >> -subdir-$(CONFIG_GDB_SCRIPTS) += gdb >> >> # Let clean descend into subdirs >> subdir- += basic kconfig package >> diff --git a/scripts/gdb/Makefile b/scripts/gdb/Makefile >> deleted file mode 100644 >> index 62f5f65..000 >> --- a/scripts/gdb/Makefile >> +++ /dev/null >> @@ -1 +0,0 @@ >> -subdir-y := linux >> diff --git a/scripts/gdb/linux/Makefile b/scripts/gdb/linux/Makefile >> deleted file mode 100644 >> index 6cf1ecf..000 >> --- a/scripts/gdb/linux/Makefile >> +++ /dev/null >> @@ -1,11 +0,0 @@ >> -always := gdb-scripts >> - >> -SRCTREE := $(shell cd $(srctree) && /bin/pwd) >> - >> -$(obj)/gdb-scripts: >> -ifneq ($(KBUILD_SRC),) >> -$(Q)ln -fsn $(SRCTREE)/$(obj)/*.py $(objtree)/$(obj) >> -endif >> -@: >> - >> -clean-files := *.pyc *.pyo $(if $(KBUILD_SRC),*.py) > > This step I don't understand at all. Why do you want to destroy the > possibility to automatically load the scripts? Did you read > Documentation/gdb-kernel-debugging.txt in this regard? > >> diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py >> index ce82bf5..a9029f4 100644 >> --- a/scripts/gdb/vmlinux-gdb.py >> +++ b/scripts/gdb/vmlinux-gdb.py >> @@ -13,7 +13,7 @@ >> >> import os >> >> -sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb") >> +sys.path.insert(0, os.path.dirname(os.path.realpath(__file__))) > > This works only so far as that (if you don't destroy the link) the main > script will still find its modules. However, *.pyc files are then > generated in the source tree, no longer in the output dirs. The code is > designed to prevent this. > > You still don't explain to us why the existing code doesn't work for you > and how you prefer to use it instead. > > Jan > Thanks for your explanation, the reason i change it is because i was doing cross platform debug , debug arm platform on x86 host . and i only have source code on host , i don’t build it .. Then when i start up gdb-arm , i want load its gdb scripts from source code . that is the usage i need . i don’t want build kernel on all host when i just want debug an embedded platform occasionally . Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2] scripts: fix the sys path for gdb scripts
we insert __file__'s real path into sys.path, so that no matter we import the vmlinux-gdb.py from $OUT floder or from source code folder, we can always find the linux/ lib folder, and we don't need create link to linux/*.py files, remove the related make file. Signed-off-by: yalin wang --- scripts/Makefile | 1 - scripts/gdb/Makefile | 1 - scripts/gdb/linux/Makefile | 11 --- scripts/gdb/vmlinux-gdb.py | 2 +- 4 files changed, 1 insertion(+), 14 deletions(-) delete mode 100644 scripts/gdb/Makefile delete mode 100644 scripts/gdb/linux/Makefile diff --git a/scripts/Makefile b/scripts/Makefile index 2016a64..72902b5 100644 --- a/scripts/Makefile +++ b/scripts/Makefile @@ -36,7 +36,6 @@ subdir-$(CONFIG_MODVERSIONS) += genksyms subdir-y += mod subdir-$(CONFIG_SECURITY_SELINUX) += selinux subdir-$(CONFIG_DTC) += dtc -subdir-$(CONFIG_GDB_SCRIPTS) += gdb # Let clean descend into subdirs subdir-+= basic kconfig package diff --git a/scripts/gdb/Makefile b/scripts/gdb/Makefile deleted file mode 100644 index 62f5f65..000 --- a/scripts/gdb/Makefile +++ /dev/null @@ -1 +0,0 @@ -subdir-y := linux diff --git a/scripts/gdb/linux/Makefile b/scripts/gdb/linux/Makefile deleted file mode 100644 index 6cf1ecf..000 --- a/scripts/gdb/linux/Makefile +++ /dev/null @@ -1,11 +0,0 @@ -always := gdb-scripts - -SRCTREE := $(shell cd $(srctree) && /bin/pwd) - -$(obj)/gdb-scripts: -ifneq ($(KBUILD_SRC),) - $(Q)ln -fsn $(SRCTREE)/$(obj)/*.py $(objtree)/$(obj) -endif - @: - -clean-files := *.pyc *.pyo $(if $(KBUILD_SRC),*.py) diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py index ce82bf5..a9029f4 100644 --- a/scripts/gdb/vmlinux-gdb.py +++ b/scripts/gdb/vmlinux-gdb.py @@ -13,7 +13,7 @@ import os -sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb") +sys.path.insert(0, os.path.dirname(os.path.realpath(__file__))) try: gdb.parse_and_eval("0") -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] scripts: fix the sys path for gdb scripts
> On Nov 25, 2015, at 15:38, Jan Kiszka wrote: > > On 2015-11-19 11:54, yalin wang wrote: >> The sys.path should be scripts/gdb, >> so that we can import linux lib correctly. >> >> Signed-off-by: yalin wang >> --- >> scripts/gdb/vmlinux-gdb.py | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py >> index ce82bf5..5a45d1a 100644 >> --- a/scripts/gdb/vmlinux-gdb.py >> +++ b/scripts/gdb/vmlinux-gdb.py >> @@ -13,7 +13,7 @@ >> >> import os >> >> -sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb") >> +sys.path.insert(0, os.path.dirname(__file__)) >> >> try: >> gdb.parse_and_eval("0") >> > > NACK. This patch is assuming that vmlinux-gdb.py is (only) started from > the scripts/gdb folder. But CONFIG_GDB_SCRIPTS places a link to > vmlinux-gdb.py aside the vmlinux binary in the top-level folder. That > way, the script is auto-loaded by gdb. > > If you have a compelling use case for loading the script manually from > its original folder, we can discuss augmenting the path. But removing > the existing one is wrong. > > Andrew, please drop the patch from your queue. > ok, i will send a V2 patch for this . -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] sched: change nr_uninterruptible to be signed
nr_uninterruptible will be negative during running, this happened when dequeue a TASK_UNINTERRUPTIBLE task from rq1 and then wake up the task and queue it to rq2, then rq2->nr_uninterruptible-- will reuslt in negative value sometimes. Signed-off-by: yalin wang --- kernel/sched/loadavg.c | 2 +- kernel/sched/sched.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/loadavg.c b/kernel/sched/loadavg.c index ef71590..39504c6 100644 --- a/kernel/sched/loadavg.c +++ b/kernel/sched/loadavg.c @@ -83,7 +83,7 @@ long calc_load_fold_active(struct rq *this_rq) long nr_active, delta = 0; nr_active = this_rq->nr_running; - nr_active += (long)this_rq->nr_uninterruptible; + nr_active += this_rq->nr_uninterruptible; if (nr_active != this_rq->calc_load_active) { delta = nr_active - this_rq->calc_load_active; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 84d4879..7b5f67b 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -605,7 +605,7 @@ struct rq { * one CPU and if it got migrated afterwards it may decrease * it on another CPU. Always updated under the runqueue lock: */ - unsigned long nr_uninterruptible; + long nr_uninterruptible; struct task_struct *curr, *idle, *stop; unsigned long next_balance; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 6/9] mm, debug: introduce dump_gfpflag_names() for symbolic printing of gfp_flags
> On Nov 25, 2015, at 18:28, Vlastimil Babka wrote: > > On 11/25/2015 09:16 AM, Joonsoo Kim wrote: >> On Tue, Nov 24, 2015 at 01:36:18PM +0100, Vlastimil Babka wrote: >>> --- a/include/trace/events/gfpflags.h >>> +++ b/include/trace/events/gfpflags.h >>> @@ -8,8 +8,8 @@ >>> * >>> * Thus most bits set go first. >>> */ >>> -#define show_gfp_flags(flags) >>> \ >>> - (flags) ? __print_flags(flags, "|", \ >>> + >>> +#define __def_gfpflag_names >>> \ >>> {(unsigned long)GFP_TRANSHUGE, "GFP_TRANSHUGE"}, \ >>> {(unsigned long)GFP_HIGHUSER_MOVABLE, "GFP_HIGHUSER_MOVABLE"}, \ >>> {(unsigned long)GFP_HIGHUSER, "GFP_HIGHUSER"},\ >>> @@ -19,9 +19,13 @@ >>> {(unsigned long)GFP_NOFS, "GFP_NOFS"},\ >>> {(unsigned long)GFP_ATOMIC, "GFP_ATOMIC"}, \ >>> {(unsigned long)GFP_NOIO, "GFP_NOIO"},\ >>> + {(unsigned long)GFP_NOWAIT, "GFP_NOWAIT"}, \ >>> + {(unsigned long)__GFP_DMA, "GFP_DMA"}, \ >>> + {(unsigned long)__GFP_DMA32,"GFP_DMA32"}, \ >>> {(unsigned long)__GFP_HIGH, "GFP_HIGH"},\ >>> {(unsigned long)__GFP_ATOMIC, "GFP_ATOMIC"}, \ >>> {(unsigned long)__GFP_IO, "GFP_IO"}, \ >>> + {(unsigned long)__GFP_FS, "GFP_FS"}, \ >>> {(unsigned long)__GFP_COLD, "GFP_COLD"},\ >>> {(unsigned long)__GFP_NOWARN, "GFP_NOWARN"}, \ >>> {(unsigned long)__GFP_REPEAT, "GFP_REPEAT"}, \ >>> @@ -36,8 +40,12 @@ >>> {(unsigned long)__GFP_RECLAIMABLE, "GFP_RECLAIMABLE"}, \ >>> {(unsigned long)__GFP_MOVABLE, "GFP_MOVABLE"}, \ >>> {(unsigned long)__GFP_NOTRACK, "GFP_NOTRACK"}, \ >>> + {(unsigned long)__GFP_WRITE,"GFP_WRITE"}, \ >>> {(unsigned long)__GFP_DIRECT_RECLAIM, "GFP_DIRECT_RECLAIM"}, \ >>> {(unsigned long)__GFP_KSWAPD_RECLAIM, "GFP_KSWAPD_RECLAIM"}, \ >>> {(unsigned long)__GFP_OTHER_NODE, "GFP_OTHER_NODE"} \ >>> - ) : "GFP_NOWAIT" >>> >>> +#define show_gfp_flags(flags) >>> \ >>> + (flags) ? __print_flags(flags, "|", \ >>> + __def_gfpflag_names \ >>> + ) : "none" >> >> How about moving this to gfp.h or something? >> Now, we use it in out of tracepoints so there is no need to keep it >> in include/trace/events/xxx. > > Hm I didn't want to pollute such widely included header with such defines. And > show_gfp_flags shouldn't be there definitely as it depends on __print_flags. > What do others think? how about add this into standard printk() format ? like cpu mask print in printk use %*pb[l] , it define a macro cpumask_pr_args to print cpumask . we can also define a new format like %pG means print flag , then it will be useful for other code to use , like dump vma / mm flags .. Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 6/9] mm, debug: introduce dump_gfpflag_names() for symbolic printing of gfp_flags
> On Nov 25, 2015, at 18:28, Vlastimil Babkawrote: > > On 11/25/2015 09:16 AM, Joonsoo Kim wrote: >> On Tue, Nov 24, 2015 at 01:36:18PM +0100, Vlastimil Babka wrote: >>> --- a/include/trace/events/gfpflags.h >>> +++ b/include/trace/events/gfpflags.h >>> @@ -8,8 +8,8 @@ >>> * >>> * Thus most bits set go first. >>> */ >>> -#define show_gfp_flags(flags) >>> \ >>> - (flags) ? __print_flags(flags, "|", \ >>> + >>> +#define __def_gfpflag_names >>> \ >>> {(unsigned long)GFP_TRANSHUGE, "GFP_TRANSHUGE"}, \ >>> {(unsigned long)GFP_HIGHUSER_MOVABLE, "GFP_HIGHUSER_MOVABLE"}, \ >>> {(unsigned long)GFP_HIGHUSER, "GFP_HIGHUSER"},\ >>> @@ -19,9 +19,13 @@ >>> {(unsigned long)GFP_NOFS, "GFP_NOFS"},\ >>> {(unsigned long)GFP_ATOMIC, "GFP_ATOMIC"}, \ >>> {(unsigned long)GFP_NOIO, "GFP_NOIO"},\ >>> + {(unsigned long)GFP_NOWAIT, "GFP_NOWAIT"}, \ >>> + {(unsigned long)__GFP_DMA, "GFP_DMA"}, \ >>> + {(unsigned long)__GFP_DMA32,"GFP_DMA32"}, \ >>> {(unsigned long)__GFP_HIGH, "GFP_HIGH"},\ >>> {(unsigned long)__GFP_ATOMIC, "GFP_ATOMIC"}, \ >>> {(unsigned long)__GFP_IO, "GFP_IO"}, \ >>> + {(unsigned long)__GFP_FS, "GFP_FS"}, \ >>> {(unsigned long)__GFP_COLD, "GFP_COLD"},\ >>> {(unsigned long)__GFP_NOWARN, "GFP_NOWARN"}, \ >>> {(unsigned long)__GFP_REPEAT, "GFP_REPEAT"}, \ >>> @@ -36,8 +40,12 @@ >>> {(unsigned long)__GFP_RECLAIMABLE, "GFP_RECLAIMABLE"}, \ >>> {(unsigned long)__GFP_MOVABLE, "GFP_MOVABLE"}, \ >>> {(unsigned long)__GFP_NOTRACK, "GFP_NOTRACK"}, \ >>> + {(unsigned long)__GFP_WRITE,"GFP_WRITE"}, \ >>> {(unsigned long)__GFP_DIRECT_RECLAIM, "GFP_DIRECT_RECLAIM"}, \ >>> {(unsigned long)__GFP_KSWAPD_RECLAIM, "GFP_KSWAPD_RECLAIM"}, \ >>> {(unsigned long)__GFP_OTHER_NODE, "GFP_OTHER_NODE"} \ >>> - ) : "GFP_NOWAIT" >>> >>> +#define show_gfp_flags(flags) >>> \ >>> + (flags) ? __print_flags(flags, "|", \ >>> + __def_gfpflag_names \ >>> + ) : "none" >> >> How about moving this to gfp.h or something? >> Now, we use it in out of tracepoints so there is no need to keep it >> in include/trace/events/xxx. > > Hm I didn't want to pollute such widely included header with such defines. And > show_gfp_flags shouldn't be there definitely as it depends on __print_flags. > What do others think? how about add this into standard printk() format ? like cpu mask print in printk use %*pb[l] , it define a macro cpumask_pr_args to print cpumask . we can also define a new format like %pG means print flag , then it will be useful for other code to use , like dump vma / mm flags .. Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] scripts: fix the sys path for gdb scripts
> On Nov 25, 2015, at 15:38, Jan Kiszka <jan.kis...@siemens.com> wrote: > > On 2015-11-19 11:54, yalin wang wrote: >> The sys.path should be scripts/gdb, >> so that we can import linux lib correctly. >> >> Signed-off-by: yalin wang <yalin.wang2...@gmail.com> >> --- >> scripts/gdb/vmlinux-gdb.py | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py >> index ce82bf5..5a45d1a 100644 >> --- a/scripts/gdb/vmlinux-gdb.py >> +++ b/scripts/gdb/vmlinux-gdb.py >> @@ -13,7 +13,7 @@ >> >> import os >> >> -sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb") >> +sys.path.insert(0, os.path.dirname(__file__)) >> >> try: >> gdb.parse_and_eval("0") >> > > NACK. This patch is assuming that vmlinux-gdb.py is (only) started from > the scripts/gdb folder. But CONFIG_GDB_SCRIPTS places a link to > vmlinux-gdb.py aside the vmlinux binary in the top-level folder. That > way, the script is auto-loaded by gdb. > > If you have a compelling use case for loading the script manually from > its original folder, we can discuss augmenting the path. But removing > the existing one is wrong. > > Andrew, please drop the patch from your queue. > ok, i will send a V2 patch for this . -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] sched: change nr_uninterruptible to be signed
nr_uninterruptible will be negative during running, this happened when dequeue a TASK_UNINTERRUPTIBLE task from rq1 and then wake up the task and queue it to rq2, then rq2->nr_uninterruptible-- will reuslt in negative value sometimes. Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- kernel/sched/loadavg.c | 2 +- kernel/sched/sched.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/loadavg.c b/kernel/sched/loadavg.c index ef71590..39504c6 100644 --- a/kernel/sched/loadavg.c +++ b/kernel/sched/loadavg.c @@ -83,7 +83,7 @@ long calc_load_fold_active(struct rq *this_rq) long nr_active, delta = 0; nr_active = this_rq->nr_running; - nr_active += (long)this_rq->nr_uninterruptible; + nr_active += this_rq->nr_uninterruptible; if (nr_active != this_rq->calc_load_active) { delta = nr_active - this_rq->calc_load_active; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 84d4879..7b5f67b 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -605,7 +605,7 @@ struct rq { * one CPU and if it got migrated afterwards it may decrease * it on another CPU. Always updated under the runqueue lock: */ - unsigned long nr_uninterruptible; + long nr_uninterruptible; struct task_struct *curr, *idle, *stop; unsigned long next_balance; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2] scripts: fix the sys path for gdb scripts
we insert __file__'s real path into sys.path, so that no matter we import the vmlinux-gdb.py from $OUT floder or from source code folder, we can always find the linux/ lib folder, and we don't need create link to linux/*.py files, remove the related make file. Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- scripts/Makefile | 1 - scripts/gdb/Makefile | 1 - scripts/gdb/linux/Makefile | 11 --- scripts/gdb/vmlinux-gdb.py | 2 +- 4 files changed, 1 insertion(+), 14 deletions(-) delete mode 100644 scripts/gdb/Makefile delete mode 100644 scripts/gdb/linux/Makefile diff --git a/scripts/Makefile b/scripts/Makefile index 2016a64..72902b5 100644 --- a/scripts/Makefile +++ b/scripts/Makefile @@ -36,7 +36,6 @@ subdir-$(CONFIG_MODVERSIONS) += genksyms subdir-y += mod subdir-$(CONFIG_SECURITY_SELINUX) += selinux subdir-$(CONFIG_DTC) += dtc -subdir-$(CONFIG_GDB_SCRIPTS) += gdb # Let clean descend into subdirs subdir-+= basic kconfig package diff --git a/scripts/gdb/Makefile b/scripts/gdb/Makefile deleted file mode 100644 index 62f5f65..000 --- a/scripts/gdb/Makefile +++ /dev/null @@ -1 +0,0 @@ -subdir-y := linux diff --git a/scripts/gdb/linux/Makefile b/scripts/gdb/linux/Makefile deleted file mode 100644 index 6cf1ecf..000 --- a/scripts/gdb/linux/Makefile +++ /dev/null @@ -1,11 +0,0 @@ -always := gdb-scripts - -SRCTREE := $(shell cd $(srctree) && /bin/pwd) - -$(obj)/gdb-scripts: -ifneq ($(KBUILD_SRC),) - $(Q)ln -fsn $(SRCTREE)/$(obj)/*.py $(objtree)/$(obj) -endif - @: - -clean-files := *.pyc *.pyo $(if $(KBUILD_SRC),*.py) diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py index ce82bf5..a9029f4 100644 --- a/scripts/gdb/vmlinux-gdb.py +++ b/scripts/gdb/vmlinux-gdb.py @@ -13,7 +13,7 @@ import os -sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb") +sys.path.insert(0, os.path.dirname(os.path.realpath(__file__))) try: gdb.parse_and_eval("0") -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] block: change blk_check_merge_flags() implementation
Use XOR to chenk some flags in flags1 and flags2 if the same, much faster on some platforms. Signed-off-by: yalin wang --- include/linux/blkdev.h | 11 +-- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index c401ecd..3d0f053 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -655,16 +655,7 @@ static inline bool rq_mergeable(struct request *rq) static inline bool blk_check_merge_flags(unsigned int flags1, unsigned int flags2) { - if ((flags1 & REQ_DISCARD) != (flags2 & REQ_DISCARD)) - return false; - - if ((flags1 & REQ_SECURE) != (flags2 & REQ_SECURE)) - return false; - - if ((flags1 & REQ_WRITE_SAME) != (flags2 & REQ_WRITE_SAME)) - return false; - - return true; + return !((flags1 ^ flags2) & (REQ_DISCARD | REQ_SECURE | REQ_WRITE_SAME)); } static inline bool blk_write_same_mergeable(struct bio *a, struct bio *b) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3] arm64: Add support for PTE contiguous bit.
> On Nov 20, 2015, at 00:57, David Woods wrote: > > The arm64 MMU supports a Contiguous bit which is a hint that the TTE > is one of a set of contiguous entries which can be cached in a single > TLB entry. Supporting this bit adds new intermediate huge page sizes. > > The set of huge page sizes available depends on the base page size. > Without using contiguous pages the huge page sizes are as follows. > > 4KB: 2MB 1GB > 64KB: 512MB > > With a 4KB granule, the contiguous bit groups together sets of 16 pages > and with a 64KB granule it groups sets of 32 pages. This enables two new > huge page sizes in each case, so that the full set of available sizes > is as follows. > > 4KB: 64KB 2MB 32MB 1GB > 64KB: 2MB 512MB 16GB > > If a 16KB granule is used then the contiguous bit groups 128 pages > at the PTE level and 32 pages at the PMD level. > > If the base page size is set to 64KB then 2MB pages are enabled by > default. It is possible in the future to make 2MB the default huge > page size for both 4KB and 64KB granules. > > Signed-off-by: David Woods > Reviewed-by: Chris Metcalf > --- > > This patch should resolve the comments on v2 and is now based on on the > arm64 next tree which includes 16K granule support. I've added definitions > which should enable 2M and 1G huge page sizes with a 16K granule. > Unfortunately, the A53 model we have does not support 16K so I don't > have a way to test this. > > arch/arm64/Kconfig | 3 - > arch/arm64/include/asm/hugetlb.h | 44 ++ > arch/arm64/include/asm/pgtable-hwdef.h | 18 ++- > arch/arm64/include/asm/pgtable.h | 10 +- > arch/arm64/mm/hugetlbpage.c| 267 - > include/linux/hugetlb.h| 2 - > 6 files changed, 306 insertions(+), 38 deletions(-) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 40e1151..077bb7c 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -480,9 +480,6 @@ config HW_PERF_EVENTS > config SYS_SUPPORTS_HUGETLBFS > def_bool y > > -config ARCH_WANT_GENERAL_HUGETLB > - def_bool y > - > config ARCH_WANT_HUGE_PMD_SHARE > def_bool y if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36) > > diff --git a/arch/arm64/include/asm/hugetlb.h > b/arch/arm64/include/asm/hugetlb.h > index bb4052e..bbc1e35 100644 > --- a/arch/arm64/include/asm/hugetlb.h > +++ b/arch/arm64/include/asm/hugetlb.h > @@ -26,36 +26,7 @@ static inline pte_t huge_ptep_get(pte_t *ptep) > return *ptep; > } > > -static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, > -pte_t *ptep, pte_t pte) > -{ > - set_pte_at(mm, addr, ptep, pte); > -} > - > -static inline void huge_ptep_clear_flush(struct vm_area_struct *vma, > - unsigned long addr, pte_t *ptep) > -{ > - ptep_clear_flush(vma, addr, ptep); > -} > - > -static inline void huge_ptep_set_wrprotect(struct mm_struct *mm, > -unsigned long addr, pte_t *ptep) > -{ > - ptep_set_wrprotect(mm, addr, ptep); > -} > > -static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, > - unsigned long addr, pte_t *ptep) > -{ > - return ptep_get_and_clear(mm, addr, ptep); > -} > - > -static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma, > - unsigned long addr, pte_t *ptep, > - pte_t pte, int dirty) > -{ > - return ptep_set_access_flags(vma, addr, ptep, pte, dirty); > -} > > static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb, > unsigned long addr, unsigned long end, > @@ -97,4 +68,19 @@ static inline void arch_clear_hugepage_flags(struct page > *page) > clear_bit(PG_dcache_clean, >flags); > } > > +extern pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma, > + struct page *page, int writable); > +#define arch_make_huge_pte arch_make_huge_pte > +extern void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, > + pte_t *ptep, pte_t pte); > +extern int huge_ptep_set_access_flags(struct vm_area_struct *vma, > + unsigned long addr, pte_t *ptep, > + pte_t pte, int dirty); > +extern pte_t huge_ptep_get_and_clear(struct mm_struct *mm, > + unsigned long addr, pte_t *ptep); > +extern void huge_ptep_set_wrprotect(struct mm_struct *mm, > + unsigned long addr, pte_t *ptep); > +extern void huge_ptep_clear_flush(struct vm_area_struct *vma, > + unsigned long addr, pte_t *ptep); > + > #endif /* __ASM_HUGETLB_H */ > diff --git a/arch/arm64/include/asm/pgtable-hwdef.h >
Re: [PATCH v3] arm64: Add support for PTE contiguous bit.
> On Nov 20, 2015, at 00:57, David Woodswrote: > > The arm64 MMU supports a Contiguous bit which is a hint that the TTE > is one of a set of contiguous entries which can be cached in a single > TLB entry. Supporting this bit adds new intermediate huge page sizes. > > The set of huge page sizes available depends on the base page size. > Without using contiguous pages the huge page sizes are as follows. > > 4KB: 2MB 1GB > 64KB: 512MB > > With a 4KB granule, the contiguous bit groups together sets of 16 pages > and with a 64KB granule it groups sets of 32 pages. This enables two new > huge page sizes in each case, so that the full set of available sizes > is as follows. > > 4KB: 64KB 2MB 32MB 1GB > 64KB: 2MB 512MB 16GB > > If a 16KB granule is used then the contiguous bit groups 128 pages > at the PTE level and 32 pages at the PMD level. > > If the base page size is set to 64KB then 2MB pages are enabled by > default. It is possible in the future to make 2MB the default huge > page size for both 4KB and 64KB granules. > > Signed-off-by: David Woods > Reviewed-by: Chris Metcalf > --- > > This patch should resolve the comments on v2 and is now based on on the > arm64 next tree which includes 16K granule support. I've added definitions > which should enable 2M and 1G huge page sizes with a 16K granule. > Unfortunately, the A53 model we have does not support 16K so I don't > have a way to test this. > > arch/arm64/Kconfig | 3 - > arch/arm64/include/asm/hugetlb.h | 44 ++ > arch/arm64/include/asm/pgtable-hwdef.h | 18 ++- > arch/arm64/include/asm/pgtable.h | 10 +- > arch/arm64/mm/hugetlbpage.c| 267 - > include/linux/hugetlb.h| 2 - > 6 files changed, 306 insertions(+), 38 deletions(-) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 40e1151..077bb7c 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -480,9 +480,6 @@ config HW_PERF_EVENTS > config SYS_SUPPORTS_HUGETLBFS > def_bool y > > -config ARCH_WANT_GENERAL_HUGETLB > - def_bool y > - > config ARCH_WANT_HUGE_PMD_SHARE > def_bool y if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36) > > diff --git a/arch/arm64/include/asm/hugetlb.h > b/arch/arm64/include/asm/hugetlb.h > index bb4052e..bbc1e35 100644 > --- a/arch/arm64/include/asm/hugetlb.h > +++ b/arch/arm64/include/asm/hugetlb.h > @@ -26,36 +26,7 @@ static inline pte_t huge_ptep_get(pte_t *ptep) > return *ptep; > } > > -static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, > -pte_t *ptep, pte_t pte) > -{ > - set_pte_at(mm, addr, ptep, pte); > -} > - > -static inline void huge_ptep_clear_flush(struct vm_area_struct *vma, > - unsigned long addr, pte_t *ptep) > -{ > - ptep_clear_flush(vma, addr, ptep); > -} > - > -static inline void huge_ptep_set_wrprotect(struct mm_struct *mm, > -unsigned long addr, pte_t *ptep) > -{ > - ptep_set_wrprotect(mm, addr, ptep); > -} > > -static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, > - unsigned long addr, pte_t *ptep) > -{ > - return ptep_get_and_clear(mm, addr, ptep); > -} > - > -static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma, > - unsigned long addr, pte_t *ptep, > - pte_t pte, int dirty) > -{ > - return ptep_set_access_flags(vma, addr, ptep, pte, dirty); > -} > > static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb, > unsigned long addr, unsigned long end, > @@ -97,4 +68,19 @@ static inline void arch_clear_hugepage_flags(struct page > *page) > clear_bit(PG_dcache_clean, >flags); > } > > +extern pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma, > + struct page *page, int writable); > +#define arch_make_huge_pte arch_make_huge_pte > +extern void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, > + pte_t *ptep, pte_t pte); > +extern int huge_ptep_set_access_flags(struct vm_area_struct *vma, > + unsigned long addr, pte_t *ptep, > + pte_t pte, int dirty); > +extern pte_t huge_ptep_get_and_clear(struct mm_struct *mm, > + unsigned long addr, pte_t *ptep); > +extern void huge_ptep_set_wrprotect(struct mm_struct *mm, > + unsigned long addr, pte_t *ptep); > +extern void huge_ptep_clear_flush(struct vm_area_struct *vma, > + unsigned long addr, pte_t *ptep); > + > #endif /* __ASM_HUGETLB_H */ > diff --git
[RFC] block: change blk_check_merge_flags() implementation
Use XOR to chenk some flags in flags1 and flags2 if the same, much faster on some platforms. Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- include/linux/blkdev.h | 11 +-- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index c401ecd..3d0f053 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -655,16 +655,7 @@ static inline bool rq_mergeable(struct request *rq) static inline bool blk_check_merge_flags(unsigned int flags1, unsigned int flags2) { - if ((flags1 & REQ_DISCARD) != (flags2 & REQ_DISCARD)) - return false; - - if ((flags1 & REQ_SECURE) != (flags2 & REQ_SECURE)) - return false; - - if ((flags1 & REQ_WRITE_SAME) != (flags2 & REQ_WRITE_SAME)) - return false; - - return true; + return !((flags1 ^ flags2) & (REQ_DISCARD | REQ_SECURE | REQ_WRITE_SAME)); } static inline bool blk_write_same_mergeable(struct bio *a, struct bio *b) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] scripts: fix the sys path for gdb scripts
The sys.path should be scripts/gdb, so that we can import linux lib correctly. Signed-off-by: yalin wang --- scripts/gdb/vmlinux-gdb.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py index ce82bf5..5a45d1a 100644 --- a/scripts/gdb/vmlinux-gdb.py +++ b/scripts/gdb/vmlinux-gdb.py @@ -13,7 +13,7 @@ import os -sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb") +sys.path.insert(0, os.path.dirname(__file__)) try: gdb.parse_and_eval("0") -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
> On Nov 19, 2015, at 14:58, Kirill A. Shutemov wrote: > > uncharged i also encounter this crash , also i encounter a crash like this in qemu: [2.703436] [] do_execveat_common.isra.36+0x4f0/0x630 [2.703624] [] do_execve+0x24/0x30 [2.703767] [] SyS_execve+0x1c/0x2c [2.703923] BUG: Bad page map in process init pte:604837ebd3 pmd:b29e7003 [2.704140] page:ffc07f00af80 count:2 mapcount:-1 mapping: (null) index:0x1 [2.704414] flags: 0x4014(referenced|dirty) [2.704563] page dumped because: bad pte [2.704666] addr:007fafb7e000 vm_flags:00100073 anon_vma:ffc0729bdb90 mapping: (null) index:7fafb7e [2.704906] file: (null) fault: (null) mmap: (null) readpage: (null) [2.705117] CPU: 0 PID: 84 Comm: init Tainted: GB 4.2.0ajb-5-g11a9bf3 #80 [2.705315] Hardware name: ranchu (DT) [2.705408] Call trace: [2.705488] [] dump_backtrace+0x0/0x124 [2.705657] [] show_stack+0x10/0x1c [2.705797] [] dump_stack+0x78/0x98 [2.705971] [] print_bad_pte+0x154/0x1f0 [2.706102] [] unmap_single_vma+0x574/0x704 [2.706236] [] unmap_vmas+0x54/0x70 [2.706354] [] exit_mmap+0x88/0xfc [2.706473] [] mmput+0x48/0xe8 [2.706584] [] flush_old_exec+0x30c/0x79c [2.706719] [] load_elf_binary+0x21c/0x1098 [2.706856] [] search_binary_handler+0xa8/0x224 [2.706995] [] do_execveat_common.isra.36+0x4f0/0x630 [2.707144] [] do_execve+0x24/0x30 [2.707263] [] SyS_execve+0x1c/0x2c [2.707392] BUG: Bad page map in process init pte:604837fbd3 pmd:b29e7003 [2.707752] page:ffc07f00afc0 count:2 mapcount:-1 mapping: (null) index:0x1 [2.708167] flags: 0x4014(referenced|dirty) [2.708333] page dumped because: bad pte [2.708501] addr:007fafb7f000 vm_flags:00100073 anon_vma:ffc0729bdb90 mapping: (null) index:7fafb7f [2.709084] file: (null) fault: (null) mmap: (null) readpage: (null) [2.709306] CPU: 0 PID: 84 Comm: init Tainted: GB 4.2.0ajb-5-g11a9bf3 #80 [2.709494] Hardware name: ranchu (DT) seems the page map count is not correct .. i build is based on mmotm-2015-10-21-14-41 Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/3] lib: Introduce 2 bit ops api: all_is_bit_{one,zero}
> On Nov 19, 2015, at 14:48, Jia He wrote: > > why not use memcmp() to compare with 0x000 or 0x ? memcmp() have better performance on some platforms . Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/3] lib: Introduce 2 bit ops api: all_is_bit_{one,zero}
> On Nov 19, 2015, at 14:48, Jia Hewrote: > > why not use memcmp() to compare with 0x000 or 0x ? memcmp() have better performance on some platforms . Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
> On Nov 19, 2015, at 14:58, Kirill A. Shutemovwrote: > > uncharged i also encounter this crash , also i encounter a crash like this in qemu: [2.703436] [] do_execveat_common.isra.36+0x4f0/0x630 [2.703624] [] do_execve+0x24/0x30 [2.703767] [] SyS_execve+0x1c/0x2c [2.703923] BUG: Bad page map in process init pte:604837ebd3 pmd:b29e7003 [2.704140] page:ffc07f00af80 count:2 mapcount:-1 mapping: (null) index:0x1 [2.704414] flags: 0x4014(referenced|dirty) [2.704563] page dumped because: bad pte [2.704666] addr:007fafb7e000 vm_flags:00100073 anon_vma:ffc0729bdb90 mapping: (null) index:7fafb7e [2.704906] file: (null) fault: (null) mmap: (null) readpage: (null) [2.705117] CPU: 0 PID: 84 Comm: init Tainted: GB 4.2.0ajb-5-g11a9bf3 #80 [2.705315] Hardware name: ranchu (DT) [2.705408] Call trace: [2.705488] [] dump_backtrace+0x0/0x124 [2.705657] [] show_stack+0x10/0x1c [2.705797] [] dump_stack+0x78/0x98 [2.705971] [] print_bad_pte+0x154/0x1f0 [2.706102] [] unmap_single_vma+0x574/0x704 [2.706236] [] unmap_vmas+0x54/0x70 [2.706354] [] exit_mmap+0x88/0xfc [2.706473] [] mmput+0x48/0xe8 [2.706584] [] flush_old_exec+0x30c/0x79c [2.706719] [] load_elf_binary+0x21c/0x1098 [2.706856] [] search_binary_handler+0xa8/0x224 [2.706995] [] do_execveat_common.isra.36+0x4f0/0x630 [2.707144] [] do_execve+0x24/0x30 [2.707263] [] SyS_execve+0x1c/0x2c [2.707392] BUG: Bad page map in process init pte:604837fbd3 pmd:b29e7003 [2.707752] page:ffc07f00afc0 count:2 mapcount:-1 mapping: (null) index:0x1 [2.708167] flags: 0x4014(referenced|dirty) [2.708333] page dumped because: bad pte [2.708501] addr:007fafb7f000 vm_flags:00100073 anon_vma:ffc0729bdb90 mapping: (null) index:7fafb7f [2.709084] file: (null) fault: (null) mmap: (null) readpage: (null) [2.709306] CPU: 0 PID: 84 Comm: init Tainted: GB 4.2.0ajb-5-g11a9bf3 #80 [2.709494] Hardware name: ranchu (DT) seems the page map count is not correct .. i build is based on mmotm-2015-10-21-14-41 Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] scripts: fix the sys path for gdb scripts
The sys.path should be scripts/gdb, so that we can import linux lib correctly. Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- scripts/gdb/vmlinux-gdb.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py index ce82bf5..5a45d1a 100644 --- a/scripts/gdb/vmlinux-gdb.py +++ b/scripts/gdb/vmlinux-gdb.py @@ -13,7 +13,7 @@ import os -sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb") +sys.path.insert(0, os.path.dirname(__file__)) try: gdb.parse_and_eval("0") -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: read physical address space
you should access it like this: printk ( *(int*)kmap(pays_to_page(pays_addr))); pays address must be mapped into virtual address before access it . > On Nov 17, 2015, at 23:21, alan hopes wrote: > > phys_addr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: read physical address space
you should access it like this: printk ( *(int*)kmap(pays_to_page(pays_addr))); pays address must be mapped into virtual address before access it . > On Nov 17, 2015, at 23:21, alan hopeswrote: > > phys_addr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] block: change to use atomic_inc_return_release()
> On Nov 17, 2015, at 11:38, Jens Axboe wrote: > > On 11/16/2015 08:24 PM, yalin wang wrote: >> Some arch define this atomic_inc_return_release() OP. > > That is a very vague commit message, you'll need a whole lot more than > that... A commit message is supposed to describe the reason for the change. > You provide no reason for the change. > >> diff --git a/block/bio.c b/block/bio.c >> index fbc558b..b251857 100644 >> --- a/block/bio.c >> +++ b/block/bio.c >> @@ -310,8 +310,7 @@ static void bio_chain_endio(struct bio *bio, int error) >> static inline void bio_inc_remaining(struct bio *bio) >> { >> bio->bi_flags |= (1 << BIO_CHAIN); >> -smp_mb__before_atomic(); >> -atomic_inc(>__bi_remaining); >> +atomic_inc_return_release(>__bi_remaining); > > Are these equivalent? Where's the documentation for this primitive? The > previous code ensured that we ordered the dec of the remaining count with the > update of the flags. > i just have a look at ARM64 implementation for this new atomic OP , but i don’t find doc in memory-barrier.txt . so i make this RFC for some response, atomic_inc_return_release() should have store_release() class memory barriers . in this example, smp_store_release() memory barrier is not enough ? just make sure bi_flags update can been seen by other cores before update atomic counter. atomic_inc_return_{release,acquire,relax} OP seems newly add to kernel . But i don’t see much users in code . Can it be used to replace lots of smp_mb__before_atomic() ? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V4] mm: fix kernel crash in khugepaged thread
> On Nov 17, 2015, at 10:43, Steven Rostedt wrote: > > On Tue, 17 Nov 2015 10:21:47 +0800 > yalin wang wrote: > > >> i have not tried , >> just a question, >> if you print a %s , but don’t call trace_define_field() do define this >> string in >> __entry , how does user space perf tool to get this string info and print >> it ? >> i am curious .. >> i can try this when i have time. and report to you . > > Because the print_fmt has nothing to do with the fields. You can have > as your print_fmt as: > > TP_printk("Message = %s", "hello dolly!") > > And both userspace and the kernel with process that correctly (if I got > string processing working in userspace, which I believe I do). The > string is processed, it's not dependent on TP_STRUCT__entry() unless it > references a field there. Which can also be used too: > > TP_printk("Message = %s", __entry->musical ? "Hello dolly!" : > "Death Trap!") > > userspace will see in the entry: > > print_fmt: "Message = %s", REC->musical ? "Hello dolly!" : "Death Trap!" > > as long as the field "musical" exists, all is well. > > -- Steve Aha, i see. Thanks very much for your explanation. Better print fat is : TP_printk("mm=%p, scan_pfn=%s, writable=%d, referenced=%d, none_or_zero=%d, status=%s, unmapped=%d", __entry->mm, __entry->pfn == (-1UL) ? "(null)" : itoa(buff, __entry->pin, 10), …..) is this possible ? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] block: change to use atomic_inc_return_release()
Some arch define this atomic_inc_return_release() OP. Signed-off-by: yalin wang --- block/bio.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/block/bio.c b/block/bio.c index fbc558b..b251857 100644 --- a/block/bio.c +++ b/block/bio.c @@ -310,8 +310,7 @@ static void bio_chain_endio(struct bio *bio, int error) static inline void bio_inc_remaining(struct bio *bio) { bio->bi_flags |= (1 << BIO_CHAIN); - smp_mb__before_atomic(); - atomic_inc(>__bi_remaining); + atomic_inc_return_release(>__bi_remaining); } /** -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V4] mm: fix kernel crash in khugepaged thread
> On Nov 16, 2015, at 22:25, Steven Rostedt wrote: > > On Mon, 16 Nov 2015 11:16:22 +0100 > Vlastimil Babka wrote: >> -- Steve >>> it is not easy to print for perf tools in userspace , >>> if you use this format , >>> for user space perf tool, it print the entry by look up the member in entry >>> struct by offset , >>> you print a dynamic string which user space perf tool don’t know how to >>> print this string . >> >> Does it work through trace-cmd? > > The two use the same code. If it works in one, it will work in the > other. > > -- Steve > i have not tried , just a question, if you print a %s , but don’t call trace_define_field() do define this string in __entry , how does user space perf tool to get this string info and print it ? i am curious .. i can try this when i have time. and report to you . Thanks-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V4] mm: fix kernel crash in khugepaged thread
> On Nov 17, 2015, at 10:43, Steven Rostedt <rost...@goodmis.org> wrote: > > On Tue, 17 Nov 2015 10:21:47 +0800 > yalin wang <yalin.wang2...@gmail.com> wrote: > > >> i have not tried , >> just a question, >> if you print a %s , but don’t call trace_define_field() do define this >> string in >> __entry , how does user space perf tool to get this string info and print >> it ? >> i am curious .. >> i can try this when i have time. and report to you . > > Because the print_fmt has nothing to do with the fields. You can have > as your print_fmt as: > > TP_printk("Message = %s", "hello dolly!") > > And both userspace and the kernel with process that correctly (if I got > string processing working in userspace, which I believe I do). The > string is processed, it's not dependent on TP_STRUCT__entry() unless it > references a field there. Which can also be used too: > > TP_printk("Message = %s", __entry->musical ? "Hello dolly!" : > "Death Trap!") > > userspace will see in the entry: > > print_fmt: "Message = %s", REC->musical ? "Hello dolly!" : "Death Trap!" > > as long as the field "musical" exists, all is well. > > -- Steve Aha, i see. Thanks very much for your explanation. Better print fat is : TP_printk("mm=%p, scan_pfn=%s, writable=%d, referenced=%d, none_or_zero=%d, status=%s, unmapped=%d", __entry->mm, __entry->pfn == (-1UL) ? "(null)" : itoa(buff, __entry->pin, 10), …..) is this possible ? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] block: change to use atomic_inc_return_release()
> On Nov 17, 2015, at 11:38, Jens Axboe <ax...@kernel.dk> wrote: > > On 11/16/2015 08:24 PM, yalin wang wrote: >> Some arch define this atomic_inc_return_release() OP. > > That is a very vague commit message, you'll need a whole lot more than > that... A commit message is supposed to describe the reason for the change. > You provide no reason for the change. > >> diff --git a/block/bio.c b/block/bio.c >> index fbc558b..b251857 100644 >> --- a/block/bio.c >> +++ b/block/bio.c >> @@ -310,8 +310,7 @@ static void bio_chain_endio(struct bio *bio, int error) >> static inline void bio_inc_remaining(struct bio *bio) >> { >> bio->bi_flags |= (1 << BIO_CHAIN); >> -smp_mb__before_atomic(); >> -atomic_inc(>__bi_remaining); >> +atomic_inc_return_release(>__bi_remaining); > > Are these equivalent? Where's the documentation for this primitive? The > previous code ensured that we ordered the dec of the remaining count with the > update of the flags. > i just have a look at ARM64 implementation for this new atomic OP , but i don’t find doc in memory-barrier.txt . so i make this RFC for some response, atomic_inc_return_release() should have store_release() class memory barriers . in this example, smp_store_release() memory barrier is not enough ? just make sure bi_flags update can been seen by other cores before update atomic counter. atomic_inc_return_{release,acquire,relax} OP seems newly add to kernel . But i don’t see much users in code . Can it be used to replace lots of smp_mb__before_atomic() ? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] block: change to use atomic_inc_return_release()
Some arch define this atomic_inc_return_release() OP. Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- block/bio.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/block/bio.c b/block/bio.c index fbc558b..b251857 100644 --- a/block/bio.c +++ b/block/bio.c @@ -310,8 +310,7 @@ static void bio_chain_endio(struct bio *bio, int error) static inline void bio_inc_remaining(struct bio *bio) { bio->bi_flags |= (1 << BIO_CHAIN); - smp_mb__before_atomic(); - atomic_inc(>__bi_remaining); + atomic_inc_return_release(>__bi_remaining); } /** -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V4] mm: fix kernel crash in khugepaged thread
> On Nov 16, 2015, at 22:25, Steven Rostedtwrote: > > On Mon, 16 Nov 2015 11:16:22 +0100 > Vlastimil Babka wrote: >> -- Steve >>> it is not easy to print for perf tools in userspace , >>> if you use this format , >>> for user space perf tool, it print the entry by look up the member in entry >>> struct by offset , >>> you print a dynamic string which user space perf tool don’t know how to >>> print this string . >> >> Does it work through trace-cmd? > > The two use the same code. If it works in one, it will work in the > other. > > -- Steve > i have not tried , just a question, if you print a %s , but don’t call trace_define_field() do define this string in __entry , how does user space perf tool to get this string info and print it ? i am curious .. i can try this when i have time. and report to you . Thanks-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 01/17] mm: support madvise(MADV_FREE)
> On Nov 16, 2015, at 10:13, Minchan Kim wrote: > > On Fri, Nov 13, 2015 at 11:46:07AM -0800, Andy Lutomirski wrote: >> On Fri, Nov 13, 2015 at 12:13 AM, Daniel Micay wrote: >>> On 13/11/15 02:03 AM, Minchan Kim wrote: On Fri, Nov 13, 2015 at 01:45:52AM -0500, Daniel Micay wrote: >> And now I am thinking if we use access bit, we could implment >> MADV_FREE_UNDO >> easily when we need it. Maybe, that's what you want. Right? > > Yes, but why the access bit instead of the dirty bit for that? It could > always be made more strict (i.e. access bit) in the future, while going > the other way won't be possible. So I think the dirty bit is really the > more conservative choice since if it turns out to be a mistake it can be > fixed without a backwards incompatible change. Absolutely true. That's why I insist on dirty bit until now although I didn't tell the reason. But I thought you wanted to change for using access bit for the future, too. It seems MADV_FREE start to bloat over and over again before knowing real problems and usecases. It's almost same situation with volatile ranges so I really want to stop at proper point which maintainer should decide, I hope. Without it, we will make the feature a lot heavy by just brain storming and then causes lots of churn in MM code without real bebenfit It would be very painful for us. >>> >>> Well, I don't think you need more than a good API and an implementation >>> with no known bugs, kernel security concerns or backwards compatibility >>> issues. Configuration and API extensions are something for later (i.e. >>> land a baseline, then submit stuff like sysctl tunables). Just my take >>> on it though... >>> >> >> As long as it's anonymous MAP_PRIVATE only, then the security aspects >> should be okay. MADV_DONTNEED seems to work on pretty much any VMA, >> and there's been long history of interesting bugs there. >> >> As for dirty vs accessed, an argument in favor of going straight to >> accessed is that it means that users can write code like this without >> worrying about whether they have a kernel that uses the dirty bit: >> >> x = mmap(...); >> *x = 1; /* mark it present */ >> >> /* i'm done with it */ >> *x = 1; >> madvise(MADV_FREE, x, ...); >> >> wait a while; >> >> /* is it still there? */ >> if (*x == 1) { >> /* use whatever was cached there */ >> } else { >> /* reinitialize it */ >> *x = 1; >> } >> >> With the dirty bit, this will look like it works, but on occasion >> users will lose the race where they probe *x to see if the data was >> lost and then the data gets lost before the next write comes in. >> >> Sure, that load from *x could be changed to RMW or users could do a >> dummy write (e.g. x[1] = 1; if (*x == 1) ...), but people might forget >> to do that, and the caching implications are a little bit worse. > > I think your example is the case what people abuse MADV_FREE. > What happens if the object(ie, x) spans multiple pages? > User should know object's memory align and investigate all of pages > which span the object. Hmm, I don't think it's good for API. > >> >> Note that switching to RMW is really really dangerous. Doing: >> >> *x &= 1; >> if (*x == 1) ...; >> >> is safe on x86 if the compiler generates: >> >> andl $1, (%[x]); >> cmpl $1, (%[x]); >> >> but is unsafe if the compiler generates: >> >> movl (%[x]), %eax; >> andl $1, %eax; >> movl %eax, (%[x]); >> cmpl $1, %eax; >> >> and even worse if the write is omitted when "provably" unnecessary. >> >> OTOH, if switching to the accessed bit is too much of a mess, then >> using the dirty bit at first isn't so bad. > > Thanks! I want to use dirty bit first. > > About access bit, I don't want to say it to mess but I guess it would > change a lot subtle thing for all architectures. Because we have used > access bit as just *hint* for aging while dirty bit is really > *critical marker* for system integrity. A example in x86, we don't > keep accuracy of access bit for reducing TLB flush IPI. I don't know > what technique other arches have used but they might have. > > Thanks. > i think use access bit is not easy to implement for ANON page in kernel. we are sure the Anon page is always PageDirty() if it is !PageSwapCache() , unless it is MADV_FREE page , but use access bit , how to distinguish Normal ANON page and MADV_FREE page? it can be implemented by Access bit , but not easy, need more code change . Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2] mm: change mm_vmscan_lru_shrink_inactive() proto types
Move node_id zone_idx shrink flags into trace function, so thay we don't need caculate these args if the trace is disabled, and will make this function have less arguments. Signed-off-by: yalin wang --- include/trace/events/vmscan.h | 14 +++--- mm/vmscan.c | 7 ++- 2 files changed, 9 insertions(+), 12 deletions(-) diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h index dae7836..31763dd 100644 --- a/include/trace/events/vmscan.h +++ b/include/trace/events/vmscan.h @@ -352,11 +352,11 @@ TRACE_EVENT(mm_vmscan_writepage, TRACE_EVENT(mm_vmscan_lru_shrink_inactive, - TP_PROTO(int nid, int zid, - unsigned long nr_scanned, unsigned long nr_reclaimed, - int priority, int reclaim_flags), + TP_PROTO(struct zone *zone, + unsigned long nr_scanned, unsigned long nr_reclaimed, + int priority, int file), - TP_ARGS(nid, zid, nr_scanned, nr_reclaimed, priority, reclaim_flags), + TP_ARGS(zone, nr_scanned, nr_reclaimed, priority, file), TP_STRUCT__entry( __field(int, nid) @@ -368,12 +368,12 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive, ), TP_fast_assign( - __entry->nid = nid; - __entry->zid = zid; + __entry->nid = zone_to_nid(zone); + __entry->zid = zone_idx(zone); __entry->nr_scanned = nr_scanned; __entry->nr_reclaimed = nr_reclaimed; __entry->priority = priority; - __entry->reclaim_flags = reclaim_flags; + __entry->reclaim_flags = trace_shrink_flags(file); ), TP_printk("nid=%d zid=%d nr_scanned=%ld nr_reclaimed=%ld priority=%d flags=%s", diff --git a/mm/vmscan.c b/mm/vmscan.c index 69ca1f5..f8fc8c1 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1691,11 +1691,8 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, current_may_throttle()) wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); - trace_mm_vmscan_lru_shrink_inactive(zone->zone_pgdat->node_id, - zone_idx(zone), - nr_scanned, nr_reclaimed, - sc->priority, - trace_shrink_flags(file)); + trace_mm_vmscan_lru_shrink_inactive(zone, nr_scanned, nr_reclaimed, + sc->priority, file); return nr_reclaimed; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: change mm_vmscan_lru_shrink_inactive() proto types
> On Nov 13, 2015, at 21:16, Vlastimil Babka wrote: > > zone_to_nid make sense, i will send V2 patch , -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: change may_enter_fs check condition
> On Nov 13, 2015, at 23:36, Michal Hocko wrote: > > On Fri 13-11-15 13:01:16, Vlastimil Babka wrote: >> On 11/13/2015 12:47 PM, yalin wang wrote: >>> Add page_is_file_cache() for __GFP_FS check, >>> otherwise, a Pageswapcache() && PageDirty() page can always be write >>> back if the gfp flag is __GFP_FS, this is not the expected behavior. >> >> I'm not sure I understand your point correctly *), but you seem to imply >> that there would be an allocation that has __GFP_FS but doesn't have >> __GFP_IO? Are there such allocations and does it make sense? > > No it doesn't. There is a natural layering here and __GFP_FS allocations > should contain __GFP_IO. > > The patch as is makes only little sense to me. Are you seeing any issue > which this is trying to fix? mm.. i don’t see issue for this part , just feel confuse when i see code about this part , then i make a patch for this . i am not sure if __GFP_FS will make sure __GFP_IO flag must be always set. if it is , i think can add comment here to make people clear . :) Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V4] mm: fix kernel crash in khugepaged thread
> On Nov 13, 2015, at 22:01, Steven Rostedt wrote: > > On Fri, 13 Nov 2015 19:54:11 +0800 > yalin wang wrote: > >>>>> TP_fast_assign( >>>>> __entry->mm = mm; >>>>> - __entry->pfn = pfn; >>>>> + __entry->pfn = page_to_pfn(page); >>>> >>>> Instead of the condition, we could have: >>>> >>>>__entry->pfn = page ? page_to_pfn(page) : -1; >>> >>> I agree. Please do it like this. > > hmm, pfn is defined as an unsigned long, would -1 be the best. > Or should it be (-1UL). > > Then we could also have: > >TP_printk("mm=%p, scan_pfn=0x%lx%s, writable=%d, referenced=%d, > none_or_zero=%d, status=%s, unmapped=%d", >__entry->mm, >__entry->pfn == (-1UL) ? 0 : __entry->pfn, > __entry->pfn == (-1UL) ? "(null)" : "", > > Note the added %s after %lx I have in the print format. > > -- Steve it is not easy to print for perf tools in userspace , if you use this format , for user space perf tool, it print the entry by look up the member in entry struct by offset , you print a dynamic string which user space perf tool don’t know how to print this string . Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: change may_enter_fs check condition
> On Nov 13, 2015, at 23:36, Michal Hocko <mho...@kernel.org> wrote: > > On Fri 13-11-15 13:01:16, Vlastimil Babka wrote: >> On 11/13/2015 12:47 PM, yalin wang wrote: >>> Add page_is_file_cache() for __GFP_FS check, >>> otherwise, a Pageswapcache() && PageDirty() page can always be write >>> back if the gfp flag is __GFP_FS, this is not the expected behavior. >> >> I'm not sure I understand your point correctly *), but you seem to imply >> that there would be an allocation that has __GFP_FS but doesn't have >> __GFP_IO? Are there such allocations and does it make sense? > > No it doesn't. There is a natural layering here and __GFP_FS allocations > should contain __GFP_IO. > > The patch as is makes only little sense to me. Are you seeing any issue > which this is trying to fix? mm.. i don’t see issue for this part , just feel confuse when i see code about this part , then i make a patch for this . i am not sure if __GFP_FS will make sure __GFP_IO flag must be always set. if it is , i think can add comment here to make people clear . :) Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2] mm: change mm_vmscan_lru_shrink_inactive() proto types
Move node_id zone_idx shrink flags into trace function, so thay we don't need caculate these args if the trace is disabled, and will make this function have less arguments. Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- include/trace/events/vmscan.h | 14 +++--- mm/vmscan.c | 7 ++- 2 files changed, 9 insertions(+), 12 deletions(-) diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h index dae7836..31763dd 100644 --- a/include/trace/events/vmscan.h +++ b/include/trace/events/vmscan.h @@ -352,11 +352,11 @@ TRACE_EVENT(mm_vmscan_writepage, TRACE_EVENT(mm_vmscan_lru_shrink_inactive, - TP_PROTO(int nid, int zid, - unsigned long nr_scanned, unsigned long nr_reclaimed, - int priority, int reclaim_flags), + TP_PROTO(struct zone *zone, + unsigned long nr_scanned, unsigned long nr_reclaimed, + int priority, int file), - TP_ARGS(nid, zid, nr_scanned, nr_reclaimed, priority, reclaim_flags), + TP_ARGS(zone, nr_scanned, nr_reclaimed, priority, file), TP_STRUCT__entry( __field(int, nid) @@ -368,12 +368,12 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive, ), TP_fast_assign( - __entry->nid = nid; - __entry->zid = zid; + __entry->nid = zone_to_nid(zone); + __entry->zid = zone_idx(zone); __entry->nr_scanned = nr_scanned; __entry->nr_reclaimed = nr_reclaimed; __entry->priority = priority; - __entry->reclaim_flags = reclaim_flags; + __entry->reclaim_flags = trace_shrink_flags(file); ), TP_printk("nid=%d zid=%d nr_scanned=%ld nr_reclaimed=%ld priority=%d flags=%s", diff --git a/mm/vmscan.c b/mm/vmscan.c index 69ca1f5..f8fc8c1 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1691,11 +1691,8 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, current_may_throttle()) wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); - trace_mm_vmscan_lru_shrink_inactive(zone->zone_pgdat->node_id, - zone_idx(zone), - nr_scanned, nr_reclaimed, - sc->priority, - trace_shrink_flags(file)); + trace_mm_vmscan_lru_shrink_inactive(zone, nr_scanned, nr_reclaimed, + sc->priority, file); return nr_reclaimed; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: change mm_vmscan_lru_shrink_inactive() proto types
> On Nov 13, 2015, at 21:16, Vlastimil Babkawrote: > > zone_to_nid make sense, i will send V2 patch , -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V4] mm: fix kernel crash in khugepaged thread
> On Nov 13, 2015, at 22:01, Steven Rostedt <rost...@goodmis.org> wrote: > > On Fri, 13 Nov 2015 19:54:11 +0800 > yalin wang <yalin.wang2...@gmail.com> wrote: > >>>>> TP_fast_assign( >>>>> __entry->mm = mm; >>>>> - __entry->pfn = pfn; >>>>> + __entry->pfn = page_to_pfn(page); >>>> >>>> Instead of the condition, we could have: >>>> >>>>__entry->pfn = page ? page_to_pfn(page) : -1; >>> >>> I agree. Please do it like this. > > hmm, pfn is defined as an unsigned long, would -1 be the best. > Or should it be (-1UL). > > Then we could also have: > >TP_printk("mm=%p, scan_pfn=0x%lx%s, writable=%d, referenced=%d, > none_or_zero=%d, status=%s, unmapped=%d", >__entry->mm, >__entry->pfn == (-1UL) ? 0 : __entry->pfn, > __entry->pfn == (-1UL) ? "(null)" : "", > > Note the added %s after %lx I have in the print format. > > -- Steve it is not easy to print for perf tools in userspace , if you use this format , for user space perf tool, it print the entry by look up the member in entry struct by offset , you print a dynamic string which user space perf tool don’t know how to print this string . Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 01/17] mm: support madvise(MADV_FREE)
> On Nov 16, 2015, at 10:13, Minchan Kimwrote: > > On Fri, Nov 13, 2015 at 11:46:07AM -0800, Andy Lutomirski wrote: >> On Fri, Nov 13, 2015 at 12:13 AM, Daniel Micay wrote: >>> On 13/11/15 02:03 AM, Minchan Kim wrote: On Fri, Nov 13, 2015 at 01:45:52AM -0500, Daniel Micay wrote: >> And now I am thinking if we use access bit, we could implment >> MADV_FREE_UNDO >> easily when we need it. Maybe, that's what you want. Right? > > Yes, but why the access bit instead of the dirty bit for that? It could > always be made more strict (i.e. access bit) in the future, while going > the other way won't be possible. So I think the dirty bit is really the > more conservative choice since if it turns out to be a mistake it can be > fixed without a backwards incompatible change. Absolutely true. That's why I insist on dirty bit until now although I didn't tell the reason. But I thought you wanted to change for using access bit for the future, too. It seems MADV_FREE start to bloat over and over again before knowing real problems and usecases. It's almost same situation with volatile ranges so I really want to stop at proper point which maintainer should decide, I hope. Without it, we will make the feature a lot heavy by just brain storming and then causes lots of churn in MM code without real bebenfit It would be very painful for us. >>> >>> Well, I don't think you need more than a good API and an implementation >>> with no known bugs, kernel security concerns or backwards compatibility >>> issues. Configuration and API extensions are something for later (i.e. >>> land a baseline, then submit stuff like sysctl tunables). Just my take >>> on it though... >>> >> >> As long as it's anonymous MAP_PRIVATE only, then the security aspects >> should be okay. MADV_DONTNEED seems to work on pretty much any VMA, >> and there's been long history of interesting bugs there. >> >> As for dirty vs accessed, an argument in favor of going straight to >> accessed is that it means that users can write code like this without >> worrying about whether they have a kernel that uses the dirty bit: >> >> x = mmap(...); >> *x = 1; /* mark it present */ >> >> /* i'm done with it */ >> *x = 1; >> madvise(MADV_FREE, x, ...); >> >> wait a while; >> >> /* is it still there? */ >> if (*x == 1) { >> /* use whatever was cached there */ >> } else { >> /* reinitialize it */ >> *x = 1; >> } >> >> With the dirty bit, this will look like it works, but on occasion >> users will lose the race where they probe *x to see if the data was >> lost and then the data gets lost before the next write comes in. >> >> Sure, that load from *x could be changed to RMW or users could do a >> dummy write (e.g. x[1] = 1; if (*x == 1) ...), but people might forget >> to do that, and the caching implications are a little bit worse. > > I think your example is the case what people abuse MADV_FREE. > What happens if the object(ie, x) spans multiple pages? > User should know object's memory align and investigate all of pages > which span the object. Hmm, I don't think it's good for API. > >> >> Note that switching to RMW is really really dangerous. Doing: >> >> *x &= 1; >> if (*x == 1) ...; >> >> is safe on x86 if the compiler generates: >> >> andl $1, (%[x]); >> cmpl $1, (%[x]); >> >> but is unsafe if the compiler generates: >> >> movl (%[x]), %eax; >> andl $1, %eax; >> movl %eax, (%[x]); >> cmpl $1, %eax; >> >> and even worse if the write is omitted when "provably" unnecessary. >> >> OTOH, if switching to the accessed bit is too much of a mess, then >> using the dirty bit at first isn't so bad. > > Thanks! I want to use dirty bit first. > > About access bit, I don't want to say it to mess but I guess it would > change a lot subtle thing for all architectures. Because we have used > access bit as just *hint* for aging while dirty bit is really > *critical marker* for system integrity. A example in x86, we don't > keep accuracy of access bit for reducing TLB flush IPI. I don't know > what technique other arches have used but they might have. > > Thanks. > i think use access bit is not easy to implement for ANON page in kernel. we are sure the Anon page is always PageDirty() if it is !PageSwapCache() , unless it is MADV_FREE page , but use access bit , how to distinguish Normal ANON page and MADV_FREE page? it can be implemented by Access bit , but not easy, need more code change . Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V6] mm: fix kernel crash in khugepaged thread
This crash is caused by NULL pointer deference, in page_to_pfn() marco, when page == NULL : [ 182.639154 ] Unable to handle kernel NULL pointer dereference at virtual address [ 182.639491 ] pgd = ffc00077a000 [ 182.639761 ] [] *pgd=b9422003, *pud=b9422003, *pmd=b9423003, *pte=006008000707 [ 182.640749 ] Internal error: Oops: 9406 [#1] SMP [ 182.641197 ] Modules linked in: [ 182.641580 ] CPU: 1 PID: 26 Comm: khugepaged Tainted: GW 4.3.0-rc6-next-20151022ajb-1-g32f3386-dirty #3 [ 182.642077 ] Hardware name: linux,dummy-virt (DT) [ 182.642227 ] task: ffc07957c080 ti: ffc079638000 task.ti: ffc079638000 [ 182.642598 ] PC is at khugepaged+0x378/0x1af8 [ 182.642826 ] LR is at khugepaged+0x418/0x1af8 [ 182.643047 ] pc : [] lr : [] pstate: 6145 [ 182.643490 ] sp : ffc07963bca0 [ 182.643650 ] x29: ffc07963bca0 x28: ffc00075c000 [ 182.644024 ] x27: ffc00f275040 x26: ffc0006c7000 [ 182.644334 ] x25: 00e848800f51 x24: 0640 [ 182.644687 ] x23: 0002 x22: [ 182.644972 ] x21: x20: [ 182.645446 ] x19: x18: 007ff86d0990 [ 182.645931 ] x17: 007ef9c8 x16: ffc98390 [ 182.646236 ] x15: x14: [ 182.646649 ] x13: 016a x12: [ 182.647046 ] x11: ffc07f025020 x10: [ 182.647395 ] x9 : 0048 x8 : ffc000721e28 [ 182.647872 ] x7 : x6 : ffc07f02d000 [ 182.648261 ] x5 : fe00 x4 : ffc00f275040 [ 182.648611 ] x3 : x2 : ffc00f2ad000 [ 182.648908 ] x1 : x0 : ffc000727000 [ 182.649147 ] [ 182.649252 ] Process khugepaged (pid: 26, stack limit = 0xffc079638020) [ 182.649724 ] Stack: (0xffc07963bca0 to 0xffc07963c000) [ 182.650141 ] bca0: ffc07963be30 ffcb5044 ffc07961fb80 ffc00072e630 [ 182.650587 ] bcc0: ffc0005d5090 ffc000197d34 [ 182.651009 ] bce0: [ 182.651446 ] bd00: ffc07963bd90 ffc07f1cbf80 4f3be003 ffc00f2750a4 [ 182.651956 ] bd20: ffc00f3bf000 ffc1 0001 ffc07f085740 [ 182.652520 ] bd40: ffc00f2ad188 ffc0 0620 ffc00f275040 [ 182.652972 ] bd60: ffc0006b1a90 ffc079638000 ffc07963be20 ffc00f0144d0 [ 182.653357 ] bd80: ffc0 0640 ffc00f0144d0 0a080001 [ 182.653793 ] bda0: 1001 ffc1 ffc07f025000 ffc00f2750a8 [ 182.654226 ] bdc0: 000105f8 ffc00075a000 06a0 ffc000727000 [ 182.654522 ] bde0: ffc0006e8478 ffc0 0001 ffc078fb9000 [ 182.654869 ] be00: ffc07963be30 ffc0 ffc07957c080 ffccfc4c [ 182.655225 ] be20: ffc07963be20 ffc07963be20 ffc85c50 [ 182.655588 ] be40: ffcb4f64 ffc07961fb80 [ 182.656138 ] be60: ffcbee2c ffcb4f64 [ 182.656609 ] be80: [ 182.657145 ] bea0: ffc07963bea0 ffc07963bea0 ffc0 [ 182.657475 ] bec0: ffc07963bec0 ffc07963bec0 [ 182.657922 ] bee0: [ 182.658558 ] bf00: [ 182.658972 ] bf20: [ 182.659291 ] bf40: [ 182.659722 ] bf60: [ 182.660122 ] bf80: [ 182.660654 ] bfa0: [ 182.661064 ] bfc0: 0005 [ 182.661466 ] bfe0: [ 182.661848 ] Call trace: [ 182.662050 ] [] khugepaged+0x378/0x1af8 [ 182.662294 ] [] kthread+0xdc/0xf4 [ 182.662605 ] [] ret_from_fork+0xc/0x40 [ 182.663046 ] Code: 35001700 f0002c60 aa0703e3 f9009fa0 (f94000e0) [ 182.663901 ] ---[ end trace 637503d8e28ae69e ]--- [ 182.664160 ] Kernel panic - not syncing: Fatal exception [ 182.664571 ] CPU2: stopping [ 182.664794 ] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G D W 4.3.0-rc6-next-20151022ajb-1-g32f3386-dirty #3 [ 182.665248 ] Hardware name: linux,dummy-virt (DT) Signed-off-by: yalin wang --- include/trace
[PATCH V5] mm: fix kernel crash in khugepaged thread
trace NULL page. Signed-off-by: yalin wang --- include/trace/events/huge_memory.h | 12 ++-- mm/huge_memory.c | 6 +++--- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index 11c59ca..bfcf4a1 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -47,10 +47,10 @@ SCAN_STATUS TRACE_EVENT(mm_khugepaged_scan_pmd, - TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable, + TP_PROTO(struct mm_struct *mm, struct page *page, bool writable, bool referenced, int none_or_zero, int status, int unmapped), - TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped), + TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped), TP_STRUCT__entry( __field(struct mm_struct *, mm) @@ -64,7 +64,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd, TP_fast_assign( __entry->mm = mm; - __entry->pfn = pfn; + __entry->pfn = page ? page_to_pfn(page) : -1; __entry->writable = writable; __entry->referenced = referenced; __entry->none_or_zero = none_or_zero; @@ -108,10 +108,10 @@ TRACE_EVENT(mm_collapse_huge_page, TRACE_EVENT(mm_collapse_huge_page_isolate, - TP_PROTO(unsigned long pfn, int none_or_zero, + TP_PROTO(struct page *page, int none_or_zero, bool referenced, bool writable, int status), - TP_ARGS(pfn, none_or_zero, referenced, writable, status), + TP_ARGS(page, none_or_zero, referenced, writable, status), TP_STRUCT__entry( __field(unsigned long, pfn) @@ -122,7 +122,7 @@ TRACE_EVENT(mm_collapse_huge_page_isolate, ), TP_fast_assign( - __entry->pfn = pfn; + __entry->pfn = page ? page_to_pfn(page) : -1; __entry->none_or_zero = none_or_zero; __entry->referenced = referenced; __entry->writable = writable; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 67b00a1..fb3c4f8 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1977,7 +1977,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, if (likely(writable)) { if (likely(referenced)) { result = SCAN_SUCCEED; - trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero, + trace_mm_collapse_huge_page_isolate(page, none_or_zero, referenced, writable, result); return 1; } @@ -1987,7 +1987,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, out: release_pte_pages(pte, _pte); - trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero, + trace_mm_collapse_huge_page_isolate(page, none_or_zero, referenced, writable, result); return 0; } @@ -2530,7 +2530,7 @@ out_unmap: collapse_huge_page(mm, address, hpage, vma, node); } out: - trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, referenced, + trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced, none_or_zero, result, unmapped); return ret; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V4] mm: fix kernel crash in khugepaged thread
> On Nov 13, 2015, at 18:47, Vlastimil Babka wrote: > > On 11/12/2015 03:29 PM, Steven Rostedt wrote: >> On Thu, 12 Nov 2015 16:21:02 +0800 >> yalin wang wrote: >> >>> This crash is caused by NULL pointer deference, in page_to_pfn() marco, >>> when page == NULL : >>> >>> [ 182.639154 ] Unable to handle kernel NULL pointer dereference at virtual >>> address >> >> >>> add the trace point with TP_CONDITION(page), >> >> I wonder if we still want to trace even if page is NULL? > > I'd say we want to. There's even a "SCAN_PAGE_NULL" result defined for that > case, and otherwise we would only have to guess why collapsing failed, which > is the thing that the tracepoint should help us find out in the first place :) > >>> avoid trace NULL page. >>> >>> Signed-off-by: yalin wang >>> --- >>> include/trace/events/huge_memory.h | 20 >>> mm/huge_memory.c | 6 +++--- >>> 2 files changed, 15 insertions(+), 11 deletions(-) >>> >>> diff --git a/include/trace/events/huge_memory.h >>> b/include/trace/events/huge_memory.h >>> index 11c59ca..727647b 100644 >>> --- a/include/trace/events/huge_memory.h >>> +++ b/include/trace/events/huge_memory.h >>> @@ -45,12 +45,14 @@ SCAN_STATUS >>> #define EM(a, b) {a, b}, >>> #define EMe(a, b) {a, b} >>> >>> -TRACE_EVENT(mm_khugepaged_scan_pmd, >>> +TRACE_EVENT_CONDITION(mm_khugepaged_scan_pmd, >>> >>> - TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable, >>> + TP_PROTO(struct mm_struct *mm, struct page *page, bool writable, >>> bool referenced, int none_or_zero, int status, int unmapped), >>> >>> - TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped), >>> + TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped), >>> + >>> + TP_CONDITION(page), >>> >>> TP_STRUCT__entry( >>> __field(struct mm_struct *, mm) >>> @@ -64,7 +66,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd, >>> >>> TP_fast_assign( >>> __entry->mm = mm; >>> - __entry->pfn = pfn; >>> + __entry->pfn = page_to_pfn(page); >> >> Instead of the condition, we could have: >> >> __entry->pfn = page ? page_to_pfn(page) : -1; > > I agree. Please do it like this. ok , i will send V5 patch .-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: change may_enter_fs check condition
Add page_is_file_cache() for __GFP_FS check, otherwise, a Pageswapcache() && PageDirty() page can always be write back if the gfp flag is __GFP_FS, this is not the expected behavior. Signed-off-by: yalin wang --- mm/vmscan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index bd2918e..f8fc8c1 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -930,7 +930,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, if (page_mapped(page) || PageSwapCache(page)) sc->nr_scanned++; - may_enter_fs = (sc->gfp_mask & __GFP_FS) || + may_enter_fs = (page_is_file_cache(page) && (sc->gfp_mask & __GFP_FS)) || (PageSwapCache(page) && (sc->gfp_mask & __GFP_IO)); /* -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V4] mm: fix kernel crash in khugepaged thread
> On Nov 13, 2015, at 16:41, Hillf Danton wrote: > >> >> Instead of the condition, we could have: >> >> __entry->pfn = page ? page_to_pfn(page) : -1; >> >> >> But if there's no reason to do the tracepoint if page is NULL, then >> this patch is fine. I'm just throwing out this idea. >> > we trace only if page is valid > > --- linux-next/mm/huge_memory.c Fri Nov 13 16:00:22 2015 > +++ b/mm/huge_memory.cFri Nov 13 16:26:19 2015 > @@ -1987,7 +1987,8 @@ static int __collapse_huge_page_isolate( > > out: > release_pte_pages(pte, _pte); > - trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero, > + if (page) > + trace_mm_collapse_huge_page_isolate(page_to_pfn(page), > none_or_zero, > referenced, writable, result); > return 0; > } > — > my V4 patch move if (!page) into trace function, so that we don’t need call page_to_fn() if the trace if disabled . more efficient . Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V4] mm: fix kernel crash in khugepaged thread
> On Nov 13, 2015, at 16:41, Hillf Dantonwrote: > >> >> Instead of the condition, we could have: >> >> __entry->pfn = page ? page_to_pfn(page) : -1; >> >> >> But if there's no reason to do the tracepoint if page is NULL, then >> this patch is fine. I'm just throwing out this idea. >> > we trace only if page is valid > > --- linux-next/mm/huge_memory.c Fri Nov 13 16:00:22 2015 > +++ b/mm/huge_memory.cFri Nov 13 16:26:19 2015 > @@ -1987,7 +1987,8 @@ static int __collapse_huge_page_isolate( > > out: > release_pte_pages(pte, _pte); > - trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero, > + if (page) > + trace_mm_collapse_huge_page_isolate(page_to_pfn(page), > none_or_zero, > referenced, writable, result); > return 0; > } > — > my V4 patch move if (!page) into trace function, so that we don’t need call page_to_fn() if the trace if disabled . more efficient . Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: change may_enter_fs check condition
Add page_is_file_cache() for __GFP_FS check, otherwise, a Pageswapcache() && PageDirty() page can always be write back if the gfp flag is __GFP_FS, this is not the expected behavior. Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- mm/vmscan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index bd2918e..f8fc8c1 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -930,7 +930,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, if (page_mapped(page) || PageSwapCache(page)) sc->nr_scanned++; - may_enter_fs = (sc->gfp_mask & __GFP_FS) || + may_enter_fs = (page_is_file_cache(page) && (sc->gfp_mask & __GFP_FS)) || (PageSwapCache(page) && (sc->gfp_mask & __GFP_IO)); /* -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V6] mm: fix kernel crash in khugepaged thread
This crash is caused by NULL pointer deference, in page_to_pfn() marco, when page == NULL : [ 182.639154 ] Unable to handle kernel NULL pointer dereference at virtual address [ 182.639491 ] pgd = ffc00077a000 [ 182.639761 ] [] *pgd=b9422003, *pud=b9422003, *pmd=b9423003, *pte=006008000707 [ 182.640749 ] Internal error: Oops: 9406 [#1] SMP [ 182.641197 ] Modules linked in: [ 182.641580 ] CPU: 1 PID: 26 Comm: khugepaged Tainted: GW 4.3.0-rc6-next-20151022ajb-1-g32f3386-dirty #3 [ 182.642077 ] Hardware name: linux,dummy-virt (DT) [ 182.642227 ] task: ffc07957c080 ti: ffc079638000 task.ti: ffc079638000 [ 182.642598 ] PC is at khugepaged+0x378/0x1af8 [ 182.642826 ] LR is at khugepaged+0x418/0x1af8 [ 182.643047 ] pc : [] lr : [] pstate: 6145 [ 182.643490 ] sp : ffc07963bca0 [ 182.643650 ] x29: ffc07963bca0 x28: ffc00075c000 [ 182.644024 ] x27: ffc00f275040 x26: ffc0006c7000 [ 182.644334 ] x25: 00e848800f51 x24: 0640 [ 182.644687 ] x23: 0002 x22: [ 182.644972 ] x21: x20: [ 182.645446 ] x19: x18: 007ff86d0990 [ 182.645931 ] x17: 007ef9c8 x16: ffc98390 [ 182.646236 ] x15: x14: [ 182.646649 ] x13: 016a x12: [ 182.647046 ] x11: ffc07f025020 x10: [ 182.647395 ] x9 : 0048 x8 : ffc000721e28 [ 182.647872 ] x7 : x6 : ffc07f02d000 [ 182.648261 ] x5 : fe00 x4 : ffc00f275040 [ 182.648611 ] x3 : x2 : ffc00f2ad000 [ 182.648908 ] x1 : x0 : ffc000727000 [ 182.649147 ] [ 182.649252 ] Process khugepaged (pid: 26, stack limit = 0xffc079638020) [ 182.649724 ] Stack: (0xffc07963bca0 to 0xffc07963c000) [ 182.650141 ] bca0: ffc07963be30 ffcb5044 ffc07961fb80 ffc00072e630 [ 182.650587 ] bcc0: ffc0005d5090 ffc000197d34 [ 182.651009 ] bce0: [ 182.651446 ] bd00: ffc07963bd90 ffc07f1cbf80 4f3be003 ffc00f2750a4 [ 182.651956 ] bd20: ffc00f3bf000 ffc1 0001 ffc07f085740 [ 182.652520 ] bd40: ffc00f2ad188 ffc0 0620 ffc00f275040 [ 182.652972 ] bd60: ffc0006b1a90 ffc079638000 ffc07963be20 ffc00f0144d0 [ 182.653357 ] bd80: ffc0 0640 ffc00f0144d0 0a080001 [ 182.653793 ] bda0: 1001 ffc1 ffc07f025000 ffc00f2750a8 [ 182.654226 ] bdc0: 000105f8 ffc00075a000 06a0 ffc000727000 [ 182.654522 ] bde0: ffc0006e8478 ffc0 0001 ffc078fb9000 [ 182.654869 ] be00: ffc07963be30 ffc0 ffc07957c080 ffccfc4c [ 182.655225 ] be20: ffc07963be20 ffc07963be20 ffc85c50 [ 182.655588 ] be40: ffcb4f64 ffc07961fb80 [ 182.656138 ] be60: ffcbee2c ffcb4f64 [ 182.656609 ] be80: [ 182.657145 ] bea0: ffc07963bea0 ffc07963bea0 ffc0 [ 182.657475 ] bec0: ffc07963bec0 ffc07963bec0 [ 182.657922 ] bee0: [ 182.658558 ] bf00: [ 182.658972 ] bf20: [ 182.659291 ] bf40: [ 182.659722 ] bf60: [ 182.660122 ] bf80: [ 182.660654 ] bfa0: [ 182.661064 ] bfc0: 0005 [ 182.661466 ] bfe0: [ 182.661848 ] Call trace: [ 182.662050 ] [] khugepaged+0x378/0x1af8 [ 182.662294 ] [] kthread+0xdc/0xf4 [ 182.662605 ] [] ret_from_fork+0xc/0x40 [ 182.663046 ] Code: 35001700 f0002c60 aa0703e3 f9009fa0 (f94000e0) [ 182.663901 ] ---[ end trace 637503d8e28ae69e ]--- [ 182.664160 ] Kernel panic - not syncing: Fatal exception [ 182.664571 ] CPU2: stopping [ 182.664794 ] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G D W 4.3.0-rc6-next-20151022ajb-1-g32f3386-dirty #3 [ 182.665248 ] Hardware name: linux,dummy-virt (DT) Signed-off-by: yalin wang <yalin.wa
Re: [PATCH V4] mm: fix kernel crash in khugepaged thread
> On Nov 13, 2015, at 18:47, Vlastimil Babka <vba...@suse.cz> wrote: > > On 11/12/2015 03:29 PM, Steven Rostedt wrote: >> On Thu, 12 Nov 2015 16:21:02 +0800 >> yalin wang <yalin.wang2...@gmail.com> wrote: >> >>> This crash is caused by NULL pointer deference, in page_to_pfn() marco, >>> when page == NULL : >>> >>> [ 182.639154 ] Unable to handle kernel NULL pointer dereference at virtual >>> address >> >> >>> add the trace point with TP_CONDITION(page), >> >> I wonder if we still want to trace even if page is NULL? > > I'd say we want to. There's even a "SCAN_PAGE_NULL" result defined for that > case, and otherwise we would only have to guess why collapsing failed, which > is the thing that the tracepoint should help us find out in the first place :) > >>> avoid trace NULL page. >>> >>> Signed-off-by: yalin wang <yalin.wang2...@gmail.com> >>> --- >>> include/trace/events/huge_memory.h | 20 >>> mm/huge_memory.c | 6 +++--- >>> 2 files changed, 15 insertions(+), 11 deletions(-) >>> >>> diff --git a/include/trace/events/huge_memory.h >>> b/include/trace/events/huge_memory.h >>> index 11c59ca..727647b 100644 >>> --- a/include/trace/events/huge_memory.h >>> +++ b/include/trace/events/huge_memory.h >>> @@ -45,12 +45,14 @@ SCAN_STATUS >>> #define EM(a, b) {a, b}, >>> #define EMe(a, b) {a, b} >>> >>> -TRACE_EVENT(mm_khugepaged_scan_pmd, >>> +TRACE_EVENT_CONDITION(mm_khugepaged_scan_pmd, >>> >>> - TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable, >>> + TP_PROTO(struct mm_struct *mm, struct page *page, bool writable, >>> bool referenced, int none_or_zero, int status, int unmapped), >>> >>> - TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped), >>> + TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped), >>> + >>> + TP_CONDITION(page), >>> >>> TP_STRUCT__entry( >>> __field(struct mm_struct *, mm) >>> @@ -64,7 +66,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd, >>> >>> TP_fast_assign( >>> __entry->mm = mm; >>> - __entry->pfn = pfn; >>> + __entry->pfn = page_to_pfn(page); >> >> Instead of the condition, we could have: >> >> __entry->pfn = page ? page_to_pfn(page) : -1; > > I agree. Please do it like this. ok , i will send V5 patch .-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V5] mm: fix kernel crash in khugepaged thread
trace NULL page. Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- include/trace/events/huge_memory.h | 12 ++-- mm/huge_memory.c | 6 +++--- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index 11c59ca..bfcf4a1 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -47,10 +47,10 @@ SCAN_STATUS TRACE_EVENT(mm_khugepaged_scan_pmd, - TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable, + TP_PROTO(struct mm_struct *mm, struct page *page, bool writable, bool referenced, int none_or_zero, int status, int unmapped), - TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped), + TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped), TP_STRUCT__entry( __field(struct mm_struct *, mm) @@ -64,7 +64,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd, TP_fast_assign( __entry->mm = mm; - __entry->pfn = pfn; + __entry->pfn = page ? page_to_pfn(page) : -1; __entry->writable = writable; __entry->referenced = referenced; __entry->none_or_zero = none_or_zero; @@ -108,10 +108,10 @@ TRACE_EVENT(mm_collapse_huge_page, TRACE_EVENT(mm_collapse_huge_page_isolate, - TP_PROTO(unsigned long pfn, int none_or_zero, + TP_PROTO(struct page *page, int none_or_zero, bool referenced, bool writable, int status), - TP_ARGS(pfn, none_or_zero, referenced, writable, status), + TP_ARGS(page, none_or_zero, referenced, writable, status), TP_STRUCT__entry( __field(unsigned long, pfn) @@ -122,7 +122,7 @@ TRACE_EVENT(mm_collapse_huge_page_isolate, ), TP_fast_assign( - __entry->pfn = pfn; + __entry->pfn = page ? page_to_pfn(page) : -1; __entry->none_or_zero = none_or_zero; __entry->referenced = referenced; __entry->writable = writable; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 67b00a1..fb3c4f8 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1977,7 +1977,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, if (likely(writable)) { if (likely(referenced)) { result = SCAN_SUCCEED; - trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero, + trace_mm_collapse_huge_page_isolate(page, none_or_zero, referenced, writable, result); return 1; } @@ -1987,7 +1987,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, out: release_pte_pages(pte, _pte); - trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero, + trace_mm_collapse_huge_page_isolate(page, none_or_zero, referenced, writable, result); return 0; } @@ -2530,7 +2530,7 @@ out_unmap: collapse_huge_page(mm, address, hpage, vma, node); } out: - trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, referenced, + trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced, none_or_zero, result, unmapped); return ret; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2] mmc: change to use kmalloc
Use kmalloc instead of kzalloc, zero the memory is not needed. Signed-off-by: yalin wang --- drivers/mmc/card/block.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c index c742cfd..c3fd4c8 100644 --- a/drivers/mmc/card/block.c +++ b/drivers/mmc/card/block.c @@ -345,7 +345,7 @@ static struct mmc_blk_ioc_data *mmc_blk_ioctl_copy_from_user( struct mmc_blk_ioc_data *idata; int err; - idata = kzalloc(sizeof(*idata), GFP_KERNEL); + idata = kmalloc(sizeof(*idata), GFP_KERNEL); if (!idata) { err = -ENOMEM; goto out; @@ -365,7 +365,7 @@ static struct mmc_blk_ioc_data *mmc_blk_ioctl_copy_from_user( if (!idata->buf_bytes) return idata; - idata->buf = kzalloc(idata->buf_bytes, GFP_KERNEL); + idata->buf = kmalloc(idata->buf_bytes, GFP_KERNEL); if (!idata->buf) { err = -ENOMEM; goto idata_err; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mmc: change to use kmalloc
> On Nov 12, 2015, at 05:57, Andy Shevchenko wrote: > > On Wed, Nov 11, 2015 at 11:17 PM, Peter Hurley > wrote: >> On 11/11/2015 12:02 PM, Alim Akhtar wrote: >>> Hi Yalin, >>> >>> On Wed, Nov 11, 2015 at 9:53 AM, yalin wang >>> wrote: >>>> Use kmalloc instead of kzalloc, zero the memory is not needed. >>>> >>> why you want to do this? what problem you faces, and how this resolves the >>> same? >> >> The patch fixes an inefficiency: explicitly zeroing memory that is then >> immediately overwritten 6 lines below is wasteful. > > It might fix previous kzalloc as well, though better not to do since > it's error prone. > yeah, i will send a new patch , >> >> Regards, >> Peter Hurley >> >>>> Signed-off-by: yalin wang >>>> --- >>>> drivers/mmc/card/block.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c >>>> index 23b6c8e..975cd3e 100644 >>>> --- a/drivers/mmc/card/block.c >>>> +++ b/drivers/mmc/card/block.c >>>> @@ -365,7 +365,7 @@ static struct mmc_blk_ioc_data >>>> *mmc_blk_ioctl_copy_from_user( >>>>if (!idata->buf_bytes) >>>>return idata; >>>> >>>> - idata->buf = kzalloc(idata->buf_bytes, GFP_KERNEL); >>>> + idata->buf = kmalloc(idata->buf_bytes, GFP_KERNEL); >>>>if (!idata->buf) { >>>>err = -ENOMEM; >>>>goto idata_err; >>>> -- >>>> 1.9.1 >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ > > > > -- > With Best Regards, > Andy Shevchenko -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V4] mm: fix kernel crash in khugepaged thread
trace NULL page. Signed-off-by: yalin wang --- include/trace/events/huge_memory.h | 20 mm/huge_memory.c | 6 +++--- 2 files changed, 15 insertions(+), 11 deletions(-) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index 11c59ca..727647b 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -45,12 +45,14 @@ SCAN_STATUS #define EM(a, b) {a, b}, #define EMe(a, b) {a, b} -TRACE_EVENT(mm_khugepaged_scan_pmd, +TRACE_EVENT_CONDITION(mm_khugepaged_scan_pmd, - TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable, + TP_PROTO(struct mm_struct *mm, struct page *page, bool writable, bool referenced, int none_or_zero, int status, int unmapped), - TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped), + TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped), + + TP_CONDITION(page), TP_STRUCT__entry( __field(struct mm_struct *, mm) @@ -64,7 +66,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd, TP_fast_assign( __entry->mm = mm; - __entry->pfn = pfn; + __entry->pfn = page_to_pfn(page); __entry->writable = writable; __entry->referenced = referenced; __entry->none_or_zero = none_or_zero; @@ -106,12 +108,14 @@ TRACE_EVENT(mm_collapse_huge_page, __print_symbolic(__entry->status, SCAN_STATUS)) ); -TRACE_EVENT(mm_collapse_huge_page_isolate, +TRACE_EVENT_CONDITION(mm_collapse_huge_page_isolate, - TP_PROTO(unsigned long pfn, int none_or_zero, + TP_PROTO(struct page *page, int none_or_zero, bool referenced, bool writable, int status), - TP_ARGS(pfn, none_or_zero, referenced, writable, status), + TP_ARGS(page, none_or_zero, referenced, writable, status), + + TP_CONDITION(page), TP_STRUCT__entry( __field(unsigned long, pfn) @@ -122,7 +126,7 @@ TRACE_EVENT(mm_collapse_huge_page_isolate, ), TP_fast_assign( - __entry->pfn = pfn; + __entry->pfn = page_to_pfn(page); __entry->none_or_zero = none_or_zero; __entry->referenced = referenced; __entry->writable = writable; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 67b00a1..fb3c4f8 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1977,7 +1977,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, if (likely(writable)) { if (likely(referenced)) { result = SCAN_SUCCEED; - trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero, + trace_mm_collapse_huge_page_isolate(page, none_or_zero, referenced, writable, result); return 1; } @@ -1987,7 +1987,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, out: release_pte_pages(pte, _pte); - trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero, + trace_mm_collapse_huge_page_isolate(page, none_or_zero, referenced, writable, result); return 0; } @@ -2530,7 +2530,7 @@ out_unmap: collapse_huge_page(mm, address, hpage, vma, node); } out: - trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, referenced, + trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced, none_or_zero, result, unmapped); return ret; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V4] mm: fix kernel crash in khugepaged thread
trace NULL page. Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- include/trace/events/huge_memory.h | 20 mm/huge_memory.c | 6 +++--- 2 files changed, 15 insertions(+), 11 deletions(-) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index 11c59ca..727647b 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -45,12 +45,14 @@ SCAN_STATUS #define EM(a, b) {a, b}, #define EMe(a, b) {a, b} -TRACE_EVENT(mm_khugepaged_scan_pmd, +TRACE_EVENT_CONDITION(mm_khugepaged_scan_pmd, - TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable, + TP_PROTO(struct mm_struct *mm, struct page *page, bool writable, bool referenced, int none_or_zero, int status, int unmapped), - TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped), + TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped), + + TP_CONDITION(page), TP_STRUCT__entry( __field(struct mm_struct *, mm) @@ -64,7 +66,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd, TP_fast_assign( __entry->mm = mm; - __entry->pfn = pfn; + __entry->pfn = page_to_pfn(page); __entry->writable = writable; __entry->referenced = referenced; __entry->none_or_zero = none_or_zero; @@ -106,12 +108,14 @@ TRACE_EVENT(mm_collapse_huge_page, __print_symbolic(__entry->status, SCAN_STATUS)) ); -TRACE_EVENT(mm_collapse_huge_page_isolate, +TRACE_EVENT_CONDITION(mm_collapse_huge_page_isolate, - TP_PROTO(unsigned long pfn, int none_or_zero, + TP_PROTO(struct page *page, int none_or_zero, bool referenced, bool writable, int status), - TP_ARGS(pfn, none_or_zero, referenced, writable, status), + TP_ARGS(page, none_or_zero, referenced, writable, status), + + TP_CONDITION(page), TP_STRUCT__entry( __field(unsigned long, pfn) @@ -122,7 +126,7 @@ TRACE_EVENT(mm_collapse_huge_page_isolate, ), TP_fast_assign( - __entry->pfn = pfn; + __entry->pfn = page_to_pfn(page); __entry->none_or_zero = none_or_zero; __entry->referenced = referenced; __entry->writable = writable; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 67b00a1..fb3c4f8 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1977,7 +1977,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, if (likely(writable)) { if (likely(referenced)) { result = SCAN_SUCCEED; - trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero, + trace_mm_collapse_huge_page_isolate(page, none_or_zero, referenced, writable, result); return 1; } @@ -1987,7 +1987,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, out: release_pte_pages(pte, _pte); - trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero, + trace_mm_collapse_huge_page_isolate(page, none_or_zero, referenced, writable, result); return 0; } @@ -2530,7 +2530,7 @@ out_unmap: collapse_huge_page(mm, address, hpage, vma, node); } out: - trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, referenced, + trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced, none_or_zero, result, unmapped); return ret; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mmc: change to use kmalloc
> On Nov 12, 2015, at 05:57, Andy Shevchenko <andy.shevche...@gmail.com> wrote: > > On Wed, Nov 11, 2015 at 11:17 PM, Peter Hurley <pe...@hurleysoftware.com> > wrote: >> On 11/11/2015 12:02 PM, Alim Akhtar wrote: >>> Hi Yalin, >>> >>> On Wed, Nov 11, 2015 at 9:53 AM, yalin wang <yalin.wang2...@gmail.com> >>> wrote: >>>> Use kmalloc instead of kzalloc, zero the memory is not needed. >>>> >>> why you want to do this? what problem you faces, and how this resolves the >>> same? >> >> The patch fixes an inefficiency: explicitly zeroing memory that is then >> immediately overwritten 6 lines below is wasteful. > > It might fix previous kzalloc as well, though better not to do since > it's error prone. > yeah, i will send a new patch , >> >> Regards, >> Peter Hurley >> >>>> Signed-off-by: yalin wang <yalin.wang2...@gmail.com> >>>> --- >>>> drivers/mmc/card/block.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c >>>> index 23b6c8e..975cd3e 100644 >>>> --- a/drivers/mmc/card/block.c >>>> +++ b/drivers/mmc/card/block.c >>>> @@ -365,7 +365,7 @@ static struct mmc_blk_ioc_data >>>> *mmc_blk_ioctl_copy_from_user( >>>>if (!idata->buf_bytes) >>>>return idata; >>>> >>>> - idata->buf = kzalloc(idata->buf_bytes, GFP_KERNEL); >>>> + idata->buf = kmalloc(idata->buf_bytes, GFP_KERNEL); >>>>if (!idata->buf) { >>>>err = -ENOMEM; >>>>goto idata_err; >>>> -- >>>> 1.9.1 >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ > > > > -- > With Best Regards, > Andy Shevchenko -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2] mmc: change to use kmalloc
Use kmalloc instead of kzalloc, zero the memory is not needed. Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- drivers/mmc/card/block.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c index c742cfd..c3fd4c8 100644 --- a/drivers/mmc/card/block.c +++ b/drivers/mmc/card/block.c @@ -345,7 +345,7 @@ static struct mmc_blk_ioc_data *mmc_blk_ioctl_copy_from_user( struct mmc_blk_ioc_data *idata; int err; - idata = kzalloc(sizeof(*idata), GFP_KERNEL); + idata = kmalloc(sizeof(*idata), GFP_KERNEL); if (!idata) { err = -ENOMEM; goto out; @@ -365,7 +365,7 @@ static struct mmc_blk_ioc_data *mmc_blk_ioctl_copy_from_user( if (!idata->buf_bytes) return idata; - idata->buf = kzalloc(idata->buf_bytes, GFP_KERNEL); + idata->buf = kmalloc(idata->buf_bytes, GFP_KERNEL); if (!idata->buf) { err = -ENOMEM; goto idata_err; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: change mm_vmscan_lru_shrink_inactive() proto types
Move node_id zone_idx shrink flags into trace function, so thay we don't need caculate these args if the trace is disabled, and will make this function have less arguments. Signed-off-by: yalin wang --- include/trace/events/vmscan.h | 14 +++--- mm/vmscan.c | 7 ++- 2 files changed, 9 insertions(+), 12 deletions(-) diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h index dae7836..f8d6b34 100644 --- a/include/trace/events/vmscan.h +++ b/include/trace/events/vmscan.h @@ -352,11 +352,11 @@ TRACE_EVENT(mm_vmscan_writepage, TRACE_EVENT(mm_vmscan_lru_shrink_inactive, - TP_PROTO(int nid, int zid, - unsigned long nr_scanned, unsigned long nr_reclaimed, - int priority, int reclaim_flags), + TP_PROTO(struct zone *zone, + unsigned long nr_scanned, unsigned long nr_reclaimed, + int priority, int file), - TP_ARGS(nid, zid, nr_scanned, nr_reclaimed, priority, reclaim_flags), + TP_ARGS(zone, nr_scanned, nr_reclaimed, priority, file), TP_STRUCT__entry( __field(int, nid) @@ -368,12 +368,12 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive, ), TP_fast_assign( - __entry->nid = nid; - __entry->zid = zid; + __entry->nid = zone->zone_pgdat->node_id; + __entry->zid = zone_idx(zone); __entry->nr_scanned = nr_scanned; __entry->nr_reclaimed = nr_reclaimed; __entry->priority = priority; - __entry->reclaim_flags = reclaim_flags; + __entry->reclaim_flags = trace_shrink_flags(file); ), TP_printk("nid=%d zid=%d nr_scanned=%ld nr_reclaimed=%ld priority=%d flags=%s", diff --git a/mm/vmscan.c b/mm/vmscan.c index 83cea53..bd2918e 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1691,11 +1691,8 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, current_may_throttle()) wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); - trace_mm_vmscan_lru_shrink_inactive(zone->zone_pgdat->node_id, - zone_idx(zone), - nr_scanned, nr_reclaimed, - sc->priority, - trace_shrink_flags(file)); + trace_mm_vmscan_lru_shrink_inactive(zone, nr_scanned, nr_reclaimed, + sc->priority, file); return nr_reclaimed; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: change trace_mm_vmscan_writepage() proto type
Move trace_reclaim_flags() into trace function, so that we don't need caculate these flags if the trace is disabled. Signed-off-by: yalin wang --- include/trace/events/vmscan.h | 7 +++ mm/vmscan.c | 2 +- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h index f66476b..dae7836 100644 --- a/include/trace/events/vmscan.h +++ b/include/trace/events/vmscan.h @@ -330,10 +330,9 @@ DEFINE_EVENT(mm_vmscan_lru_isolate_template, mm_vmscan_memcg_isolate, TRACE_EVENT(mm_vmscan_writepage, - TP_PROTO(struct page *page, - int reclaim_flags), + TP_PROTO(struct page *page), - TP_ARGS(page, reclaim_flags), + TP_ARGS(page), TP_STRUCT__entry( __field(unsigned long, pfn) @@ -342,7 +341,7 @@ TRACE_EVENT(mm_vmscan_writepage, TP_fast_assign( __entry->pfn = page_to_pfn(page); - __entry->reclaim_flags = reclaim_flags; + __entry->reclaim_flags = trace_reclaim_flags(page); ), TP_printk("page=%p pfn=%lu flags=%s", diff --git a/mm/vmscan.c b/mm/vmscan.c index a4507ec..83cea53 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -594,7 +594,7 @@ static pageout_t pageout(struct page *page, struct address_space *mapping, /* synchronous write or broken a_ops? */ ClearPageReclaim(page); } - trace_mm_vmscan_writepage(page, trace_reclaim_flags(page)); + trace_mm_vmscan_writepage(page); inc_zone_page_state(page, NR_VMSCAN_WRITE); return PAGE_SUCCESS; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2] mm: fix kernel crash in khugepaged thread
Ok i will send a V3 patch. > On Nov 5, 2015, at 16:50, Kirill A. Shutemov wrote: > > On Thu, Nov 05, 2015 at 09:12:34AM +0100, Vlastimil Babka wrote: >> On 10/29/2015 01:35 AM, Kirill A. Shutemov wrote: @@ -2605,9 +2603,9 @@ out_unmap: /* collapse_huge_page will return with the mmap_sem released */ collapse_huge_page(mm, address, hpage, vma, node); } -out: - trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, referenced, - none_or_zero, result, unmapped); + trace_mm_khugepaged_scan_pmd(mm, pte_present(pteval) ? + pte_pfn(pteval) : -1, writable, referenced, + none_or_zero, result, unmapped); >>> >>> maybe passing down pte instead of pfn? >> >> Maybe just pass the page, and have tracepoint's fast assign check for !NULL >> and >> do page_to_pfn itself? That way the complexity and overhead is only in the >> tracepoint and when enabled. > > Agreed. > > -- > Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V3] mm: fix kernel crash in khugepaged thread
trace NULL page. Signed-off-by: yalin wang --- include/trace/events/huge_memory.h | 10 ++ mm/huge_memory.c | 2 +- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index 11c59ca..369d912 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -45,12 +45,14 @@ SCAN_STATUS #define EM(a, b) {a, b}, #define EMe(a, b) {a, b} -TRACE_EVENT(mm_khugepaged_scan_pmd, +TRACE_EVENT_CONDITION(mm_khugepaged_scan_pmd, - TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable, + TP_PROTO(struct mm_struct *mm, struct page *page, bool writable, bool referenced, int none_or_zero, int status, int unmapped), - TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped), + TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped), + + TP_CONDITION(page), TP_STRUCT__entry( __field(struct mm_struct *, mm) @@ -64,7 +66,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd, TP_fast_assign( __entry->mm = mm; - __entry->pfn = pfn; + __entry->pfn = page_to_pfn(page); __entry->writable = writable; __entry->referenced = referenced; __entry->none_or_zero = none_or_zero; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 67b00a1..ff2b105 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2530,7 +2530,7 @@ out_unmap: collapse_huge_page(mm, address, hpage, vma, node); } out: - trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, referenced, + trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced, none_or_zero, result, unmapped); return ret; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V3] mm: fix kernel crash in khugepaged thread
trace NULL page. Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- include/trace/events/huge_memory.h | 10 ++ mm/huge_memory.c | 2 +- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index 11c59ca..369d912 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -45,12 +45,14 @@ SCAN_STATUS #define EM(a, b) {a, b}, #define EMe(a, b) {a, b} -TRACE_EVENT(mm_khugepaged_scan_pmd, +TRACE_EVENT_CONDITION(mm_khugepaged_scan_pmd, - TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable, + TP_PROTO(struct mm_struct *mm, struct page *page, bool writable, bool referenced, int none_or_zero, int status, int unmapped), - TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped), + TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped), + + TP_CONDITION(page), TP_STRUCT__entry( __field(struct mm_struct *, mm) @@ -64,7 +66,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd, TP_fast_assign( __entry->mm = mm; - __entry->pfn = pfn; + __entry->pfn = page_to_pfn(page); __entry->writable = writable; __entry->referenced = referenced; __entry->none_or_zero = none_or_zero; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 67b00a1..ff2b105 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2530,7 +2530,7 @@ out_unmap: collapse_huge_page(mm, address, hpage, vma, node); } out: - trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, referenced, + trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced, none_or_zero, result, unmapped); return ret; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2] mm: fix kernel crash in khugepaged thread
Ok i will send a V3 patch. > On Nov 5, 2015, at 16:50, Kirill A. Shutemovwrote: > > On Thu, Nov 05, 2015 at 09:12:34AM +0100, Vlastimil Babka wrote: >> On 10/29/2015 01:35 AM, Kirill A. Shutemov wrote: @@ -2605,9 +2603,9 @@ out_unmap: /* collapse_huge_page will return with the mmap_sem released */ collapse_huge_page(mm, address, hpage, vma, node); } -out: - trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, referenced, - none_or_zero, result, unmapped); + trace_mm_khugepaged_scan_pmd(mm, pte_present(pteval) ? + pte_pfn(pteval) : -1, writable, referenced, + none_or_zero, result, unmapped); >>> >>> maybe passing down pte instead of pfn? >> >> Maybe just pass the page, and have tracepoint's fast assign check for !NULL >> and >> do page_to_pfn itself? That way the complexity and overhead is only in the >> tracepoint and when enabled. > > Agreed. > > -- > Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: change mm_vmscan_lru_shrink_inactive() proto types
Move node_id zone_idx shrink flags into trace function, so thay we don't need caculate these args if the trace is disabled, and will make this function have less arguments. Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- include/trace/events/vmscan.h | 14 +++--- mm/vmscan.c | 7 ++- 2 files changed, 9 insertions(+), 12 deletions(-) diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h index dae7836..f8d6b34 100644 --- a/include/trace/events/vmscan.h +++ b/include/trace/events/vmscan.h @@ -352,11 +352,11 @@ TRACE_EVENT(mm_vmscan_writepage, TRACE_EVENT(mm_vmscan_lru_shrink_inactive, - TP_PROTO(int nid, int zid, - unsigned long nr_scanned, unsigned long nr_reclaimed, - int priority, int reclaim_flags), + TP_PROTO(struct zone *zone, + unsigned long nr_scanned, unsigned long nr_reclaimed, + int priority, int file), - TP_ARGS(nid, zid, nr_scanned, nr_reclaimed, priority, reclaim_flags), + TP_ARGS(zone, nr_scanned, nr_reclaimed, priority, file), TP_STRUCT__entry( __field(int, nid) @@ -368,12 +368,12 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive, ), TP_fast_assign( - __entry->nid = nid; - __entry->zid = zid; + __entry->nid = zone->zone_pgdat->node_id; + __entry->zid = zone_idx(zone); __entry->nr_scanned = nr_scanned; __entry->nr_reclaimed = nr_reclaimed; __entry->priority = priority; - __entry->reclaim_flags = reclaim_flags; + __entry->reclaim_flags = trace_shrink_flags(file); ), TP_printk("nid=%d zid=%d nr_scanned=%ld nr_reclaimed=%ld priority=%d flags=%s", diff --git a/mm/vmscan.c b/mm/vmscan.c index 83cea53..bd2918e 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1691,11 +1691,8 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, current_may_throttle()) wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); - trace_mm_vmscan_lru_shrink_inactive(zone->zone_pgdat->node_id, - zone_idx(zone), - nr_scanned, nr_reclaimed, - sc->priority, - trace_shrink_flags(file)); + trace_mm_vmscan_lru_shrink_inactive(zone, nr_scanned, nr_reclaimed, + sc->priority, file); return nr_reclaimed; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: change trace_mm_vmscan_writepage() proto type
Move trace_reclaim_flags() into trace function, so that we don't need caculate these flags if the trace is disabled. Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- include/trace/events/vmscan.h | 7 +++ mm/vmscan.c | 2 +- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h index f66476b..dae7836 100644 --- a/include/trace/events/vmscan.h +++ b/include/trace/events/vmscan.h @@ -330,10 +330,9 @@ DEFINE_EVENT(mm_vmscan_lru_isolate_template, mm_vmscan_memcg_isolate, TRACE_EVENT(mm_vmscan_writepage, - TP_PROTO(struct page *page, - int reclaim_flags), + TP_PROTO(struct page *page), - TP_ARGS(page, reclaim_flags), + TP_ARGS(page), TP_STRUCT__entry( __field(unsigned long, pfn) @@ -342,7 +341,7 @@ TRACE_EVENT(mm_vmscan_writepage, TP_fast_assign( __entry->pfn = page_to_pfn(page); - __entry->reclaim_flags = reclaim_flags; + __entry->reclaim_flags = trace_reclaim_flags(page); ), TP_printk("page=%p pfn=%lu flags=%s", diff --git a/mm/vmscan.c b/mm/vmscan.c index a4507ec..83cea53 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -594,7 +594,7 @@ static pageout_t pageout(struct page *page, struct address_space *mapping, /* synchronous write or broken a_ops? */ ClearPageReclaim(page); } - trace_mm_vmscan_writepage(page, trace_reclaim_flags(page)); + trace_mm_vmscan_writepage(page); inc_zone_page_state(page, NR_VMSCAN_WRITE); return PAGE_SUCCESS; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mmc: change to use kmalloc
Use kmalloc instead of kzalloc, zero the memory is not needed. Signed-off-by: yalin wang --- drivers/mmc/card/block.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c index 23b6c8e..975cd3e 100644 --- a/drivers/mmc/card/block.c +++ b/drivers/mmc/card/block.c @@ -365,7 +365,7 @@ static struct mmc_blk_ioc_data *mmc_blk_ioctl_copy_from_user( if (!idata->buf_bytes) return idata; - idata->buf = kzalloc(idata->buf_bytes, GFP_KERNEL); + idata->buf = kmalloc(idata->buf_bytes, GFP_KERNEL); if (!idata->buf) { err = -ENOMEM; goto idata_err; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [V3 PATCH] arm64: remove redundant FRAME_POINTER kconfig option and force to select it
> On Nov 10, 2015, at 19:35, Catalin Marinas wrote: > > On Tue, Nov 10, 2015 at 07:09:00PM +0800, yalin wang wrote: >>> On Nov 10, 2015, at 18:37, Catalin Marinas wrote: >>> >>> On Mon, Nov 09, 2015 at 10:09:55AM -0800, Yang Shi wrote: >>>> FRAME_POINTER is defined in lib/Kconfig.debug, it is unnecessary to >>>> redefine >>>> it in arch/arm64/Kconfig.debug. Actually, the one defined in arm64 >>>> directory >>>> is never used. >>> >>> That's not true since the arm64 definition seems to take precedence. >>> >>>> This adds a dependency on DEBUG_KERNEL for building with frame pointers. >>> >>> It doesn't because arm64 selects ARCH_WANT_FRAME_POINTERS. >>> >>>> ARM64 depends on frame pointer to get correct stack backtrace and need >>>> FRAME_POINTER kconfig option enabled all the time. >>>> However, currect implementation makes it could be disabled, so force it >>>> to be selected by ARM64. >>>> >>>> Signed-off-by: Yang Shi >>> >>> Patch applied but I changed the commit log slightly. Thanks. >> i have a question, >> why FRAME_POINTER config must be enabled ? >> and i see ARM arch can disable this config . >> if i don’t need stack trace dump and the software release is for >> final product , don’t need debug stack trace log . >> is it possible to disable it for performance reason ? > > If you don't need any stack trace, perf etc., in theory you can disable > the option. However, the aarch64 gcc compiler always generates it (I'm > not sure whether the AAPCS mandates it). Anyway, the performance impact > is very small since there are more general purpose registers available > in AArch64 already. > i just make a test with -fomit-frame-pointer, seems gcc can generate code without frame pointer, like ARM arch. version: aarch64-linux-gnu-gcc gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) why AARCH64 don’t have frame unwind info just like ARM arch? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] arm64: add HAVE_LATENCYTOP_SUPPORT config
> On Nov 10, 2015, at 19:18, Will Deacon wrote: > > Ha, so it does! Patch below. The only non-trivial part was arch/arm/, > which has a dependency on !SMP which I believe is no longer required > as of d5996b2ff0e2 ("ARM: fix /proc/$PID/stack on SMP"). > > Will > > --->8 > > From 8dfb40e92ac322cbd68bf9f16cbb11fc5e210269 Mon Sep 17 00:00:00 2001 > From: Will Deacon > Date: Tue, 10 Nov 2015 11:10:04 + > Subject: [PATCH] Kconfig: remove HAVE_LATENCYTOP_SUPPORT > > As illustrated by a3afe70b83fd ("[S390] latencytop s390 support."), > HAVE_LATENCYTOP_SUPPORT is defined by an architecture to advertise an > implementation of save_stack_trace_tsk. > > However, as of 9212ddb5eada ("stacktrace: provide save_stack_trace_tsk() > weak alias") a dummy implementation is provided if STACKTRACE=y. > Given that LATENCYTOP already depends on STACKTRACE_SUPPORT and selects > STACKTRACE, we can remove HAVE_LATENCYTOP_SUPPORT altogether. > > Signed-off-by: Will Deacon > --- > arch/arc/Kconfig| 3 --- > arch/arm/Kconfig| 5 - > arch/metag/Kconfig | 3 --- > arch/microblaze/Kconfig | 3 --- > arch/parisc/Kconfig | 3 --- > arch/powerpc/Kconfig| 3 --- > arch/s390/Kconfig | 3 --- > arch/sh/Kconfig | 3 --- > arch/sparc/Kconfig | 4 > arch/unicore32/Kconfig | 3 --- > arch/x86/Kconfig| 3 --- > lib/Kconfig.debug | 1 - > 12 files changed, 37 deletions(-) > > diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig > index 2c2ac3f3ff80..6dc312fd6480 100644 > --- a/arch/arc/Kconfig > +++ b/arch/arc/Kconfig > @@ -73,9 +73,6 @@ config STACKTRACE_SUPPORT > def_bool y > select STACKTRACE > > -config HAVE_LATENCYTOP_SUPPORT > - def_bool y > - > config HAVE_ARCH_TRANSPARENT_HUGEPAGE > def_bool y > depends on ARC_MMU_V4 > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index 35854e8d97ff..94eff0c6b0f8 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -162,11 +162,6 @@ config STACKTRACE_SUPPORT > bool > default y > > -config HAVE_LATENCYTOP_SUPPORT > - bool > - depends on !SMP > - default y > - > config LOCKDEP_SUPPORT > bool > default y > diff --git a/arch/metag/Kconfig b/arch/metag/Kconfig > index 0b389a81c43a..a0fa88da3e31 100644 > --- a/arch/metag/Kconfig > +++ b/arch/metag/Kconfig > @@ -36,9 +36,6 @@ config STACKTRACE_SUPPORT > config LOCKDEP_SUPPORT > def_bool y > > -config HAVE_LATENCYTOP_SUPPORT > - def_bool y > - > config RWSEM_GENERIC_SPINLOCK > def_bool y > > diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig > index 0bce820428fc..5ecd0287a874 100644 > --- a/arch/microblaze/Kconfig > +++ b/arch/microblaze/Kconfig > @@ -67,9 +67,6 @@ config STACKTRACE_SUPPORT > config LOCKDEP_SUPPORT > def_bool y > > -config HAVE_LATENCYTOP_SUPPORT > - def_bool y > - > source "init/Kconfig" > > source "kernel/Kconfig.freezer" > diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig > index c36546959e86..16276d505cd6 100644 > --- a/arch/parisc/Kconfig > +++ b/arch/parisc/Kconfig > @@ -79,9 +79,6 @@ config TIME_LOW_RES > depends on SMP > default y > > -config HAVE_LATENCYTOP_SUPPORT > -def_bool y > - > # unless you want to implement ACPI on PA-RISC ... ;-) > config PM > bool > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index db49e0d796b1..89210bfdfc7a 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -47,9 +47,6 @@ config STACKTRACE_SUPPORT > bool > default y > > -config HAVE_LATENCYTOP_SUPPORT > - def_bool y > - > config TRACE_IRQFLAGS_SUPPORT > bool > default y > diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig > index 3a55f493c7da..69e22b502d09 100644 > --- a/arch/s390/Kconfig > +++ b/arch/s390/Kconfig > @@ -10,9 +10,6 @@ config LOCKDEP_SUPPORT > config STACKTRACE_SUPPORT > def_bool y > > -config HAVE_LATENCYTOP_SUPPORT > - def_bool y > - > config RWSEM_GENERIC_SPINLOCK > bool > > diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig > index d514df7e04dd..6c391a5d3e5c 100644 > --- a/arch/sh/Kconfig > +++ b/arch/sh/Kconfig > @@ -130,9 +130,6 @@ config STACKTRACE_SUPPORT > config LOCKDEP_SUPPORT > def_bool y > > -config HAVE_LATENCYTOP_SUPPORT > - def_bool y > - > config ARCH_HAS_ILOG2_U32 > def_bool n > > diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig > index 56442d2d7bbc..3203e42190dd 100644 > --- a/arch/sparc/Kconfig > +++ b/arch/sparc/Kconfig > @@ -101,10 +101,6 @@ config LOCKDEP_SUPPORT > bool > default y if SPARC64 > > -config HAVE_LATENCYTOP_SUPPORT > - bool > - default y if SPARC64 > - > config ARCH_HIBERNATION_POSSIBLE > def_bool y if SPARC64 > > diff --git a/arch/unicore32/Kconfig b/arch/unicore32/Kconfig > index c9faddc61100..910ed969 100644 > --- a/arch/unicore32/Kconfig > +++ b/arch/unicore32/Kconfig > @@ -33,9 +33,6 @@ config NO_IOPORT_MAP > config STACKTRACE_SUPPORT > def_bool y >
Re: [V3 PATCH] arm64: remove redundant FRAME_POINTER kconfig option and force to select it
> On Nov 10, 2015, at 18:37, Catalin Marinas wrote: > > On Mon, Nov 09, 2015 at 10:09:55AM -0800, Yang Shi wrote: >> FRAME_POINTER is defined in lib/Kconfig.debug, it is unnecessary to redefine >> it in arch/arm64/Kconfig.debug. Actually, the one defined in arm64 directory >> is never used. > > That's not true since the arm64 definition seems to take precedence. > >> This adds a dependency on DEBUG_KERNEL for building with frame pointers. > > It doesn't because arm64 selects ARCH_WANT_FRAME_POINTERS. > >> ARM64 depends on frame pointer to get correct stack backtrace and need >> FRAME_POINTER kconfig option enabled all the time. >> However, currect implementation makes it could be disabled, so force it >> to be selected by ARM64. >> >> Signed-off-by: Yang Shi > > Patch applied but I changed the commit log slightly. Thanks. i have a question, why FRAME_POINTER config must be enabled ? and i see ARM arch can disable this config . if i don’t need stack trace dump and the software release is for final product , don’t need debug stack trace log . is it possible to disable it for performance reason ? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] arm64: add HAVE_LATENCYTOP_SUPPORT config
> On Nov 10, 2015, at 19:18, Will Deaconwrote: > > Ha, so it does! Patch below. The only non-trivial part was arch/arm/, > which has a dependency on !SMP which I believe is no longer required > as of d5996b2ff0e2 ("ARM: fix /proc/$PID/stack on SMP"). > > Will > > --->8 > > From 8dfb40e92ac322cbd68bf9f16cbb11fc5e210269 Mon Sep 17 00:00:00 2001 > From: Will Deacon > Date: Tue, 10 Nov 2015 11:10:04 + > Subject: [PATCH] Kconfig: remove HAVE_LATENCYTOP_SUPPORT > > As illustrated by a3afe70b83fd ("[S390] latencytop s390 support."), > HAVE_LATENCYTOP_SUPPORT is defined by an architecture to advertise an > implementation of save_stack_trace_tsk. > > However, as of 9212ddb5eada ("stacktrace: provide save_stack_trace_tsk() > weak alias") a dummy implementation is provided if STACKTRACE=y. > Given that LATENCYTOP already depends on STACKTRACE_SUPPORT and selects > STACKTRACE, we can remove HAVE_LATENCYTOP_SUPPORT altogether. > > Signed-off-by: Will Deacon > --- > arch/arc/Kconfig| 3 --- > arch/arm/Kconfig| 5 - > arch/metag/Kconfig | 3 --- > arch/microblaze/Kconfig | 3 --- > arch/parisc/Kconfig | 3 --- > arch/powerpc/Kconfig| 3 --- > arch/s390/Kconfig | 3 --- > arch/sh/Kconfig | 3 --- > arch/sparc/Kconfig | 4 > arch/unicore32/Kconfig | 3 --- > arch/x86/Kconfig| 3 --- > lib/Kconfig.debug | 1 - > 12 files changed, 37 deletions(-) > > diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig > index 2c2ac3f3ff80..6dc312fd6480 100644 > --- a/arch/arc/Kconfig > +++ b/arch/arc/Kconfig > @@ -73,9 +73,6 @@ config STACKTRACE_SUPPORT > def_bool y > select STACKTRACE > > -config HAVE_LATENCYTOP_SUPPORT > - def_bool y > - > config HAVE_ARCH_TRANSPARENT_HUGEPAGE > def_bool y > depends on ARC_MMU_V4 > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index 35854e8d97ff..94eff0c6b0f8 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -162,11 +162,6 @@ config STACKTRACE_SUPPORT > bool > default y > > -config HAVE_LATENCYTOP_SUPPORT > - bool > - depends on !SMP > - default y > - > config LOCKDEP_SUPPORT > bool > default y > diff --git a/arch/metag/Kconfig b/arch/metag/Kconfig > index 0b389a81c43a..a0fa88da3e31 100644 > --- a/arch/metag/Kconfig > +++ b/arch/metag/Kconfig > @@ -36,9 +36,6 @@ config STACKTRACE_SUPPORT > config LOCKDEP_SUPPORT > def_bool y > > -config HAVE_LATENCYTOP_SUPPORT > - def_bool y > - > config RWSEM_GENERIC_SPINLOCK > def_bool y > > diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig > index 0bce820428fc..5ecd0287a874 100644 > --- a/arch/microblaze/Kconfig > +++ b/arch/microblaze/Kconfig > @@ -67,9 +67,6 @@ config STACKTRACE_SUPPORT > config LOCKDEP_SUPPORT > def_bool y > > -config HAVE_LATENCYTOP_SUPPORT > - def_bool y > - > source "init/Kconfig" > > source "kernel/Kconfig.freezer" > diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig > index c36546959e86..16276d505cd6 100644 > --- a/arch/parisc/Kconfig > +++ b/arch/parisc/Kconfig > @@ -79,9 +79,6 @@ config TIME_LOW_RES > depends on SMP > default y > > -config HAVE_LATENCYTOP_SUPPORT > -def_bool y > - > # unless you want to implement ACPI on PA-RISC ... ;-) > config PM > bool > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index db49e0d796b1..89210bfdfc7a 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -47,9 +47,6 @@ config STACKTRACE_SUPPORT > bool > default y > > -config HAVE_LATENCYTOP_SUPPORT > - def_bool y > - > config TRACE_IRQFLAGS_SUPPORT > bool > default y > diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig > index 3a55f493c7da..69e22b502d09 100644 > --- a/arch/s390/Kconfig > +++ b/arch/s390/Kconfig > @@ -10,9 +10,6 @@ config LOCKDEP_SUPPORT > config STACKTRACE_SUPPORT > def_bool y > > -config HAVE_LATENCYTOP_SUPPORT > - def_bool y > - > config RWSEM_GENERIC_SPINLOCK > bool > > diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig > index d514df7e04dd..6c391a5d3e5c 100644 > --- a/arch/sh/Kconfig > +++ b/arch/sh/Kconfig > @@ -130,9 +130,6 @@ config STACKTRACE_SUPPORT > config LOCKDEP_SUPPORT > def_bool y > > -config HAVE_LATENCYTOP_SUPPORT > - def_bool y > - > config ARCH_HAS_ILOG2_U32 > def_bool n > > diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig > index 56442d2d7bbc..3203e42190dd 100644 > --- a/arch/sparc/Kconfig > +++ b/arch/sparc/Kconfig > @@ -101,10 +101,6 @@ config LOCKDEP_SUPPORT > bool > default y if SPARC64 > > -config HAVE_LATENCYTOP_SUPPORT > - bool > - default y if SPARC64 > - > config ARCH_HIBERNATION_POSSIBLE > def_bool y if SPARC64 > > diff --git a/arch/unicore32/Kconfig b/arch/unicore32/Kconfig > index c9faddc61100..910ed969 100644 > --- a/arch/unicore32/Kconfig > +++ b/arch/unicore32/Kconfig > @@ -33,9 +33,6 @@ config
Re: [V3 PATCH] arm64: remove redundant FRAME_POINTER kconfig option and force to select it
> On Nov 10, 2015, at 19:35, Catalin Marinas <catalin.mari...@arm.com> wrote: > > On Tue, Nov 10, 2015 at 07:09:00PM +0800, yalin wang wrote: >>> On Nov 10, 2015, at 18:37, Catalin Marinas <catalin.mari...@arm.com> wrote: >>> >>> On Mon, Nov 09, 2015 at 10:09:55AM -0800, Yang Shi wrote: >>>> FRAME_POINTER is defined in lib/Kconfig.debug, it is unnecessary to >>>> redefine >>>> it in arch/arm64/Kconfig.debug. Actually, the one defined in arm64 >>>> directory >>>> is never used. >>> >>> That's not true since the arm64 definition seems to take precedence. >>> >>>> This adds a dependency on DEBUG_KERNEL for building with frame pointers. >>> >>> It doesn't because arm64 selects ARCH_WANT_FRAME_POINTERS. >>> >>>> ARM64 depends on frame pointer to get correct stack backtrace and need >>>> FRAME_POINTER kconfig option enabled all the time. >>>> However, currect implementation makes it could be disabled, so force it >>>> to be selected by ARM64. >>>> >>>> Signed-off-by: Yang Shi <yang@linaro.org> >>> >>> Patch applied but I changed the commit log slightly. Thanks. >> i have a question, >> why FRAME_POINTER config must be enabled ? >> and i see ARM arch can disable this config . >> if i don’t need stack trace dump and the software release is for >> final product , don’t need debug stack trace log . >> is it possible to disable it for performance reason ? > > If you don't need any stack trace, perf etc., in theory you can disable > the option. However, the aarch64 gcc compiler always generates it (I'm > not sure whether the AAPCS mandates it). Anyway, the performance impact > is very small since there are more general purpose registers available > in AArch64 already. > i just make a test with -fomit-frame-pointer, seems gcc can generate code without frame pointer, like ARM arch. version: aarch64-linux-gnu-gcc gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) why AARCH64 don’t have frame unwind info just like ARM arch? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [V3 PATCH] arm64: remove redundant FRAME_POINTER kconfig option and force to select it
> On Nov 10, 2015, at 18:37, Catalin Marinaswrote: > > On Mon, Nov 09, 2015 at 10:09:55AM -0800, Yang Shi wrote: >> FRAME_POINTER is defined in lib/Kconfig.debug, it is unnecessary to redefine >> it in arch/arm64/Kconfig.debug. Actually, the one defined in arm64 directory >> is never used. > > That's not true since the arm64 definition seems to take precedence. > >> This adds a dependency on DEBUG_KERNEL for building with frame pointers. > > It doesn't because arm64 selects ARCH_WANT_FRAME_POINTERS. > >> ARM64 depends on frame pointer to get correct stack backtrace and need >> FRAME_POINTER kconfig option enabled all the time. >> However, currect implementation makes it could be disabled, so force it >> to be selected by ARM64. >> >> Signed-off-by: Yang Shi > > Patch applied but I changed the commit log slightly. Thanks. i have a question, why FRAME_POINTER config must be enabled ? and i see ARM arch can disable this config . if i don’t need stack trace dump and the software release is for final product , don’t need debug stack trace log . is it possible to disable it for performance reason ? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mmc: change to use kmalloc
Use kmalloc instead of kzalloc, zero the memory is not needed. Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- drivers/mmc/card/block.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c index 23b6c8e..975cd3e 100644 --- a/drivers/mmc/card/block.c +++ b/drivers/mmc/card/block.c @@ -365,7 +365,7 @@ static struct mmc_blk_ioc_data *mmc_blk_ioctl_copy_from_user( if (!idata->buf_bytes) return idata; - idata->buf = kzalloc(idata->buf_bytes, GFP_KERNEL); + idata->buf = kmalloc(idata->buf_bytes, GFP_KERNEL); if (!idata->buf) { err = -ENOMEM; goto idata_err; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] arm64: add HAVE_LATENCYTOP_SUPPORT config
i just enable it on ARM64, and it can work, i don’t see some special requirement to enable this config . > On Nov 7, 2015, at 00:05, Will Deacon wrote: > > On Fri, Nov 06, 2015 at 11:57:58PM +0800, yalin wang wrote: >> Add HAVE_LATENCYTOP_SUPPORT in Kconfig, so that >> we can enable this feature on ARM64 > > Do you know what the prerequisites for HAVE_LATENCYTOP_SUPPORT actually > are (beyond those explicitly listed as dependencies for CONFIG_LATENCYTOP)? > > Will -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] arm64: add HAVE_LATENCYTOP_SUPPORT config
Add HAVE_LATENCYTOP_SUPPORT in Kconfig, so that we can enable this feature on ARM64 Signed-off-by: yalin wang --- arch/arm64/Kconfig | 4 1 file changed, 4 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 851fe11..782b5bd 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -103,6 +103,10 @@ config ARCH_PHYS_ADDR_T_64BIT config MMU def_bool y +config HAVE_LATENCYTOP_SUPPORT + bool + default y + config NO_IOPORT_MAP def_bool y if !PCI -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] goldfish: fix goldfish_pipe driver BUG
goldfish_pipe_read_write() should pass the buffer's physical address to qemu, so that host can copy access data correctly, currently, the drier write a virtual address into address register, host can not get correct data, then adbd daemon can not work in guest. Also I comment off access_with_param() function, seems not used, we don't need use this function in goldfish_pipe_read_write(). Signed-off-by: yalin wang --- drivers/platform/goldfish/goldfish_pipe.c | 56 +++ 1 file changed, 35 insertions(+), 21 deletions(-) diff --git a/drivers/platform/goldfish/goldfish_pipe.c b/drivers/platform/goldfish/goldfish_pipe.c index 55b6d7c..bdf6f11 100644 --- a/drivers/platform/goldfish/goldfish_pipe.c +++ b/drivers/platform/goldfish/goldfish_pipe.c @@ -112,16 +112,27 @@ #define PIPE_WAKE_READ (1 << 1) /* pipe can now be read from */ #define PIPE_WAKE_WRITE(1 << 2) /* pipe can now be written to */ +#ifdef CONFIG_64BIT struct access_params { - unsigned long channel; - u32 size; - unsigned long address; - u32 cmd; - u32 result; + uint64_t channel; /* 0x00 */ + uint32_t size; /* 0x08 */ + uint64_t address; /* 0x0c */ + uint32_t cmd; /* 0x14 */ + uint32_t result;/* 0x18 */ /* reserved for future extension */ - u32 flags; + uint32_t flags; /* 0x1c */ }; - +#else +struct access_params { + uint32_t channel; /* 0x00 */ + uint32_t size; /* 0x04 */ + uint32_t address; /* 0x08 */ + uint32_t cmd; /* 0x0c */ + uint32_t result;/* 0x10 */ + /* reserved for future extension */ + uint32_t flags; /* 0x14 */ +}; +#endif /* The global driver data. Holds a reference to the i/o page used to * communicate with the emulator, and a wake queue for blocked tasks * waiting to be awoken. @@ -237,6 +248,7 @@ static int setup_access_params_addr(struct platform_device *pdev, return -1; } +#if 0 /* A value that will not be set by qemu emulator */ #define INITIAL_BATCH_RESULT (0xdeadbeaf) static int access_with_param(struct goldfish_pipe_dev *dev, const int cmd, @@ -263,6 +275,7 @@ static int access_with_param(struct goldfish_pipe_dev *dev, const int cmd, *status = aps->result; return 0; } +#endif /* This function is used for both reading from and writing to a given * pipe. @@ -304,6 +317,8 @@ static ssize_t goldfish_pipe_read_write(struct file *filp, char __user *buffer, : address_end; unsigned long avail= next - address; int status, wakeBit; + struct page *page; + phys_addr_t phys_addr; /* Ensure that the corresponding page is properly mapped */ /* FIXME: this isn't safe or sufficient - use get_user_pages */ @@ -323,23 +338,22 @@ static ssize_t goldfish_pipe_read_write(struct file *filp, char __user *buffer, break; } } - + if (get_user_pages_unlocked(current, current->active_mm, + address, 1, !is_write, 0, ) != 1) + return -EINVAL; + phys_addr = page_to_phys(page) + offset_in_page(address); /* Now, try to transfer the bytes in the current page */ spin_lock_irqsave(>lock, irq_flags); - if (access_with_param(dev, CMD_WRITE_BUFFER + cmd_offset, - address, avail, pipe, )) { - gf_write_ptr(pipe, dev->base + PIPE_REG_CHANNEL, -dev->base + PIPE_REG_CHANNEL_HIGH); - writel(avail, dev->base + PIPE_REG_SIZE); - gf_write_ptr((void *)address, -dev->base + PIPE_REG_ADDRESS, -dev->base + PIPE_REG_ADDRESS_HIGH); - writel(CMD_WRITE_BUFFER + cmd_offset, - dev->base + PIPE_REG_COMMAND); - status = readl(dev->base + PIPE_REG_STATUS); - } + gf_write_ptr(pipe, dev->base + PIPE_REG_CHANNEL, +dev->base + PIPE_REG_CHANNEL_HIGH); + writel(avail, dev->base + PIPE_REG_SIZE); + gf_write_ptr((void *)phys_addr, dev->base + PIPE_REG_ADDRESS, +dev->base + PIPE_REG_ADDRESS_HIGH); + writel(CMD_WRITE_BUFFER + cmd_offset, + dev->base + PIPE_REG_COMMAND); + status = readl(dev->base + PIPE_REG_STATUS); spin_unlock_irqrestore(>lock, irq_flags); - + put_page(page); if (status &
[PATCH] goldfish: fix goldfish_pipe driver BUG
goldfish_pipe_read_write() should pass the buffer's physical address to qemu, so that host can copy access data correctly, currently, the drier write a virtual address into address register, host can not get correct data, then adbd daemon can not work in guest. Also I comment off access_with_param() function, seems not used, we don't need use this function in goldfish_pipe_read_write(). Signed-off-by: yalin wang <yalin.wang2...@gmail.com> --- drivers/platform/goldfish/goldfish_pipe.c | 56 +++ 1 file changed, 35 insertions(+), 21 deletions(-) diff --git a/drivers/platform/goldfish/goldfish_pipe.c b/drivers/platform/goldfish/goldfish_pipe.c index 55b6d7c..bdf6f11 100644 --- a/drivers/platform/goldfish/goldfish_pipe.c +++ b/drivers/platform/goldfish/goldfish_pipe.c @@ -112,16 +112,27 @@ #define PIPE_WAKE_READ (1 << 1) /* pipe can now be read from */ #define PIPE_WAKE_WRITE(1 << 2) /* pipe can now be written to */ +#ifdef CONFIG_64BIT struct access_params { - unsigned long channel; - u32 size; - unsigned long address; - u32 cmd; - u32 result; + uint64_t channel; /* 0x00 */ + uint32_t size; /* 0x08 */ + uint64_t address; /* 0x0c */ + uint32_t cmd; /* 0x14 */ + uint32_t result;/* 0x18 */ /* reserved for future extension */ - u32 flags; + uint32_t flags; /* 0x1c */ }; - +#else +struct access_params { + uint32_t channel; /* 0x00 */ + uint32_t size; /* 0x04 */ + uint32_t address; /* 0x08 */ + uint32_t cmd; /* 0x0c */ + uint32_t result;/* 0x10 */ + /* reserved for future extension */ + uint32_t flags; /* 0x14 */ +}; +#endif /* The global driver data. Holds a reference to the i/o page used to * communicate with the emulator, and a wake queue for blocked tasks * waiting to be awoken. @@ -237,6 +248,7 @@ static int setup_access_params_addr(struct platform_device *pdev, return -1; } +#if 0 /* A value that will not be set by qemu emulator */ #define INITIAL_BATCH_RESULT (0xdeadbeaf) static int access_with_param(struct goldfish_pipe_dev *dev, const int cmd, @@ -263,6 +275,7 @@ static int access_with_param(struct goldfish_pipe_dev *dev, const int cmd, *status = aps->result; return 0; } +#endif /* This function is used for both reading from and writing to a given * pipe. @@ -304,6 +317,8 @@ static ssize_t goldfish_pipe_read_write(struct file *filp, char __user *buffer, : address_end; unsigned long avail= next - address; int status, wakeBit; + struct page *page; + phys_addr_t phys_addr; /* Ensure that the corresponding page is properly mapped */ /* FIXME: this isn't safe or sufficient - use get_user_pages */ @@ -323,23 +338,22 @@ static ssize_t goldfish_pipe_read_write(struct file *filp, char __user *buffer, break; } } - + if (get_user_pages_unlocked(current, current->active_mm, + address, 1, !is_write, 0, ) != 1) + return -EINVAL; + phys_addr = page_to_phys(page) + offset_in_page(address); /* Now, try to transfer the bytes in the current page */ spin_lock_irqsave(>lock, irq_flags); - if (access_with_param(dev, CMD_WRITE_BUFFER + cmd_offset, - address, avail, pipe, )) { - gf_write_ptr(pipe, dev->base + PIPE_REG_CHANNEL, -dev->base + PIPE_REG_CHANNEL_HIGH); - writel(avail, dev->base + PIPE_REG_SIZE); - gf_write_ptr((void *)address, -dev->base + PIPE_REG_ADDRESS, -dev->base + PIPE_REG_ADDRESS_HIGH); - writel(CMD_WRITE_BUFFER + cmd_offset, - dev->base + PIPE_REG_COMMAND); - status = readl(dev->base + PIPE_REG_STATUS); - } + gf_write_ptr(pipe, dev->base + PIPE_REG_CHANNEL, +dev->base + PIPE_REG_CHANNEL_HIGH); + writel(avail, dev->base + PIPE_REG_SIZE); + gf_write_ptr((void *)phys_addr, dev->base + PIPE_REG_ADDRESS, +dev->base + PIPE_REG_ADDRESS_HIGH); + writel(CMD_WRITE_BUFFER + cmd_offset, + dev->base + PIPE_REG_COMMAND); + status = readl(dev->base + PIPE_REG_STATUS); spin_unlock_irqrestore(>lock, irq_flags); - + put_page(page);