Re: [PATCH] android: lmk: add swap pte pmd in tasksize

2016-03-10 Thread yalin wang

> On Mar 11, 2016, at 15:23, Lu Bing  wrote:
> 
> From: l00215322 
> 
> Many android devices have zram,so we should add "MM_SWAPENTS" in tasksize.
> Refer oom_kill.c,we add pte also.
> 
> Reviewed-by: Chen Feng 
> Reviewed-by: Fu Jun 
> Reviewed-by: Xu YiPing 
> Reviewed-by: Yu DongBin 
> Signed-off-by: Lu Bing 
> ---
> drivers/staging/android/lowmemorykiller.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/android/lowmemorykiller.c 
> b/drivers/staging/android/lowmemorykiller.c
> index 8b5a4a8..0817d3b 100644
> --- a/drivers/staging/android/lowmemorykiller.c
> +++ b/drivers/staging/android/lowmemorykiller.c
> @@ -139,7 +139,9 @@ static unsigned long lowmem_scan(struct shrinker *s, 
> struct shrink_control *sc)
>   task_unlock(p);
>   continue;
>   }
> - tasksize = get_mm_rss(p->mm);
> + tasksize = get_mm_rss(p->mm) +
> + get_mm_counter(p->mm, MM_SWAPENTS) +
> + atomic_long_read(>mm->nr_ptes) + mm_nr_pmds(p->mm);
why not introduce a mm_nr_ptes()  help function here ?
more clear to see .

>   task_unlock(p);
>   if (tasksize <= 0)
>   continue;
> -- 
> 1.8.3.2
> 



Re: [PATCH] android: lmk: add swap pte pmd in tasksize

2016-03-10 Thread yalin wang

> On Mar 11, 2016, at 15:23, Lu Bing  wrote:
> 
> From: l00215322 
> 
> Many android devices have zram,so we should add "MM_SWAPENTS" in tasksize.
> Refer oom_kill.c,we add pte also.
> 
> Reviewed-by: Chen Feng 
> Reviewed-by: Fu Jun 
> Reviewed-by: Xu YiPing 
> Reviewed-by: Yu DongBin 
> Signed-off-by: Lu Bing 
> ---
> drivers/staging/android/lowmemorykiller.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/android/lowmemorykiller.c 
> b/drivers/staging/android/lowmemorykiller.c
> index 8b5a4a8..0817d3b 100644
> --- a/drivers/staging/android/lowmemorykiller.c
> +++ b/drivers/staging/android/lowmemorykiller.c
> @@ -139,7 +139,9 @@ static unsigned long lowmem_scan(struct shrinker *s, 
> struct shrink_control *sc)
>   task_unlock(p);
>   continue;
>   }
> - tasksize = get_mm_rss(p->mm);
> + tasksize = get_mm_rss(p->mm) +
> + get_mm_counter(p->mm, MM_SWAPENTS) +
> + atomic_long_read(>mm->nr_ptes) + mm_nr_pmds(p->mm);
why not introduce a mm_nr_ptes()  help function here ?
more clear to see .

>   task_unlock(p);
>   if (tasksize <= 0)
>   continue;
> -- 
> 1.8.3.2
> 



[RFC] arm: change to use generic sign_extend32() function

2015-12-27 Thread yalin wang
change to use generic sign_extend32() to caaculate branch_displacement.

Signed-off-by: yalin wang 
---
 arch/arm/probes/decode-arm.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/arm/probes/decode-arm.c b/arch/arm/probes/decode-arm.c
index f72c33a..ff794c0 100644
--- a/arch/arm/probes/decode-arm.c
+++ b/arch/arm/probes/decode-arm.c
@@ -20,13 +20,12 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "decode.h"
 #include "decode-arm.h"
 
-#define sign_extend(x, signbit) ((x) | (0 - ((x) & (1 << (signbit)
-
-#define branch_displacement(insn) sign_extend(((insn) & 0xff) << 2, 25)
+#define branch_displacement(insn) sign_extend32(((insn) & 0xff) << 2, 25)
 
 /*
  * To avoid the complications of mimicing single-stepping on a
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] arm: change to use generic sign_extend32() function

2015-12-27 Thread yalin wang
change to use generic sign_extend32() to caaculate branch_displacement.

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 arch/arm/probes/decode-arm.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/arm/probes/decode-arm.c b/arch/arm/probes/decode-arm.c
index f72c33a..ff794c0 100644
--- a/arch/arm/probes/decode-arm.c
+++ b/arch/arm/probes/decode-arm.c
@@ -20,13 +20,12 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "decode.h"
 #include "decode-arm.h"
 
-#define sign_extend(x, signbit) ((x) | (0 - ((x) & (1 << (signbit)
-
-#define branch_displacement(insn) sign_extend(((insn) & 0xff) << 2, 25)
+#define branch_displacement(insn) sign_extend32(((insn) & 0xff) << 2, 25)
 
 /*
  * To avoid the complications of mimicing single-stepping on a
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-21 Thread yalin wang

> On Dec 15, 2015, at 19:53, Kirill A. Shutemov 
>  wrote:
> 
> On Tue, Dec 15, 2015 at 02:41:21PM +0800, yalin wang wrote:
>>> On Dec 15, 2015, at 05:11, Kirill A. Shutemov  wrote:
>>> Anyway, I don't think it's possible to gain anything measurable from this
>>> optimization.
>>> 
>> the advantage is that if addr don’t belong to any vma, we don’t need loop 
>> all vma,
>> we can break earlier if we found the most closest vma which vma->end_add > 
>> addr,
> 
> Do you have any workload which can demonstrate the advantage?
> 
> — 
i add the log in find_vma() to see the call stack ,
it is very efficient in mmap() / munmap / do_execve() / get_unmaped_area() /
mem_cgroup_move_task()->walk_page_range()->find_vma() call ,

in most time the loop will break after search about 7 vm,
i don’t consider the cache pollution problem in this patch,
yeah, this patch will check the vm_prev->vm_end for every loop,
but this only happened when tmp->vm_end > addr ,
if you don’t not check this , you will continue to loop to check next rb ,
this will also pollute the cache ,

so the question is which one is better ?
i don’t have a better method to test this .
Any good ideas about this ?
how to test it ?

Thanks







--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-21 Thread yalin wang

> On Dec 15, 2015, at 19:53, Kirill A. Shutemov 
> <kirill.shute...@linux.intel.com> wrote:
> 
> On Tue, Dec 15, 2015 at 02:41:21PM +0800, yalin wang wrote:
>>> On Dec 15, 2015, at 05:11, Kirill A. Shutemov <kir...@shutemov.name> wrote:
>>> Anyway, I don't think it's possible to gain anything measurable from this
>>> optimization.
>>> 
>> the advantage is that if addr don’t belong to any vma, we don’t need loop 
>> all vma,
>> we can break earlier if we found the most closest vma which vma->end_add > 
>> addr,
> 
> Do you have any workload which can demonstrate the advantage?
> 
> — 
i add the log in find_vma() to see the call stack ,
it is very efficient in mmap() / munmap / do_execve() / get_unmaped_area() /
mem_cgroup_move_task()->walk_page_range()->find_vma() call ,

in most time the loop will break after search about 7 vm,
i don’t consider the cache pollution problem in this patch,
yeah, this patch will check the vm_prev->vm_end for every loop,
but this only happened when tmp->vm_end > addr ,
if you don’t not check this , you will continue to loop to check next rb ,
this will also pollute the cache ,

so the question is which one is better ?
i don’t have a better method to test this .
Any good ideas about this ?
how to test it ?

Thanks







--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-14 Thread yalin wang

> On Dec 15, 2015, at 05:11, Kirill A. Shutemov  wrote:
> 
> On Mon, Dec 14, 2015 at 06:55:09PM +0100, Oleg Nesterov wrote:
>> On 12/14, Kirill A. Shutemov wrote:
>>> 
>>> On Mon, Dec 14, 2015 at 07:02:25PM +0800, yalin wang wrote:
>>>> change find_vma() to break ealier when found the adderss
>>>> is not in any vma, don't need loop to search all vma.
>>>> 
>>>> Signed-off-by: yalin wang 
>>>> ---
>>>> mm/mmap.c | 3 +++
>>>> 1 file changed, 3 insertions(+)
>>>> 
>>>> diff --git a/mm/mmap.c b/mm/mmap.c
>>>> index b513f20..8294c9b 100644
>>>> --- a/mm/mmap.c
>>>> +++ b/mm/mmap.c
>>>> @@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct 
>>>> *mm, unsigned long addr)
>>>>vma = tmp;
>>>>if (tmp->vm_start <= addr)
>>>>break;
>>>> +  if (!tmp->vm_prev || tmp->vm_prev->vm_end <= addr)
>>>> +  break;
>>>> +
>>> 
>>> This 'break' would return 'tmp' as found vma.
>> 
>> But this would be right?
> 
> Hm. Right. Sorry for my tone.
> 
> I think the right condition is 'tmp->vm_prev->vm_end < addr', not '<=' as
> vm_end is the first byte after the vma. But it's equivalent in practice
> here.
> 
this should be <= here,
because vma’s effect address space doesn’t include vm_end add,
so if an address vm_end <= add , this means this addr don’t belong to this vma,

> Anyway, I don't think it's possible to gain anything measurable from this
> optimization.
> 
the advantage is that if addr don’t belong to any vma, we don’t need loop all 
vma,
we can break earlier if we found the most closest vma which vma->end_add > addr,
>> 
>> Not that I think this optimization makes sense, I simply do not know,
>> but to me this change looks technically correct at first glance...
>> 
>> But the changelog is wrong or I missed something. This change can stop
>> the main loop earlier; if "tmp" is the first vma,
> 
> For the first vma, we don't get anything comparing to what we have now:
> check for !rb_node on the next iteration would have the same trade off and
> effect as the proposed check.
Yes
> 
>> or if the previous one is below the address.
> 
> Yes, but would it compensate additional check on each 'tmp->vm_end > addr'
> iteration to the point? That's not obvious.
> 
>> Or perhaps I just misread that "not in any vma" note in the changelog.
>> 
>> No?
>> 
>> Oleg.
>> 

i have test it, it works fine. :)
Thanks




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] mm: change find_vma() function

2015-12-14 Thread yalin wang
change find_vma() to break ealier when found the adderss
is not in any vma, don't need loop to search all vma.

Signed-off-by: yalin wang 
---
 mm/mmap.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/mmap.c b/mm/mmap.c
index b513f20..8294c9b 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, 
unsigned long addr)
vma = tmp;
if (tmp->vm_start <= addr)
break;
+   if (!tmp->vm_prev || tmp->vm_prev->vm_end <= addr)
+   break;
+
rb_node = rb_node->rb_left;
} else
rb_node = rb_node->rb_right;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] mm: change find_vma() function

2015-12-14 Thread yalin wang
change find_vma() to break ealier when found the adderss
is not in any vma, don't need loop to search all vma.

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 mm/mmap.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/mmap.c b/mm/mmap.c
index b513f20..8294c9b 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, 
unsigned long addr)
vma = tmp;
if (tmp->vm_start <= addr)
break;
+   if (!tmp->vm_prev || tmp->vm_prev->vm_end <= addr)
+   break;
+
rb_node = rb_node->rb_left;
} else
rb_node = rb_node->rb_right;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-14 Thread yalin wang

> On Dec 15, 2015, at 05:11, Kirill A. Shutemov <kir...@shutemov.name> wrote:
> 
> On Mon, Dec 14, 2015 at 06:55:09PM +0100, Oleg Nesterov wrote:
>> On 12/14, Kirill A. Shutemov wrote:
>>> 
>>> On Mon, Dec 14, 2015 at 07:02:25PM +0800, yalin wang wrote:
>>>> change find_vma() to break ealier when found the adderss
>>>> is not in any vma, don't need loop to search all vma.
>>>> 
>>>> Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
>>>> ---
>>>> mm/mmap.c | 3 +++
>>>> 1 file changed, 3 insertions(+)
>>>> 
>>>> diff --git a/mm/mmap.c b/mm/mmap.c
>>>> index b513f20..8294c9b 100644
>>>> --- a/mm/mmap.c
>>>> +++ b/mm/mmap.c
>>>> @@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct 
>>>> *mm, unsigned long addr)
>>>>vma = tmp;
>>>>if (tmp->vm_start <= addr)
>>>>break;
>>>> +  if (!tmp->vm_prev || tmp->vm_prev->vm_end <= addr)
>>>> +  break;
>>>> +
>>> 
>>> This 'break' would return 'tmp' as found vma.
>> 
>> But this would be right?
> 
> Hm. Right. Sorry for my tone.
> 
> I think the right condition is 'tmp->vm_prev->vm_end < addr', not '<=' as
> vm_end is the first byte after the vma. But it's equivalent in practice
> here.
> 
this should be <= here,
because vma’s effect address space doesn’t include vm_end add,
so if an address vm_end <= add , this means this addr don’t belong to this vma,

> Anyway, I don't think it's possible to gain anything measurable from this
> optimization.
> 
the advantage is that if addr don’t belong to any vma, we don’t need loop all 
vma,
we can break earlier if we found the most closest vma which vma->end_add > addr,
>> 
>> Not that I think this optimization makes sense, I simply do not know,
>> but to me this change looks technically correct at first glance...
>> 
>> But the changelog is wrong or I missed something. This change can stop
>> the main loop earlier; if "tmp" is the first vma,
> 
> For the first vma, we don't get anything comparing to what we have now:
> check for !rb_node on the next iteration would have the same trade off and
> effect as the proposed check.
Yes
> 
>> or if the previous one is below the address.
> 
> Yes, but would it compensate additional check on each 'tmp->vm_end > addr'
> iteration to the point? That's not obvious.
> 
>> Or perhaps I just misread that "not in any vma" note in the changelog.
>> 
>> No?
>> 
>> Oleg.
>> 

i have test it, it works fine. :)
Thanks




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] clear file privilege bits when mmap writing

2015-12-03 Thread yalin wang

> On Dec 2, 2015, at 16:03, Kees Cook  wrote:
> 
> Normally, when a user can modify a file that has setuid or setgid bits,
> those bits are cleared when they are not the file owner or a member
> of the group. This is enforced when using write and truncate but not
> when writing to a shared mmap on the file. This could allow the file
> writer to gain privileges by changing a binary without losing the
> setuid/setgid/caps bits.
> 
> Changing the bits requires holding inode->i_mutex, so it cannot be done
> during the page fault (due to mmap_sem being held during the fault).
> Instead, clear the bits if PROT_WRITE is being used at mmap time.
> 
> Signed-off-by: Kees Cook 
> Cc: sta...@vger.kernel.org
> —

is this means mprotect() sys call also need add this check?
mprotect() can change to PROT_WRITE, then it can write to a 
read only map again , also a secure hole here .

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm, printk: introduce new format string for flags

2015-12-03 Thread yalin wang

>> Technically, I think the answer is yes, at least in C99 (and I suppose
>> gcc would accept it in gnu89 mode as well).
>> 
>> printk("%pg\n", &(struct flag_printer){.flags = my_flags, .names = 
>> vmaflags_names});
>> 
>> Not tested, and I still don't think it would be particularly readable
>> even when macroized
>> 
>> printk("%pg\n", PRINTF_VMAFLAGS(my_flags));
> i test on gcc 4.9.3, it can work for this method,
> so the final solution like this:
> printk.h:
> struct flag_fmt_spec {
>   unsigned long flag;
>   struct trace_print_flags *flags;
>   int array_size;
>   char delimiter; }
> 
> #define FLAG_FORMAT(flag, flag_array, delimiter) (&(struct flag_ft_spec){ 
> .flag = flag, .flags = flag_array, .array_size = ARRAY_SIZE(flag_array), 
> .delimiter = delimiter})
> #define VMA_FLAG_FORMAT(flag)  FLAG_FORMAT(flag, vmaflags_names, ‘|’)
a little change:
#define VMA_FLAG_FORMAT(vma)  FLAG_FORMAT(vma->vm_flags, 
vmaflags_names, ‘|’)


> source code:
> printk("%pg\n", VMA_FLAG_FORMAT(my_flags)); 
a little change:
printk("%pg\n", VMA_FLAG_FORMAT(vma)); 

> 
> that’s all, see cpumask_pr_args(masks) macro,
> it also use macro and  %*pb  to print cpu mask .
> i think this method is not very complex to use .
> 
> search source code ,
> there is lots of printk to print flag into hex number :
> $ grep -n  -r 'printk.*flag.*%x’  .
> it will be great if this flag string print is generic.
> 
> Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm, printk: introduce new format string for flags

2015-12-03 Thread yalin wang

> On Dec 3, 2015, at 00:03, Rasmus Villemoes  wrote:
> 
> On Thu, Dec 03 2015, yalin wang  wrote:
> 
>>> On Dec 2, 2015, at 13:04, Vlastimil Babka  wrote:
>>> 
>>> On 12/02/2015 06:40 PM, yalin wang wrote:
>>> 
>>> (please trim your reply next time, no need to quote whole patch here)
>>> 
>>>> i am thinking why not make %pg* to be more generic ?
>>>> not restricted to only GFP / vma flags / page flags .
>>>> so could we change format like this ?
>>>> define a flag spec struct to include flag and trace_print_flags and some 
>>>> other option :
>>>> typedef struct { 
>>>> unsigned long flag;
>>>> structtrace_print_flags *flags;
>>>> unsigned long option; } flag_sec;
>>>> flag_sec my_flag;
>>>> in printk we only pass like this :
>>>> printk(“%pg\n”, _flag) ;
>>>> then it can print any flags defined by user .
>>>> more useful for other drivers to use .
>>> 
>>> I don't know, it sounds quite complicated
> 
> Agreed, I think this would be premature generalization. There's also
> some value in having the individual %pgX specifiers, as that allows
> individual tweaks such as the mask_out for page flags.
> 
> given that we had no flags printing
>> 
if we use this generic method, %pgX where X can be used to specify some flag to
mask out some thing .  it will be great .

> 
> Compared to printk("%pgv\n", >flag), I know which I'd prefer to read.
> 
>> i am not if DECLARE_FLAG_PRINTK_FMT and FLAG_PRINTK_FMT macro 
>> can be defined into one macro ?
>> maybe need some trick here .
>> 
>> is it possible ?
> 
> Technically, I think the answer is yes, at least in C99 (and I suppose
> gcc would accept it in gnu89 mode as well).
> 
> printk("%pg\n", &(struct flag_printer){.flags = my_flags, .names = 
> vmaflags_names});
> 
> Not tested, and I still don't think it would be particularly readable
> even when macroized
> 
> printk("%pg\n", PRINTF_VMAFLAGS(my_flags));
i test on gcc 4.9.3, it can work for this method,
so the final solution like this:
printk.h:
struct flag_fmt_spec {
unsigned long flag;
struct trace_print_flags *flags;
int array_size;
char delimiter; }

#define FLAG_FORMAT(flag, flag_array, delimiter) (&(struct flag_ft_spec){ .flag 
= flag, .flags = flag_array, .array_size = ARRAY_SIZE(flag_array), .delimiter = 
delimiter})
#define VMA_FLAG_FORMAT(flag)  FLAG_FORMAT(flag, vmaflags_names, ‘|')

source code:
printk("%pg\n", VMA_FLAG_FORMAT(my_flags)); 

that’s all, see cpumask_pr_args(masks) macro,
it also use macro and  %*pb  to print cpu mask .
i think this method is not very complex to use .

search source code ,
there is lots of printk to print flag into hex number :
$ grep -n  -r 'printk.*flag.*%x’  .
it will be great if this flag string print is generic.

Thanks









--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm, printk: introduce new format string for flags

2015-12-03 Thread yalin wang

> On Dec 3, 2015, at 00:03, Rasmus Villemoes <li...@rasmusvillemoes.dk> wrote:
> 
> On Thu, Dec 03 2015, yalin wang <yalin.wang2...@gmail.com> wrote:
> 
>>> On Dec 2, 2015, at 13:04, Vlastimil Babka <vba...@suse.cz> wrote:
>>> 
>>> On 12/02/2015 06:40 PM, yalin wang wrote:
>>> 
>>> (please trim your reply next time, no need to quote whole patch here)
>>> 
>>>> i am thinking why not make %pg* to be more generic ?
>>>> not restricted to only GFP / vma flags / page flags .
>>>> so could we change format like this ?
>>>> define a flag spec struct to include flag and trace_print_flags and some 
>>>> other option :
>>>> typedef struct { 
>>>> unsigned long flag;
>>>> structtrace_print_flags *flags;
>>>> unsigned long option; } flag_sec;
>>>> flag_sec my_flag;
>>>> in printk we only pass like this :
>>>> printk(“%pg\n”, _flag) ;
>>>> then it can print any flags defined by user .
>>>> more useful for other drivers to use .
>>> 
>>> I don't know, it sounds quite complicated
> 
> Agreed, I think this would be premature generalization. There's also
> some value in having the individual %pgX specifiers, as that allows
> individual tweaks such as the mask_out for page flags.
> 
> given that we had no flags printing
>> 
if we use this generic method, %pgX where X can be used to specify some flag to
mask out some thing .  it will be great .

> 
> Compared to printk("%pgv\n", >flag), I know which I'd prefer to read.
> 
>> i am not if DECLARE_FLAG_PRINTK_FMT and FLAG_PRINTK_FMT macro 
>> can be defined into one macro ?
>> maybe need some trick here .
>> 
>> is it possible ?
> 
> Technically, I think the answer is yes, at least in C99 (and I suppose
> gcc would accept it in gnu89 mode as well).
> 
> printk("%pg\n", &(struct flag_printer){.flags = my_flags, .names = 
> vmaflags_names});
> 
> Not tested, and I still don't think it would be particularly readable
> even when macroized
> 
> printk("%pg\n", PRINTF_VMAFLAGS(my_flags));
i test on gcc 4.9.3, it can work for this method,
so the final solution like this:
printk.h:
struct flag_fmt_spec {
unsigned long flag;
struct trace_print_flags *flags;
int array_size;
char delimiter; }

#define FLAG_FORMAT(flag, flag_array, delimiter) (&(struct flag_ft_spec){ .flag 
= flag, .flags = flag_array, .array_size = ARRAY_SIZE(flag_array), .delimiter = 
delimiter})
#define VMA_FLAG_FORMAT(flag)  FLAG_FORMAT(flag, vmaflags_names, ‘|')

source code:
printk("%pg\n", VMA_FLAG_FORMAT(my_flags)); 

that’s all, see cpumask_pr_args(masks) macro,
it also use macro and  %*pb  to print cpu mask .
i think this method is not very complex to use .

search source code ,
there is lots of printk to print flag into hex number :
$ grep -n  -r 'printk.*flag.*%x’  .
it will be great if this flag string print is generic.

Thanks









--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm, printk: introduce new format string for flags

2015-12-03 Thread yalin wang

>> Technically, I think the answer is yes, at least in C99 (and I suppose
>> gcc would accept it in gnu89 mode as well).
>> 
>> printk("%pg\n", &(struct flag_printer){.flags = my_flags, .names = 
>> vmaflags_names});
>> 
>> Not tested, and I still don't think it would be particularly readable
>> even when macroized
>> 
>> printk("%pg\n", PRINTF_VMAFLAGS(my_flags));
> i test on gcc 4.9.3, it can work for this method,
> so the final solution like this:
> printk.h:
> struct flag_fmt_spec {
>   unsigned long flag;
>   struct trace_print_flags *flags;
>   int array_size;
>   char delimiter; }
> 
> #define FLAG_FORMAT(flag, flag_array, delimiter) (&(struct flag_ft_spec){ 
> .flag = flag, .flags = flag_array, .array_size = ARRAY_SIZE(flag_array), 
> .delimiter = delimiter})
> #define VMA_FLAG_FORMAT(flag)  FLAG_FORMAT(flag, vmaflags_names, ‘|’)
a little change:
#define VMA_FLAG_FORMAT(vma)  FLAG_FORMAT(vma->vm_flags, 
vmaflags_names, ‘|’)


> source code:
> printk("%pg\n", VMA_FLAG_FORMAT(my_flags)); 
a little change:
printk("%pg\n", VMA_FLAG_FORMAT(vma)); 

> 
> that’s all, see cpumask_pr_args(masks) macro,
> it also use macro and  %*pb  to print cpu mask .
> i think this method is not very complex to use .
> 
> search source code ,
> there is lots of printk to print flag into hex number :
> $ grep -n  -r 'printk.*flag.*%x’  .
> it will be great if this flag string print is generic.
> 
> Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] clear file privilege bits when mmap writing

2015-12-03 Thread yalin wang

> On Dec 2, 2015, at 16:03, Kees Cook  wrote:
> 
> Normally, when a user can modify a file that has setuid or setgid bits,
> those bits are cleared when they are not the file owner or a member
> of the group. This is enforced when using write and truncate but not
> when writing to a shared mmap on the file. This could allow the file
> writer to gain privileges by changing a binary without losing the
> setuid/setgid/caps bits.
> 
> Changing the bits requires holding inode->i_mutex, so it cannot be done
> during the page fault (due to mmap_sem being held during the fault).
> Instead, clear the bits if PROT_WRITE is being used at mmap time.
> 
> Signed-off-by: Kees Cook 
> Cc: sta...@vger.kernel.org
> —

is this means mprotect() sys call also need add this check?
mprotect() can change to PROT_WRITE, then it can write to a 
read only map again , also a secure hole here .

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm, printk: introduce new format string for flags

2015-12-02 Thread yalin wang

> On Dec 2, 2015, at 13:04, Vlastimil Babka  wrote:
> 
> On 12/02/2015 06:40 PM, yalin wang wrote:
> 
> (please trim your reply next time, no need to quote whole patch here)
> 
>> i am thinking why not make %pg* to be more generic ?
>> not restricted to only GFP / vma flags / page flags .
>> so could we change format like this ?
>> define a flag spec struct to include flag and trace_print_flags and some 
>> other option :
>> typedef struct { 
>> unsigned long flag;
>> structtrace_print_flags *flags;
>> unsigned long option; } flag_sec;
>> flag_sec my_flag;
>> in printk we only pass like this :
>> printk(“%pg\n”, _flag) ;
>> then it can print any flags defined by user .
>> more useful for other drivers to use .
> 
> I don't know, it sounds quite complicated given that we had no flags printing
> for years and now there's just three kinds of them. The extra struct flag_sec 
> is
> IMHO nuissance. No other printk format needs such thing AFAIK? For example, 
> if I
> were to print page flags from several places, each would have to define the
> struct flag_sec instance, or some header would have to provide it?
this can be avoided by provide a macro in header file .
we can add a new struct to declare trace_print_flags :
for example:
#define DECLARE_FLAG_PRINTK_FMT(name, flags_array)   flag_spec name = { .flags 
= flags_array};
#define FLAG_PRINTK_FMT(name, flag) ({  name.flag = flag;  })

in source code :
DECLARE_FLAG_PRINTK_FMT(my_flag, vmaflags_names);
printk(“%pg\n”, FLAG_PRINTK_FMT(my_flag, vma->flag));

i am not if DECLARE_FLAG_PRINTK_FMT and FLAG_PRINTK_FMT macro 
can be defined into one macro ?
maybe need some trick here .

is it possible ?


Thanks



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm, printk: introduce new format string for flags

2015-12-02 Thread yalin wang

> On Nov 30, 2015, at 08:10, Vlastimil Babka  wrote:
> 
> In mm we use several kinds of flags bitfields that are sometimes printed for
> debugging purposes, or exported to userspace via sysfs. To make them easier to
> interpret independently on kernel version and config, we want to dump also the
> symbolic flag names. So far this has been done with repeated calls to
> pr_cont(), which is unreliable on SMP, and not usable for e.g. sysfs export.
> 
> To get a more reliable and universal solution, this patch extends printk()
> format string for pointers to handle the page flags (%pgp), gfp_flags (%pgg)
> and vma flags (%pgv). Existing users of dump_flag_names() are converted and
> simplified.
> 
> It would be possible to pass flags by value instead of pointer, but the %p
> format string for pointers already has extensions for various kernel
> structures, so it's a good fit, and the extra indirection in a non-critical
> path is negligible.
> 
> Signed-off-by: Vlastimil Babka 
> Cc: Rasmus Villemoes 
> ---
> I'm sending it on top of the page_owner series, as it's already in mmotm.
> But to reduce churn (in case this approach is accepted), I can later
> incorporate it and resend it whole.
> 
> Documentation/printk-formats.txt |  14 
> include/linux/mmdebug.h  |   5 +-
> lib/vsprintf.c   |  31 
> mm/debug.c   | 150 ++-
> mm/oom_kill.c|   5 +-
> mm/page_alloc.c  |   5 +-
> mm/page_owner.c  |   5 +-
> 7 files changed, 140 insertions(+), 75 deletions(-)
> 
> diff --git a/Documentation/printk-formats.txt 
> b/Documentation/printk-formats.txt
> index b784c270105f..4b5156e74b09 100644
> --- a/Documentation/printk-formats.txt
> +++ b/Documentation/printk-formats.txt
> @@ -292,6 +292,20 @@ Raw pointer value SHOULD be printed with %p. The kernel 
> supports
> 
>   Passed by reference.
> 
> +Flags bitfields such as page flags, gfp_flags:
> +
> + %pgp0x1f886c(referenced|uptodate|lru|active|private)
> + %pgg0x24202c4(GFP_USER|GFP_DMA32|GFP_NOWARN)
> + %pgv0x875(read|exec|mayread|maywrite|mayexec|denywrite)
> +
> + For printing raw values of flags bitfields together with symbolic
> + strings that would construct the value. The type of flags is given by
> + the third character. Currently supported are [p]age flags, [g]fp_flags
> + and [v]ma_flags. The flag names and print order depends on the
> + particular type.
> +
> + Passed by reference.
> +
> Network device features:
> 
>   %pNF0xc000
> diff --git a/include/linux/mmdebug.h b/include/linux/mmdebug.h
> index 3b77fab7ad28..e6518df259ca 100644
> --- a/include/linux/mmdebug.h
> +++ b/include/linux/mmdebug.h
> @@ -2,6 +2,7 @@
> #define LINUX_MM_DEBUG_H 1
> 
> #include 
> +#include 
> 
> struct page;
> struct vm_area_struct;
> @@ -10,7 +11,9 @@ struct mm_struct;
> extern void dump_page(struct page *page, const char *reason);
> extern void dump_page_badflags(struct page *page, const char *reason,
>  unsigned long badflags);
> -extern void dump_gfpflag_names(unsigned long gfp_flags);
> +extern char *format_page_flags(unsigned long flags, char *buf, char *end);
> +extern char *format_vma_flags(unsigned long flags, char *buf, char *end);
> +extern char *format_gfp_flags(gfp_t gfp_flags, char *buf, char*end);
> void dump_vma(const struct vm_area_struct *vma);
> void dump_mm(const struct mm_struct *mm);
> 
> diff --git a/lib/vsprintf.c b/lib/vsprintf.c
> index f9cee8e1233c..41cd122bd307 100644
> --- a/lib/vsprintf.c
> +++ b/lib/vsprintf.c
> @@ -31,6 +31,7 @@
> #include 
> #include 
> #include 
> +#include 
> 
> #include  /* for PAGE_SIZE */
> #include  /* for dereference_function_descriptor() */
> @@ -1361,6 +1362,29 @@ char *clock(char *buf, char *end, struct clk *clk, 
> struct printf_spec spec,
>   }
> }
> 
> +static noinline_for_stack
> +char *flags_string(char *buf, char *end, void *flags_ptr,
> + struct printf_spec spec, const char *fmt)
> +{
> + unsigned long flags;
> + gfp_t gfp_flags;
> +
> + switch (fmt[1]) {
> + case 'p':
> + flags = *(unsigned long *)flags_ptr;
> + return format_page_flags(flags, buf, end);
> + case 'v':
> + flags = *(unsigned long *)flags_ptr;
> + return format_vma_flags(flags, buf, end);
> + case 'g':
> + gfp_flags = *(gfp_t *)flags_ptr;
> + return format_gfp_flags(gfp_flags, buf, end);
> + default:
> + WARN_ONCE(1, "Unsupported flags modifier: %c\n", fmt[1]);
> + return 0;
> + }
> +}
> +
> int kptr_restrict __read_mostly;
> 
> /*
> @@ -1448,6 +1472,11 @@ int kptr_restrict __read_mostly;
>  * - 'Cn' For a clock, it prints the name (Common Clock Framework) or address
>  *(legacy clock framework) of the clock
>  * - 'Cr' For a clock, it 

Re: [PATCH 1/2] mm, printk: introduce new format string for flags

2015-12-02 Thread yalin wang

> On Nov 30, 2015, at 08:10, Vlastimil Babka  wrote:
> 
> In mm we use several kinds of flags bitfields that are sometimes printed for
> debugging purposes, or exported to userspace via sysfs. To make them easier to
> interpret independently on kernel version and config, we want to dump also the
> symbolic flag names. So far this has been done with repeated calls to
> pr_cont(), which is unreliable on SMP, and not usable for e.g. sysfs export.
> 
> To get a more reliable and universal solution, this patch extends printk()
> format string for pointers to handle the page flags (%pgp), gfp_flags (%pgg)
> and vma flags (%pgv). Existing users of dump_flag_names() are converted and
> simplified.
> 
> It would be possible to pass flags by value instead of pointer, but the %p
> format string for pointers already has extensions for various kernel
> structures, so it's a good fit, and the extra indirection in a non-critical
> path is negligible.
> 
> Signed-off-by: Vlastimil Babka 
> Cc: Rasmus Villemoes 
> ---
> I'm sending it on top of the page_owner series, as it's already in mmotm.
> But to reduce churn (in case this approach is accepted), I can later
> incorporate it and resend it whole.
> 
> Documentation/printk-formats.txt |  14 
> include/linux/mmdebug.h  |   5 +-
> lib/vsprintf.c   |  31 
> mm/debug.c   | 150 ++-
> mm/oom_kill.c|   5 +-
> mm/page_alloc.c  |   5 +-
> mm/page_owner.c  |   5 +-
> 7 files changed, 140 insertions(+), 75 deletions(-)
> 
> diff --git a/Documentation/printk-formats.txt 
> b/Documentation/printk-formats.txt
> index b784c270105f..4b5156e74b09 100644
> --- a/Documentation/printk-formats.txt
> +++ b/Documentation/printk-formats.txt
> @@ -292,6 +292,20 @@ Raw pointer value SHOULD be printed with %p. The kernel 
> supports
> 
>   Passed by reference.
> 
> +Flags bitfields such as page flags, gfp_flags:
> +
> + %pgp0x1f886c(referenced|uptodate|lru|active|private)
> + %pgg0x24202c4(GFP_USER|GFP_DMA32|GFP_NOWARN)
> + %pgv0x875(read|exec|mayread|maywrite|mayexec|denywrite)
> +
> + For printing raw values of flags bitfields together with symbolic
> + strings that would construct the value. The type of flags is given by
> + the third character. Currently supported are [p]age flags, [g]fp_flags
> + and [v]ma_flags. The flag names and print order depends on the
> + particular type.
> +
> + Passed by reference.
> +
> Network device features:
> 
>   %pNF0xc000
> diff --git a/include/linux/mmdebug.h b/include/linux/mmdebug.h
> index 3b77fab7ad28..e6518df259ca 100644
> --- a/include/linux/mmdebug.h
> +++ b/include/linux/mmdebug.h
> @@ -2,6 +2,7 @@
> #define LINUX_MM_DEBUG_H 1
> 
> #include 
> +#include 
> 
> struct page;
> struct vm_area_struct;
> @@ -10,7 +11,9 @@ struct mm_struct;
> extern void dump_page(struct page *page, const char *reason);
> extern void dump_page_badflags(struct page *page, const char *reason,
>  unsigned long badflags);
> -extern void dump_gfpflag_names(unsigned long gfp_flags);
> +extern char *format_page_flags(unsigned long flags, char *buf, char *end);
> +extern char *format_vma_flags(unsigned long flags, char *buf, char *end);
> +extern char *format_gfp_flags(gfp_t gfp_flags, char *buf, char*end);
> void dump_vma(const struct vm_area_struct *vma);
> void dump_mm(const struct mm_struct *mm);
> 
> diff --git a/lib/vsprintf.c b/lib/vsprintf.c
> index f9cee8e1233c..41cd122bd307 100644
> --- a/lib/vsprintf.c
> +++ b/lib/vsprintf.c
> @@ -31,6 +31,7 @@
> #include 
> #include 
> #include 
> +#include 
> 
> #include  /* for PAGE_SIZE */
> #include  /* for dereference_function_descriptor() */
> @@ -1361,6 +1362,29 @@ char *clock(char *buf, char *end, struct clk *clk, 
> struct printf_spec spec,
>   }
> }
> 
> +static noinline_for_stack
> +char *flags_string(char *buf, char *end, void *flags_ptr,
> + struct printf_spec spec, const char *fmt)
> +{
> + unsigned long flags;
> + gfp_t gfp_flags;
> +
> + switch (fmt[1]) {
> + case 'p':
> + flags = *(unsigned long *)flags_ptr;
> + return format_page_flags(flags, buf, end);
> + case 'v':
> + flags = *(unsigned long *)flags_ptr;
> + return format_vma_flags(flags, buf, end);
> + case 'g':
> + gfp_flags = *(gfp_t *)flags_ptr;
> + return format_gfp_flags(gfp_flags, buf, end);
> + default:
> + WARN_ONCE(1, "Unsupported flags modifier: %c\n", fmt[1]);
> + return 0;
> + }
> +}
> +
> int kptr_restrict __read_mostly;
> 
> /*
> @@ -1448,6 +1472,11 @@ int kptr_restrict __read_mostly;
>  * - 'Cn' For a clock, it prints the name (Common Clock Framework) or address
>  *(legacy 

Re: [PATCH 1/2] mm, printk: introduce new format string for flags

2015-12-02 Thread yalin wang

> On Dec 2, 2015, at 13:04, Vlastimil Babka <vba...@suse.cz> wrote:
> 
> On 12/02/2015 06:40 PM, yalin wang wrote:
> 
> (please trim your reply next time, no need to quote whole patch here)
> 
>> i am thinking why not make %pg* to be more generic ?
>> not restricted to only GFP / vma flags / page flags .
>> so could we change format like this ?
>> define a flag spec struct to include flag and trace_print_flags and some 
>> other option :
>> typedef struct { 
>> unsigned long flag;
>> structtrace_print_flags *flags;
>> unsigned long option; } flag_sec;
>> flag_sec my_flag;
>> in printk we only pass like this :
>> printk(“%pg\n”, _flag) ;
>> then it can print any flags defined by user .
>> more useful for other drivers to use .
> 
> I don't know, it sounds quite complicated given that we had no flags printing
> for years and now there's just three kinds of them. The extra struct flag_sec 
> is
> IMHO nuissance. No other printk format needs such thing AFAIK? For example, 
> if I
> were to print page flags from several places, each would have to define the
> struct flag_sec instance, or some header would have to provide it?
this can be avoided by provide a macro in header file .
we can add a new struct to declare trace_print_flags :
for example:
#define DECLARE_FLAG_PRINTK_FMT(name, flags_array)   flag_spec name = { .flags 
= flags_array};
#define FLAG_PRINTK_FMT(name, flag) ({  name.flag = flag;  })

in source code :
DECLARE_FLAG_PRINTK_FMT(my_flag, vmaflags_names);
printk(“%pg\n”, FLAG_PRINTK_FMT(my_flag, vma->flag));

i am not if DECLARE_FLAG_PRINTK_FMT and FLAG_PRINTK_FMT macro 
can be defined into one macro ?
maybe need some trick here .

is it possible ?


Thanks



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 5/5] printk/nmi: Increase the size of the temporary buffer

2015-11-30 Thread yalin wang

> On Nov 27, 2015, at 19:09, Petr Mladek  wrote:
> 
> Testing has shown that the backtrace sometimes does not fit
> into the 4kB temporary buffer that is used in NMI context.
> 
> The warnings are gone when I double the temporary buffer size.
> 
> Note that this problem existed even in the x86-specific
> implementation that was added by the commit a9edc8809328
> ("x86/nmi: Perform a safe NMI stack trace on all CPUs").
> Nobody noticed it because it did not print any warnings.
> 
> Signed-off-by: Petr Mladek 
> ---
> kernel/printk/nmi.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/printk/nmi.c b/kernel/printk/nmi.c
> index 8af1e4016719..6111644d5f01 100644
> --- a/kernel/printk/nmi.c
> +++ b/kernel/printk/nmi.c
> @@ -42,7 +42,7 @@ atomic_t nmi_message_lost;
> struct nmi_seq_buf {
>   atomic_tlen;/* length of written data */
>   struct irq_work work;   /* IRQ work that flushes the buffer */
> - unsigned char   buffer[PAGE_SIZE - sizeof(atomic_t) -
> + unsigned char   buffer[2 * PAGE_SIZE - sizeof(atomic_t) -
>  sizeof(struct irq_work)];
> };
> 

why not define like this:

union {
struct {atomic_tlen;
struct irq_work work;
}
unsigned char   buffer[PAGE_SIZE * 2] ;
}

we can make sure the union is 2 PAGE_SIZE .

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 5/5] printk/nmi: Increase the size of the temporary buffer

2015-11-30 Thread yalin wang

> On Nov 27, 2015, at 19:09, Petr Mladek  wrote:
> 
> Testing has shown that the backtrace sometimes does not fit
> into the 4kB temporary buffer that is used in NMI context.
> 
> The warnings are gone when I double the temporary buffer size.
> 
> Note that this problem existed even in the x86-specific
> implementation that was added by the commit a9edc8809328
> ("x86/nmi: Perform a safe NMI stack trace on all CPUs").
> Nobody noticed it because it did not print any warnings.
> 
> Signed-off-by: Petr Mladek 
> ---
> kernel/printk/nmi.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/printk/nmi.c b/kernel/printk/nmi.c
> index 8af1e4016719..6111644d5f01 100644
> --- a/kernel/printk/nmi.c
> +++ b/kernel/printk/nmi.c
> @@ -42,7 +42,7 @@ atomic_t nmi_message_lost;
> struct nmi_seq_buf {
>   atomic_tlen;/* length of written data */
>   struct irq_work work;   /* IRQ work that flushes the buffer */
> - unsigned char   buffer[PAGE_SIZE - sizeof(atomic_t) -
> + unsigned char   buffer[2 * PAGE_SIZE - sizeof(atomic_t) -
>  sizeof(struct irq_work)];
> };
> 

why not define like this:

union {
struct {atomic_tlen;
struct irq_work work;
}
unsigned char   buffer[PAGE_SIZE * 2] ;
}

we can make sure the union is 2 PAGE_SIZE .

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: no-op delay loops

2015-11-27 Thread yalin wang

> On Nov 27, 2015, at 16:53, Rasmus Villemoes  wrote:
> 
> Hi,
> 
> It seems that gcc happily compiles
> 
> for (i = 0; i < 10; ++i) ;
> 
> into simply
> 
> i = 10;
> 
> (which is then usually eliminated as a dead store). At least at -O2, and
> when i is not declared volatile. So it would seem that the loops at
> 
> arch/mips/pci/pci-rt2880.c:235
> arch/mips/pmcs-msp71xx/msp_setup.c:80
> arch/mips/sni/reset.c:35
> 
> actually don't do anything. (In the middle one, i is 'register', but
> that doesn't change anything.) Is mips compiled with some special flags
> that would make gcc actually emit code for the above?
> 
you can try to declare i as  volatile int i;
may gcc will not optimize it .

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2] scripts: fix the sys path for gdb scripts

2015-11-27 Thread yalin wang

> On Nov 27, 2015, at 15:04, Jan Kiszka  wrote:
> 
> On 2015-11-27 07:41, yalin wang wrote:
>> we insert __file__'s real path into sys.path,
>> so that no matter we import the vmlinux-gdb.py from $OUT floder or
>> from source code folder, we can always find the linux/ lib folder,
>> and we don't need create link to linux/*.py files,
>> remove the related make file.
> 
> NACK again - I tell you why below.
> 
>> 
>> Signed-off-by: yalin wang 
>> ---
>> scripts/Makefile   |  1 -
>> scripts/gdb/Makefile   |  1 -
>> scripts/gdb/linux/Makefile | 11 ---
>> scripts/gdb/vmlinux-gdb.py |  2 +-
>> 4 files changed, 1 insertion(+), 14 deletions(-)
>> delete mode 100644 scripts/gdb/Makefile
>> delete mode 100644 scripts/gdb/linux/Makefile
>> 
>> diff --git a/scripts/Makefile b/scripts/Makefile
>> index 2016a64..72902b5 100644
>> --- a/scripts/Makefile
>> +++ b/scripts/Makefile
>> @@ -36,7 +36,6 @@ subdir-$(CONFIG_MODVERSIONS) += genksyms
>> subdir-y += mod
>> subdir-$(CONFIG_SECURITY_SELINUX) += selinux
>> subdir-$(CONFIG_DTC) += dtc
>> -subdir-$(CONFIG_GDB_SCRIPTS) += gdb
>> 
>> # Let clean descend into subdirs
>> subdir-  += basic kconfig package
>> diff --git a/scripts/gdb/Makefile b/scripts/gdb/Makefile
>> deleted file mode 100644
>> index 62f5f65..000
>> --- a/scripts/gdb/Makefile
>> +++ /dev/null
>> @@ -1 +0,0 @@
>> -subdir-y := linux
>> diff --git a/scripts/gdb/linux/Makefile b/scripts/gdb/linux/Makefile
>> deleted file mode 100644
>> index 6cf1ecf..000
>> --- a/scripts/gdb/linux/Makefile
>> +++ /dev/null
>> @@ -1,11 +0,0 @@
>> -always := gdb-scripts
>> -
>> -SRCTREE := $(shell cd $(srctree) && /bin/pwd)
>> -
>> -$(obj)/gdb-scripts:
>> -ifneq ($(KBUILD_SRC),)
>> -$(Q)ln -fsn $(SRCTREE)/$(obj)/*.py $(objtree)/$(obj)
>> -endif
>> -@:
>> -
>> -clean-files := *.pyc *.pyo $(if $(KBUILD_SRC),*.py)
> 
> This step I don't understand at all. Why do you want to destroy the
> possibility to automatically load the scripts? Did you read
> Documentation/gdb-kernel-debugging.txt in this regard?
> 
>> diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py
>> index ce82bf5..a9029f4 100644
>> --- a/scripts/gdb/vmlinux-gdb.py
>> +++ b/scripts/gdb/vmlinux-gdb.py
>> @@ -13,7 +13,7 @@
>> 
>> import os
>> 
>> -sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb")
>> +sys.path.insert(0, os.path.dirname(os.path.realpath(__file__)))
> 
> This works only so far as that (if you don't destroy the link) the main
> script will still find its modules. However, *.pyc files are then
> generated in the source tree, no longer in the output dirs. The code is
> designed to prevent this.
> 
> You still don't explain to us why the existing code doesn't work for you
> and how you prefer to use it instead.
> 
> Jan
> 
Thanks for your explanation,
the reason i change it is because i was doing cross platform debug ,
debug arm platform on x86 host .
and i only have source code on host ,
i don’t build it ..
Then when i start up gdb-arm , i want load its gdb scripts from source code .
that is the usage i need .

i don’t want build kernel on all host when i just want debug an embedded 
platform occasionally .

Thanks









--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: no-op delay loops

2015-11-27 Thread yalin wang

> On Nov 27, 2015, at 16:53, Rasmus Villemoes  wrote:
> 
> Hi,
> 
> It seems that gcc happily compiles
> 
> for (i = 0; i < 10; ++i) ;
> 
> into simply
> 
> i = 10;
> 
> (which is then usually eliminated as a dead store). At least at -O2, and
> when i is not declared volatile. So it would seem that the loops at
> 
> arch/mips/pci/pci-rt2880.c:235
> arch/mips/pmcs-msp71xx/msp_setup.c:80
> arch/mips/sni/reset.c:35
> 
> actually don't do anything. (In the middle one, i is 'register', but
> that doesn't change anything.) Is mips compiled with some special flags
> that would make gcc actually emit code for the above?
> 
you can try to declare i as  volatile int i;
may gcc will not optimize it .

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2] scripts: fix the sys path for gdb scripts

2015-11-27 Thread yalin wang

> On Nov 27, 2015, at 15:04, Jan Kiszka <jan.kis...@siemens.com> wrote:
> 
> On 2015-11-27 07:41, yalin wang wrote:
>> we insert __file__'s real path into sys.path,
>> so that no matter we import the vmlinux-gdb.py from $OUT floder or
>> from source code folder, we can always find the linux/ lib folder,
>> and we don't need create link to linux/*.py files,
>> remove the related make file.
> 
> NACK again - I tell you why below.
> 
>> 
>> Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
>> ---
>> scripts/Makefile   |  1 -
>> scripts/gdb/Makefile   |  1 -
>> scripts/gdb/linux/Makefile | 11 ---
>> scripts/gdb/vmlinux-gdb.py |  2 +-
>> 4 files changed, 1 insertion(+), 14 deletions(-)
>> delete mode 100644 scripts/gdb/Makefile
>> delete mode 100644 scripts/gdb/linux/Makefile
>> 
>> diff --git a/scripts/Makefile b/scripts/Makefile
>> index 2016a64..72902b5 100644
>> --- a/scripts/Makefile
>> +++ b/scripts/Makefile
>> @@ -36,7 +36,6 @@ subdir-$(CONFIG_MODVERSIONS) += genksyms
>> subdir-y += mod
>> subdir-$(CONFIG_SECURITY_SELINUX) += selinux
>> subdir-$(CONFIG_DTC) += dtc
>> -subdir-$(CONFIG_GDB_SCRIPTS) += gdb
>> 
>> # Let clean descend into subdirs
>> subdir-  += basic kconfig package
>> diff --git a/scripts/gdb/Makefile b/scripts/gdb/Makefile
>> deleted file mode 100644
>> index 62f5f65..000
>> --- a/scripts/gdb/Makefile
>> +++ /dev/null
>> @@ -1 +0,0 @@
>> -subdir-y := linux
>> diff --git a/scripts/gdb/linux/Makefile b/scripts/gdb/linux/Makefile
>> deleted file mode 100644
>> index 6cf1ecf..000
>> --- a/scripts/gdb/linux/Makefile
>> +++ /dev/null
>> @@ -1,11 +0,0 @@
>> -always := gdb-scripts
>> -
>> -SRCTREE := $(shell cd $(srctree) && /bin/pwd)
>> -
>> -$(obj)/gdb-scripts:
>> -ifneq ($(KBUILD_SRC),)
>> -$(Q)ln -fsn $(SRCTREE)/$(obj)/*.py $(objtree)/$(obj)
>> -endif
>> -@:
>> -
>> -clean-files := *.pyc *.pyo $(if $(KBUILD_SRC),*.py)
> 
> This step I don't understand at all. Why do you want to destroy the
> possibility to automatically load the scripts? Did you read
> Documentation/gdb-kernel-debugging.txt in this regard?
> 
>> diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py
>> index ce82bf5..a9029f4 100644
>> --- a/scripts/gdb/vmlinux-gdb.py
>> +++ b/scripts/gdb/vmlinux-gdb.py
>> @@ -13,7 +13,7 @@
>> 
>> import os
>> 
>> -sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb")
>> +sys.path.insert(0, os.path.dirname(os.path.realpath(__file__)))
> 
> This works only so far as that (if you don't destroy the link) the main
> script will still find its modules. However, *.pyc files are then
> generated in the source tree, no longer in the output dirs. The code is
> designed to prevent this.
> 
> You still don't explain to us why the existing code doesn't work for you
> and how you prefer to use it instead.
> 
> Jan
> 
Thanks for your explanation,
the reason i change it is because i was doing cross platform debug ,
debug arm platform on x86 host .
and i only have source code on host ,
i don’t build it ..
Then when i start up gdb-arm , i want load its gdb scripts from source code .
that is the usage i need .

i don’t want build kernel on all host when i just want debug an embedded 
platform occasionally .

Thanks









--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2] scripts: fix the sys path for gdb scripts

2015-11-26 Thread yalin wang
we insert __file__'s real path into sys.path,
so that no matter we import the vmlinux-gdb.py from $OUT floder or
from source code folder, we can always find the linux/ lib folder,
and we don't need create link to linux/*.py files,
remove the related make file.

Signed-off-by: yalin wang 
---
 scripts/Makefile   |  1 -
 scripts/gdb/Makefile   |  1 -
 scripts/gdb/linux/Makefile | 11 ---
 scripts/gdb/vmlinux-gdb.py |  2 +-
 4 files changed, 1 insertion(+), 14 deletions(-)
 delete mode 100644 scripts/gdb/Makefile
 delete mode 100644 scripts/gdb/linux/Makefile

diff --git a/scripts/Makefile b/scripts/Makefile
index 2016a64..72902b5 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -36,7 +36,6 @@ subdir-$(CONFIG_MODVERSIONS) += genksyms
 subdir-y += mod
 subdir-$(CONFIG_SECURITY_SELINUX) += selinux
 subdir-$(CONFIG_DTC) += dtc
-subdir-$(CONFIG_GDB_SCRIPTS) += gdb
 
 # Let clean descend into subdirs
 subdir-+= basic kconfig package
diff --git a/scripts/gdb/Makefile b/scripts/gdb/Makefile
deleted file mode 100644
index 62f5f65..000
--- a/scripts/gdb/Makefile
+++ /dev/null
@@ -1 +0,0 @@
-subdir-y := linux
diff --git a/scripts/gdb/linux/Makefile b/scripts/gdb/linux/Makefile
deleted file mode 100644
index 6cf1ecf..000
--- a/scripts/gdb/linux/Makefile
+++ /dev/null
@@ -1,11 +0,0 @@
-always := gdb-scripts
-
-SRCTREE := $(shell cd $(srctree) && /bin/pwd)
-
-$(obj)/gdb-scripts:
-ifneq ($(KBUILD_SRC),)
-   $(Q)ln -fsn $(SRCTREE)/$(obj)/*.py $(objtree)/$(obj)
-endif
-   @:
-
-clean-files := *.pyc *.pyo $(if $(KBUILD_SRC),*.py)
diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py
index ce82bf5..a9029f4 100644
--- a/scripts/gdb/vmlinux-gdb.py
+++ b/scripts/gdb/vmlinux-gdb.py
@@ -13,7 +13,7 @@
 
 import os
 
-sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb")
+sys.path.insert(0, os.path.dirname(os.path.realpath(__file__)))
 
 try:
 gdb.parse_and_eval("0")
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] scripts: fix the sys path for gdb scripts

2015-11-26 Thread yalin wang

> On Nov 25, 2015, at 15:38, Jan Kiszka  wrote:
> 
> On 2015-11-19 11:54, yalin wang wrote:
>> The sys.path should be scripts/gdb,
>> so that we can import linux lib correctly.
>> 
>> Signed-off-by: yalin wang 
>> ---
>> scripts/gdb/vmlinux-gdb.py | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py
>> index ce82bf5..5a45d1a 100644
>> --- a/scripts/gdb/vmlinux-gdb.py
>> +++ b/scripts/gdb/vmlinux-gdb.py
>> @@ -13,7 +13,7 @@
>> 
>> import os
>> 
>> -sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb")
>> +sys.path.insert(0, os.path.dirname(__file__))
>> 
>> try:
>> gdb.parse_and_eval("0")
>> 
> 
> NACK. This patch is assuming that vmlinux-gdb.py is (only) started from
> the scripts/gdb folder. But CONFIG_GDB_SCRIPTS places a link to
> vmlinux-gdb.py aside the vmlinux binary in the top-level folder. That
> way, the script is auto-loaded by gdb.
> 
> If you have a compelling use case for loading the script manually from
> its original folder, we can discuss augmenting the path. But removing
> the existing one is wrong.
> 
> Andrew, please drop the patch from your queue.
> 
ok, i will send a V2 patch for this .



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] sched: change nr_uninterruptible to be signed

2015-11-26 Thread yalin wang
nr_uninterruptible will be negative during running,
this happened when dequeue a TASK_UNINTERRUPTIBLE task
from rq1 and then wake up the task and queue it to rq2,
then rq2->nr_uninterruptible-- will reuslt in negative value
sometimes.

Signed-off-by: yalin wang 
---
 kernel/sched/loadavg.c | 2 +-
 kernel/sched/sched.h   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/loadavg.c b/kernel/sched/loadavg.c
index ef71590..39504c6 100644
--- a/kernel/sched/loadavg.c
+++ b/kernel/sched/loadavg.c
@@ -83,7 +83,7 @@ long calc_load_fold_active(struct rq *this_rq)
long nr_active, delta = 0;
 
nr_active = this_rq->nr_running;
-   nr_active += (long)this_rq->nr_uninterruptible;
+   nr_active += this_rq->nr_uninterruptible;
 
if (nr_active != this_rq->calc_load_active) {
delta = nr_active - this_rq->calc_load_active;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 84d4879..7b5f67b 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -605,7 +605,7 @@ struct rq {
 * one CPU and if it got migrated afterwards it may decrease
 * it on another CPU. Always updated under the runqueue lock:
 */
-   unsigned long nr_uninterruptible;
+   long nr_uninterruptible;
 
struct task_struct *curr, *idle, *stop;
unsigned long next_balance;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 6/9] mm, debug: introduce dump_gfpflag_names() for symbolic printing of gfp_flags

2015-11-26 Thread yalin wang

> On Nov 25, 2015, at 18:28, Vlastimil Babka  wrote:
> 
> On 11/25/2015 09:16 AM, Joonsoo Kim wrote:
>> On Tue, Nov 24, 2015 at 01:36:18PM +0100, Vlastimil Babka wrote:
>>> --- a/include/trace/events/gfpflags.h
>>> +++ b/include/trace/events/gfpflags.h
>>> @@ -8,8 +8,8 @@
>>>  *
>>>  * Thus most bits set go first.
>>>  */
>>> -#define show_gfp_flags(flags)  
>>> \
>>> -   (flags) ? __print_flags(flags, "|", \
>>> +
>>> +#define __def_gfpflag_names
>>> \
>>> {(unsigned long)GFP_TRANSHUGE,  "GFP_TRANSHUGE"},   \
>>> {(unsigned long)GFP_HIGHUSER_MOVABLE,   "GFP_HIGHUSER_MOVABLE"}, \
>>> {(unsigned long)GFP_HIGHUSER,   "GFP_HIGHUSER"},\
>>> @@ -19,9 +19,13 @@
>>> {(unsigned long)GFP_NOFS,   "GFP_NOFS"},\
>>> {(unsigned long)GFP_ATOMIC, "GFP_ATOMIC"},  \
>>> {(unsigned long)GFP_NOIO,   "GFP_NOIO"},\
>>> +   {(unsigned long)GFP_NOWAIT, "GFP_NOWAIT"},  \
>>> +   {(unsigned long)__GFP_DMA,  "GFP_DMA"}, \
>>> +   {(unsigned long)__GFP_DMA32,"GFP_DMA32"},   \
>>> {(unsigned long)__GFP_HIGH, "GFP_HIGH"},\
>>> {(unsigned long)__GFP_ATOMIC,   "GFP_ATOMIC"},  \
>>> {(unsigned long)__GFP_IO,   "GFP_IO"},  \
>>> +   {(unsigned long)__GFP_FS,   "GFP_FS"},  \
>>> {(unsigned long)__GFP_COLD, "GFP_COLD"},\
>>> {(unsigned long)__GFP_NOWARN,   "GFP_NOWARN"},  \
>>> {(unsigned long)__GFP_REPEAT,   "GFP_REPEAT"},  \
>>> @@ -36,8 +40,12 @@
>>> {(unsigned long)__GFP_RECLAIMABLE,  "GFP_RECLAIMABLE"}, \
>>> {(unsigned long)__GFP_MOVABLE,  "GFP_MOVABLE"}, \
>>> {(unsigned long)__GFP_NOTRACK,  "GFP_NOTRACK"}, \
>>> +   {(unsigned long)__GFP_WRITE,"GFP_WRITE"},   \
>>> {(unsigned long)__GFP_DIRECT_RECLAIM,   "GFP_DIRECT_RECLAIM"},  \
>>> {(unsigned long)__GFP_KSWAPD_RECLAIM,   "GFP_KSWAPD_RECLAIM"},  \
>>> {(unsigned long)__GFP_OTHER_NODE,   "GFP_OTHER_NODE"}   \
>>> -   ) : "GFP_NOWAIT"
>>> 
>>> +#define show_gfp_flags(flags)  
>>> \
>>> +   (flags) ? __print_flags(flags, "|", \
>>> +   __def_gfpflag_names \
>>> +   ) : "none"
>> 
>> How about moving this to gfp.h or something?
>> Now, we use it in out of tracepoints so there is no need to keep it
>> in include/trace/events/xxx.
> 
> Hm I didn't want to pollute such widely included header with such defines. And
> show_gfp_flags shouldn't be there definitely as it depends on __print_flags.
> What do others think?
how about add this into standard printk()  format ?
like cpu mask print in printk use %*pb[l]  ,
it define a macro cpumask_pr_args  to print cpumask .

we can also define a new format like %pG  means print flag ,
then it will be useful for other code to use , like dump vma /  mm  flags ..

Thanks





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 6/9] mm, debug: introduce dump_gfpflag_names() for symbolic printing of gfp_flags

2015-11-26 Thread yalin wang

> On Nov 25, 2015, at 18:28, Vlastimil Babka  wrote:
> 
> On 11/25/2015 09:16 AM, Joonsoo Kim wrote:
>> On Tue, Nov 24, 2015 at 01:36:18PM +0100, Vlastimil Babka wrote:
>>> --- a/include/trace/events/gfpflags.h
>>> +++ b/include/trace/events/gfpflags.h
>>> @@ -8,8 +8,8 @@
>>>  *
>>>  * Thus most bits set go first.
>>>  */
>>> -#define show_gfp_flags(flags)  
>>> \
>>> -   (flags) ? __print_flags(flags, "|", \
>>> +
>>> +#define __def_gfpflag_names
>>> \
>>> {(unsigned long)GFP_TRANSHUGE,  "GFP_TRANSHUGE"},   \
>>> {(unsigned long)GFP_HIGHUSER_MOVABLE,   "GFP_HIGHUSER_MOVABLE"}, \
>>> {(unsigned long)GFP_HIGHUSER,   "GFP_HIGHUSER"},\
>>> @@ -19,9 +19,13 @@
>>> {(unsigned long)GFP_NOFS,   "GFP_NOFS"},\
>>> {(unsigned long)GFP_ATOMIC, "GFP_ATOMIC"},  \
>>> {(unsigned long)GFP_NOIO,   "GFP_NOIO"},\
>>> +   {(unsigned long)GFP_NOWAIT, "GFP_NOWAIT"},  \
>>> +   {(unsigned long)__GFP_DMA,  "GFP_DMA"}, \
>>> +   {(unsigned long)__GFP_DMA32,"GFP_DMA32"},   \
>>> {(unsigned long)__GFP_HIGH, "GFP_HIGH"},\
>>> {(unsigned long)__GFP_ATOMIC,   "GFP_ATOMIC"},  \
>>> {(unsigned long)__GFP_IO,   "GFP_IO"},  \
>>> +   {(unsigned long)__GFP_FS,   "GFP_FS"},  \
>>> {(unsigned long)__GFP_COLD, "GFP_COLD"},\
>>> {(unsigned long)__GFP_NOWARN,   "GFP_NOWARN"},  \
>>> {(unsigned long)__GFP_REPEAT,   "GFP_REPEAT"},  \
>>> @@ -36,8 +40,12 @@
>>> {(unsigned long)__GFP_RECLAIMABLE,  "GFP_RECLAIMABLE"}, \
>>> {(unsigned long)__GFP_MOVABLE,  "GFP_MOVABLE"}, \
>>> {(unsigned long)__GFP_NOTRACK,  "GFP_NOTRACK"}, \
>>> +   {(unsigned long)__GFP_WRITE,"GFP_WRITE"},   \
>>> {(unsigned long)__GFP_DIRECT_RECLAIM,   "GFP_DIRECT_RECLAIM"},  \
>>> {(unsigned long)__GFP_KSWAPD_RECLAIM,   "GFP_KSWAPD_RECLAIM"},  \
>>> {(unsigned long)__GFP_OTHER_NODE,   "GFP_OTHER_NODE"}   \
>>> -   ) : "GFP_NOWAIT"
>>> 
>>> +#define show_gfp_flags(flags)  
>>> \
>>> +   (flags) ? __print_flags(flags, "|", \
>>> +   __def_gfpflag_names \
>>> +   ) : "none"
>> 
>> How about moving this to gfp.h or something?
>> Now, we use it in out of tracepoints so there is no need to keep it
>> in include/trace/events/xxx.
> 
> Hm I didn't want to pollute such widely included header with such defines. And
> show_gfp_flags shouldn't be there definitely as it depends on __print_flags.
> What do others think?
how about add this into standard printk()  format ?
like cpu mask print in printk use %*pb[l]  ,
it define a macro cpumask_pr_args  to print cpumask .

we can also define a new format like %pG  means print flag ,
then it will be useful for other code to use , like dump vma /  mm  flags ..

Thanks





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] scripts: fix the sys path for gdb scripts

2015-11-26 Thread yalin wang

> On Nov 25, 2015, at 15:38, Jan Kiszka <jan.kis...@siemens.com> wrote:
> 
> On 2015-11-19 11:54, yalin wang wrote:
>> The sys.path should be scripts/gdb,
>> so that we can import linux lib correctly.
>> 
>> Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
>> ---
>> scripts/gdb/vmlinux-gdb.py | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py
>> index ce82bf5..5a45d1a 100644
>> --- a/scripts/gdb/vmlinux-gdb.py
>> +++ b/scripts/gdb/vmlinux-gdb.py
>> @@ -13,7 +13,7 @@
>> 
>> import os
>> 
>> -sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb")
>> +sys.path.insert(0, os.path.dirname(__file__))
>> 
>> try:
>> gdb.parse_and_eval("0")
>> 
> 
> NACK. This patch is assuming that vmlinux-gdb.py is (only) started from
> the scripts/gdb folder. But CONFIG_GDB_SCRIPTS places a link to
> vmlinux-gdb.py aside the vmlinux binary in the top-level folder. That
> way, the script is auto-loaded by gdb.
> 
> If you have a compelling use case for loading the script manually from
> its original folder, we can discuss augmenting the path. But removing
> the existing one is wrong.
> 
> Andrew, please drop the patch from your queue.
> 
ok, i will send a V2 patch for this .



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] sched: change nr_uninterruptible to be signed

2015-11-26 Thread yalin wang
nr_uninterruptible will be negative during running,
this happened when dequeue a TASK_UNINTERRUPTIBLE task
from rq1 and then wake up the task and queue it to rq2,
then rq2->nr_uninterruptible-- will reuslt in negative value
sometimes.

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 kernel/sched/loadavg.c | 2 +-
 kernel/sched/sched.h   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/loadavg.c b/kernel/sched/loadavg.c
index ef71590..39504c6 100644
--- a/kernel/sched/loadavg.c
+++ b/kernel/sched/loadavg.c
@@ -83,7 +83,7 @@ long calc_load_fold_active(struct rq *this_rq)
long nr_active, delta = 0;
 
nr_active = this_rq->nr_running;
-   nr_active += (long)this_rq->nr_uninterruptible;
+   nr_active += this_rq->nr_uninterruptible;
 
if (nr_active != this_rq->calc_load_active) {
delta = nr_active - this_rq->calc_load_active;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 84d4879..7b5f67b 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -605,7 +605,7 @@ struct rq {
 * one CPU and if it got migrated afterwards it may decrease
 * it on another CPU. Always updated under the runqueue lock:
 */
-   unsigned long nr_uninterruptible;
+   long nr_uninterruptible;
 
struct task_struct *curr, *idle, *stop;
unsigned long next_balance;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2] scripts: fix the sys path for gdb scripts

2015-11-26 Thread yalin wang
we insert __file__'s real path into sys.path,
so that no matter we import the vmlinux-gdb.py from $OUT floder or
from source code folder, we can always find the linux/ lib folder,
and we don't need create link to linux/*.py files,
remove the related make file.

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 scripts/Makefile   |  1 -
 scripts/gdb/Makefile   |  1 -
 scripts/gdb/linux/Makefile | 11 ---
 scripts/gdb/vmlinux-gdb.py |  2 +-
 4 files changed, 1 insertion(+), 14 deletions(-)
 delete mode 100644 scripts/gdb/Makefile
 delete mode 100644 scripts/gdb/linux/Makefile

diff --git a/scripts/Makefile b/scripts/Makefile
index 2016a64..72902b5 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -36,7 +36,6 @@ subdir-$(CONFIG_MODVERSIONS) += genksyms
 subdir-y += mod
 subdir-$(CONFIG_SECURITY_SELINUX) += selinux
 subdir-$(CONFIG_DTC) += dtc
-subdir-$(CONFIG_GDB_SCRIPTS) += gdb
 
 # Let clean descend into subdirs
 subdir-+= basic kconfig package
diff --git a/scripts/gdb/Makefile b/scripts/gdb/Makefile
deleted file mode 100644
index 62f5f65..000
--- a/scripts/gdb/Makefile
+++ /dev/null
@@ -1 +0,0 @@
-subdir-y := linux
diff --git a/scripts/gdb/linux/Makefile b/scripts/gdb/linux/Makefile
deleted file mode 100644
index 6cf1ecf..000
--- a/scripts/gdb/linux/Makefile
+++ /dev/null
@@ -1,11 +0,0 @@
-always := gdb-scripts
-
-SRCTREE := $(shell cd $(srctree) && /bin/pwd)
-
-$(obj)/gdb-scripts:
-ifneq ($(KBUILD_SRC),)
-   $(Q)ln -fsn $(SRCTREE)/$(obj)/*.py $(objtree)/$(obj)
-endif
-   @:
-
-clean-files := *.pyc *.pyo $(if $(KBUILD_SRC),*.py)
diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py
index ce82bf5..a9029f4 100644
--- a/scripts/gdb/vmlinux-gdb.py
+++ b/scripts/gdb/vmlinux-gdb.py
@@ -13,7 +13,7 @@
 
 import os
 
-sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb")
+sys.path.insert(0, os.path.dirname(os.path.realpath(__file__)))
 
 try:
 gdb.parse_and_eval("0")
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] block: change blk_check_merge_flags() implementation

2015-11-20 Thread yalin wang
Use XOR to chenk some flags in flags1 and flags2 if the same,
much faster on some platforms.

Signed-off-by: yalin wang 
---
 include/linux/blkdev.h | 11 +--
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index c401ecd..3d0f053 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -655,16 +655,7 @@ static inline bool rq_mergeable(struct request *rq)
 static inline bool blk_check_merge_flags(unsigned int flags1,
 unsigned int flags2)
 {
-   if ((flags1 & REQ_DISCARD) != (flags2 & REQ_DISCARD))
-   return false;
-
-   if ((flags1 & REQ_SECURE) != (flags2 & REQ_SECURE))
-   return false;
-
-   if ((flags1 & REQ_WRITE_SAME) != (flags2 & REQ_WRITE_SAME))
-   return false;
-
-   return true;
+   return !((flags1 ^ flags2) & (REQ_DISCARD | REQ_SECURE | 
REQ_WRITE_SAME));
 }
 
 static inline bool blk_write_same_mergeable(struct bio *a, struct bio *b)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3] arm64: Add support for PTE contiguous bit.

2015-11-20 Thread yalin wang

> On Nov 20, 2015, at 00:57, David Woods  wrote:
> 
> The arm64 MMU supports a Contiguous bit which is a hint that the TTE
> is one of a set of contiguous entries which can be cached in a single
> TLB entry.  Supporting this bit adds new intermediate huge page sizes.
> 
> The set of huge page sizes available depends on the base page size.
> Without using contiguous pages the huge page sizes are as follows.
> 
> 4KB:   2MB  1GB
> 64KB: 512MB
> 
> With a 4KB granule, the contiguous bit groups together sets of 16 pages
> and with a 64KB granule it groups sets of 32 pages.  This enables two new
> huge page sizes in each case, so that the full set of available sizes
> is as follows.
> 
> 4KB:  64KB   2MB  32MB  1GB
> 64KB:   2MB 512MB  16GB
> 
> If a 16KB granule is used then the contiguous bit groups 128 pages
> at the PTE level and 32 pages at the PMD level.
> 
> If the base page size is set to 64KB then 2MB pages are enabled by
> default.  It is possible in the future to make 2MB the default huge
> page size for both 4KB and 64KB granules.
> 
> Signed-off-by: David Woods 
> Reviewed-by: Chris Metcalf 
> ---
> 
> This patch should resolve the comments on v2 and is now based on on the 
> arm64 next tree which includes 16K granule support.  I've added definitions 
> which should enable 2M and 1G huge page sizes with a 16K granule.  
> Unfortunately, the A53 model we have does not support 16K so I don't 
> have a way to test this.
> 
> arch/arm64/Kconfig |   3 -
> arch/arm64/include/asm/hugetlb.h   |  44 ++
> arch/arm64/include/asm/pgtable-hwdef.h |  18 ++-
> arch/arm64/include/asm/pgtable.h   |  10 +-
> arch/arm64/mm/hugetlbpage.c| 267 -
> include/linux/hugetlb.h|   2 -
> 6 files changed, 306 insertions(+), 38 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 40e1151..077bb7c 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -480,9 +480,6 @@ config HW_PERF_EVENTS
> config SYS_SUPPORTS_HUGETLBFS
>   def_bool y
> 
> -config ARCH_WANT_GENERAL_HUGETLB
> - def_bool y
> -
> config ARCH_WANT_HUGE_PMD_SHARE
>   def_bool y if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36)
> 
> diff --git a/arch/arm64/include/asm/hugetlb.h 
> b/arch/arm64/include/asm/hugetlb.h
> index bb4052e..bbc1e35 100644
> --- a/arch/arm64/include/asm/hugetlb.h
> +++ b/arch/arm64/include/asm/hugetlb.h
> @@ -26,36 +26,7 @@ static inline pte_t huge_ptep_get(pte_t *ptep)
>   return *ptep;
> }
> 
> -static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
> -pte_t *ptep, pte_t pte)
> -{
> - set_pte_at(mm, addr, ptep, pte);
> -}
> -
> -static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
> -  unsigned long addr, pte_t *ptep)
> -{
> - ptep_clear_flush(vma, addr, ptep);
> -}
> -
> -static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
> -unsigned long addr, pte_t *ptep)
> -{
> - ptep_set_wrprotect(mm, addr, ptep);
> -}
> 
> -static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
> - unsigned long addr, pte_t *ptep)
> -{
> - return ptep_get_and_clear(mm, addr, ptep);
> -}
> -
> -static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
> -  unsigned long addr, pte_t *ptep,
> -  pte_t pte, int dirty)
> -{
> - return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
> -}
> 
> static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb,
> unsigned long addr, unsigned long end,
> @@ -97,4 +68,19 @@ static inline void arch_clear_hugepage_flags(struct page 
> *page)
>   clear_bit(PG_dcache_clean, >flags);
> }
> 
> +extern pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma,
> + struct page *page, int writable);
> +#define arch_make_huge_pte arch_make_huge_pte
> +extern void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
> + pte_t *ptep, pte_t pte);
> +extern int huge_ptep_set_access_flags(struct vm_area_struct *vma,
> +   unsigned long addr, pte_t *ptep,
> +   pte_t pte, int dirty);
> +extern pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
> +  unsigned long addr, pte_t *ptep);
> +extern void huge_ptep_set_wrprotect(struct mm_struct *mm,
> + unsigned long addr, pte_t *ptep);
> +extern void huge_ptep_clear_flush(struct vm_area_struct *vma,
> +   unsigned long addr, pte_t *ptep);
> +
> #endif /* __ASM_HUGETLB_H */
> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
> 

Re: [PATCH v3] arm64: Add support for PTE contiguous bit.

2015-11-20 Thread yalin wang

> On Nov 20, 2015, at 00:57, David Woods  wrote:
> 
> The arm64 MMU supports a Contiguous bit which is a hint that the TTE
> is one of a set of contiguous entries which can be cached in a single
> TLB entry.  Supporting this bit adds new intermediate huge page sizes.
> 
> The set of huge page sizes available depends on the base page size.
> Without using contiguous pages the huge page sizes are as follows.
> 
> 4KB:   2MB  1GB
> 64KB: 512MB
> 
> With a 4KB granule, the contiguous bit groups together sets of 16 pages
> and with a 64KB granule it groups sets of 32 pages.  This enables two new
> huge page sizes in each case, so that the full set of available sizes
> is as follows.
> 
> 4KB:  64KB   2MB  32MB  1GB
> 64KB:   2MB 512MB  16GB
> 
> If a 16KB granule is used then the contiguous bit groups 128 pages
> at the PTE level and 32 pages at the PMD level.
> 
> If the base page size is set to 64KB then 2MB pages are enabled by
> default.  It is possible in the future to make 2MB the default huge
> page size for both 4KB and 64KB granules.
> 
> Signed-off-by: David Woods 
> Reviewed-by: Chris Metcalf 
> ---
> 
> This patch should resolve the comments on v2 and is now based on on the 
> arm64 next tree which includes 16K granule support.  I've added definitions 
> which should enable 2M and 1G huge page sizes with a 16K granule.  
> Unfortunately, the A53 model we have does not support 16K so I don't 
> have a way to test this.
> 
> arch/arm64/Kconfig |   3 -
> arch/arm64/include/asm/hugetlb.h   |  44 ++
> arch/arm64/include/asm/pgtable-hwdef.h |  18 ++-
> arch/arm64/include/asm/pgtable.h   |  10 +-
> arch/arm64/mm/hugetlbpage.c| 267 -
> include/linux/hugetlb.h|   2 -
> 6 files changed, 306 insertions(+), 38 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 40e1151..077bb7c 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -480,9 +480,6 @@ config HW_PERF_EVENTS
> config SYS_SUPPORTS_HUGETLBFS
>   def_bool y
> 
> -config ARCH_WANT_GENERAL_HUGETLB
> - def_bool y
> -
> config ARCH_WANT_HUGE_PMD_SHARE
>   def_bool y if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36)
> 
> diff --git a/arch/arm64/include/asm/hugetlb.h 
> b/arch/arm64/include/asm/hugetlb.h
> index bb4052e..bbc1e35 100644
> --- a/arch/arm64/include/asm/hugetlb.h
> +++ b/arch/arm64/include/asm/hugetlb.h
> @@ -26,36 +26,7 @@ static inline pte_t huge_ptep_get(pte_t *ptep)
>   return *ptep;
> }
> 
> -static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
> -pte_t *ptep, pte_t pte)
> -{
> - set_pte_at(mm, addr, ptep, pte);
> -}
> -
> -static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
> -  unsigned long addr, pte_t *ptep)
> -{
> - ptep_clear_flush(vma, addr, ptep);
> -}
> -
> -static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
> -unsigned long addr, pte_t *ptep)
> -{
> - ptep_set_wrprotect(mm, addr, ptep);
> -}
> 
> -static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
> - unsigned long addr, pte_t *ptep)
> -{
> - return ptep_get_and_clear(mm, addr, ptep);
> -}
> -
> -static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
> -  unsigned long addr, pte_t *ptep,
> -  pte_t pte, int dirty)
> -{
> - return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
> -}
> 
> static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb,
> unsigned long addr, unsigned long end,
> @@ -97,4 +68,19 @@ static inline void arch_clear_hugepage_flags(struct page 
> *page)
>   clear_bit(PG_dcache_clean, >flags);
> }
> 
> +extern pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma,
> + struct page *page, int writable);
> +#define arch_make_huge_pte arch_make_huge_pte
> +extern void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
> + pte_t *ptep, pte_t pte);
> +extern int huge_ptep_set_access_flags(struct vm_area_struct *vma,
> +   unsigned long addr, pte_t *ptep,
> +   pte_t pte, int dirty);
> +extern pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
> +  unsigned long addr, pte_t *ptep);
> +extern void huge_ptep_set_wrprotect(struct mm_struct *mm,
> + unsigned long addr, pte_t *ptep);
> +extern void huge_ptep_clear_flush(struct vm_area_struct *vma,
> +   unsigned long addr, pte_t *ptep);
> +
> #endif /* __ASM_HUGETLB_H */
> diff --git 

[RFC] block: change blk_check_merge_flags() implementation

2015-11-20 Thread yalin wang
Use XOR to chenk some flags in flags1 and flags2 if the same,
much faster on some platforms.

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 include/linux/blkdev.h | 11 +--
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index c401ecd..3d0f053 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -655,16 +655,7 @@ static inline bool rq_mergeable(struct request *rq)
 static inline bool blk_check_merge_flags(unsigned int flags1,
 unsigned int flags2)
 {
-   if ((flags1 & REQ_DISCARD) != (flags2 & REQ_DISCARD))
-   return false;
-
-   if ((flags1 & REQ_SECURE) != (flags2 & REQ_SECURE))
-   return false;
-
-   if ((flags1 & REQ_WRITE_SAME) != (flags2 & REQ_WRITE_SAME))
-   return false;
-
-   return true;
+   return !((flags1 ^ flags2) & (REQ_DISCARD | REQ_SECURE | 
REQ_WRITE_SAME));
 }
 
 static inline bool blk_write_same_mergeable(struct bio *a, struct bio *b)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] scripts: fix the sys path for gdb scripts

2015-11-19 Thread yalin wang
The sys.path should be scripts/gdb,
so that we can import linux lib correctly.

Signed-off-by: yalin wang 
---
 scripts/gdb/vmlinux-gdb.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py
index ce82bf5..5a45d1a 100644
--- a/scripts/gdb/vmlinux-gdb.py
+++ b/scripts/gdb/vmlinux-gdb.py
@@ -13,7 +13,7 @@
 
 import os
 
-sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb")
+sys.path.insert(0, os.path.dirname(__file__))
 
 try:
 gdb.parse_and_eval("0")
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel oops on mmotm-2015-10-15-15-20

2015-11-19 Thread yalin wang

> On Nov 19, 2015, at 14:58, Kirill A. Shutemov  wrote:
> 
> uncharged
i also encounter this crash ,

also  i encounter a crash like this in qemu:


[2.703436] [] do_execveat_common.isra.36+0x4f0/0x630
[2.703624] [] do_execve+0x24/0x30
[2.703767] [] SyS_execve+0x1c/0x2c
[2.703923] BUG: Bad page map in process init  pte:604837ebd3 
pmd:b29e7003
[2.704140] page:ffc07f00af80 count:2 mapcount:-1 mapping:  
(null) index:0x1
[2.704414] flags: 0x4014(referenced|dirty)
[2.704563] page dumped because: bad pte
[2.704666] addr:007fafb7e000 vm_flags:00100073 
anon_vma:ffc0729bdb90 mapping:  (null) index:7fafb7e
[2.704906] file:  (null) fault:  (null) mmap:  
(null) readpage:  (null)
[2.705117] CPU: 0 PID: 84 Comm: init Tainted: GB   
4.2.0ajb-5-g11a9bf3 #80
[2.705315] Hardware name: ranchu (DT)
[2.705408] Call trace:
[2.705488] [] dump_backtrace+0x0/0x124
[2.705657] [] show_stack+0x10/0x1c
[2.705797] [] dump_stack+0x78/0x98
[2.705971] [] print_bad_pte+0x154/0x1f0
[2.706102] [] unmap_single_vma+0x574/0x704
[2.706236] [] unmap_vmas+0x54/0x70
[2.706354] [] exit_mmap+0x88/0xfc
[2.706473] [] mmput+0x48/0xe8
[2.706584] [] flush_old_exec+0x30c/0x79c
[2.706719] [] load_elf_binary+0x21c/0x1098
[2.706856] [] search_binary_handler+0xa8/0x224
[2.706995] [] do_execveat_common.isra.36+0x4f0/0x630
[2.707144] [] do_execve+0x24/0x30
[2.707263] [] SyS_execve+0x1c/0x2c
[2.707392] BUG: Bad page map in process init  pte:604837fbd3 
pmd:b29e7003
[2.707752] page:ffc07f00afc0 count:2 mapcount:-1 mapping:  
(null) index:0x1
[2.708167] flags: 0x4014(referenced|dirty)
[2.708333] page dumped because: bad pte
[2.708501] addr:007fafb7f000 vm_flags:00100073 
anon_vma:ffc0729bdb90 mapping:  (null) index:7fafb7f
[2.709084] file:  (null) fault:  (null) mmap:  
(null) readpage:  (null)
[2.709306] CPU: 0 PID: 84 Comm: init Tainted: GB   
4.2.0ajb-5-g11a9bf3 #80
[2.709494] Hardware name: ranchu (DT)

seems the page map count is not correct ..
i build is based on mmotm-2015-10-21-14-41

Thanks



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/3] lib: Introduce 2 bit ops api: all_is_bit_{one,zero}

2015-11-19 Thread yalin wang

> On Nov 19, 2015, at 14:48, Jia He  wrote:
> 
> 
why not use memcmp() to compare with  0x000 or 0x  ?
memcmp() have better performance on some platforms .

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/3] lib: Introduce 2 bit ops api: all_is_bit_{one,zero}

2015-11-19 Thread yalin wang

> On Nov 19, 2015, at 14:48, Jia He  wrote:
> 
> 
why not use memcmp() to compare with  0x000 or 0x  ?
memcmp() have better performance on some platforms .

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel oops on mmotm-2015-10-15-15-20

2015-11-19 Thread yalin wang

> On Nov 19, 2015, at 14:58, Kirill A. Shutemov  wrote:
> 
> uncharged
i also encounter this crash ,

also  i encounter a crash like this in qemu:


[2.703436] [] do_execveat_common.isra.36+0x4f0/0x630
[2.703624] [] do_execve+0x24/0x30
[2.703767] [] SyS_execve+0x1c/0x2c
[2.703923] BUG: Bad page map in process init  pte:604837ebd3 
pmd:b29e7003
[2.704140] page:ffc07f00af80 count:2 mapcount:-1 mapping:  
(null) index:0x1
[2.704414] flags: 0x4014(referenced|dirty)
[2.704563] page dumped because: bad pte
[2.704666] addr:007fafb7e000 vm_flags:00100073 
anon_vma:ffc0729bdb90 mapping:  (null) index:7fafb7e
[2.704906] file:  (null) fault:  (null) mmap:  
(null) readpage:  (null)
[2.705117] CPU: 0 PID: 84 Comm: init Tainted: GB   
4.2.0ajb-5-g11a9bf3 #80
[2.705315] Hardware name: ranchu (DT)
[2.705408] Call trace:
[2.705488] [] dump_backtrace+0x0/0x124
[2.705657] [] show_stack+0x10/0x1c
[2.705797] [] dump_stack+0x78/0x98
[2.705971] [] print_bad_pte+0x154/0x1f0
[2.706102] [] unmap_single_vma+0x574/0x704
[2.706236] [] unmap_vmas+0x54/0x70
[2.706354] [] exit_mmap+0x88/0xfc
[2.706473] [] mmput+0x48/0xe8
[2.706584] [] flush_old_exec+0x30c/0x79c
[2.706719] [] load_elf_binary+0x21c/0x1098
[2.706856] [] search_binary_handler+0xa8/0x224
[2.706995] [] do_execveat_common.isra.36+0x4f0/0x630
[2.707144] [] do_execve+0x24/0x30
[2.707263] [] SyS_execve+0x1c/0x2c
[2.707392] BUG: Bad page map in process init  pte:604837fbd3 
pmd:b29e7003
[2.707752] page:ffc07f00afc0 count:2 mapcount:-1 mapping:  
(null) index:0x1
[2.708167] flags: 0x4014(referenced|dirty)
[2.708333] page dumped because: bad pte
[2.708501] addr:007fafb7f000 vm_flags:00100073 
anon_vma:ffc0729bdb90 mapping:  (null) index:7fafb7f
[2.709084] file:  (null) fault:  (null) mmap:  
(null) readpage:  (null)
[2.709306] CPU: 0 PID: 84 Comm: init Tainted: GB   
4.2.0ajb-5-g11a9bf3 #80
[2.709494] Hardware name: ranchu (DT)

seems the page map count is not correct ..
i build is based on mmotm-2015-10-21-14-41

Thanks



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] scripts: fix the sys path for gdb scripts

2015-11-19 Thread yalin wang
The sys.path should be scripts/gdb,
so that we can import linux lib correctly.

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 scripts/gdb/vmlinux-gdb.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py
index ce82bf5..5a45d1a 100644
--- a/scripts/gdb/vmlinux-gdb.py
+++ b/scripts/gdb/vmlinux-gdb.py
@@ -13,7 +13,7 @@
 
 import os
 
-sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb")
+sys.path.insert(0, os.path.dirname(__file__))
 
 try:
 gdb.parse_and_eval("0")
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: read physical address space

2015-11-17 Thread yalin wang
you should access it like this:
printk ( *(int*)kmap(pays_to_page(pays_addr)));

pays address must be mapped into virtual address before access it .
> On Nov 17, 2015, at 23:21, alan hopes  wrote:
> 
> phys_addr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: read physical address space

2015-11-17 Thread yalin wang
you should access it like this:
printk ( *(int*)kmap(pays_to_page(pays_addr)));

pays address must be mapped into virtual address before access it .
> On Nov 17, 2015, at 23:21, alan hopes  wrote:
> 
> phys_addr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] block: change to use atomic_inc_return_release()

2015-11-16 Thread yalin wang

> On Nov 17, 2015, at 11:38, Jens Axboe  wrote:
> 
> On 11/16/2015 08:24 PM, yalin wang wrote:
>> Some arch define this atomic_inc_return_release() OP.
> 
> That is a very vague commit message, you'll need a whole lot more than 
> that... A commit message is supposed to describe the reason for the change. 
> You provide no reason for the change.
> 
>> diff --git a/block/bio.c b/block/bio.c
>> index fbc558b..b251857 100644
>> --- a/block/bio.c
>> +++ b/block/bio.c
>> @@ -310,8 +310,7 @@ static void bio_chain_endio(struct bio *bio, int error)
>>  static inline void bio_inc_remaining(struct bio *bio)
>>  {
>>  bio->bi_flags |= (1 << BIO_CHAIN);
>> -smp_mb__before_atomic();
>> -atomic_inc(>__bi_remaining);
>> +atomic_inc_return_release(>__bi_remaining);
> 
> Are these equivalent? Where's the documentation for this primitive? The 
> previous code ensured that we ordered the dec of the remaining count with the 
> update of the flags.
> 
i just have a look at ARM64 implementation for this new atomic OP ,
but i don’t find doc in memory-barrier.txt . so i make this RFC for some 
response,
atomic_inc_return_release()  should have store_release() class memory barriers .
in this example,  smp_store_release() memory barrier is not enough ?
just make sure bi_flags update can been seen by other cores before update 
atomic counter.
atomic_inc_return_{release,acquire,relax} OP seems newly add to kernel .
But i don’t see much users in code .
Can it be used to replace lots of smp_mb__before_atomic() ?

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V4] mm: fix kernel crash in khugepaged thread

2015-11-16 Thread yalin wang

> On Nov 17, 2015, at 10:43, Steven Rostedt  wrote:
> 
> On Tue, 17 Nov 2015 10:21:47 +0800
> yalin wang  wrote:
> 
> 
>> i have not tried ,
>> just a question,
>> if you print a %s , but don’t call trace_define_field() do define this 
>> string in
>> __entry ,  how does user space perf tool to get this string info and print 
>> it ?
>> i am curious ..
>> i can try this when i have time.  and report to you .
> 
> Because the print_fmt has nothing to do with the fields. You can have
> as your print_fmt as:
> 
>   TP_printk("Message = %s", "hello dolly!")
> 
> And both userspace and the kernel with process that correctly (if I got
> string processing working in userspace, which I believe I do). The
> string is processed, it's not dependent on TP_STRUCT__entry() unless it
> references a field there. Which can also be used too:
> 
>   TP_printk("Message = %s", __entry->musical ? "Hello dolly!" :
>   "Death Trap!")
> 
> userspace will see in the entry:
> 
> print_fmt: "Message = %s", REC->musical ? "Hello dolly!" : "Death Trap!"
> 
> as long as the field "musical" exists, all is well.
> 
> -- Steve
Aha,  i see.
Thanks very much for your explanation.
Better print fat is :   
TP_printk("mm=%p, scan_pfn=%s, writable=%d, referenced=%d, none_or_zero=%d, 
status=%s, unmapped=%d",
   __entry->mm,
__entry->pfn == (-1UL) ? "(null)" :  itoa(buff,  __entry->pin, 
10), …..)

is this possible ?

Thanks








--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] block: change to use atomic_inc_return_release()

2015-11-16 Thread yalin wang
Some arch define this atomic_inc_return_release() OP.

Signed-off-by: yalin wang 
---
 block/bio.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index fbc558b..b251857 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -310,8 +310,7 @@ static void bio_chain_endio(struct bio *bio, int error)
 static inline void bio_inc_remaining(struct bio *bio)
 {
bio->bi_flags |= (1 << BIO_CHAIN);
-   smp_mb__before_atomic();
-   atomic_inc(>__bi_remaining);
+   atomic_inc_return_release(>__bi_remaining);
 }
 
 /**
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V4] mm: fix kernel crash in khugepaged thread

2015-11-16 Thread yalin wang

> On Nov 16, 2015, at 22:25, Steven Rostedt  wrote:
> 
> On Mon, 16 Nov 2015 11:16:22 +0100
> Vlastimil Babka  wrote:
>> 
 -- Steve  
>>> it is not easy to print for perf tools in userspace ,
>>> if you use this format ,
>>> for user space perf tool, it print the entry by look up the member in entry 
>>> struct by offset ,
>>> you print a dynamic string which user space perf tool don’t know how to 
>>> print this string .  
>> 
>> Does it work through trace-cmd?
> 
> The two use the same code. If it works in one, it will work in the
> other.
> 
> -- Steve
> 
i have not tried ,
just a question,
if you print a %s , but don’t call trace_define_field() do define this string in
__entry ,  how does user space perf tool to get this string info and print it ?
i am curious ..
i can try this when i have time.  and report to you .

Thanks--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V4] mm: fix kernel crash in khugepaged thread

2015-11-16 Thread yalin wang

> On Nov 17, 2015, at 10:43, Steven Rostedt <rost...@goodmis.org> wrote:
> 
> On Tue, 17 Nov 2015 10:21:47 +0800
> yalin wang <yalin.wang2...@gmail.com> wrote:
> 
> 
>> i have not tried ,
>> just a question,
>> if you print a %s , but don’t call trace_define_field() do define this 
>> string in
>> __entry ,  how does user space perf tool to get this string info and print 
>> it ?
>> i am curious ..
>> i can try this when i have time.  and report to you .
> 
> Because the print_fmt has nothing to do with the fields. You can have
> as your print_fmt as:
> 
>   TP_printk("Message = %s", "hello dolly!")
> 
> And both userspace and the kernel with process that correctly (if I got
> string processing working in userspace, which I believe I do). The
> string is processed, it's not dependent on TP_STRUCT__entry() unless it
> references a field there. Which can also be used too:
> 
>   TP_printk("Message = %s", __entry->musical ? "Hello dolly!" :
>   "Death Trap!")
> 
> userspace will see in the entry:
> 
> print_fmt: "Message = %s", REC->musical ? "Hello dolly!" : "Death Trap!"
> 
> as long as the field "musical" exists, all is well.
> 
> -- Steve
Aha,  i see.
Thanks very much for your explanation.
Better print fat is :   
TP_printk("mm=%p, scan_pfn=%s, writable=%d, referenced=%d, none_or_zero=%d, 
status=%s, unmapped=%d",
   __entry->mm,
__entry->pfn == (-1UL) ? "(null)" :  itoa(buff,  __entry->pin, 
10), …..)

is this possible ?

Thanks








--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] block: change to use atomic_inc_return_release()

2015-11-16 Thread yalin wang

> On Nov 17, 2015, at 11:38, Jens Axboe <ax...@kernel.dk> wrote:
> 
> On 11/16/2015 08:24 PM, yalin wang wrote:
>> Some arch define this atomic_inc_return_release() OP.
> 
> That is a very vague commit message, you'll need a whole lot more than 
> that... A commit message is supposed to describe the reason for the change. 
> You provide no reason for the change.
> 
>> diff --git a/block/bio.c b/block/bio.c
>> index fbc558b..b251857 100644
>> --- a/block/bio.c
>> +++ b/block/bio.c
>> @@ -310,8 +310,7 @@ static void bio_chain_endio(struct bio *bio, int error)
>>  static inline void bio_inc_remaining(struct bio *bio)
>>  {
>>  bio->bi_flags |= (1 << BIO_CHAIN);
>> -smp_mb__before_atomic();
>> -atomic_inc(>__bi_remaining);
>> +atomic_inc_return_release(>__bi_remaining);
> 
> Are these equivalent? Where's the documentation for this primitive? The 
> previous code ensured that we ordered the dec of the remaining count with the 
> update of the flags.
> 
i just have a look at ARM64 implementation for this new atomic OP ,
but i don’t find doc in memory-barrier.txt . so i make this RFC for some 
response,
atomic_inc_return_release()  should have store_release() class memory barriers .
in this example,  smp_store_release() memory barrier is not enough ?
just make sure bi_flags update can been seen by other cores before update 
atomic counter.
atomic_inc_return_{release,acquire,relax} OP seems newly add to kernel .
But i don’t see much users in code .
Can it be used to replace lots of smp_mb__before_atomic() ?

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] block: change to use atomic_inc_return_release()

2015-11-16 Thread yalin wang
Some arch define this atomic_inc_return_release() OP.

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 block/bio.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index fbc558b..b251857 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -310,8 +310,7 @@ static void bio_chain_endio(struct bio *bio, int error)
 static inline void bio_inc_remaining(struct bio *bio)
 {
bio->bi_flags |= (1 << BIO_CHAIN);
-   smp_mb__before_atomic();
-   atomic_inc(>__bi_remaining);
+   atomic_inc_return_release(>__bi_remaining);
 }
 
 /**
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V4] mm: fix kernel crash in khugepaged thread

2015-11-16 Thread yalin wang

> On Nov 16, 2015, at 22:25, Steven Rostedt  wrote:
> 
> On Mon, 16 Nov 2015 11:16:22 +0100
> Vlastimil Babka  wrote:
>> 
 -- Steve  
>>> it is not easy to print for perf tools in userspace ,
>>> if you use this format ,
>>> for user space perf tool, it print the entry by look up the member in entry 
>>> struct by offset ,
>>> you print a dynamic string which user space perf tool don’t know how to 
>>> print this string .  
>> 
>> Does it work through trace-cmd?
> 
> The two use the same code. If it works in one, it will work in the
> other.
> 
> -- Steve
> 
i have not tried ,
just a question,
if you print a %s , but don’t call trace_define_field() do define this string in
__entry ,  how does user space perf tool to get this string info and print it ?
i am curious ..
i can try this when i have time.  and report to you .

Thanks--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 01/17] mm: support madvise(MADV_FREE)

2015-11-15 Thread yalin wang

> On Nov 16, 2015, at 10:13, Minchan Kim  wrote:
> 
> On Fri, Nov 13, 2015 at 11:46:07AM -0800, Andy Lutomirski wrote:
>> On Fri, Nov 13, 2015 at 12:13 AM, Daniel Micay  wrote:
>>> On 13/11/15 02:03 AM, Minchan Kim wrote:
 On Fri, Nov 13, 2015 at 01:45:52AM -0500, Daniel Micay wrote:
>> And now I am thinking if we use access bit, we could implment 
>> MADV_FREE_UNDO
>> easily when we need it. Maybe, that's what you want. Right?
> 
> Yes, but why the access bit instead of the dirty bit for that? It could
> always be made more strict (i.e. access bit) in the future, while going
> the other way won't be possible. So I think the dirty bit is really the
> more conservative choice since if it turns out to be a mistake it can be
> fixed without a backwards incompatible change.
 
 Absolutely true. That's why I insist on dirty bit until now although
 I didn't tell the reason. But I thought you wanted to change for using
 access bit for the future, too. It seems MADV_FREE start to bloat
 over and over again before knowing real problems and usecases.
 It's almost same situation with volatile ranges so I really want to
 stop at proper point which maintainer should decide, I hope.
 Without it, we will make the feature a lot heavy by just brain storming
 and then causes lots of churn in MM code without real bebenfit
 It would be very painful for us.
>>> 
>>> Well, I don't think you need more than a good API and an implementation
>>> with no known bugs, kernel security concerns or backwards compatibility
>>> issues. Configuration and API extensions are something for later (i.e.
>>> land a baseline, then submit stuff like sysctl tunables). Just my take
>>> on it though...
>>> 
>> 
>> As long as it's anonymous MAP_PRIVATE only, then the security aspects
>> should be okay.  MADV_DONTNEED seems to work on pretty much any VMA,
>> and there's been long history of interesting bugs there.
>> 
>> As for dirty vs accessed, an argument in favor of going straight to
>> accessed is that it means that users can write code like this without
>> worrying about whether they have a kernel that uses the dirty bit:
>> 
>> x = mmap(...);
>> *x = 1;  /* mark it present */
>> 
>> /* i'm done with it */
>> *x = 1;
>> madvise(MADV_FREE, x, ...);
>> 
>> wait a while;
>> 
>> /* is it still there? */
>> if (*x == 1) {
>>  /* use whatever was cached there */
>> } else {
>> /* reinitialize it */
>> *x = 1;
>> }
>> 
>> With the dirty bit, this will look like it works, but on occasion
>> users will lose the race where they probe *x to see if the data was
>> lost and then the data gets lost before the next write comes in.
>> 
>> Sure, that load from *x could be changed to RMW or users could do a
>> dummy write (e.g. x[1] = 1; if (*x == 1) ...), but people might forget
>> to do that, and the caching implications are a little bit worse.
> 
> I think your example is the case what people abuse MADV_FREE.
> What happens if the object(ie, x) spans multiple pages?
> User should know object's memory align and investigate all of pages
> which span the object. Hmm, I don't think it's good for API.
> 
>> 
>> Note that switching to RMW is really really dangerous.  Doing:
>> 
>> *x &= 1;
>> if (*x == 1) ...;
>> 
>> is safe on x86 if the compiler generates:
>> 
>> andl $1, (%[x]);
>> cmpl $1, (%[x]);
>> 
>> but is unsafe if the compiler generates:
>> 
>> movl (%[x]), %eax;
>> andl $1, %eax;
>> movl %eax, (%[x]);
>> cmpl $1, %eax;
>> 
>> and even worse if the write is omitted when "provably" unnecessary.
>> 
>> OTOH, if switching to the accessed bit is too much of a mess, then
>> using the dirty bit at first isn't so bad.
> 
> Thanks! I want to use dirty bit first.
> 
> About access bit, I don't want to say it to mess but I guess it would
> change a lot subtle thing for all architectures. Because we have used
> access bit as just *hint* for aging while dirty bit is really
> *critical marker* for system integrity. A example in x86, we don't
> keep accuracy of access bit for reducing TLB flush IPI. I don't know
> what technique other arches have used but they might have.
> 
> Thanks.
> 
i think use access bit is not easy to implement for ANON page in kernel.
we are sure the Anon page is always PageDirty()  if it is !PageSwapCache() ,
unless it is MADV_FREE page ,
but use access bit , how to distinguish Normal ANON page and  MADV_FREE page?
it can be implemented by Access bit , but not easy, need more code change .

Thanks



 




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2] mm: change mm_vmscan_lru_shrink_inactive() proto types

2015-11-15 Thread yalin wang
Move node_id zone_idx shrink flags into trace function,
so thay we don't need caculate these args if the trace is disabled,
and will make this function have less arguments.

Signed-off-by: yalin wang 
---
 include/trace/events/vmscan.h | 14 +++---
 mm/vmscan.c   |  7 ++-
 2 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index dae7836..31763dd 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -352,11 +352,11 @@ TRACE_EVENT(mm_vmscan_writepage,
 
 TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 
-   TP_PROTO(int nid, int zid,
-   unsigned long nr_scanned, unsigned long nr_reclaimed,
-   int priority, int reclaim_flags),
+   TP_PROTO(struct zone *zone,
+   unsigned long nr_scanned, unsigned long nr_reclaimed,
+   int priority, int file),
 
-   TP_ARGS(nid, zid, nr_scanned, nr_reclaimed, priority, reclaim_flags),
+   TP_ARGS(zone, nr_scanned, nr_reclaimed, priority, file),
 
TP_STRUCT__entry(
__field(int, nid)
@@ -368,12 +368,12 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
),
 
TP_fast_assign(
-   __entry->nid = nid;
-   __entry->zid = zid;
+   __entry->nid = zone_to_nid(zone);
+   __entry->zid = zone_idx(zone);
__entry->nr_scanned = nr_scanned;
__entry->nr_reclaimed = nr_reclaimed;
__entry->priority = priority;
-   __entry->reclaim_flags = reclaim_flags;
+   __entry->reclaim_flags = trace_shrink_flags(file);
),
 
TP_printk("nid=%d zid=%d nr_scanned=%ld nr_reclaimed=%ld priority=%d 
flags=%s",
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 69ca1f5..f8fc8c1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1691,11 +1691,8 @@ shrink_inactive_list(unsigned long nr_to_scan, struct 
lruvec *lruvec,
current_may_throttle())
wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10);
 
-   trace_mm_vmscan_lru_shrink_inactive(zone->zone_pgdat->node_id,
-   zone_idx(zone),
-   nr_scanned, nr_reclaimed,
-   sc->priority,
-   trace_shrink_flags(file));
+   trace_mm_vmscan_lru_shrink_inactive(zone, nr_scanned, nr_reclaimed,
+   sc->priority, file);
return nr_reclaimed;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: change mm_vmscan_lru_shrink_inactive() proto types

2015-11-15 Thread yalin wang

> On Nov 13, 2015, at 21:16, Vlastimil Babka  wrote:
> 
> zone_to_nid
make sense,
i will send V2 patch ,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: change may_enter_fs check condition

2015-11-15 Thread yalin wang

> On Nov 13, 2015, at 23:36, Michal Hocko  wrote:
> 
> On Fri 13-11-15 13:01:16, Vlastimil Babka wrote:
>> On 11/13/2015 12:47 PM, yalin wang wrote:
>>> Add page_is_file_cache() for __GFP_FS check,
>>> otherwise, a Pageswapcache() && PageDirty() page can always be write
>>> back if the gfp flag is __GFP_FS, this is not the expected behavior.
>> 
>> I'm not sure I understand your point correctly *), but you seem to imply
>> that there would be an allocation that has __GFP_FS but doesn't have
>> __GFP_IO? Are there such allocations and does it make sense?
> 
> No it doesn't. There is a natural layering here and __GFP_FS allocations
> should contain __GFP_IO.
> 
> The patch as is makes only little sense to me. Are you seeing any issue
> which this is trying to fix?
mm..
i don’t see issue for this part ,
just feel confuse when i see code about this part ,
then i make a patch for this .
i am not sure if __GFP_FS will make sure __GFP_IO flag must be always set.
if it is ,  i think can add comment here to make people clear . :)

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V4] mm: fix kernel crash in khugepaged thread

2015-11-15 Thread yalin wang

> On Nov 13, 2015, at 22:01, Steven Rostedt  wrote:
> 
> On Fri, 13 Nov 2015 19:54:11 +0800
> yalin wang  wrote:
> 
>>>>>   TP_fast_assign(
>>>>>   __entry->mm = mm;
>>>>> - __entry->pfn = pfn;
>>>>> + __entry->pfn = page_to_pfn(page);  
>>>> 
>>>> Instead of the condition, we could have:
>>>> 
>>>>__entry->pfn = page ? page_to_pfn(page) : -1;  
>>> 
>>> I agree. Please do it like this.  
> 
> hmm, pfn is defined as an unsigned long, would -1 be the best.
> Or should it be (-1UL).
> 
> Then we could also have:
> 
>TP_printk("mm=%p, scan_pfn=0x%lx%s, writable=%d, referenced=%d, 
> none_or_zero=%d, status=%s, unmapped=%d",
>__entry->mm,
>__entry->pfn == (-1UL) ? 0 : __entry->pfn,
>   __entry->pfn == (-1UL) ? "(null)" : "",
> 
> Note the added %s after %lx I have in the print format.
> 
> -- Steve
it is not easy to print for perf tools in userspace ,
if you use this format ,
for user space perf tool, it print the entry by look up the member in entry 
struct by offset ,
you print a dynamic string which user space perf tool don’t know how to print 
this string .

Thanks --
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: change may_enter_fs check condition

2015-11-15 Thread yalin wang

> On Nov 13, 2015, at 23:36, Michal Hocko <mho...@kernel.org> wrote:
> 
> On Fri 13-11-15 13:01:16, Vlastimil Babka wrote:
>> On 11/13/2015 12:47 PM, yalin wang wrote:
>>> Add page_is_file_cache() for __GFP_FS check,
>>> otherwise, a Pageswapcache() && PageDirty() page can always be write
>>> back if the gfp flag is __GFP_FS, this is not the expected behavior.
>> 
>> I'm not sure I understand your point correctly *), but you seem to imply
>> that there would be an allocation that has __GFP_FS but doesn't have
>> __GFP_IO? Are there such allocations and does it make sense?
> 
> No it doesn't. There is a natural layering here and __GFP_FS allocations
> should contain __GFP_IO.
> 
> The patch as is makes only little sense to me. Are you seeing any issue
> which this is trying to fix?
mm..
i don’t see issue for this part ,
just feel confuse when i see code about this part ,
then i make a patch for this .
i am not sure if __GFP_FS will make sure __GFP_IO flag must be always set.
if it is ,  i think can add comment here to make people clear . :)

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2] mm: change mm_vmscan_lru_shrink_inactive() proto types

2015-11-15 Thread yalin wang
Move node_id zone_idx shrink flags into trace function,
so thay we don't need caculate these args if the trace is disabled,
and will make this function have less arguments.

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 include/trace/events/vmscan.h | 14 +++---
 mm/vmscan.c   |  7 ++-
 2 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index dae7836..31763dd 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -352,11 +352,11 @@ TRACE_EVENT(mm_vmscan_writepage,
 
 TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 
-   TP_PROTO(int nid, int zid,
-   unsigned long nr_scanned, unsigned long nr_reclaimed,
-   int priority, int reclaim_flags),
+   TP_PROTO(struct zone *zone,
+   unsigned long nr_scanned, unsigned long nr_reclaimed,
+   int priority, int file),
 
-   TP_ARGS(nid, zid, nr_scanned, nr_reclaimed, priority, reclaim_flags),
+   TP_ARGS(zone, nr_scanned, nr_reclaimed, priority, file),
 
TP_STRUCT__entry(
__field(int, nid)
@@ -368,12 +368,12 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
),
 
TP_fast_assign(
-   __entry->nid = nid;
-   __entry->zid = zid;
+   __entry->nid = zone_to_nid(zone);
+   __entry->zid = zone_idx(zone);
__entry->nr_scanned = nr_scanned;
__entry->nr_reclaimed = nr_reclaimed;
__entry->priority = priority;
-   __entry->reclaim_flags = reclaim_flags;
+   __entry->reclaim_flags = trace_shrink_flags(file);
),
 
TP_printk("nid=%d zid=%d nr_scanned=%ld nr_reclaimed=%ld priority=%d 
flags=%s",
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 69ca1f5..f8fc8c1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1691,11 +1691,8 @@ shrink_inactive_list(unsigned long nr_to_scan, struct 
lruvec *lruvec,
current_may_throttle())
wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10);
 
-   trace_mm_vmscan_lru_shrink_inactive(zone->zone_pgdat->node_id,
-   zone_idx(zone),
-   nr_scanned, nr_reclaimed,
-   sc->priority,
-   trace_shrink_flags(file));
+   trace_mm_vmscan_lru_shrink_inactive(zone, nr_scanned, nr_reclaimed,
+   sc->priority, file);
return nr_reclaimed;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: change mm_vmscan_lru_shrink_inactive() proto types

2015-11-15 Thread yalin wang

> On Nov 13, 2015, at 21:16, Vlastimil Babka  wrote:
> 
> zone_to_nid
make sense,
i will send V2 patch ,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V4] mm: fix kernel crash in khugepaged thread

2015-11-15 Thread yalin wang

> On Nov 13, 2015, at 22:01, Steven Rostedt <rost...@goodmis.org> wrote:
> 
> On Fri, 13 Nov 2015 19:54:11 +0800
> yalin wang <yalin.wang2...@gmail.com> wrote:
> 
>>>>>   TP_fast_assign(
>>>>>   __entry->mm = mm;
>>>>> - __entry->pfn = pfn;
>>>>> + __entry->pfn = page_to_pfn(page);  
>>>> 
>>>> Instead of the condition, we could have:
>>>> 
>>>>__entry->pfn = page ? page_to_pfn(page) : -1;  
>>> 
>>> I agree. Please do it like this.  
> 
> hmm, pfn is defined as an unsigned long, would -1 be the best.
> Or should it be (-1UL).
> 
> Then we could also have:
> 
>TP_printk("mm=%p, scan_pfn=0x%lx%s, writable=%d, referenced=%d, 
> none_or_zero=%d, status=%s, unmapped=%d",
>__entry->mm,
>__entry->pfn == (-1UL) ? 0 : __entry->pfn,
>   __entry->pfn == (-1UL) ? "(null)" : "",
> 
> Note the added %s after %lx I have in the print format.
> 
> -- Steve
it is not easy to print for perf tools in userspace ,
if you use this format ,
for user space perf tool, it print the entry by look up the member in entry 
struct by offset ,
you print a dynamic string which user space perf tool don’t know how to print 
this string .

Thanks --
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 01/17] mm: support madvise(MADV_FREE)

2015-11-15 Thread yalin wang

> On Nov 16, 2015, at 10:13, Minchan Kim  wrote:
> 
> On Fri, Nov 13, 2015 at 11:46:07AM -0800, Andy Lutomirski wrote:
>> On Fri, Nov 13, 2015 at 12:13 AM, Daniel Micay  wrote:
>>> On 13/11/15 02:03 AM, Minchan Kim wrote:
 On Fri, Nov 13, 2015 at 01:45:52AM -0500, Daniel Micay wrote:
>> And now I am thinking if we use access bit, we could implment 
>> MADV_FREE_UNDO
>> easily when we need it. Maybe, that's what you want. Right?
> 
> Yes, but why the access bit instead of the dirty bit for that? It could
> always be made more strict (i.e. access bit) in the future, while going
> the other way won't be possible. So I think the dirty bit is really the
> more conservative choice since if it turns out to be a mistake it can be
> fixed without a backwards incompatible change.
 
 Absolutely true. That's why I insist on dirty bit until now although
 I didn't tell the reason. But I thought you wanted to change for using
 access bit for the future, too. It seems MADV_FREE start to bloat
 over and over again before knowing real problems and usecases.
 It's almost same situation with volatile ranges so I really want to
 stop at proper point which maintainer should decide, I hope.
 Without it, we will make the feature a lot heavy by just brain storming
 and then causes lots of churn in MM code without real bebenfit
 It would be very painful for us.
>>> 
>>> Well, I don't think you need more than a good API and an implementation
>>> with no known bugs, kernel security concerns or backwards compatibility
>>> issues. Configuration and API extensions are something for later (i.e.
>>> land a baseline, then submit stuff like sysctl tunables). Just my take
>>> on it though...
>>> 
>> 
>> As long as it's anonymous MAP_PRIVATE only, then the security aspects
>> should be okay.  MADV_DONTNEED seems to work on pretty much any VMA,
>> and there's been long history of interesting bugs there.
>> 
>> As for dirty vs accessed, an argument in favor of going straight to
>> accessed is that it means that users can write code like this without
>> worrying about whether they have a kernel that uses the dirty bit:
>> 
>> x = mmap(...);
>> *x = 1;  /* mark it present */
>> 
>> /* i'm done with it */
>> *x = 1;
>> madvise(MADV_FREE, x, ...);
>> 
>> wait a while;
>> 
>> /* is it still there? */
>> if (*x == 1) {
>>  /* use whatever was cached there */
>> } else {
>> /* reinitialize it */
>> *x = 1;
>> }
>> 
>> With the dirty bit, this will look like it works, but on occasion
>> users will lose the race where they probe *x to see if the data was
>> lost and then the data gets lost before the next write comes in.
>> 
>> Sure, that load from *x could be changed to RMW or users could do a
>> dummy write (e.g. x[1] = 1; if (*x == 1) ...), but people might forget
>> to do that, and the caching implications are a little bit worse.
> 
> I think your example is the case what people abuse MADV_FREE.
> What happens if the object(ie, x) spans multiple pages?
> User should know object's memory align and investigate all of pages
> which span the object. Hmm, I don't think it's good for API.
> 
>> 
>> Note that switching to RMW is really really dangerous.  Doing:
>> 
>> *x &= 1;
>> if (*x == 1) ...;
>> 
>> is safe on x86 if the compiler generates:
>> 
>> andl $1, (%[x]);
>> cmpl $1, (%[x]);
>> 
>> but is unsafe if the compiler generates:
>> 
>> movl (%[x]), %eax;
>> andl $1, %eax;
>> movl %eax, (%[x]);
>> cmpl $1, %eax;
>> 
>> and even worse if the write is omitted when "provably" unnecessary.
>> 
>> OTOH, if switching to the accessed bit is too much of a mess, then
>> using the dirty bit at first isn't so bad.
> 
> Thanks! I want to use dirty bit first.
> 
> About access bit, I don't want to say it to mess but I guess it would
> change a lot subtle thing for all architectures. Because we have used
> access bit as just *hint* for aging while dirty bit is really
> *critical marker* for system integrity. A example in x86, we don't
> keep accuracy of access bit for reducing TLB flush IPI. I don't know
> what technique other arches have used but they might have.
> 
> Thanks.
> 
i think use access bit is not easy to implement for ANON page in kernel.
we are sure the Anon page is always PageDirty()  if it is !PageSwapCache() ,
unless it is MADV_FREE page ,
but use access bit , how to distinguish Normal ANON page and  MADV_FREE page?
it can be implemented by Access bit , but not easy, need more code change .

Thanks



 




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V6] mm: fix kernel crash in khugepaged thread

2015-11-13 Thread yalin wang
This crash is caused by NULL pointer deference, in page_to_pfn() marco,
when page == NULL :

[  182.639154 ] Unable to handle kernel NULL pointer dereference at virtual 
address 
[  182.639491 ] pgd = ffc00077a000
[  182.639761 ] [] *pgd=b9422003, *pud=b9422003, 
*pmd=b9423003, *pte=006008000707
[  182.640749 ] Internal error: Oops: 9406 [#1] SMP
[  182.641197 ] Modules linked in:
[  182.641580 ] CPU: 1 PID: 26 Comm: khugepaged Tainted: GW   
4.3.0-rc6-next-20151022ajb-1-g32f3386-dirty #3
[  182.642077 ] Hardware name: linux,dummy-virt (DT)
[  182.642227 ] task: ffc07957c080 ti: ffc079638000 task.ti: 
ffc079638000
[  182.642598 ] PC is at khugepaged+0x378/0x1af8
[  182.642826 ] LR is at khugepaged+0x418/0x1af8
[  182.643047 ] pc : [] lr : [] pstate: 
6145
[  182.643490 ] sp : ffc07963bca0
[  182.643650 ] x29: ffc07963bca0 x28: ffc00075c000
[  182.644024 ] x27: ffc00f275040 x26: ffc0006c7000
[  182.644334 ] x25: 00e848800f51 x24: 0640
[  182.644687 ] x23: 0002 x22: 
[  182.644972 ] x21:  x20: 
[  182.645446 ] x19:  x18: 007ff86d0990
[  182.645931 ] x17: 007ef9c8 x16: ffc98390
[  182.646236 ] x15:  x14: 
[  182.646649 ] x13: 016a x12: 
[  182.647046 ] x11: ffc07f025020 x10: 
[  182.647395 ] x9 : 0048 x8 : ffc000721e28
[  182.647872 ] x7 :  x6 : ffc07f02d000
[  182.648261 ] x5 : fe00 x4 : ffc00f275040
[  182.648611 ] x3 :  x2 : ffc00f2ad000
[  182.648908 ] x1 :  x0 : ffc000727000
[  182.649147 ]
[  182.649252 ] Process khugepaged (pid: 26, stack limit = 0xffc079638020)
[  182.649724 ] Stack: (0xffc07963bca0 to 0xffc07963c000)
[  182.650141 ] bca0: ffc07963be30 ffcb5044 ffc07961fb80 
ffc00072e630
[  182.650587 ] bcc0: ffc0005d5090  ffc000197d34 

[  182.651009 ] bce0:    

[  182.651446 ] bd00: ffc07963bd90 ffc07f1cbf80 4f3be003 
ffc00f2750a4
[  182.651956 ] bd20: ffc00f3bf000 ffc1 0001 
ffc07f085740
[  182.652520 ] bd40: ffc00f2ad188 ffc0 0620 
ffc00f275040
[  182.652972 ] bd60: ffc0006b1a90 ffc079638000 ffc07963be20 
ffc00f0144d0
[  182.653357 ] bd80: ffc0 0640 ffc00f0144d0 
0a080001
[  182.653793 ] bda0: 1001 ffc1 ffc07f025000 
ffc00f2750a8
[  182.654226 ] bdc0: 000105f8 ffc00075a000 06a0 
ffc000727000
[  182.654522 ] bde0: ffc0006e8478 ffc0 0001 
ffc078fb9000
[  182.654869 ] be00: ffc07963be30 ffc0 ffc07957c080 
ffccfc4c
[  182.655225 ] be20: ffc07963be20 ffc07963be20  
ffc85c50
[  182.655588 ] be40: ffcb4f64 ffc07961fb80  

[  182.656138 ] be60:  ffcbee2c ffcb4f64 

[  182.656609 ] be80:    

[  182.657145 ] bea0: ffc07963bea0 ffc07963bea0  
ffc0
[  182.657475 ] bec0: ffc07963bec0 ffc07963bec0  

[  182.657922 ] bee0:    

[  182.658558 ] bf00:    

[  182.658972 ] bf20:    

[  182.659291 ] bf40:    

[  182.659722 ] bf60:    

[  182.660122 ] bf80:    

[  182.660654 ] bfa0:    

[  182.661064 ] bfc0:    
0005
[  182.661466 ] bfe0:    

[  182.661848 ] Call trace:
[  182.662050 ] [] khugepaged+0x378/0x1af8
[  182.662294 ] [] kthread+0xdc/0xf4
[  182.662605 ] [] ret_from_fork+0xc/0x40
[  182.663046 ] Code: 35001700 f0002c60 aa0703e3 f9009fa0 (f94000e0)
[  182.663901 ] ---[ end trace 637503d8e28ae69e  ]---
[  182.664160 ] Kernel panic - not syncing: Fatal exception
[  182.664571 ] CPU2: stopping
[  182.664794 ] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G  D W   
4.3.0-rc6-next-20151022ajb-1-g32f3386-dirty #3
[  182.665248 ] Hardware name: linux,dummy-virt (DT)

Signed-off-by: yalin wang 
---
 include/trace

[PATCH V5] mm: fix kernel crash in khugepaged thread

2015-11-13 Thread yalin wang
 trace NULL page.

Signed-off-by: yalin wang 
---
 include/trace/events/huge_memory.h | 12 ++--
 mm/huge_memory.c   |  6 +++---
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/trace/events/huge_memory.h 
b/include/trace/events/huge_memory.h
index 11c59ca..bfcf4a1 100644
--- a/include/trace/events/huge_memory.h
+++ b/include/trace/events/huge_memory.h
@@ -47,10 +47,10 @@ SCAN_STATUS
 
 TRACE_EVENT(mm_khugepaged_scan_pmd,
 
-   TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable,
+   TP_PROTO(struct mm_struct *mm, struct page *page, bool writable,
 bool referenced, int none_or_zero, int status, int unmapped),
 
-   TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped),
+   TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped),
 
TP_STRUCT__entry(
__field(struct mm_struct *, mm)
@@ -64,7 +64,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd,
 
TP_fast_assign(
__entry->mm = mm;
-   __entry->pfn = pfn;
+   __entry->pfn = page ? page_to_pfn(page) : -1;
__entry->writable = writable;
__entry->referenced = referenced;
__entry->none_or_zero = none_or_zero;
@@ -108,10 +108,10 @@ TRACE_EVENT(mm_collapse_huge_page,
 
 TRACE_EVENT(mm_collapse_huge_page_isolate,
 
-   TP_PROTO(unsigned long pfn, int none_or_zero,
+   TP_PROTO(struct page *page, int none_or_zero,
 bool referenced, bool  writable, int status),
 
-   TP_ARGS(pfn, none_or_zero, referenced, writable, status),
+   TP_ARGS(page, none_or_zero, referenced, writable, status),
 
TP_STRUCT__entry(
__field(unsigned long, pfn)
@@ -122,7 +122,7 @@ TRACE_EVENT(mm_collapse_huge_page_isolate,
),
 
TP_fast_assign(
-   __entry->pfn = pfn;
+   __entry->pfn = page ? page_to_pfn(page) : -1;
__entry->none_or_zero = none_or_zero;
__entry->referenced = referenced;
__entry->writable = writable;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 67b00a1..fb3c4f8 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1977,7 +1977,7 @@ static int __collapse_huge_page_isolate(struct 
vm_area_struct *vma,
if (likely(writable)) {
if (likely(referenced)) {
result = SCAN_SUCCEED;
-   trace_mm_collapse_huge_page_isolate(page_to_pfn(page), 
none_or_zero,
+   trace_mm_collapse_huge_page_isolate(page, none_or_zero,
referenced, 
writable, result);
return 1;
}
@@ -1987,7 +1987,7 @@ static int __collapse_huge_page_isolate(struct 
vm_area_struct *vma,
 
 out:
release_pte_pages(pte, _pte);
-   trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero,
+   trace_mm_collapse_huge_page_isolate(page, none_or_zero,
referenced, writable, result);
return 0;
 }
@@ -2530,7 +2530,7 @@ out_unmap:
collapse_huge_page(mm, address, hpage, vma, node);
}
 out:
-   trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, 
referenced,
+   trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced,
 none_or_zero, result, unmapped);
return ret;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V4] mm: fix kernel crash in khugepaged thread

2015-11-13 Thread yalin wang

> On Nov 13, 2015, at 18:47, Vlastimil Babka  wrote:
> 
> On 11/12/2015 03:29 PM, Steven Rostedt wrote:
>> On Thu, 12 Nov 2015 16:21:02 +0800
>> yalin wang  wrote:
>> 
>>> This crash is caused by NULL pointer deference, in page_to_pfn() marco,
>>> when page == NULL :
>>> 
>>> [  182.639154 ] Unable to handle kernel NULL pointer dereference at virtual 
>>> address 
>> 
>> 
>>> add the trace point with TP_CONDITION(page),
>> 
>> I wonder if we still want to trace even if page is NULL?
> 
> I'd say we want to. There's even a "SCAN_PAGE_NULL" result defined for that 
> case, and otherwise we would only have to guess why collapsing failed, which 
> is the thing that the tracepoint should help us find out in the first place :)
> 
>>> avoid trace NULL page.
>>> 
>>> Signed-off-by: yalin wang 
>>> ---
>>>  include/trace/events/huge_memory.h | 20 
>>>  mm/huge_memory.c   |  6 +++---
>>>  2 files changed, 15 insertions(+), 11 deletions(-)
>>> 
>>> diff --git a/include/trace/events/huge_memory.h 
>>> b/include/trace/events/huge_memory.h
>>> index 11c59ca..727647b 100644
>>> --- a/include/trace/events/huge_memory.h
>>> +++ b/include/trace/events/huge_memory.h
>>> @@ -45,12 +45,14 @@ SCAN_STATUS
>>>  #define EM(a, b)   {a, b},
>>>  #define EMe(a, b)  {a, b}
>>> 
>>> -TRACE_EVENT(mm_khugepaged_scan_pmd,
>>> +TRACE_EVENT_CONDITION(mm_khugepaged_scan_pmd,
>>> 
>>> -   TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable,
>>> +   TP_PROTO(struct mm_struct *mm, struct page *page, bool writable,
>>>  bool referenced, int none_or_zero, int status, int unmapped),
>>> 
>>> -   TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped),
>>> +   TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped),
>>> +
>>> +   TP_CONDITION(page),
>>> 
>>> TP_STRUCT__entry(
>>> __field(struct mm_struct *, mm)
>>> @@ -64,7 +66,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd,
>>> 
>>> TP_fast_assign(
>>> __entry->mm = mm;
>>> -   __entry->pfn = pfn;
>>> +   __entry->pfn = page_to_pfn(page);
>> 
>> Instead of the condition, we could have:
>> 
>>  __entry->pfn = page ? page_to_pfn(page) : -1;
> 
> I agree. Please do it like this.
ok ,  i will send V5 patch .--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: change may_enter_fs check condition

2015-11-13 Thread yalin wang
Add page_is_file_cache() for __GFP_FS check,
otherwise, a Pageswapcache() && PageDirty() page can always be write
back if the gfp flag is __GFP_FS, this is not the expected behavior.

Signed-off-by: yalin wang 
---
 mm/vmscan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index bd2918e..f8fc8c1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -930,7 +930,7 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
if (page_mapped(page) || PageSwapCache(page))
sc->nr_scanned++;
 
-   may_enter_fs = (sc->gfp_mask & __GFP_FS) ||
+   may_enter_fs = (page_is_file_cache(page) && (sc->gfp_mask & 
__GFP_FS)) ||
(PageSwapCache(page) && (sc->gfp_mask & __GFP_IO));
 
/*
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V4] mm: fix kernel crash in khugepaged thread

2015-11-13 Thread yalin wang

> On Nov 13, 2015, at 16:41, Hillf Danton  wrote:
> 
>> 
>> Instead of the condition, we could have:
>> 
>>  __entry->pfn = page ? page_to_pfn(page) : -1;
>> 
>> 
>> But if there's no reason to do the tracepoint if page is NULL, then
>> this patch is fine. I'm just throwing out this idea.
>> 
> we trace only if page is valid
> 
> --- linux-next/mm/huge_memory.c   Fri Nov 13 16:00:22 2015
> +++ b/mm/huge_memory.cFri Nov 13 16:26:19 2015
> @@ -1987,7 +1987,8 @@ static int __collapse_huge_page_isolate(
> 
> out:
>   release_pte_pages(pte, _pte);
> - trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero,
> + if (page)
> + trace_mm_collapse_huge_page_isolate(page_to_pfn(page), 
> none_or_zero,
>   referenced, writable, result);
>   return 0;
> }
> —
> 
my V4  patch move  if (!page)  into trace function,
so that we don’t need call page_to_fn()  if the trace if disabled .
more efficient  .
Thanks


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V4] mm: fix kernel crash in khugepaged thread

2015-11-13 Thread yalin wang

> On Nov 13, 2015, at 16:41, Hillf Danton  wrote:
> 
>> 
>> Instead of the condition, we could have:
>> 
>>  __entry->pfn = page ? page_to_pfn(page) : -1;
>> 
>> 
>> But if there's no reason to do the tracepoint if page is NULL, then
>> this patch is fine. I'm just throwing out this idea.
>> 
> we trace only if page is valid
> 
> --- linux-next/mm/huge_memory.c   Fri Nov 13 16:00:22 2015
> +++ b/mm/huge_memory.cFri Nov 13 16:26:19 2015
> @@ -1987,7 +1987,8 @@ static int __collapse_huge_page_isolate(
> 
> out:
>   release_pte_pages(pte, _pte);
> - trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero,
> + if (page)
> + trace_mm_collapse_huge_page_isolate(page_to_pfn(page), 
> none_or_zero,
>   referenced, writable, result);
>   return 0;
> }
> —
> 
my V4  patch move  if (!page)  into trace function,
so that we don’t need call page_to_fn()  if the trace if disabled .
more efficient  .
Thanks


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: change may_enter_fs check condition

2015-11-13 Thread yalin wang
Add page_is_file_cache() for __GFP_FS check,
otherwise, a Pageswapcache() && PageDirty() page can always be write
back if the gfp flag is __GFP_FS, this is not the expected behavior.

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 mm/vmscan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index bd2918e..f8fc8c1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -930,7 +930,7 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
if (page_mapped(page) || PageSwapCache(page))
sc->nr_scanned++;
 
-   may_enter_fs = (sc->gfp_mask & __GFP_FS) ||
+   may_enter_fs = (page_is_file_cache(page) && (sc->gfp_mask & 
__GFP_FS)) ||
(PageSwapCache(page) && (sc->gfp_mask & __GFP_IO));
 
/*
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V6] mm: fix kernel crash in khugepaged thread

2015-11-13 Thread yalin wang
This crash is caused by NULL pointer deference, in page_to_pfn() marco,
when page == NULL :

[  182.639154 ] Unable to handle kernel NULL pointer dereference at virtual 
address 
[  182.639491 ] pgd = ffc00077a000
[  182.639761 ] [] *pgd=b9422003, *pud=b9422003, 
*pmd=b9423003, *pte=006008000707
[  182.640749 ] Internal error: Oops: 9406 [#1] SMP
[  182.641197 ] Modules linked in:
[  182.641580 ] CPU: 1 PID: 26 Comm: khugepaged Tainted: GW   
4.3.0-rc6-next-20151022ajb-1-g32f3386-dirty #3
[  182.642077 ] Hardware name: linux,dummy-virt (DT)
[  182.642227 ] task: ffc07957c080 ti: ffc079638000 task.ti: 
ffc079638000
[  182.642598 ] PC is at khugepaged+0x378/0x1af8
[  182.642826 ] LR is at khugepaged+0x418/0x1af8
[  182.643047 ] pc : [] lr : [] pstate: 
6145
[  182.643490 ] sp : ffc07963bca0
[  182.643650 ] x29: ffc07963bca0 x28: ffc00075c000
[  182.644024 ] x27: ffc00f275040 x26: ffc0006c7000
[  182.644334 ] x25: 00e848800f51 x24: 0640
[  182.644687 ] x23: 0002 x22: 
[  182.644972 ] x21:  x20: 
[  182.645446 ] x19:  x18: 007ff86d0990
[  182.645931 ] x17: 007ef9c8 x16: ffc98390
[  182.646236 ] x15:  x14: 
[  182.646649 ] x13: 016a x12: 
[  182.647046 ] x11: ffc07f025020 x10: 
[  182.647395 ] x9 : 0048 x8 : ffc000721e28
[  182.647872 ] x7 :  x6 : ffc07f02d000
[  182.648261 ] x5 : fe00 x4 : ffc00f275040
[  182.648611 ] x3 :  x2 : ffc00f2ad000
[  182.648908 ] x1 :  x0 : ffc000727000
[  182.649147 ]
[  182.649252 ] Process khugepaged (pid: 26, stack limit = 0xffc079638020)
[  182.649724 ] Stack: (0xffc07963bca0 to 0xffc07963c000)
[  182.650141 ] bca0: ffc07963be30 ffcb5044 ffc07961fb80 
ffc00072e630
[  182.650587 ] bcc0: ffc0005d5090  ffc000197d34 

[  182.651009 ] bce0:    

[  182.651446 ] bd00: ffc07963bd90 ffc07f1cbf80 4f3be003 
ffc00f2750a4
[  182.651956 ] bd20: ffc00f3bf000 ffc1 0001 
ffc07f085740
[  182.652520 ] bd40: ffc00f2ad188 ffc0 0620 
ffc00f275040
[  182.652972 ] bd60: ffc0006b1a90 ffc079638000 ffc07963be20 
ffc00f0144d0
[  182.653357 ] bd80: ffc0 0640 ffc00f0144d0 
0a080001
[  182.653793 ] bda0: 1001 ffc1 ffc07f025000 
ffc00f2750a8
[  182.654226 ] bdc0: 000105f8 ffc00075a000 06a0 
ffc000727000
[  182.654522 ] bde0: ffc0006e8478 ffc0 0001 
ffc078fb9000
[  182.654869 ] be00: ffc07963be30 ffc0 ffc07957c080 
ffccfc4c
[  182.655225 ] be20: ffc07963be20 ffc07963be20  
ffc85c50
[  182.655588 ] be40: ffcb4f64 ffc07961fb80  

[  182.656138 ] be60:  ffcbee2c ffcb4f64 

[  182.656609 ] be80:    

[  182.657145 ] bea0: ffc07963bea0 ffc07963bea0  
ffc0
[  182.657475 ] bec0: ffc07963bec0 ffc07963bec0  

[  182.657922 ] bee0:    

[  182.658558 ] bf00:    

[  182.658972 ] bf20:    

[  182.659291 ] bf40:    

[  182.659722 ] bf60:    

[  182.660122 ] bf80:    

[  182.660654 ] bfa0:    

[  182.661064 ] bfc0:    
0005
[  182.661466 ] bfe0:    

[  182.661848 ] Call trace:
[  182.662050 ] [] khugepaged+0x378/0x1af8
[  182.662294 ] [] kthread+0xdc/0xf4
[  182.662605 ] [] ret_from_fork+0xc/0x40
[  182.663046 ] Code: 35001700 f0002c60 aa0703e3 f9009fa0 (f94000e0)
[  182.663901 ] ---[ end trace 637503d8e28ae69e  ]---
[  182.664160 ] Kernel panic - not syncing: Fatal exception
[  182.664571 ] CPU2: stopping
[  182.664794 ] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G  D W   
4.3.0-rc6-next-20151022ajb-1-g32f3386-dirty #3
[  182.665248 ] Hardware name: linux,dummy-virt (DT)

Signed-off-by: yalin wang <yalin.wa

Re: [PATCH V4] mm: fix kernel crash in khugepaged thread

2015-11-13 Thread yalin wang

> On Nov 13, 2015, at 18:47, Vlastimil Babka <vba...@suse.cz> wrote:
> 
> On 11/12/2015 03:29 PM, Steven Rostedt wrote:
>> On Thu, 12 Nov 2015 16:21:02 +0800
>> yalin wang <yalin.wang2...@gmail.com> wrote:
>> 
>>> This crash is caused by NULL pointer deference, in page_to_pfn() marco,
>>> when page == NULL :
>>> 
>>> [  182.639154 ] Unable to handle kernel NULL pointer dereference at virtual 
>>> address 
>> 
>> 
>>> add the trace point with TP_CONDITION(page),
>> 
>> I wonder if we still want to trace even if page is NULL?
> 
> I'd say we want to. There's even a "SCAN_PAGE_NULL" result defined for that 
> case, and otherwise we would only have to guess why collapsing failed, which 
> is the thing that the tracepoint should help us find out in the first place :)
> 
>>> avoid trace NULL page.
>>> 
>>> Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
>>> ---
>>>  include/trace/events/huge_memory.h | 20 
>>>  mm/huge_memory.c   |  6 +++---
>>>  2 files changed, 15 insertions(+), 11 deletions(-)
>>> 
>>> diff --git a/include/trace/events/huge_memory.h 
>>> b/include/trace/events/huge_memory.h
>>> index 11c59ca..727647b 100644
>>> --- a/include/trace/events/huge_memory.h
>>> +++ b/include/trace/events/huge_memory.h
>>> @@ -45,12 +45,14 @@ SCAN_STATUS
>>>  #define EM(a, b)   {a, b},
>>>  #define EMe(a, b)  {a, b}
>>> 
>>> -TRACE_EVENT(mm_khugepaged_scan_pmd,
>>> +TRACE_EVENT_CONDITION(mm_khugepaged_scan_pmd,
>>> 
>>> -   TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable,
>>> +   TP_PROTO(struct mm_struct *mm, struct page *page, bool writable,
>>>  bool referenced, int none_or_zero, int status, int unmapped),
>>> 
>>> -   TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped),
>>> +   TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped),
>>> +
>>> +   TP_CONDITION(page),
>>> 
>>> TP_STRUCT__entry(
>>> __field(struct mm_struct *, mm)
>>> @@ -64,7 +66,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd,
>>> 
>>> TP_fast_assign(
>>> __entry->mm = mm;
>>> -   __entry->pfn = pfn;
>>> +   __entry->pfn = page_to_pfn(page);
>> 
>> Instead of the condition, we could have:
>> 
>>  __entry->pfn = page ? page_to_pfn(page) : -1;
> 
> I agree. Please do it like this.
ok ,  i will send V5 patch .--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V5] mm: fix kernel crash in khugepaged thread

2015-11-13 Thread yalin wang
 trace NULL page.

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 include/trace/events/huge_memory.h | 12 ++--
 mm/huge_memory.c   |  6 +++---
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/trace/events/huge_memory.h 
b/include/trace/events/huge_memory.h
index 11c59ca..bfcf4a1 100644
--- a/include/trace/events/huge_memory.h
+++ b/include/trace/events/huge_memory.h
@@ -47,10 +47,10 @@ SCAN_STATUS
 
 TRACE_EVENT(mm_khugepaged_scan_pmd,
 
-   TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable,
+   TP_PROTO(struct mm_struct *mm, struct page *page, bool writable,
 bool referenced, int none_or_zero, int status, int unmapped),
 
-   TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped),
+   TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped),
 
TP_STRUCT__entry(
__field(struct mm_struct *, mm)
@@ -64,7 +64,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd,
 
TP_fast_assign(
__entry->mm = mm;
-   __entry->pfn = pfn;
+   __entry->pfn = page ? page_to_pfn(page) : -1;
__entry->writable = writable;
__entry->referenced = referenced;
__entry->none_or_zero = none_or_zero;
@@ -108,10 +108,10 @@ TRACE_EVENT(mm_collapse_huge_page,
 
 TRACE_EVENT(mm_collapse_huge_page_isolate,
 
-   TP_PROTO(unsigned long pfn, int none_or_zero,
+   TP_PROTO(struct page *page, int none_or_zero,
 bool referenced, bool  writable, int status),
 
-   TP_ARGS(pfn, none_or_zero, referenced, writable, status),
+   TP_ARGS(page, none_or_zero, referenced, writable, status),
 
TP_STRUCT__entry(
__field(unsigned long, pfn)
@@ -122,7 +122,7 @@ TRACE_EVENT(mm_collapse_huge_page_isolate,
),
 
TP_fast_assign(
-   __entry->pfn = pfn;
+   __entry->pfn = page ? page_to_pfn(page) : -1;
__entry->none_or_zero = none_or_zero;
__entry->referenced = referenced;
__entry->writable = writable;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 67b00a1..fb3c4f8 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1977,7 +1977,7 @@ static int __collapse_huge_page_isolate(struct 
vm_area_struct *vma,
if (likely(writable)) {
if (likely(referenced)) {
result = SCAN_SUCCEED;
-   trace_mm_collapse_huge_page_isolate(page_to_pfn(page), 
none_or_zero,
+   trace_mm_collapse_huge_page_isolate(page, none_or_zero,
referenced, 
writable, result);
return 1;
}
@@ -1987,7 +1987,7 @@ static int __collapse_huge_page_isolate(struct 
vm_area_struct *vma,
 
 out:
release_pte_pages(pte, _pte);
-   trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero,
+   trace_mm_collapse_huge_page_isolate(page, none_or_zero,
referenced, writable, result);
return 0;
 }
@@ -2530,7 +2530,7 @@ out_unmap:
collapse_huge_page(mm, address, hpage, vma, node);
}
 out:
-   trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, 
referenced,
+   trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced,
 none_or_zero, result, unmapped);
return ret;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2] mmc: change to use kmalloc

2015-11-12 Thread yalin wang
Use kmalloc instead of kzalloc, zero the memory is not needed.

Signed-off-by: yalin wang 
---
 drivers/mmc/card/block.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index c742cfd..c3fd4c8 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -345,7 +345,7 @@ static struct mmc_blk_ioc_data 
*mmc_blk_ioctl_copy_from_user(
struct mmc_blk_ioc_data *idata;
int err;
 
-   idata = kzalloc(sizeof(*idata), GFP_KERNEL);
+   idata = kmalloc(sizeof(*idata), GFP_KERNEL);
if (!idata) {
err = -ENOMEM;
goto out;
@@ -365,7 +365,7 @@ static struct mmc_blk_ioc_data 
*mmc_blk_ioctl_copy_from_user(
if (!idata->buf_bytes)
return idata;
 
-   idata->buf = kzalloc(idata->buf_bytes, GFP_KERNEL);
+   idata->buf = kmalloc(idata->buf_bytes, GFP_KERNEL);
if (!idata->buf) {
err = -ENOMEM;
goto idata_err;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mmc: change to use kmalloc

2015-11-12 Thread yalin wang

> On Nov 12, 2015, at 05:57, Andy Shevchenko  wrote:
> 
> On Wed, Nov 11, 2015 at 11:17 PM, Peter Hurley  
> wrote:
>> On 11/11/2015 12:02 PM, Alim Akhtar wrote:
>>> Hi Yalin,
>>> 
>>> On Wed, Nov 11, 2015 at 9:53 AM, yalin wang  
>>> wrote:
>>>> Use kmalloc instead of kzalloc, zero the memory is not needed.
>>>> 
>>> why you want to do this? what problem you faces, and how this resolves the 
>>> same?
>> 
>> The patch fixes an inefficiency: explicitly zeroing memory that is then
>> immediately overwritten 6 lines below is wasteful.
> 
> It might fix previous kzalloc as well, though better not to do since
> it's error prone.
> 
yeah,  i will send a new patch ,



>> 
>> Regards,
>> Peter Hurley
>> 
>>>> Signed-off-by: yalin wang 
>>>> ---
>>>> drivers/mmc/card/block.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>> 
>>>> diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
>>>> index 23b6c8e..975cd3e 100644
>>>> --- a/drivers/mmc/card/block.c
>>>> +++ b/drivers/mmc/card/block.c
>>>> @@ -365,7 +365,7 @@ static struct mmc_blk_ioc_data 
>>>> *mmc_blk_ioctl_copy_from_user(
>>>>if (!idata->buf_bytes)
>>>>return idata;
>>>> 
>>>> -   idata->buf = kzalloc(idata->buf_bytes, GFP_KERNEL);
>>>> +   idata->buf = kmalloc(idata->buf_bytes, GFP_KERNEL);
>>>>if (!idata->buf) {
>>>>err = -ENOMEM;
>>>>goto idata_err;
>>>> --
>>>> 1.9.1
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 
> 
> -- 
> With Best Regards,
> Andy Shevchenko

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V4] mm: fix kernel crash in khugepaged thread

2015-11-12 Thread yalin wang
 trace NULL page.

Signed-off-by: yalin wang 
---
 include/trace/events/huge_memory.h | 20 
 mm/huge_memory.c   |  6 +++---
 2 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/include/trace/events/huge_memory.h 
b/include/trace/events/huge_memory.h
index 11c59ca..727647b 100644
--- a/include/trace/events/huge_memory.h
+++ b/include/trace/events/huge_memory.h
@@ -45,12 +45,14 @@ SCAN_STATUS
 #define EM(a, b)   {a, b},
 #define EMe(a, b)  {a, b}
 
-TRACE_EVENT(mm_khugepaged_scan_pmd,
+TRACE_EVENT_CONDITION(mm_khugepaged_scan_pmd,
 
-   TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable,
+   TP_PROTO(struct mm_struct *mm, struct page *page, bool writable,
 bool referenced, int none_or_zero, int status, int unmapped),
 
-   TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped),
+   TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped),
+
+   TP_CONDITION(page),
 
TP_STRUCT__entry(
__field(struct mm_struct *, mm)
@@ -64,7 +66,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd,
 
TP_fast_assign(
__entry->mm = mm;
-   __entry->pfn = pfn;
+   __entry->pfn = page_to_pfn(page);
__entry->writable = writable;
__entry->referenced = referenced;
__entry->none_or_zero = none_or_zero;
@@ -106,12 +108,14 @@ TRACE_EVENT(mm_collapse_huge_page,
__print_symbolic(__entry->status, SCAN_STATUS))
 );
 
-TRACE_EVENT(mm_collapse_huge_page_isolate,
+TRACE_EVENT_CONDITION(mm_collapse_huge_page_isolate,
 
-   TP_PROTO(unsigned long pfn, int none_or_zero,
+   TP_PROTO(struct page *page, int none_or_zero,
 bool referenced, bool  writable, int status),
 
-   TP_ARGS(pfn, none_or_zero, referenced, writable, status),
+   TP_ARGS(page, none_or_zero, referenced, writable, status),
+
+   TP_CONDITION(page),
 
TP_STRUCT__entry(
__field(unsigned long, pfn)
@@ -122,7 +126,7 @@ TRACE_EVENT(mm_collapse_huge_page_isolate,
),
 
TP_fast_assign(
-   __entry->pfn = pfn;
+   __entry->pfn = page_to_pfn(page);
__entry->none_or_zero = none_or_zero;
__entry->referenced = referenced;
__entry->writable = writable;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 67b00a1..fb3c4f8 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1977,7 +1977,7 @@ static int __collapse_huge_page_isolate(struct 
vm_area_struct *vma,
if (likely(writable)) {
if (likely(referenced)) {
result = SCAN_SUCCEED;
-   trace_mm_collapse_huge_page_isolate(page_to_pfn(page), 
none_or_zero,
+   trace_mm_collapse_huge_page_isolate(page, none_or_zero,
referenced, 
writable, result);
return 1;
}
@@ -1987,7 +1987,7 @@ static int __collapse_huge_page_isolate(struct 
vm_area_struct *vma,
 
 out:
release_pte_pages(pte, _pte);
-   trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero,
+   trace_mm_collapse_huge_page_isolate(page, none_or_zero,
referenced, writable, result);
return 0;
 }
@@ -2530,7 +2530,7 @@ out_unmap:
collapse_huge_page(mm, address, hpage, vma, node);
}
 out:
-   trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, 
referenced,
+   trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced,
 none_or_zero, result, unmapped);
return ret;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V4] mm: fix kernel crash in khugepaged thread

2015-11-12 Thread yalin wang
 trace NULL page.

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 include/trace/events/huge_memory.h | 20 
 mm/huge_memory.c   |  6 +++---
 2 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/include/trace/events/huge_memory.h 
b/include/trace/events/huge_memory.h
index 11c59ca..727647b 100644
--- a/include/trace/events/huge_memory.h
+++ b/include/trace/events/huge_memory.h
@@ -45,12 +45,14 @@ SCAN_STATUS
 #define EM(a, b)   {a, b},
 #define EMe(a, b)  {a, b}
 
-TRACE_EVENT(mm_khugepaged_scan_pmd,
+TRACE_EVENT_CONDITION(mm_khugepaged_scan_pmd,
 
-   TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable,
+   TP_PROTO(struct mm_struct *mm, struct page *page, bool writable,
 bool referenced, int none_or_zero, int status, int unmapped),
 
-   TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped),
+   TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped),
+
+   TP_CONDITION(page),
 
TP_STRUCT__entry(
__field(struct mm_struct *, mm)
@@ -64,7 +66,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd,
 
TP_fast_assign(
__entry->mm = mm;
-   __entry->pfn = pfn;
+   __entry->pfn = page_to_pfn(page);
__entry->writable = writable;
__entry->referenced = referenced;
__entry->none_or_zero = none_or_zero;
@@ -106,12 +108,14 @@ TRACE_EVENT(mm_collapse_huge_page,
__print_symbolic(__entry->status, SCAN_STATUS))
 );
 
-TRACE_EVENT(mm_collapse_huge_page_isolate,
+TRACE_EVENT_CONDITION(mm_collapse_huge_page_isolate,
 
-   TP_PROTO(unsigned long pfn, int none_or_zero,
+   TP_PROTO(struct page *page, int none_or_zero,
 bool referenced, bool  writable, int status),
 
-   TP_ARGS(pfn, none_or_zero, referenced, writable, status),
+   TP_ARGS(page, none_or_zero, referenced, writable, status),
+
+   TP_CONDITION(page),
 
TP_STRUCT__entry(
__field(unsigned long, pfn)
@@ -122,7 +126,7 @@ TRACE_EVENT(mm_collapse_huge_page_isolate,
),
 
TP_fast_assign(
-   __entry->pfn = pfn;
+   __entry->pfn = page_to_pfn(page);
__entry->none_or_zero = none_or_zero;
__entry->referenced = referenced;
__entry->writable = writable;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 67b00a1..fb3c4f8 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1977,7 +1977,7 @@ static int __collapse_huge_page_isolate(struct 
vm_area_struct *vma,
if (likely(writable)) {
if (likely(referenced)) {
result = SCAN_SUCCEED;
-   trace_mm_collapse_huge_page_isolate(page_to_pfn(page), 
none_or_zero,
+   trace_mm_collapse_huge_page_isolate(page, none_or_zero,
referenced, 
writable, result);
return 1;
}
@@ -1987,7 +1987,7 @@ static int __collapse_huge_page_isolate(struct 
vm_area_struct *vma,
 
 out:
release_pte_pages(pte, _pte);
-   trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero,
+   trace_mm_collapse_huge_page_isolate(page, none_or_zero,
referenced, writable, result);
return 0;
 }
@@ -2530,7 +2530,7 @@ out_unmap:
collapse_huge_page(mm, address, hpage, vma, node);
}
 out:
-   trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, 
referenced,
+   trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced,
 none_or_zero, result, unmapped);
return ret;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mmc: change to use kmalloc

2015-11-12 Thread yalin wang

> On Nov 12, 2015, at 05:57, Andy Shevchenko <andy.shevche...@gmail.com> wrote:
> 
> On Wed, Nov 11, 2015 at 11:17 PM, Peter Hurley <pe...@hurleysoftware.com> 
> wrote:
>> On 11/11/2015 12:02 PM, Alim Akhtar wrote:
>>> Hi Yalin,
>>> 
>>> On Wed, Nov 11, 2015 at 9:53 AM, yalin wang <yalin.wang2...@gmail.com> 
>>> wrote:
>>>> Use kmalloc instead of kzalloc, zero the memory is not needed.
>>>> 
>>> why you want to do this? what problem you faces, and how this resolves the 
>>> same?
>> 
>> The patch fixes an inefficiency: explicitly zeroing memory that is then
>> immediately overwritten 6 lines below is wasteful.
> 
> It might fix previous kzalloc as well, though better not to do since
> it's error prone.
> 
yeah,  i will send a new patch ,



>> 
>> Regards,
>> Peter Hurley
>> 
>>>> Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
>>>> ---
>>>> drivers/mmc/card/block.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>> 
>>>> diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
>>>> index 23b6c8e..975cd3e 100644
>>>> --- a/drivers/mmc/card/block.c
>>>> +++ b/drivers/mmc/card/block.c
>>>> @@ -365,7 +365,7 @@ static struct mmc_blk_ioc_data 
>>>> *mmc_blk_ioctl_copy_from_user(
>>>>if (!idata->buf_bytes)
>>>>return idata;
>>>> 
>>>> -   idata->buf = kzalloc(idata->buf_bytes, GFP_KERNEL);
>>>> +   idata->buf = kmalloc(idata->buf_bytes, GFP_KERNEL);
>>>>if (!idata->buf) {
>>>>err = -ENOMEM;
>>>>goto idata_err;
>>>> --
>>>> 1.9.1
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 
> 
> -- 
> With Best Regards,
> Andy Shevchenko

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2] mmc: change to use kmalloc

2015-11-12 Thread yalin wang
Use kmalloc instead of kzalloc, zero the memory is not needed.

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 drivers/mmc/card/block.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index c742cfd..c3fd4c8 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -345,7 +345,7 @@ static struct mmc_blk_ioc_data 
*mmc_blk_ioctl_copy_from_user(
struct mmc_blk_ioc_data *idata;
int err;
 
-   idata = kzalloc(sizeof(*idata), GFP_KERNEL);
+   idata = kmalloc(sizeof(*idata), GFP_KERNEL);
if (!idata) {
err = -ENOMEM;
goto out;
@@ -365,7 +365,7 @@ static struct mmc_blk_ioc_data 
*mmc_blk_ioctl_copy_from_user(
if (!idata->buf_bytes)
return idata;
 
-   idata->buf = kzalloc(idata->buf_bytes, GFP_KERNEL);
+   idata->buf = kmalloc(idata->buf_bytes, GFP_KERNEL);
if (!idata->buf) {
err = -ENOMEM;
goto idata_err;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: change mm_vmscan_lru_shrink_inactive() proto types

2015-11-11 Thread yalin wang
Move node_id zone_idx shrink flags into trace function,
so thay we don't need caculate these args if the trace is disabled,
and will make this function have less arguments.

Signed-off-by: yalin wang 
---
 include/trace/events/vmscan.h | 14 +++---
 mm/vmscan.c   |  7 ++-
 2 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index dae7836..f8d6b34 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -352,11 +352,11 @@ TRACE_EVENT(mm_vmscan_writepage,
 
 TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 
-   TP_PROTO(int nid, int zid,
-   unsigned long nr_scanned, unsigned long nr_reclaimed,
-   int priority, int reclaim_flags),
+   TP_PROTO(struct zone *zone,
+   unsigned long nr_scanned, unsigned long nr_reclaimed,
+   int priority, int file),
 
-   TP_ARGS(nid, zid, nr_scanned, nr_reclaimed, priority, reclaim_flags),
+   TP_ARGS(zone, nr_scanned, nr_reclaimed, priority, file),
 
TP_STRUCT__entry(
__field(int, nid)
@@ -368,12 +368,12 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
),
 
TP_fast_assign(
-   __entry->nid = nid;
-   __entry->zid = zid;
+   __entry->nid = zone->zone_pgdat->node_id;
+   __entry->zid = zone_idx(zone);
__entry->nr_scanned = nr_scanned;
__entry->nr_reclaimed = nr_reclaimed;
__entry->priority = priority;
-   __entry->reclaim_flags = reclaim_flags;
+   __entry->reclaim_flags = trace_shrink_flags(file);
),
 
TP_printk("nid=%d zid=%d nr_scanned=%ld nr_reclaimed=%ld priority=%d 
flags=%s",
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 83cea53..bd2918e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1691,11 +1691,8 @@ shrink_inactive_list(unsigned long nr_to_scan, struct 
lruvec *lruvec,
current_may_throttle())
wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10);
 
-   trace_mm_vmscan_lru_shrink_inactive(zone->zone_pgdat->node_id,
-   zone_idx(zone),
-   nr_scanned, nr_reclaimed,
-   sc->priority,
-   trace_shrink_flags(file));
+   trace_mm_vmscan_lru_shrink_inactive(zone, nr_scanned, nr_reclaimed,
+   sc->priority, file);
return nr_reclaimed;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: change trace_mm_vmscan_writepage() proto type

2015-11-11 Thread yalin wang
Move trace_reclaim_flags() into trace function,
so that we don't need caculate these flags if the trace is disabled.

Signed-off-by: yalin wang 
---
 include/trace/events/vmscan.h | 7 +++
 mm/vmscan.c   | 2 +-
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index f66476b..dae7836 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -330,10 +330,9 @@ DEFINE_EVENT(mm_vmscan_lru_isolate_template, 
mm_vmscan_memcg_isolate,
 
 TRACE_EVENT(mm_vmscan_writepage,
 
-   TP_PROTO(struct page *page,
-   int reclaim_flags),
+   TP_PROTO(struct page *page),
 
-   TP_ARGS(page, reclaim_flags),
+   TP_ARGS(page),
 
TP_STRUCT__entry(
__field(unsigned long, pfn)
@@ -342,7 +341,7 @@ TRACE_EVENT(mm_vmscan_writepage,
 
TP_fast_assign(
__entry->pfn = page_to_pfn(page);
-   __entry->reclaim_flags = reclaim_flags;
+   __entry->reclaim_flags = trace_reclaim_flags(page);
),
 
TP_printk("page=%p pfn=%lu flags=%s",
diff --git a/mm/vmscan.c b/mm/vmscan.c
index a4507ec..83cea53 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -594,7 +594,7 @@ static pageout_t pageout(struct page *page, struct 
address_space *mapping,
/* synchronous write or broken a_ops? */
ClearPageReclaim(page);
}
-   trace_mm_vmscan_writepage(page, trace_reclaim_flags(page));
+   trace_mm_vmscan_writepage(page);
inc_zone_page_state(page, NR_VMSCAN_WRITE);
return PAGE_SUCCESS;
}
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2] mm: fix kernel crash in khugepaged thread

2015-11-11 Thread yalin wang
Ok
i will send a V3 patch.
> On Nov 5, 2015, at 16:50, Kirill A. Shutemov  wrote:
> 
> On Thu, Nov 05, 2015 at 09:12:34AM +0100, Vlastimil Babka wrote:
>> On 10/29/2015 01:35 AM, Kirill A. Shutemov wrote:
 @@ -2605,9 +2603,9 @@ out_unmap:
/* collapse_huge_page will return with the mmap_sem released */
collapse_huge_page(mm, address, hpage, vma, node);
}
 -out:
 -  trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, 
 referenced,
 -   none_or_zero, result, unmapped);
 +  trace_mm_khugepaged_scan_pmd(mm, pte_present(pteval) ?
 +  pte_pfn(pteval) : -1, writable, referenced,
 +  none_or_zero, result, unmapped);
>>> 
>>> maybe passing down pte instead of pfn?
>> 
>> Maybe just pass the page, and have tracepoint's fast assign check for !NULL 
>> and
>> do page_to_pfn itself? That way the complexity and overhead is only in the
>> tracepoint and when enabled.
> 
> Agreed.
> 
> -- 
> Kirill A. Shutemov

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V3] mm: fix kernel crash in khugepaged thread

2015-11-11 Thread yalin wang
 trace NULL page.

Signed-off-by: yalin wang 
---
 include/trace/events/huge_memory.h | 10 ++
 mm/huge_memory.c   |  2 +-
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/include/trace/events/huge_memory.h 
b/include/trace/events/huge_memory.h
index 11c59ca..369d912 100644
--- a/include/trace/events/huge_memory.h
+++ b/include/trace/events/huge_memory.h
@@ -45,12 +45,14 @@ SCAN_STATUS
 #define EM(a, b)   {a, b},
 #define EMe(a, b)  {a, b}
 
-TRACE_EVENT(mm_khugepaged_scan_pmd,
+TRACE_EVENT_CONDITION(mm_khugepaged_scan_pmd,
 
-   TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable,
+   TP_PROTO(struct mm_struct *mm, struct page *page, bool writable,
 bool referenced, int none_or_zero, int status, int unmapped),
 
-   TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped),
+   TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped),
+
+   TP_CONDITION(page),
 
TP_STRUCT__entry(
__field(struct mm_struct *, mm)
@@ -64,7 +66,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd,
 
TP_fast_assign(
__entry->mm = mm;
-   __entry->pfn = pfn;
+   __entry->pfn = page_to_pfn(page);
__entry->writable = writable;
__entry->referenced = referenced;
__entry->none_or_zero = none_or_zero;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 67b00a1..ff2b105 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2530,7 +2530,7 @@ out_unmap:
collapse_huge_page(mm, address, hpage, vma, node);
}
 out:
-   trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, 
referenced,
+   trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced,
 none_or_zero, result, unmapped);
return ret;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V3] mm: fix kernel crash in khugepaged thread

2015-11-11 Thread yalin wang
 trace NULL page.

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 include/trace/events/huge_memory.h | 10 ++
 mm/huge_memory.c   |  2 +-
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/include/trace/events/huge_memory.h 
b/include/trace/events/huge_memory.h
index 11c59ca..369d912 100644
--- a/include/trace/events/huge_memory.h
+++ b/include/trace/events/huge_memory.h
@@ -45,12 +45,14 @@ SCAN_STATUS
 #define EM(a, b)   {a, b},
 #define EMe(a, b)  {a, b}
 
-TRACE_EVENT(mm_khugepaged_scan_pmd,
+TRACE_EVENT_CONDITION(mm_khugepaged_scan_pmd,
 
-   TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable,
+   TP_PROTO(struct mm_struct *mm, struct page *page, bool writable,
 bool referenced, int none_or_zero, int status, int unmapped),
 
-   TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped),
+   TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped),
+
+   TP_CONDITION(page),
 
TP_STRUCT__entry(
__field(struct mm_struct *, mm)
@@ -64,7 +66,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd,
 
TP_fast_assign(
__entry->mm = mm;
-   __entry->pfn = pfn;
+   __entry->pfn = page_to_pfn(page);
__entry->writable = writable;
__entry->referenced = referenced;
__entry->none_or_zero = none_or_zero;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 67b00a1..ff2b105 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2530,7 +2530,7 @@ out_unmap:
collapse_huge_page(mm, address, hpage, vma, node);
}
 out:
-   trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, 
referenced,
+   trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced,
 none_or_zero, result, unmapped);
return ret;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2] mm: fix kernel crash in khugepaged thread

2015-11-11 Thread yalin wang
Ok
i will send a V3 patch.
> On Nov 5, 2015, at 16:50, Kirill A. Shutemov  wrote:
> 
> On Thu, Nov 05, 2015 at 09:12:34AM +0100, Vlastimil Babka wrote:
>> On 10/29/2015 01:35 AM, Kirill A. Shutemov wrote:
 @@ -2605,9 +2603,9 @@ out_unmap:
/* collapse_huge_page will return with the mmap_sem released */
collapse_huge_page(mm, address, hpage, vma, node);
}
 -out:
 -  trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, 
 referenced,
 -   none_or_zero, result, unmapped);
 +  trace_mm_khugepaged_scan_pmd(mm, pte_present(pteval) ?
 +  pte_pfn(pteval) : -1, writable, referenced,
 +  none_or_zero, result, unmapped);
>>> 
>>> maybe passing down pte instead of pfn?
>> 
>> Maybe just pass the page, and have tracepoint's fast assign check for !NULL 
>> and
>> do page_to_pfn itself? That way the complexity and overhead is only in the
>> tracepoint and when enabled.
> 
> Agreed.
> 
> -- 
> Kirill A. Shutemov

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: change mm_vmscan_lru_shrink_inactive() proto types

2015-11-11 Thread yalin wang
Move node_id zone_idx shrink flags into trace function,
so thay we don't need caculate these args if the trace is disabled,
and will make this function have less arguments.

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 include/trace/events/vmscan.h | 14 +++---
 mm/vmscan.c   |  7 ++-
 2 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index dae7836..f8d6b34 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -352,11 +352,11 @@ TRACE_EVENT(mm_vmscan_writepage,
 
 TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 
-   TP_PROTO(int nid, int zid,
-   unsigned long nr_scanned, unsigned long nr_reclaimed,
-   int priority, int reclaim_flags),
+   TP_PROTO(struct zone *zone,
+   unsigned long nr_scanned, unsigned long nr_reclaimed,
+   int priority, int file),
 
-   TP_ARGS(nid, zid, nr_scanned, nr_reclaimed, priority, reclaim_flags),
+   TP_ARGS(zone, nr_scanned, nr_reclaimed, priority, file),
 
TP_STRUCT__entry(
__field(int, nid)
@@ -368,12 +368,12 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
),
 
TP_fast_assign(
-   __entry->nid = nid;
-   __entry->zid = zid;
+   __entry->nid = zone->zone_pgdat->node_id;
+   __entry->zid = zone_idx(zone);
__entry->nr_scanned = nr_scanned;
__entry->nr_reclaimed = nr_reclaimed;
__entry->priority = priority;
-   __entry->reclaim_flags = reclaim_flags;
+   __entry->reclaim_flags = trace_shrink_flags(file);
),
 
TP_printk("nid=%d zid=%d nr_scanned=%ld nr_reclaimed=%ld priority=%d 
flags=%s",
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 83cea53..bd2918e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1691,11 +1691,8 @@ shrink_inactive_list(unsigned long nr_to_scan, struct 
lruvec *lruvec,
current_may_throttle())
wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10);
 
-   trace_mm_vmscan_lru_shrink_inactive(zone->zone_pgdat->node_id,
-   zone_idx(zone),
-   nr_scanned, nr_reclaimed,
-   sc->priority,
-   trace_shrink_flags(file));
+   trace_mm_vmscan_lru_shrink_inactive(zone, nr_scanned, nr_reclaimed,
+   sc->priority, file);
return nr_reclaimed;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: change trace_mm_vmscan_writepage() proto type

2015-11-11 Thread yalin wang
Move trace_reclaim_flags() into trace function,
so that we don't need caculate these flags if the trace is disabled.

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 include/trace/events/vmscan.h | 7 +++
 mm/vmscan.c   | 2 +-
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index f66476b..dae7836 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -330,10 +330,9 @@ DEFINE_EVENT(mm_vmscan_lru_isolate_template, 
mm_vmscan_memcg_isolate,
 
 TRACE_EVENT(mm_vmscan_writepage,
 
-   TP_PROTO(struct page *page,
-   int reclaim_flags),
+   TP_PROTO(struct page *page),
 
-   TP_ARGS(page, reclaim_flags),
+   TP_ARGS(page),
 
TP_STRUCT__entry(
__field(unsigned long, pfn)
@@ -342,7 +341,7 @@ TRACE_EVENT(mm_vmscan_writepage,
 
TP_fast_assign(
__entry->pfn = page_to_pfn(page);
-   __entry->reclaim_flags = reclaim_flags;
+   __entry->reclaim_flags = trace_reclaim_flags(page);
),
 
TP_printk("page=%p pfn=%lu flags=%s",
diff --git a/mm/vmscan.c b/mm/vmscan.c
index a4507ec..83cea53 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -594,7 +594,7 @@ static pageout_t pageout(struct page *page, struct 
address_space *mapping,
/* synchronous write or broken a_ops? */
ClearPageReclaim(page);
}
-   trace_mm_vmscan_writepage(page, trace_reclaim_flags(page));
+   trace_mm_vmscan_writepage(page);
inc_zone_page_state(page, NR_VMSCAN_WRITE);
return PAGE_SUCCESS;
}
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mmc: change to use kmalloc

2015-11-10 Thread yalin wang
Use kmalloc instead of kzalloc, zero the memory is not needed.

Signed-off-by: yalin wang 
---
 drivers/mmc/card/block.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 23b6c8e..975cd3e 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -365,7 +365,7 @@ static struct mmc_blk_ioc_data 
*mmc_blk_ioctl_copy_from_user(
if (!idata->buf_bytes)
return idata;
 
-   idata->buf = kzalloc(idata->buf_bytes, GFP_KERNEL);
+   idata->buf = kmalloc(idata->buf_bytes, GFP_KERNEL);
if (!idata->buf) {
err = -ENOMEM;
goto idata_err;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V3 PATCH] arm64: remove redundant FRAME_POINTER kconfig option and force to select it

2015-11-10 Thread yalin wang

> On Nov 10, 2015, at 19:35, Catalin Marinas  wrote:
> 
> On Tue, Nov 10, 2015 at 07:09:00PM +0800, yalin wang wrote:
>>> On Nov 10, 2015, at 18:37, Catalin Marinas  wrote:
>>> 
>>> On Mon, Nov 09, 2015 at 10:09:55AM -0800, Yang Shi wrote:
>>>> FRAME_POINTER  is defined in lib/Kconfig.debug, it is unnecessary to 
>>>> redefine
>>>> it in arch/arm64/Kconfig.debug. Actually, the one defined in arm64 
>>>> directory
>>>> is never used.
>>> 
>>> That's not true since the arm64 definition seems to take precedence.
>>> 
>>>> This adds a dependency on DEBUG_KERNEL for building with frame pointers.
>>> 
>>> It doesn't because arm64 selects ARCH_WANT_FRAME_POINTERS.
>>> 
>>>> ARM64 depends on frame pointer to get correct stack backtrace and need
>>>> FRAME_POINTER kconfig option enabled all the time.
>>>> However, currect implementation makes it could be disabled, so force it
>>>> to be selected by ARM64.
>>>> 
>>>> Signed-off-by: Yang Shi 
>>> 
>>> Patch applied but I changed the commit log slightly. Thanks.
>> i have a question,
>> why FRAME_POINTER  config must be enabled ?
>> and i see ARM arch can  disable this config .
>> if i don’t need stack trace dump and the software release is for 
>> final product , don’t need debug stack trace log .
>> is it possible to disable it for performance reason ?
> 
> If you don't need any stack trace, perf etc., in theory you can disable
> the option. However, the aarch64 gcc compiler always generates it (I'm
> not sure whether the AAPCS mandates it). Anyway, the performance impact
> is very small since there are more general purpose registers available
> in AArch64 already.
> 
i just make a test with -fomit-frame-pointer,  seems gcc can generate code 
without frame pointer,
like ARM arch.
version:
aarch64-linux-gnu-gcc
gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 
- Linaro GCC 4.9-2014.09)

why AARCH64 don’t have frame unwind info just like ARM arch?

Thanks








--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm64: add HAVE_LATENCYTOP_SUPPORT config

2015-11-10 Thread yalin wang

> On Nov 10, 2015, at 19:18, Will Deacon  wrote:
> 
> Ha, so it does! Patch below. The only non-trivial part was arch/arm/,
> which has a dependency on !SMP which I believe is no longer required
> as of d5996b2ff0e2 ("ARM: fix /proc/$PID/stack on SMP").
> 
> Will
> 
> --->8
> 
> From 8dfb40e92ac322cbd68bf9f16cbb11fc5e210269 Mon Sep 17 00:00:00 2001
> From: Will Deacon 
> Date: Tue, 10 Nov 2015 11:10:04 +
> Subject: [PATCH] Kconfig: remove HAVE_LATENCYTOP_SUPPORT
> 
> As illustrated by a3afe70b83fd ("[S390] latencytop s390 support."),
> HAVE_LATENCYTOP_SUPPORT is defined by an architecture to advertise an
> implementation of save_stack_trace_tsk.
> 
> However, as of 9212ddb5eada ("stacktrace: provide save_stack_trace_tsk()
> weak alias") a dummy implementation is provided if STACKTRACE=y.
> Given that LATENCYTOP already depends on STACKTRACE_SUPPORT and selects
> STACKTRACE, we can remove HAVE_LATENCYTOP_SUPPORT altogether.
> 
> Signed-off-by: Will Deacon 
> ---
> arch/arc/Kconfig| 3 ---
> arch/arm/Kconfig| 5 -
> arch/metag/Kconfig  | 3 ---
> arch/microblaze/Kconfig | 3 ---
> arch/parisc/Kconfig | 3 ---
> arch/powerpc/Kconfig| 3 ---
> arch/s390/Kconfig   | 3 ---
> arch/sh/Kconfig | 3 ---
> arch/sparc/Kconfig  | 4 
> arch/unicore32/Kconfig  | 3 ---
> arch/x86/Kconfig| 3 ---
> lib/Kconfig.debug   | 1 -
> 12 files changed, 37 deletions(-)
> 
> diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
> index 2c2ac3f3ff80..6dc312fd6480 100644
> --- a/arch/arc/Kconfig
> +++ b/arch/arc/Kconfig
> @@ -73,9 +73,6 @@ config STACKTRACE_SUPPORT
>   def_bool y
>   select STACKTRACE
> 
> -config HAVE_LATENCYTOP_SUPPORT
> - def_bool y
> -
> config HAVE_ARCH_TRANSPARENT_HUGEPAGE
>   def_bool y
>   depends on ARC_MMU_V4
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 35854e8d97ff..94eff0c6b0f8 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -162,11 +162,6 @@ config STACKTRACE_SUPPORT
>   bool
>   default y
> 
> -config HAVE_LATENCYTOP_SUPPORT
> - bool
> - depends on !SMP
> - default y
> -
> config LOCKDEP_SUPPORT
>   bool
>   default y
> diff --git a/arch/metag/Kconfig b/arch/metag/Kconfig
> index 0b389a81c43a..a0fa88da3e31 100644
> --- a/arch/metag/Kconfig
> +++ b/arch/metag/Kconfig
> @@ -36,9 +36,6 @@ config STACKTRACE_SUPPORT
> config LOCKDEP_SUPPORT
>   def_bool y
> 
> -config HAVE_LATENCYTOP_SUPPORT
> - def_bool y
> -
> config RWSEM_GENERIC_SPINLOCK
>   def_bool y
> 
> diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig
> index 0bce820428fc..5ecd0287a874 100644
> --- a/arch/microblaze/Kconfig
> +++ b/arch/microblaze/Kconfig
> @@ -67,9 +67,6 @@ config STACKTRACE_SUPPORT
> config LOCKDEP_SUPPORT
>   def_bool y
> 
> -config HAVE_LATENCYTOP_SUPPORT
> - def_bool y
> -
> source "init/Kconfig"
> 
> source "kernel/Kconfig.freezer"
> diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
> index c36546959e86..16276d505cd6 100644
> --- a/arch/parisc/Kconfig
> +++ b/arch/parisc/Kconfig
> @@ -79,9 +79,6 @@ config TIME_LOW_RES
>   depends on SMP
>   default y
> 
> -config HAVE_LATENCYTOP_SUPPORT
> -def_bool y
> -
> # unless you want to implement ACPI on PA-RISC ... ;-)
> config PM
>   bool
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index db49e0d796b1..89210bfdfc7a 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -47,9 +47,6 @@ config STACKTRACE_SUPPORT
>   bool
>   default y
> 
> -config HAVE_LATENCYTOP_SUPPORT
> - def_bool y
> -
> config TRACE_IRQFLAGS_SUPPORT
>   bool
>   default y
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index 3a55f493c7da..69e22b502d09 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -10,9 +10,6 @@ config LOCKDEP_SUPPORT
> config STACKTRACE_SUPPORT
>   def_bool y
> 
> -config HAVE_LATENCYTOP_SUPPORT
> - def_bool y
> -
> config RWSEM_GENERIC_SPINLOCK
>   bool
> 
> diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
> index d514df7e04dd..6c391a5d3e5c 100644
> --- a/arch/sh/Kconfig
> +++ b/arch/sh/Kconfig
> @@ -130,9 +130,6 @@ config STACKTRACE_SUPPORT
> config LOCKDEP_SUPPORT
>   def_bool y
> 
> -config HAVE_LATENCYTOP_SUPPORT
> - def_bool y
> -
> config ARCH_HAS_ILOG2_U32
>   def_bool n
> 
> diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
> index 56442d2d7bbc..3203e42190dd 100644
> --- a/arch/sparc/Kconfig
> +++ b/arch/sparc/Kconfig
> @@ -101,10 +101,6 @@ config LOCKDEP_SUPPORT
>   bool
>   default y if SPARC64
> 
> -config HAVE_LATENCYTOP_SUPPORT
> - bool
> - default y if SPARC64
> -
> config ARCH_HIBERNATION_POSSIBLE
>   def_bool y if SPARC64
> 
> diff --git a/arch/unicore32/Kconfig b/arch/unicore32/Kconfig
> index c9faddc61100..910ed969 100644
> --- a/arch/unicore32/Kconfig
> +++ b/arch/unicore32/Kconfig
> @@ -33,9 +33,6 @@ config NO_IOPORT_MAP
> config STACKTRACE_SUPPORT
>   def_bool y
> 

Re: [V3 PATCH] arm64: remove redundant FRAME_POINTER kconfig option and force to select it

2015-11-10 Thread yalin wang

> On Nov 10, 2015, at 18:37, Catalin Marinas  wrote:
> 
> On Mon, Nov 09, 2015 at 10:09:55AM -0800, Yang Shi wrote:
>> FRAME_POINTER  is defined in lib/Kconfig.debug, it is unnecessary to redefine
>> it in arch/arm64/Kconfig.debug. Actually, the one defined in arm64 directory
>> is never used.
> 
> That's not true since the arm64 definition seems to take precedence.
> 
>> This adds a dependency on DEBUG_KERNEL for building with frame pointers.
> 
> It doesn't because arm64 selects ARCH_WANT_FRAME_POINTERS.
> 
>> ARM64 depends on frame pointer to get correct stack backtrace and need
>> FRAME_POINTER kconfig option enabled all the time.
>> However, currect implementation makes it could be disabled, so force it
>> to be selected by ARM64.
>> 
>> Signed-off-by: Yang Shi 
> 
> Patch applied but I changed the commit log slightly. Thanks.
i have a question,
why FRAME_POINTER  config must be enabled ?
and i see ARM arch can  disable this config .
if i don’t need stack trace dump and the software release is for 
final product , don’t need debug stack trace log .
is it possible to disable it for performance reason ?

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm64: add HAVE_LATENCYTOP_SUPPORT config

2015-11-10 Thread yalin wang

> On Nov 10, 2015, at 19:18, Will Deacon  wrote:
> 
> Ha, so it does! Patch below. The only non-trivial part was arch/arm/,
> which has a dependency on !SMP which I believe is no longer required
> as of d5996b2ff0e2 ("ARM: fix /proc/$PID/stack on SMP").
> 
> Will
> 
> --->8
> 
> From 8dfb40e92ac322cbd68bf9f16cbb11fc5e210269 Mon Sep 17 00:00:00 2001
> From: Will Deacon 
> Date: Tue, 10 Nov 2015 11:10:04 +
> Subject: [PATCH] Kconfig: remove HAVE_LATENCYTOP_SUPPORT
> 
> As illustrated by a3afe70b83fd ("[S390] latencytop s390 support."),
> HAVE_LATENCYTOP_SUPPORT is defined by an architecture to advertise an
> implementation of save_stack_trace_tsk.
> 
> However, as of 9212ddb5eada ("stacktrace: provide save_stack_trace_tsk()
> weak alias") a dummy implementation is provided if STACKTRACE=y.
> Given that LATENCYTOP already depends on STACKTRACE_SUPPORT and selects
> STACKTRACE, we can remove HAVE_LATENCYTOP_SUPPORT altogether.
> 
> Signed-off-by: Will Deacon 
> ---
> arch/arc/Kconfig| 3 ---
> arch/arm/Kconfig| 5 -
> arch/metag/Kconfig  | 3 ---
> arch/microblaze/Kconfig | 3 ---
> arch/parisc/Kconfig | 3 ---
> arch/powerpc/Kconfig| 3 ---
> arch/s390/Kconfig   | 3 ---
> arch/sh/Kconfig | 3 ---
> arch/sparc/Kconfig  | 4 
> arch/unicore32/Kconfig  | 3 ---
> arch/x86/Kconfig| 3 ---
> lib/Kconfig.debug   | 1 -
> 12 files changed, 37 deletions(-)
> 
> diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
> index 2c2ac3f3ff80..6dc312fd6480 100644
> --- a/arch/arc/Kconfig
> +++ b/arch/arc/Kconfig
> @@ -73,9 +73,6 @@ config STACKTRACE_SUPPORT
>   def_bool y
>   select STACKTRACE
> 
> -config HAVE_LATENCYTOP_SUPPORT
> - def_bool y
> -
> config HAVE_ARCH_TRANSPARENT_HUGEPAGE
>   def_bool y
>   depends on ARC_MMU_V4
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 35854e8d97ff..94eff0c6b0f8 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -162,11 +162,6 @@ config STACKTRACE_SUPPORT
>   bool
>   default y
> 
> -config HAVE_LATENCYTOP_SUPPORT
> - bool
> - depends on !SMP
> - default y
> -
> config LOCKDEP_SUPPORT
>   bool
>   default y
> diff --git a/arch/metag/Kconfig b/arch/metag/Kconfig
> index 0b389a81c43a..a0fa88da3e31 100644
> --- a/arch/metag/Kconfig
> +++ b/arch/metag/Kconfig
> @@ -36,9 +36,6 @@ config STACKTRACE_SUPPORT
> config LOCKDEP_SUPPORT
>   def_bool y
> 
> -config HAVE_LATENCYTOP_SUPPORT
> - def_bool y
> -
> config RWSEM_GENERIC_SPINLOCK
>   def_bool y
> 
> diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig
> index 0bce820428fc..5ecd0287a874 100644
> --- a/arch/microblaze/Kconfig
> +++ b/arch/microblaze/Kconfig
> @@ -67,9 +67,6 @@ config STACKTRACE_SUPPORT
> config LOCKDEP_SUPPORT
>   def_bool y
> 
> -config HAVE_LATENCYTOP_SUPPORT
> - def_bool y
> -
> source "init/Kconfig"
> 
> source "kernel/Kconfig.freezer"
> diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
> index c36546959e86..16276d505cd6 100644
> --- a/arch/parisc/Kconfig
> +++ b/arch/parisc/Kconfig
> @@ -79,9 +79,6 @@ config TIME_LOW_RES
>   depends on SMP
>   default y
> 
> -config HAVE_LATENCYTOP_SUPPORT
> -def_bool y
> -
> # unless you want to implement ACPI on PA-RISC ... ;-)
> config PM
>   bool
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index db49e0d796b1..89210bfdfc7a 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -47,9 +47,6 @@ config STACKTRACE_SUPPORT
>   bool
>   default y
> 
> -config HAVE_LATENCYTOP_SUPPORT
> - def_bool y
> -
> config TRACE_IRQFLAGS_SUPPORT
>   bool
>   default y
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index 3a55f493c7da..69e22b502d09 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -10,9 +10,6 @@ config LOCKDEP_SUPPORT
> config STACKTRACE_SUPPORT
>   def_bool y
> 
> -config HAVE_LATENCYTOP_SUPPORT
> - def_bool y
> -
> config RWSEM_GENERIC_SPINLOCK
>   bool
> 
> diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
> index d514df7e04dd..6c391a5d3e5c 100644
> --- a/arch/sh/Kconfig
> +++ b/arch/sh/Kconfig
> @@ -130,9 +130,6 @@ config STACKTRACE_SUPPORT
> config LOCKDEP_SUPPORT
>   def_bool y
> 
> -config HAVE_LATENCYTOP_SUPPORT
> - def_bool y
> -
> config ARCH_HAS_ILOG2_U32
>   def_bool n
> 
> diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
> index 56442d2d7bbc..3203e42190dd 100644
> --- a/arch/sparc/Kconfig
> +++ b/arch/sparc/Kconfig
> @@ -101,10 +101,6 @@ config LOCKDEP_SUPPORT
>   bool
>   default y if SPARC64
> 
> -config HAVE_LATENCYTOP_SUPPORT
> - bool
> - default y if SPARC64
> -
> config ARCH_HIBERNATION_POSSIBLE
>   def_bool y if SPARC64
> 
> diff --git a/arch/unicore32/Kconfig b/arch/unicore32/Kconfig
> index c9faddc61100..910ed969 100644
> --- a/arch/unicore32/Kconfig
> +++ b/arch/unicore32/Kconfig
> @@ -33,9 +33,6 @@ config 

Re: [V3 PATCH] arm64: remove redundant FRAME_POINTER kconfig option and force to select it

2015-11-10 Thread yalin wang

> On Nov 10, 2015, at 19:35, Catalin Marinas <catalin.mari...@arm.com> wrote:
> 
> On Tue, Nov 10, 2015 at 07:09:00PM +0800, yalin wang wrote:
>>> On Nov 10, 2015, at 18:37, Catalin Marinas <catalin.mari...@arm.com> wrote:
>>> 
>>> On Mon, Nov 09, 2015 at 10:09:55AM -0800, Yang Shi wrote:
>>>> FRAME_POINTER  is defined in lib/Kconfig.debug, it is unnecessary to 
>>>> redefine
>>>> it in arch/arm64/Kconfig.debug. Actually, the one defined in arm64 
>>>> directory
>>>> is never used.
>>> 
>>> That's not true since the arm64 definition seems to take precedence.
>>> 
>>>> This adds a dependency on DEBUG_KERNEL for building with frame pointers.
>>> 
>>> It doesn't because arm64 selects ARCH_WANT_FRAME_POINTERS.
>>> 
>>>> ARM64 depends on frame pointer to get correct stack backtrace and need
>>>> FRAME_POINTER kconfig option enabled all the time.
>>>> However, currect implementation makes it could be disabled, so force it
>>>> to be selected by ARM64.
>>>> 
>>>> Signed-off-by: Yang Shi <yang@linaro.org>
>>> 
>>> Patch applied but I changed the commit log slightly. Thanks.
>> i have a question,
>> why FRAME_POINTER  config must be enabled ?
>> and i see ARM arch can  disable this config .
>> if i don’t need stack trace dump and the software release is for 
>> final product , don’t need debug stack trace log .
>> is it possible to disable it for performance reason ?
> 
> If you don't need any stack trace, perf etc., in theory you can disable
> the option. However, the aarch64 gcc compiler always generates it (I'm
> not sure whether the AAPCS mandates it). Anyway, the performance impact
> is very small since there are more general purpose registers available
> in AArch64 already.
> 
i just make a test with -fomit-frame-pointer,  seems gcc can generate code 
without frame pointer,
like ARM arch.
version:
aarch64-linux-gnu-gcc
gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 
- Linaro GCC 4.9-2014.09)

why AARCH64 don’t have frame unwind info just like ARM arch?

Thanks








--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V3 PATCH] arm64: remove redundant FRAME_POINTER kconfig option and force to select it

2015-11-10 Thread yalin wang

> On Nov 10, 2015, at 18:37, Catalin Marinas  wrote:
> 
> On Mon, Nov 09, 2015 at 10:09:55AM -0800, Yang Shi wrote:
>> FRAME_POINTER  is defined in lib/Kconfig.debug, it is unnecessary to redefine
>> it in arch/arm64/Kconfig.debug. Actually, the one defined in arm64 directory
>> is never used.
> 
> That's not true since the arm64 definition seems to take precedence.
> 
>> This adds a dependency on DEBUG_KERNEL for building with frame pointers.
> 
> It doesn't because arm64 selects ARCH_WANT_FRAME_POINTERS.
> 
>> ARM64 depends on frame pointer to get correct stack backtrace and need
>> FRAME_POINTER kconfig option enabled all the time.
>> However, currect implementation makes it could be disabled, so force it
>> to be selected by ARM64.
>> 
>> Signed-off-by: Yang Shi 
> 
> Patch applied but I changed the commit log slightly. Thanks.
i have a question,
why FRAME_POINTER  config must be enabled ?
and i see ARM arch can  disable this config .
if i don’t need stack trace dump and the software release is for 
final product , don’t need debug stack trace log .
is it possible to disable it for performance reason ?

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mmc: change to use kmalloc

2015-11-10 Thread yalin wang
Use kmalloc instead of kzalloc, zero the memory is not needed.

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 drivers/mmc/card/block.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 23b6c8e..975cd3e 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -365,7 +365,7 @@ static struct mmc_blk_ioc_data 
*mmc_blk_ioctl_copy_from_user(
if (!idata->buf_bytes)
return idata;
 
-   idata->buf = kzalloc(idata->buf_bytes, GFP_KERNEL);
+   idata->buf = kmalloc(idata->buf_bytes, GFP_KERNEL);
if (!idata->buf) {
err = -ENOMEM;
goto idata_err;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm64: add HAVE_LATENCYTOP_SUPPORT config

2015-11-06 Thread yalin wang
i just enable it on ARM64,
and it can work,
i don’t see some special requirement to enable this config .
> On Nov 7, 2015, at 00:05, Will Deacon  wrote:
> 
> On Fri, Nov 06, 2015 at 11:57:58PM +0800, yalin wang wrote:
>> Add HAVE_LATENCYTOP_SUPPORT in Kconfig, so that
>> we can enable this feature on ARM64
> 
> Do you know what the prerequisites for HAVE_LATENCYTOP_SUPPORT actually
> are (beyond those explicitly listed as dependencies for CONFIG_LATENCYTOP)?
> 
> Will

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] arm64: add HAVE_LATENCYTOP_SUPPORT config

2015-11-06 Thread yalin wang
Add HAVE_LATENCYTOP_SUPPORT in Kconfig, so that
we can enable this feature on ARM64

Signed-off-by: yalin wang 
---
 arch/arm64/Kconfig | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 851fe11..782b5bd 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -103,6 +103,10 @@ config ARCH_PHYS_ADDR_T_64BIT
 config MMU
def_bool y
 
+config HAVE_LATENCYTOP_SUPPORT
+   bool
+   default y
+
 config NO_IOPORT_MAP
def_bool y if !PCI
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] goldfish: fix goldfish_pipe driver BUG

2015-11-06 Thread yalin wang
goldfish_pipe_read_write() should pass the buffer's physical
address to qemu, so that host can copy access data correctly,
currently, the drier write a virtual address into address register,
host can not get correct data, then adbd daemon can not work in guest.
Also I comment off access_with_param() function, seems not used,
we don't need use this function in goldfish_pipe_read_write().

Signed-off-by: yalin wang 
---
 drivers/platform/goldfish/goldfish_pipe.c | 56 +++
 1 file changed, 35 insertions(+), 21 deletions(-)

diff --git a/drivers/platform/goldfish/goldfish_pipe.c 
b/drivers/platform/goldfish/goldfish_pipe.c
index 55b6d7c..bdf6f11 100644
--- a/drivers/platform/goldfish/goldfish_pipe.c
+++ b/drivers/platform/goldfish/goldfish_pipe.c
@@ -112,16 +112,27 @@
 #define PIPE_WAKE_READ (1 << 1)  /* pipe can now be read from */
 #define PIPE_WAKE_WRITE(1 << 2)  /* pipe can now be written to */
 
+#ifdef CONFIG_64BIT
 struct access_params {
-   unsigned long channel;
-   u32 size;
-   unsigned long address;
-   u32 cmd;
-   u32 result;
+   uint64_t channel;   /* 0x00 */
+   uint32_t size;  /* 0x08 */
+   uint64_t address;   /* 0x0c */
+   uint32_t cmd;   /* 0x14 */
+   uint32_t result;/* 0x18 */
/* reserved for future extension */
-   u32 flags;
+   uint32_t flags; /* 0x1c */
 };
-
+#else
+struct access_params {
+   uint32_t channel;   /* 0x00 */
+   uint32_t size;  /* 0x04 */
+   uint32_t address;   /* 0x08 */
+   uint32_t cmd;   /* 0x0c */
+   uint32_t result;/* 0x10 */
+   /* reserved for future extension */
+   uint32_t flags; /* 0x14 */
+};
+#endif
 /* The global driver data. Holds a reference to the i/o page used to
  * communicate with the emulator, and a wake queue for blocked tasks
  * waiting to be awoken.
@@ -237,6 +248,7 @@ static int setup_access_params_addr(struct platform_device 
*pdev,
return -1;
 }
 
+#if 0
 /* A value that will not be set by qemu emulator */
 #define INITIAL_BATCH_RESULT (0xdeadbeaf)
 static int access_with_param(struct goldfish_pipe_dev *dev, const int cmd,
@@ -263,6 +275,7 @@ static int access_with_param(struct goldfish_pipe_dev *dev, 
const int cmd,
*status = aps->result;
return 0;
 }
+#endif
 
 /* This function is used for both reading from and writing to a given
  * pipe.
@@ -304,6 +317,8 @@ static ssize_t goldfish_pipe_read_write(struct file *filp, 
char __user *buffer,
 : address_end;
unsigned long  avail= next - address;
int status, wakeBit;
+   struct page *page;
+   phys_addr_t phys_addr;
 
/* Ensure that the corresponding page is properly mapped */
/* FIXME: this isn't safe or sufficient - use get_user_pages */
@@ -323,23 +338,22 @@ static ssize_t goldfish_pipe_read_write(struct file 
*filp, char __user *buffer,
break;
}
}
-
+   if (get_user_pages_unlocked(current, current->active_mm,
+   address, 1, !is_write, 0, ) != 1)
+   return -EINVAL;
+   phys_addr = page_to_phys(page) + offset_in_page(address);
/* Now, try to transfer the bytes in the current page */
spin_lock_irqsave(>lock, irq_flags);
-   if (access_with_param(dev, CMD_WRITE_BUFFER + cmd_offset,
-   address, avail, pipe, )) {
-   gf_write_ptr(pipe, dev->base + PIPE_REG_CHANNEL,
-dev->base + PIPE_REG_CHANNEL_HIGH);
-   writel(avail, dev->base + PIPE_REG_SIZE);
-   gf_write_ptr((void *)address,
-dev->base + PIPE_REG_ADDRESS,
-dev->base + PIPE_REG_ADDRESS_HIGH);
-   writel(CMD_WRITE_BUFFER + cmd_offset,
-   dev->base + PIPE_REG_COMMAND);
-   status = readl(dev->base + PIPE_REG_STATUS);
-   }
+   gf_write_ptr(pipe, dev->base + PIPE_REG_CHANNEL,
+dev->base + PIPE_REG_CHANNEL_HIGH);
+   writel(avail, dev->base + PIPE_REG_SIZE);
+   gf_write_ptr((void *)phys_addr, dev->base + PIPE_REG_ADDRESS,
+dev->base + PIPE_REG_ADDRESS_HIGH);
+   writel(CMD_WRITE_BUFFER + cmd_offset,
+   dev->base + PIPE_REG_COMMAND);
+   status = readl(dev->base + PIPE_REG_STATUS);
spin_unlock_irqrestore(>lock, irq_flags);
-
+   put_page(page);
if (status &

[PATCH] goldfish: fix goldfish_pipe driver BUG

2015-11-06 Thread yalin wang
goldfish_pipe_read_write() should pass the buffer's physical
address to qemu, so that host can copy access data correctly,
currently, the drier write a virtual address into address register,
host can not get correct data, then adbd daemon can not work in guest.
Also I comment off access_with_param() function, seems not used,
we don't need use this function in goldfish_pipe_read_write().

Signed-off-by: yalin wang <yalin.wang2...@gmail.com>
---
 drivers/platform/goldfish/goldfish_pipe.c | 56 +++
 1 file changed, 35 insertions(+), 21 deletions(-)

diff --git a/drivers/platform/goldfish/goldfish_pipe.c 
b/drivers/platform/goldfish/goldfish_pipe.c
index 55b6d7c..bdf6f11 100644
--- a/drivers/platform/goldfish/goldfish_pipe.c
+++ b/drivers/platform/goldfish/goldfish_pipe.c
@@ -112,16 +112,27 @@
 #define PIPE_WAKE_READ (1 << 1)  /* pipe can now be read from */
 #define PIPE_WAKE_WRITE(1 << 2)  /* pipe can now be written to */
 
+#ifdef CONFIG_64BIT
 struct access_params {
-   unsigned long channel;
-   u32 size;
-   unsigned long address;
-   u32 cmd;
-   u32 result;
+   uint64_t channel;   /* 0x00 */
+   uint32_t size;  /* 0x08 */
+   uint64_t address;   /* 0x0c */
+   uint32_t cmd;   /* 0x14 */
+   uint32_t result;/* 0x18 */
/* reserved for future extension */
-   u32 flags;
+   uint32_t flags; /* 0x1c */
 };
-
+#else
+struct access_params {
+   uint32_t channel;   /* 0x00 */
+   uint32_t size;  /* 0x04 */
+   uint32_t address;   /* 0x08 */
+   uint32_t cmd;   /* 0x0c */
+   uint32_t result;/* 0x10 */
+   /* reserved for future extension */
+   uint32_t flags; /* 0x14 */
+};
+#endif
 /* The global driver data. Holds a reference to the i/o page used to
  * communicate with the emulator, and a wake queue for blocked tasks
  * waiting to be awoken.
@@ -237,6 +248,7 @@ static int setup_access_params_addr(struct platform_device 
*pdev,
return -1;
 }
 
+#if 0
 /* A value that will not be set by qemu emulator */
 #define INITIAL_BATCH_RESULT (0xdeadbeaf)
 static int access_with_param(struct goldfish_pipe_dev *dev, const int cmd,
@@ -263,6 +275,7 @@ static int access_with_param(struct goldfish_pipe_dev *dev, 
const int cmd,
*status = aps->result;
return 0;
 }
+#endif
 
 /* This function is used for both reading from and writing to a given
  * pipe.
@@ -304,6 +317,8 @@ static ssize_t goldfish_pipe_read_write(struct file *filp, 
char __user *buffer,
 : address_end;
unsigned long  avail= next - address;
int status, wakeBit;
+   struct page *page;
+   phys_addr_t phys_addr;
 
/* Ensure that the corresponding page is properly mapped */
/* FIXME: this isn't safe or sufficient - use get_user_pages */
@@ -323,23 +338,22 @@ static ssize_t goldfish_pipe_read_write(struct file 
*filp, char __user *buffer,
break;
}
}
-
+   if (get_user_pages_unlocked(current, current->active_mm,
+   address, 1, !is_write, 0, ) != 1)
+   return -EINVAL;
+   phys_addr = page_to_phys(page) + offset_in_page(address);
/* Now, try to transfer the bytes in the current page */
spin_lock_irqsave(>lock, irq_flags);
-   if (access_with_param(dev, CMD_WRITE_BUFFER + cmd_offset,
-   address, avail, pipe, )) {
-   gf_write_ptr(pipe, dev->base + PIPE_REG_CHANNEL,
-dev->base + PIPE_REG_CHANNEL_HIGH);
-   writel(avail, dev->base + PIPE_REG_SIZE);
-   gf_write_ptr((void *)address,
-dev->base + PIPE_REG_ADDRESS,
-dev->base + PIPE_REG_ADDRESS_HIGH);
-   writel(CMD_WRITE_BUFFER + cmd_offset,
-   dev->base + PIPE_REG_COMMAND);
-   status = readl(dev->base + PIPE_REG_STATUS);
-   }
+   gf_write_ptr(pipe, dev->base + PIPE_REG_CHANNEL,
+dev->base + PIPE_REG_CHANNEL_HIGH);
+   writel(avail, dev->base + PIPE_REG_SIZE);
+   gf_write_ptr((void *)phys_addr, dev->base + PIPE_REG_ADDRESS,
+dev->base + PIPE_REG_ADDRESS_HIGH);
+   writel(CMD_WRITE_BUFFER + cmd_offset,
+   dev->base + PIPE_REG_COMMAND);
+   status = readl(dev->base + PIPE_REG_STATUS);
spin_unlock_irqrestore(>lock, irq_flags);
-
+   put_page(page);

  1   2   3   4   >