Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM

2017-02-13 Thread Marek Szyprowski

Hi Vinod,


On 2017-02-13 16:47, Vinod Koul wrote:

On Mon, Feb 13, 2017 at 04:32:32PM +0100, Ulf Hansson wrote:

[...]


Although, I don't know of other examples, besides the runtime PM use
case, where non-atomic channel prepare/unprepare would make sense. Do
you?

The primary ask for that has been to enable runtime_pm for drivers. It's not
a new ask, but we somehow haven't gotten around to do it.

Okay, I see.


As I said earlier, if we want to solve that problem a better idea is to
actually split the prepare as we discussed in [1]

This way we can get a non atomic descriptor allocate/prepare and release.
Yes we need to redesign the APIs to solve this, but if you guys are up for
it, I think we can do it and avoid any further round abouts :)

Adding/re-designing dma APIs is a viable option to solve the runtime PM case.

Changes would be needed for all related dma client drivers as well,
although if that's what we need to do - let's do it.

Yes, but do bear in mind that some cases do need atomic prepare. The primary
cases for DMA had that in mind and also submitting next transaction from the
callback (tasklet) context, so that won't go away.

It would help in other cases where clients know that they will not be in
atomic context so we provide additional non-atomic "allocation" followed by
prepare, so that drivers can split the work among these and people can do
runtime_pm and other things..

That for sharing the details.

It seems like some dma expert really need to be heavily involved if we
ever are going to complete this work. :-)

Sure, I will help out :)

If anyone of you are in Portland next week, then we can discuss these f2f. I
will try taking a stab at the new API design next week.


I'm not going to Portland, but I hope that you will have a fruitful 
discussion

there.

[...]

Best regards
--
Marek Szyprowski, PhD
Samsung R Institute Poland



Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM

2017-02-13 Thread Marek Szyprowski

Hi Vinod,


On 2017-02-13 16:47, Vinod Koul wrote:

On Mon, Feb 13, 2017 at 04:32:32PM +0100, Ulf Hansson wrote:

[...]


Although, I don't know of other examples, besides the runtime PM use
case, where non-atomic channel prepare/unprepare would make sense. Do
you?

The primary ask for that has been to enable runtime_pm for drivers. It's not
a new ask, but we somehow haven't gotten around to do it.

Okay, I see.


As I said earlier, if we want to solve that problem a better idea is to
actually split the prepare as we discussed in [1]

This way we can get a non atomic descriptor allocate/prepare and release.
Yes we need to redesign the APIs to solve this, but if you guys are up for
it, I think we can do it and avoid any further round abouts :)

Adding/re-designing dma APIs is a viable option to solve the runtime PM case.

Changes would be needed for all related dma client drivers as well,
although if that's what we need to do - let's do it.

Yes, but do bear in mind that some cases do need atomic prepare. The primary
cases for DMA had that in mind and also submitting next transaction from the
callback (tasklet) context, so that won't go away.

It would help in other cases where clients know that they will not be in
atomic context so we provide additional non-atomic "allocation" followed by
prepare, so that drivers can split the work among these and people can do
runtime_pm and other things..

That for sharing the details.

It seems like some dma expert really need to be heavily involved if we
ever are going to complete this work. :-)

Sure, I will help out :)

If anyone of you are in Portland next week, then we can discuss these f2f. I
will try taking a stab at the new API design next week.


I'm not going to Portland, but I hope that you will have a fruitful 
discussion

there.

[...]

Best regards
--
Marek Szyprowski, PhD
Samsung R Institute Poland



Re: [PATCH v3 1/9] llist: Provide a safe version for llist_for_each

2017-02-13 Thread Huang, Ying
Byungchul Park  writes:

> Sometimes we have to dereference next field of llist node before entering
> loop becasue the node might be deleted or the next field might be
> modified within the loop. So this adds the safe version of llist_for_each,
> that is, llist_for_each_safe.
>
> Signed-off-by: Byungchul Park 

Reviewed-by: "Huang, Ying" 

Best Regards,
Huang, Ying

> ---
>  include/linux/llist.h | 19 +++
>  1 file changed, 19 insertions(+)
>
> diff --git a/include/linux/llist.h b/include/linux/llist.h
> index fd4ca0b..b90c9f2 100644
> --- a/include/linux/llist.h
> +++ b/include/linux/llist.h
> @@ -105,6 +105,25 @@ static inline void init_llist_head(struct llist_head 
> *list)
>   for ((pos) = (node); pos; (pos) = (pos)->next)
>  
>  /**
> + * llist_for_each_safe - iterate over some deleted entries of a lock-less 
> list
> + *safe against removal of list entry
> + * @pos: the  llist_node to use as a loop cursor
> + * @n:   another  llist_node to use as temporary storage
> + * @node:the first entry of deleted list entries
> + *
> + * In general, some entries of the lock-less list can be traversed
> + * safely only after being deleted from list, so start with an entry
> + * instead of list head.
> + *
> + * If being used on entries deleted from lock-less list directly, the
> + * traverse order is from the newest to the oldest added entry.  If
> + * you want to traverse from the oldest to the newest, you must
> + * reverse the order by yourself before traversing.
> + */
> +#define llist_for_each_safe(pos, n, node)\
> + for ((pos) = (node); (pos) && ((n) = (pos)->next, true); (pos) = (n))
> +
> +/**
>   * llist_for_each_entry - iterate over some deleted entries of lock-less 
> list of given type
>   * @pos: the type * to use as a loop cursor.
>   * @node:the fist entry of deleted list entries.


Re: [PATCH v3 1/9] llist: Provide a safe version for llist_for_each

2017-02-13 Thread Huang, Ying
Byungchul Park  writes:

> Sometimes we have to dereference next field of llist node before entering
> loop becasue the node might be deleted or the next field might be
> modified within the loop. So this adds the safe version of llist_for_each,
> that is, llist_for_each_safe.
>
> Signed-off-by: Byungchul Park 

Reviewed-by: "Huang, Ying" 

Best Regards,
Huang, Ying

> ---
>  include/linux/llist.h | 19 +++
>  1 file changed, 19 insertions(+)
>
> diff --git a/include/linux/llist.h b/include/linux/llist.h
> index fd4ca0b..b90c9f2 100644
> --- a/include/linux/llist.h
> +++ b/include/linux/llist.h
> @@ -105,6 +105,25 @@ static inline void init_llist_head(struct llist_head 
> *list)
>   for ((pos) = (node); pos; (pos) = (pos)->next)
>  
>  /**
> + * llist_for_each_safe - iterate over some deleted entries of a lock-less 
> list
> + *safe against removal of list entry
> + * @pos: the  llist_node to use as a loop cursor
> + * @n:   another  llist_node to use as temporary storage
> + * @node:the first entry of deleted list entries
> + *
> + * In general, some entries of the lock-less list can be traversed
> + * safely only after being deleted from list, so start with an entry
> + * instead of list head.
> + *
> + * If being used on entries deleted from lock-less list directly, the
> + * traverse order is from the newest to the oldest added entry.  If
> + * you want to traverse from the oldest to the newest, you must
> + * reverse the order by yourself before traversing.
> + */
> +#define llist_for_each_safe(pos, n, node)\
> + for ((pos) = (node); (pos) && ((n) = (pos)->next, true); (pos) = (n))
> +
> +/**
>   * llist_for_each_entry - iterate over some deleted entries of lock-less 
> list of given type
>   * @pos: the type * to use as a loop cursor.
>   * @node:the fist entry of deleted list entries.


Re: [PATCH v2] tags: honor COMPILED_SOURCE with apart output directory

2017-02-13 Thread Robert Jarzmik
Robert Jarzmik  writes:

> Robert Jarzmik  writes:
>> Signed-off-by: Robert Jarzmik 
>> ---
>> Since v1: amended k expression, Marek's comments
> Hi Marek,
>
> Is this version good for you ?

Marek, could you take a look please ?

--
Robert
>> ---
>>  scripts/tags.sh | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/scripts/tags.sh b/scripts/tags.sh
>> index a2ff3388e5ea..35cb64d5211c 100755
>> --- a/scripts/tags.sh
>> +++ b/scripts/tags.sh
>> @@ -106,7 +106,8 @@ all_compiled_sources()
>>  case "$i" in
>>  *.[cS])
>>  j=${i/\.[cS]/\.o}
>> -if [ -e $j ]; then
>> +k="${j#$tree}"
>> +if [ -e $j -o -e "$k" ]; then
>>  echo $i
>>  fi
>>  ;;


Re: [PATCH v2] tags: honor COMPILED_SOURCE with apart output directory

2017-02-13 Thread Robert Jarzmik
Robert Jarzmik  writes:

> Robert Jarzmik  writes:
>> Signed-off-by: Robert Jarzmik 
>> ---
>> Since v1: amended k expression, Marek's comments
> Hi Marek,
>
> Is this version good for you ?

Marek, could you take a look please ?

--
Robert
>> ---
>>  scripts/tags.sh | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/scripts/tags.sh b/scripts/tags.sh
>> index a2ff3388e5ea..35cb64d5211c 100755
>> --- a/scripts/tags.sh
>> +++ b/scripts/tags.sh
>> @@ -106,7 +106,8 @@ all_compiled_sources()
>>  case "$i" in
>>  *.[cS])
>>  j=${i/\.[cS]/\.o}
>> -if [ -e $j ]; then
>> +k="${j#$tree}"
>> +if [ -e $j -o -e "$k" ]; then
>>  echo $i
>>  fi
>>  ;;


RE: [tip:locking/core] refcount_t: Introduce a special purpose refcount type

2017-02-13 Thread Reshetova, Elena
> Subject: refcount: Out-of-line everything
> From: Peter Zijlstra 
> Date: Fri Feb 10 16:27:52 CET 2017
> 
> Linus asked to please make this real C code.

Perhaps a completely stupid question, but I am going to ask anyway since only 
this way I can learn.
What a real difference it makes? Or are we talking more about styling and etc.? 
Is it because of size concerns?
This way it is certainly now done differently than rest of atomic and kref... 

Best Regards,
Elena.


RE: [tip:locking/core] refcount_t: Introduce a special purpose refcount type

2017-02-13 Thread Reshetova, Elena
> Subject: refcount: Out-of-line everything
> From: Peter Zijlstra 
> Date: Fri Feb 10 16:27:52 CET 2017
> 
> Linus asked to please make this real C code.

Perhaps a completely stupid question, but I am going to ask anyway since only 
this way I can learn.
What a real difference it makes? Or are we talking more about styling and etc.? 
Is it because of size concerns?
This way it is certainly now done differently than rest of atomic and kref... 

Best Regards,
Elena.


[PATCH v3 8/9] sched: Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park 
---
 kernel/sched/core.c | 15 +++
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d01f9d0..8938125 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1773,7 +1773,7 @@ void sched_ttwu_pending(void)
 {
struct rq *rq = this_rq();
struct llist_node *llist = llist_del_all(>wake_list);
-   struct task_struct *p;
+   struct task_struct *p, *t;
unsigned long flags;
struct rq_flags rf;
 
@@ -1783,17 +1783,8 @@ void sched_ttwu_pending(void)
raw_spin_lock_irqsave(>lock, flags);
rq_pin_lock(rq, );
 
-   while (llist) {
-   int wake_flags = 0;
-
-   p = llist_entry(llist, struct task_struct, wake_entry);
-   llist = llist_next(llist);
-
-   if (p->sched_remote_wakeup)
-   wake_flags = WF_MIGRATED;
-
-   ttwu_do_activate(rq, p, wake_flags, );
-   }
+   llist_for_each_entry_safe(p, t, llist, wake_entry)
+   ttwu_do_activate(rq, p, p->sched_remote_wakeup ? WF_MIGRATED : 
0, );
 
rq_unpin_lock(rq, );
raw_spin_unlock_irqrestore(>lock, flags);
-- 
1.9.1



[PATCH v3 7/9] irq_work: Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park 
---
 kernel/irq_work.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index bcf107c..e2ebe8c 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -138,11 +138,7 @@ static void irq_work_run_list(struct llist_head *list)
return;
 
llnode = llist_del_all(list);
-   while (llnode != NULL) {
-   work = llist_entry(llnode, struct irq_work, llnode);
-
-   llnode = llist_next(llnode);
-
+   llist_for_each_entry(work, llnode, llnode) {
/*
 * Clear the PENDING bit, after this point the @work
 * can be re-used.
-- 
1.9.1



[PATCH v3 8/9] sched: Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park 
---
 kernel/sched/core.c | 15 +++
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d01f9d0..8938125 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1773,7 +1773,7 @@ void sched_ttwu_pending(void)
 {
struct rq *rq = this_rq();
struct llist_node *llist = llist_del_all(>wake_list);
-   struct task_struct *p;
+   struct task_struct *p, *t;
unsigned long flags;
struct rq_flags rf;
 
@@ -1783,17 +1783,8 @@ void sched_ttwu_pending(void)
raw_spin_lock_irqsave(>lock, flags);
rq_pin_lock(rq, );
 
-   while (llist) {
-   int wake_flags = 0;
-
-   p = llist_entry(llist, struct task_struct, wake_entry);
-   llist = llist_next(llist);
-
-   if (p->sched_remote_wakeup)
-   wake_flags = WF_MIGRATED;
-
-   ttwu_do_activate(rq, p, wake_flags, );
-   }
+   llist_for_each_entry_safe(p, t, llist, wake_entry)
+   ttwu_do_activate(rq, p, p->sched_remote_wakeup ? WF_MIGRATED : 
0, );
 
rq_unpin_lock(rq, );
raw_spin_unlock_irqrestore(>lock, flags);
-- 
1.9.1



[PATCH v3 7/9] irq_work: Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park 
---
 kernel/irq_work.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index bcf107c..e2ebe8c 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -138,11 +138,7 @@ static void irq_work_run_list(struct llist_head *list)
return;
 
llnode = llist_del_all(list);
-   while (llnode != NULL) {
-   work = llist_entry(llnode, struct irq_work, llnode);
-
-   llnode = llist_next(llnode);
-
+   llist_for_each_entry(work, llnode, llnode) {
/*
 * Clear the PENDING bit, after this point the @work
 * can be re-used.
-- 
1.9.1



[PATCH v3 5/9] fput: Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park 
---
 fs/file_table.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 6d982b5..3209da2 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -231,12 +231,10 @@ static void __fput(struct file *file)
 static void delayed_fput(struct work_struct *unused)
 {
struct llist_node *node = llist_del_all(_fput_list);
-   struct llist_node *next;
+   struct file *f, *t;
 
-   for (; node; node = next) {
-   next = llist_next(node);
-   __fput(llist_entry(node, struct file, f_u.fu_llist));
-   }
+   llist_for_each_entry_safe(f, t, node, f_u.fu_llist)
+   __fput(f);
 }
 
 static void fput(struct callback_head *work)
@@ -310,7 +308,7 @@ void put_filp(struct file *file)
 }
 
 void __init files_init(void)
-{ 
+{
filp_cachep = kmem_cache_create("filp", sizeof(struct file), 0,
SLAB_HWCACHE_ALIGN | SLAB_PANIC, NULL);
percpu_counter_init(_files, 0, GFP_KERNEL);
@@ -329,4 +327,4 @@ void __init files_maxfiles_init(void)
n = ((totalram_pages - memreserve) * (PAGE_SIZE / 1024)) / 10;
 
files_stat.max_files = max_t(unsigned long, n, NR_FILE);
-} 
+}
-- 
1.9.1



[PATCH v3 9/9] mm: Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park 
---
 mm/vmalloc.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 3ca82d4..8c0eb45 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -49,12 +49,10 @@ struct vfree_deferred {
 static void free_work(struct work_struct *w)
 {
struct vfree_deferred *p = container_of(w, struct vfree_deferred, wq);
-   struct llist_node *llnode = llist_del_all(>list);
-   while (llnode) {
-   void *p = llnode;
-   llnode = llist_next(llnode);
-   __vunmap(p, 1);
-   }
+   struct llist_node *t, *llnode;
+
+   llist_for_each_safe(llnode, t, llist_del_all(>list))
+   __vunmap((void *)llnode, 1);
 }
 
 /*** Page table manipulation functions ***/
-- 
1.9.1



[PATCH v3 9/9] mm: Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park 
---
 mm/vmalloc.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 3ca82d4..8c0eb45 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -49,12 +49,10 @@ struct vfree_deferred {
 static void free_work(struct work_struct *w)
 {
struct vfree_deferred *p = container_of(w, struct vfree_deferred, wq);
-   struct llist_node *llnode = llist_del_all(>list);
-   while (llnode) {
-   void *p = llnode;
-   llnode = llist_next(llnode);
-   __vunmap(p, 1);
-   }
+   struct llist_node *t, *llnode;
+
+   llist_for_each_safe(llnode, t, llist_del_all(>list))
+   __vunmap((void *)llnode, 1);
 }
 
 /*** Page table manipulation functions ***/
-- 
1.9.1



[PATCH v3 5/9] fput: Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park 
---
 fs/file_table.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 6d982b5..3209da2 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -231,12 +231,10 @@ static void __fput(struct file *file)
 static void delayed_fput(struct work_struct *unused)
 {
struct llist_node *node = llist_del_all(_fput_list);
-   struct llist_node *next;
+   struct file *f, *t;
 
-   for (; node; node = next) {
-   next = llist_next(node);
-   __fput(llist_entry(node, struct file, f_u.fu_llist));
-   }
+   llist_for_each_entry_safe(f, t, node, f_u.fu_llist)
+   __fput(f);
 }
 
 static void fput(struct callback_head *work)
@@ -310,7 +308,7 @@ void put_filp(struct file *file)
 }
 
 void __init files_init(void)
-{ 
+{
filp_cachep = kmem_cache_create("filp", sizeof(struct file), 0,
SLAB_HWCACHE_ALIGN | SLAB_PANIC, NULL);
percpu_counter_init(_files, 0, GFP_KERNEL);
@@ -329,4 +327,4 @@ void __init files_maxfiles_init(void)
n = ((totalram_pages - memreserve) * (PAGE_SIZE / 1024)) / 10;
 
files_stat.max_files = max_t(unsigned long, n, NR_FILE);
-} 
+}
-- 
1.9.1



[PATCH v3 6/9] namespace.c: Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park 
---
 fs/namespace.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index b5b1259..5cb2229 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1082,12 +1082,10 @@ static void __cleanup_mnt(struct rcu_head *head)
 static void delayed_mntput(struct work_struct *unused)
 {
struct llist_node *node = llist_del_all(_mntput_list);
-   struct llist_node *next;
+   struct mount *m, *t;
 
-   for (; node; node = next) {
-   next = llist_next(node);
-   cleanup_mnt(llist_entry(node, struct mount, mnt_llist));
-   }
+   llist_for_each_entry_safe(m, t, node, mnt_llist)
+   cleanup_mnt(m);
 }
 static DECLARE_DELAYED_WORK(delayed_mntput_work, delayed_mntput);
 
@@ -1615,7 +1613,7 @@ void __detach_mounts(struct dentry *dentry)
namespace_unlock();
 }
 
-/* 
+/*
  * Is the caller allowed to modify his namespace?
  */
 static inline bool may_mount(void)
@@ -2159,7 +2157,7 @@ static int do_loopback(struct path *path, const char 
*old_name,
 
err = -EINVAL;
if (mnt_ns_loop(old_path.dentry))
-   goto out; 
+   goto out;
 
mp = lock_mount(path);
err = PTR_ERR(mp);
-- 
1.9.1



[PATCH v3 6/9] namespace.c: Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park 
---
 fs/namespace.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index b5b1259..5cb2229 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1082,12 +1082,10 @@ static void __cleanup_mnt(struct rcu_head *head)
 static void delayed_mntput(struct work_struct *unused)
 {
struct llist_node *node = llist_del_all(_mntput_list);
-   struct llist_node *next;
+   struct mount *m, *t;
 
-   for (; node; node = next) {
-   next = llist_next(node);
-   cleanup_mnt(llist_entry(node, struct mount, mnt_llist));
-   }
+   llist_for_each_entry_safe(m, t, node, mnt_llist)
+   cleanup_mnt(m);
 }
 static DECLARE_DELAYED_WORK(delayed_mntput_work, delayed_mntput);
 
@@ -1615,7 +1613,7 @@ void __detach_mounts(struct dentry *dentry)
namespace_unlock();
 }
 
-/* 
+/*
  * Is the caller allowed to modify his namespace?
  */
 static inline bool may_mount(void)
@@ -2159,7 +2157,7 @@ static int do_loopback(struct path *path, const char 
*old_name,
 
err = -EINVAL;
if (mnt_ns_loop(old_path.dentry))
-   goto out; 
+   goto out;
 
mp = lock_mount(path);
err = PTR_ERR(mp);
-- 
1.9.1



[PATCH v3 1/9] llist: Provide a safe version for llist_for_each

2017-02-13 Thread Byungchul Park
Sometimes we have to dereference next field of llist node before entering
loop becasue the node might be deleted or the next field might be
modified within the loop. So this adds the safe version of llist_for_each,
that is, llist_for_each_safe.

Signed-off-by: Byungchul Park 
---
 include/linux/llist.h | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/include/linux/llist.h b/include/linux/llist.h
index fd4ca0b..b90c9f2 100644
--- a/include/linux/llist.h
+++ b/include/linux/llist.h
@@ -105,6 +105,25 @@ static inline void init_llist_head(struct llist_head *list)
for ((pos) = (node); pos; (pos) = (pos)->next)
 
 /**
+ * llist_for_each_safe - iterate over some deleted entries of a lock-less list
+ *  safe against removal of list entry
+ * @pos:   the  llist_node to use as a loop cursor
+ * @n: another  llist_node to use as temporary storage
+ * @node:  the first entry of deleted list entries
+ *
+ * In general, some entries of the lock-less list can be traversed
+ * safely only after being deleted from list, so start with an entry
+ * instead of list head.
+ *
+ * If being used on entries deleted from lock-less list directly, the
+ * traverse order is from the newest to the oldest added entry.  If
+ * you want to traverse from the oldest to the newest, you must
+ * reverse the order by yourself before traversing.
+ */
+#define llist_for_each_safe(pos, n, node)  \
+   for ((pos) = (node); (pos) && ((n) = (pos)->next, true); (pos) = (n))
+
+/**
  * llist_for_each_entry - iterate over some deleted entries of lock-less list 
of given type
  * @pos:   the type * to use as a loop cursor.
  * @node:  the fist entry of deleted list entries.
-- 
1.9.1



[PATCH v3 1/9] llist: Provide a safe version for llist_for_each

2017-02-13 Thread Byungchul Park
Sometimes we have to dereference next field of llist node before entering
loop becasue the node might be deleted or the next field might be
modified within the loop. So this adds the safe version of llist_for_each,
that is, llist_for_each_safe.

Signed-off-by: Byungchul Park 
---
 include/linux/llist.h | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/include/linux/llist.h b/include/linux/llist.h
index fd4ca0b..b90c9f2 100644
--- a/include/linux/llist.h
+++ b/include/linux/llist.h
@@ -105,6 +105,25 @@ static inline void init_llist_head(struct llist_head *list)
for ((pos) = (node); pos; (pos) = (pos)->next)
 
 /**
+ * llist_for_each_safe - iterate over some deleted entries of a lock-less list
+ *  safe against removal of list entry
+ * @pos:   the  llist_node to use as a loop cursor
+ * @n: another  llist_node to use as temporary storage
+ * @node:  the first entry of deleted list entries
+ *
+ * In general, some entries of the lock-less list can be traversed
+ * safely only after being deleted from list, so start with an entry
+ * instead of list head.
+ *
+ * If being used on entries deleted from lock-less list directly, the
+ * traverse order is from the newest to the oldest added entry.  If
+ * you want to traverse from the oldest to the newest, you must
+ * reverse the order by yourself before traversing.
+ */
+#define llist_for_each_safe(pos, n, node)  \
+   for ((pos) = (node); (pos) && ((n) = (pos)->next, true); (pos) = (n))
+
+/**
  * llist_for_each_entry - iterate over some deleted entries of lock-less list 
of given type
  * @pos:   the type * to use as a loop cursor.
  * @node:  the fist entry of deleted list entries.
-- 
1.9.1



[PATCH v3 0/9] Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Change from v2
- replace for_each(wake_list) with the safe version in scheduler.
- fix a trivial comment in llist.h

Change from v1
- split one patch to several ones, one for each subsystem.
- replace for_each with the safe version where it's necessary.

Byungchul Park (9):
  llist: Provide a safe version for llist_for_each
  bcache: Don't reinvent the wheel but use existing llist API
  raid5: Don't reinvent the wheel but use existing llist API
  vhost/scsi: Don't reinvent the wheel but use existing llist API
  fput: Don't reinvent the wheel but use existing llist API
  namespace.c: Don't reinvent the wheel but use existing llist API
  irq_work: Don't reinvent the wheel but use existing llist API
  sched: Don't reinvent the wheel but use existing llist API
  mm: Don't reinvent the wheel but use existing llist API

 drivers/md/bcache/closure.c | 17 +++--
 drivers/md/raid5.c  |  6 ++
 drivers/vhost/scsi.c| 11 +++
 fs/file_table.c | 12 +---
 fs/namespace.c  | 12 +---
 include/linux/llist.h   | 19 +++
 kernel/irq_work.c   |  6 +-
 kernel/sched/core.c | 13 ++---
 mm/vmalloc.c| 10 --
 9 files changed, 44 insertions(+), 62 deletions(-)

-- 
1.9.1

*** BLURB HERE ***

Byungchul Park (9):
  llist: Provide a safe version for llist_for_each
  bcache: Don't reinvent the wheel but use existing llist API
  raid5: Don't reinvent the wheel but use existing llist API
  vhost/scsi: Don't reinvent the wheel but use existing llist API
  fput: Don't reinvent the wheel but use existing llist API
  namespace.c: Don't reinvent the wheel but use existing llist API
  irq_work: Don't reinvent the wheel but use existing llist API
  sched: Don't reinvent the wheel but use existing llist API
  mm: Don't reinvent the wheel but use existing llist API

 drivers/md/bcache/closure.c | 17 +++--
 drivers/md/raid5.c  |  6 ++
 drivers/vhost/scsi.c| 11 +++
 fs/file_table.c | 12 +---
 fs/namespace.c  | 12 +---
 include/linux/llist.h   | 19 +++
 kernel/irq_work.c   |  6 +-
 kernel/sched/core.c | 15 +++
 mm/vmalloc.c| 10 --
 9 files changed, 45 insertions(+), 63 deletions(-)

-- 
1.9.1



[PATCH v3 0/9] Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Change from v2
- replace for_each(wake_list) with the safe version in scheduler.
- fix a trivial comment in llist.h

Change from v1
- split one patch to several ones, one for each subsystem.
- replace for_each with the safe version where it's necessary.

Byungchul Park (9):
  llist: Provide a safe version for llist_for_each
  bcache: Don't reinvent the wheel but use existing llist API
  raid5: Don't reinvent the wheel but use existing llist API
  vhost/scsi: Don't reinvent the wheel but use existing llist API
  fput: Don't reinvent the wheel but use existing llist API
  namespace.c: Don't reinvent the wheel but use existing llist API
  irq_work: Don't reinvent the wheel but use existing llist API
  sched: Don't reinvent the wheel but use existing llist API
  mm: Don't reinvent the wheel but use existing llist API

 drivers/md/bcache/closure.c | 17 +++--
 drivers/md/raid5.c  |  6 ++
 drivers/vhost/scsi.c| 11 +++
 fs/file_table.c | 12 +---
 fs/namespace.c  | 12 +---
 include/linux/llist.h   | 19 +++
 kernel/irq_work.c   |  6 +-
 kernel/sched/core.c | 13 ++---
 mm/vmalloc.c| 10 --
 9 files changed, 44 insertions(+), 62 deletions(-)

-- 
1.9.1

*** BLURB HERE ***

Byungchul Park (9):
  llist: Provide a safe version for llist_for_each
  bcache: Don't reinvent the wheel but use existing llist API
  raid5: Don't reinvent the wheel but use existing llist API
  vhost/scsi: Don't reinvent the wheel but use existing llist API
  fput: Don't reinvent the wheel but use existing llist API
  namespace.c: Don't reinvent the wheel but use existing llist API
  irq_work: Don't reinvent the wheel but use existing llist API
  sched: Don't reinvent the wheel but use existing llist API
  mm: Don't reinvent the wheel but use existing llist API

 drivers/md/bcache/closure.c | 17 +++--
 drivers/md/raid5.c  |  6 ++
 drivers/vhost/scsi.c| 11 +++
 fs/file_table.c | 12 +---
 fs/namespace.c  | 12 +---
 include/linux/llist.h   | 19 +++
 kernel/irq_work.c   |  6 +-
 kernel/sched/core.c | 15 +++
 mm/vmalloc.c| 10 --
 9 files changed, 45 insertions(+), 63 deletions(-)

-- 
1.9.1



[PATCH v3 4/9] vhost/scsi: Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park 
---
 drivers/vhost/scsi.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 253310c..a4cb966 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -496,14 +496,12 @@ static void vhost_scsi_evt_work(struct vhost_work *work)
struct vhost_scsi *vs = container_of(work, struct vhost_scsi,
vs_event_work);
struct vhost_virtqueue *vq = >vqs[VHOST_SCSI_VQ_EVT].vq;
-   struct vhost_scsi_evt *evt;
+   struct vhost_scsi_evt *evt, *t;
struct llist_node *llnode;
 
mutex_lock(>mutex);
llnode = llist_del_all(>vs_event_list);
-   while (llnode) {
-   evt = llist_entry(llnode, struct vhost_scsi_evt, list);
-   llnode = llist_next(llnode);
+   llist_for_each_entry_safe(evt, t, llnode, list) {
vhost_scsi_do_evt_work(vs, evt);
vhost_scsi_free_evt(vs, evt);
}
@@ -529,10 +527,7 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work 
*work)
 
bitmap_zero(signal, VHOST_SCSI_MAX_VQ);
llnode = llist_del_all(>vs_completion_list);
-   while (llnode) {
-   cmd = llist_entry(llnode, struct vhost_scsi_cmd,
-tvc_completion_list);
-   llnode = llist_next(llnode);
+   llist_for_each_entry(cmd, llnode, tvc_completion_list) {
se_cmd = >tvc_se_cmd;
 
pr_debug("%s tv_cmd %p resid %u status %#02x\n", __func__,
-- 
1.9.1



[PATCH v3 3/9] raid5: Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park 
---
 drivers/md/raid5.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 36c13e4..22a0326 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -353,17 +353,15 @@ static void release_inactive_stripe_list(struct r5conf 
*conf,
 static int release_stripe_list(struct r5conf *conf,
   struct list_head *temp_inactive_list)
 {
-   struct stripe_head *sh;
+   struct stripe_head *sh, *t;
int count = 0;
struct llist_node *head;
 
head = llist_del_all(>released_stripes);
head = llist_reverse_order(head);
-   while (head) {
+   llist_for_each_entry_safe(sh, t, head, release_list) {
int hash;
 
-   sh = llist_entry(head, struct stripe_head, release_list);
-   head = llist_next(head);
/* sh could be readded after STRIPE_ON_RELEASE_LIST is cleard */
smp_mb();
clear_bit(STRIPE_ON_RELEASE_LIST, >state);
-- 
1.9.1



[PATCH v3 4/9] vhost/scsi: Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park 
---
 drivers/vhost/scsi.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 253310c..a4cb966 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -496,14 +496,12 @@ static void vhost_scsi_evt_work(struct vhost_work *work)
struct vhost_scsi *vs = container_of(work, struct vhost_scsi,
vs_event_work);
struct vhost_virtqueue *vq = >vqs[VHOST_SCSI_VQ_EVT].vq;
-   struct vhost_scsi_evt *evt;
+   struct vhost_scsi_evt *evt, *t;
struct llist_node *llnode;
 
mutex_lock(>mutex);
llnode = llist_del_all(>vs_event_list);
-   while (llnode) {
-   evt = llist_entry(llnode, struct vhost_scsi_evt, list);
-   llnode = llist_next(llnode);
+   llist_for_each_entry_safe(evt, t, llnode, list) {
vhost_scsi_do_evt_work(vs, evt);
vhost_scsi_free_evt(vs, evt);
}
@@ -529,10 +527,7 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work 
*work)
 
bitmap_zero(signal, VHOST_SCSI_MAX_VQ);
llnode = llist_del_all(>vs_completion_list);
-   while (llnode) {
-   cmd = llist_entry(llnode, struct vhost_scsi_cmd,
-tvc_completion_list);
-   llnode = llist_next(llnode);
+   llist_for_each_entry(cmd, llnode, tvc_completion_list) {
se_cmd = >tvc_se_cmd;
 
pr_debug("%s tv_cmd %p resid %u status %#02x\n", __func__,
-- 
1.9.1



[PATCH v3 3/9] raid5: Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park 
---
 drivers/md/raid5.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 36c13e4..22a0326 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -353,17 +353,15 @@ static void release_inactive_stripe_list(struct r5conf 
*conf,
 static int release_stripe_list(struct r5conf *conf,
   struct list_head *temp_inactive_list)
 {
-   struct stripe_head *sh;
+   struct stripe_head *sh, *t;
int count = 0;
struct llist_node *head;
 
head = llist_del_all(>released_stripes);
head = llist_reverse_order(head);
-   while (head) {
+   llist_for_each_entry_safe(sh, t, head, release_list) {
int hash;
 
-   sh = llist_entry(head, struct stripe_head, release_list);
-   head = llist_next(head);
/* sh could be readded after STRIPE_ON_RELEASE_LIST is cleard */
smp_mb();
clear_bit(STRIPE_ON_RELEASE_LIST, >state);
-- 
1.9.1



[PATCH v3 2/9] bcache: Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park 
---
 drivers/md/bcache/closure.c | 17 +++--
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/drivers/md/bcache/closure.c b/drivers/md/bcache/closure.c
index 864e673..1841d03 100644
--- a/drivers/md/bcache/closure.c
+++ b/drivers/md/bcache/closure.c
@@ -64,27 +64,16 @@ void closure_put(struct closure *cl)
 void __closure_wake_up(struct closure_waitlist *wait_list)
 {
struct llist_node *list;
-   struct closure *cl;
+   struct closure *cl, *t;
struct llist_node *reverse = NULL;
 
list = llist_del_all(_list->list);
 
/* We first reverse the list to preserve FIFO ordering and fairness */
-
-   while (list) {
-   struct llist_node *t = list;
-   list = llist_next(list);
-
-   t->next = reverse;
-   reverse = t;
-   }
+   reverse = llist_reverse_order(list);
 
/* Then do the wakeups */
-
-   while (reverse) {
-   cl = container_of(reverse, struct closure, list);
-   reverse = llist_next(reverse);
-
+   llist_for_each_entry_safe(cl, t, reverse, list) {
closure_set_waiting(cl, 0);
closure_sub(cl, CLOSURE_WAITING + 1);
}
-- 
1.9.1



[PATCH v3 2/9] bcache: Don't reinvent the wheel but use existing llist API

2017-02-13 Thread Byungchul Park
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park 
---
 drivers/md/bcache/closure.c | 17 +++--
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/drivers/md/bcache/closure.c b/drivers/md/bcache/closure.c
index 864e673..1841d03 100644
--- a/drivers/md/bcache/closure.c
+++ b/drivers/md/bcache/closure.c
@@ -64,27 +64,16 @@ void closure_put(struct closure *cl)
 void __closure_wake_up(struct closure_waitlist *wait_list)
 {
struct llist_node *list;
-   struct closure *cl;
+   struct closure *cl, *t;
struct llist_node *reverse = NULL;
 
list = llist_del_all(_list->list);
 
/* We first reverse the list to preserve FIFO ordering and fairness */
-
-   while (list) {
-   struct llist_node *t = list;
-   list = llist_next(list);
-
-   t->next = reverse;
-   reverse = t;
-   }
+   reverse = llist_reverse_order(list);
 
/* Then do the wakeups */
-
-   while (reverse) {
-   cl = container_of(reverse, struct closure, list);
-   reverse = llist_next(reverse);
-
+   llist_for_each_entry_safe(cl, t, reverse, list) {
closure_set_waiting(cl, 0);
closure_sub(cl, CLOSURE_WAITING + 1);
}
-- 
1.9.1



Re: [PATCH] mm: free reserved area's memmap if possiable

2017-02-13 Thread zhouxianrong

if the reserved area by user were so big which caused the memmap big,
and the reserved area's memamp did not be used by kernel, then user
could free the the reserved area's memamp by memblock_mark_raw_pfn
interface which is added by me.

On 2017/2/14 14:53, zhouxianr...@huawei.com wrote:

From: zhouxianrong 

just like freeing no-map area's memmap (gaps of memblock.memory)
we could free reserved area's memmap (areas of memblock.reserved)
as well only when user of reserved area indicate that we can do
this in drivers. that is, user of reserved area know how to
use the reserved area who could not memblock_free or free_reserved_xxx
the reserved area and regard the area as raw pfn usage by kernel.
the patch supply a way to users who want to utilize the memmap
memory corresponding to raw pfn reserved areas as many as possible.
users can do this by memblock_mark_raw_pfn interface which mark the
reserved area as raw pfn and tell free_unused_memmap that this area's
memmap could be freeed.

Signed-off-by: zhouxianrong 
---
 arch/arm64/mm/init.c |   14 +-
 include/linux/memblock.h |3 +++
 mm/memblock.c|   24 
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 380ebe7..7e62ef8 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -358,7 +358,7 @@ static inline void free_memmap(unsigned long start_pfn, 
unsigned long end_pfn)
  */
 static void __init free_unused_memmap(void)
 {
-   unsigned long start, prev_end = 0;
+   unsigned long start, end, prev_end = 0;
struct memblock_region *reg;

for_each_memblock(memory, reg) {
@@ -391,6 +391,18 @@ static void __init free_unused_memmap(void)
if (!IS_ALIGNED(prev_end, PAGES_PER_SECTION))
free_memmap(prev_end, ALIGN(prev_end, PAGES_PER_SECTION));
 #endif
+
+   for_each_memblock(reserved, reg) {
+   if (!(reg->flags & MEMBLOCK_RAW_PFN))
+   continue;
+
+   start = memblock_region_memory_base_pfn(reg);
+   end = round_down(memblock_region_memory_end_pfn(reg),
+MAX_ORDER_NR_PAGES);
+
+   if (start < end)
+   free_memmap(start, end);
+   }
 }
 #endif /* !CONFIG_SPARSEMEM_VMEMMAP */

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 5b759c9..9f8d277 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -26,6 +26,7 @@ enum {
MEMBLOCK_HOTPLUG= 0x1,  /* hotpluggable region */
MEMBLOCK_MIRROR = 0x2,  /* mirrored region */
MEMBLOCK_NOMAP  = 0x4,  /* don't add to kernel direct mapping */
+   MEMBLOCK_RAW_PFN= 0x8,  /* region whose memmap never be used */
 };

 struct memblock_region {
@@ -92,6 +93,8 @@ bool memblock_overlaps_region(struct memblock_type *type,
 int memblock_clear_hotplug(phys_addr_t base, phys_addr_t size);
 int memblock_mark_mirror(phys_addr_t base, phys_addr_t size);
 int memblock_mark_nomap(phys_addr_t base, phys_addr_t size);
+int memblock_mark_raw_pfn(phys_addr_t base, phys_addr_t size);
+int memblock_clear_raw_pfn(phys_addr_t base, phys_addr_t size);
 ulong choose_memblock_flags(void);

 /* Low level functions */
diff --git a/mm/memblock.c b/mm/memblock.c
index 7608bc3..c103b94 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -814,6 +814,30 @@ int __init_memblock memblock_mark_nomap(phys_addr_t base, 
phys_addr_t size)
 }

 /**
+ * memblock_mark_raw_pfn - Mark raw pfn memory with flag MEMBLOCK_RAW_PFN.
+ * @base: the base phys addr of the region
+ * @size: the size of the region
+ *
+ * Return 0 on succees, -errno on failure.
+ */
+int __init_memblock memblock_mark_raw_pfn(phys_addr_t base, phys_addr_t size)
+{
+   return memblock_setclr_flag(base, size, 1, MEMBLOCK_RAW_PFN);
+}
+
+/**
+ * memblock_clear_raw_pfn - Clear flag MEMBLOCK_RAW_PFN for a specified region.
+ * @base: the base phys addr of the region
+ * @size: the size of the region
+ *
+ * Return 0 on succees, -errno on failure.
+ */
+int __init_memblock memblock_clear_raw_pfn(phys_addr_t base, phys_addr_t size)
+{
+   return memblock_setclr_flag(base, size, 0, MEMBLOCK_RAW_PFN);
+}
+
+/**
  * __next_reserved_mem_region - next function for for_each_reserved_region()
  * @idx: pointer to u64 loop variable
  * @out_start: ptr to phys_addr_t for start address of the region, can be %NULL





Re: [PATCH] mm: free reserved area's memmap if possiable

2017-02-13 Thread zhouxianrong

if the reserved area by user were so big which caused the memmap big,
and the reserved area's memamp did not be used by kernel, then user
could free the the reserved area's memamp by memblock_mark_raw_pfn
interface which is added by me.

On 2017/2/14 14:53, zhouxianr...@huawei.com wrote:

From: zhouxianrong 

just like freeing no-map area's memmap (gaps of memblock.memory)
we could free reserved area's memmap (areas of memblock.reserved)
as well only when user of reserved area indicate that we can do
this in drivers. that is, user of reserved area know how to
use the reserved area who could not memblock_free or free_reserved_xxx
the reserved area and regard the area as raw pfn usage by kernel.
the patch supply a way to users who want to utilize the memmap
memory corresponding to raw pfn reserved areas as many as possible.
users can do this by memblock_mark_raw_pfn interface which mark the
reserved area as raw pfn and tell free_unused_memmap that this area's
memmap could be freeed.

Signed-off-by: zhouxianrong 
---
 arch/arm64/mm/init.c |   14 +-
 include/linux/memblock.h |3 +++
 mm/memblock.c|   24 
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 380ebe7..7e62ef8 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -358,7 +358,7 @@ static inline void free_memmap(unsigned long start_pfn, 
unsigned long end_pfn)
  */
 static void __init free_unused_memmap(void)
 {
-   unsigned long start, prev_end = 0;
+   unsigned long start, end, prev_end = 0;
struct memblock_region *reg;

for_each_memblock(memory, reg) {
@@ -391,6 +391,18 @@ static void __init free_unused_memmap(void)
if (!IS_ALIGNED(prev_end, PAGES_PER_SECTION))
free_memmap(prev_end, ALIGN(prev_end, PAGES_PER_SECTION));
 #endif
+
+   for_each_memblock(reserved, reg) {
+   if (!(reg->flags & MEMBLOCK_RAW_PFN))
+   continue;
+
+   start = memblock_region_memory_base_pfn(reg);
+   end = round_down(memblock_region_memory_end_pfn(reg),
+MAX_ORDER_NR_PAGES);
+
+   if (start < end)
+   free_memmap(start, end);
+   }
 }
 #endif /* !CONFIG_SPARSEMEM_VMEMMAP */

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 5b759c9..9f8d277 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -26,6 +26,7 @@ enum {
MEMBLOCK_HOTPLUG= 0x1,  /* hotpluggable region */
MEMBLOCK_MIRROR = 0x2,  /* mirrored region */
MEMBLOCK_NOMAP  = 0x4,  /* don't add to kernel direct mapping */
+   MEMBLOCK_RAW_PFN= 0x8,  /* region whose memmap never be used */
 };

 struct memblock_region {
@@ -92,6 +93,8 @@ bool memblock_overlaps_region(struct memblock_type *type,
 int memblock_clear_hotplug(phys_addr_t base, phys_addr_t size);
 int memblock_mark_mirror(phys_addr_t base, phys_addr_t size);
 int memblock_mark_nomap(phys_addr_t base, phys_addr_t size);
+int memblock_mark_raw_pfn(phys_addr_t base, phys_addr_t size);
+int memblock_clear_raw_pfn(phys_addr_t base, phys_addr_t size);
 ulong choose_memblock_flags(void);

 /* Low level functions */
diff --git a/mm/memblock.c b/mm/memblock.c
index 7608bc3..c103b94 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -814,6 +814,30 @@ int __init_memblock memblock_mark_nomap(phys_addr_t base, 
phys_addr_t size)
 }

 /**
+ * memblock_mark_raw_pfn - Mark raw pfn memory with flag MEMBLOCK_RAW_PFN.
+ * @base: the base phys addr of the region
+ * @size: the size of the region
+ *
+ * Return 0 on succees, -errno on failure.
+ */
+int __init_memblock memblock_mark_raw_pfn(phys_addr_t base, phys_addr_t size)
+{
+   return memblock_setclr_flag(base, size, 1, MEMBLOCK_RAW_PFN);
+}
+
+/**
+ * memblock_clear_raw_pfn - Clear flag MEMBLOCK_RAW_PFN for a specified region.
+ * @base: the base phys addr of the region
+ * @size: the size of the region
+ *
+ * Return 0 on succees, -errno on failure.
+ */
+int __init_memblock memblock_clear_raw_pfn(phys_addr_t base, phys_addr_t size)
+{
+   return memblock_setclr_flag(base, size, 0, MEMBLOCK_RAW_PFN);
+}
+
+/**
  * __next_reserved_mem_region - next function for for_each_reserved_region()
  * @idx: pointer to u64 loop variable
  * @out_start: ptr to phys_addr_t for start address of the region, can be %NULL





Re: [PATCH v7 5/5] usb: doc: add document for USB3 debug port usage

2017-02-13 Thread Lu Baolu
Hi,

On 02/14/2017 02:13 PM, Peter Chen wrote:
>  
>> On 02/14/2017 11:45 AM, Peter Chen wrote:
>>> On Tue, Feb 14, 2017 at 10:27 AM, Lu Baolu  wrote:
>>>
 Add Documentation/usb/usb3-debug-port.rst. This document includes the
 user guide for USB3 debug port.

 Cc: linux-...@vger.kernel.org
 Signed-off-by: Lu Baolu 
 ---
  Documentation/usb/usb3-debug-port.rst | 98
 ++
 +
  1 file changed, 98 insertions(+)
  create mode 100644 Documentation/usb/usb3-debug-port.rst

 diff --git a/Documentation/usb/usb3-debug-port.rst
 b/Documentation/usb/usb3-debug-port.rst
 new file mode 100644
 index 000..9eddb3a
 --- /dev/null
 +++ b/Documentation/usb/usb3-debug-port.rst
 @@ -0,0 +1,98 @@
 +===
 +USB3 debug port
 +===
 +
 +:Author: Lu Baolu 
 +:Date: January 2017
 +
 +GENERAL
 +===
 +
 +This is a HOWTO for using USB3 debug port on x86 systems.
 +
 +Before using any kernel debugging functionality based on USB3 debug
 +port, you need to check 1) whether debug port is supported by the
 +xHCI host; 2) which port is used for debugging purposes (normally
 +the first USB3 root port). You must have a USB 3.0 super-speed
 +A-to-A debugging cable to connect the debug target with a debug
 +host. In this document, "debug target" stands for the system under
 +debugging, and "debug host" stands for a stand-alone system that is
 +able to talk to the debugging target through the USB3 debug port.
 +
 +EARLY PRINTK
 +
 +
 +On the debug target system, you need to customize a debugging kernel
 +with CONFIG_EARLY_PRINTK_USB_XDBC enabled. And, add below kernel
 +boot parameter::
 +
 +   "earlyprintk=xdbc"
 +
 +If there are multiple xHCI controllers in the system, you can append
 +a host contoller index to this kernel parameter. This index starts
 +from 0.
 +
 +If you are going to use the "keep" option defined by the early
 +printk framework to keep the boot console alive after early boot,
 +you'd better add below kernel boot parameter::
 +
 +   "usbcore.autosuspend=-1"
 +
 +On the debug host side, you don't need to customize the kernel, but
 +you'd better disable usb subsystem runtime power management by
 +adding below kernel boot parameter::
 +
 +   "usbcore.autosuspend=-1"
 +

>>> Just curious, why autosuspend needs to be disabled for this function?
>> This implementation doesn't support suspend/resume yet.
>  
> Why host side needs to disable it too?

If host side runtime suspend/resume is enabled, it might ask the
debug device to suspend. This will cause the debug device dead.

The suspend/resume of the debug device depends on xhci driver.
I will make it happen in separated patches with more tests. It's in
my TODO list.

Best regards,
Lu Baolu


Re: [PATCH v7 5/5] usb: doc: add document for USB3 debug port usage

2017-02-13 Thread Lu Baolu
Hi,

On 02/14/2017 02:13 PM, Peter Chen wrote:
>  
>> On 02/14/2017 11:45 AM, Peter Chen wrote:
>>> On Tue, Feb 14, 2017 at 10:27 AM, Lu Baolu  wrote:
>>>
 Add Documentation/usb/usb3-debug-port.rst. This document includes the
 user guide for USB3 debug port.

 Cc: linux-...@vger.kernel.org
 Signed-off-by: Lu Baolu 
 ---
  Documentation/usb/usb3-debug-port.rst | 98
 ++
 +
  1 file changed, 98 insertions(+)
  create mode 100644 Documentation/usb/usb3-debug-port.rst

 diff --git a/Documentation/usb/usb3-debug-port.rst
 b/Documentation/usb/usb3-debug-port.rst
 new file mode 100644
 index 000..9eddb3a
 --- /dev/null
 +++ b/Documentation/usb/usb3-debug-port.rst
 @@ -0,0 +1,98 @@
 +===
 +USB3 debug port
 +===
 +
 +:Author: Lu Baolu 
 +:Date: January 2017
 +
 +GENERAL
 +===
 +
 +This is a HOWTO for using USB3 debug port on x86 systems.
 +
 +Before using any kernel debugging functionality based on USB3 debug
 +port, you need to check 1) whether debug port is supported by the
 +xHCI host; 2) which port is used for debugging purposes (normally
 +the first USB3 root port). You must have a USB 3.0 super-speed
 +A-to-A debugging cable to connect the debug target with a debug
 +host. In this document, "debug target" stands for the system under
 +debugging, and "debug host" stands for a stand-alone system that is
 +able to talk to the debugging target through the USB3 debug port.
 +
 +EARLY PRINTK
 +
 +
 +On the debug target system, you need to customize a debugging kernel
 +with CONFIG_EARLY_PRINTK_USB_XDBC enabled. And, add below kernel
 +boot parameter::
 +
 +   "earlyprintk=xdbc"
 +
 +If there are multiple xHCI controllers in the system, you can append
 +a host contoller index to this kernel parameter. This index starts
 +from 0.
 +
 +If you are going to use the "keep" option defined by the early
 +printk framework to keep the boot console alive after early boot,
 +you'd better add below kernel boot parameter::
 +
 +   "usbcore.autosuspend=-1"
 +
 +On the debug host side, you don't need to customize the kernel, but
 +you'd better disable usb subsystem runtime power management by
 +adding below kernel boot parameter::
 +
 +   "usbcore.autosuspend=-1"
 +

>>> Just curious, why autosuspend needs to be disabled for this function?
>> This implementation doesn't support suspend/resume yet.
>  
> Why host side needs to disable it too?

If host side runtime suspend/resume is enabled, it might ask the
debug device to suspend. This will cause the debug device dead.

The suspend/resume of the debug device depends on xhci driver.
I will make it happen in separated patches with more tests. It's in
my TODO list.

Best regards,
Lu Baolu


Re: [PATCH BUGFIX] block: make elevator_get robust against cross blk/blk-mq choice

2017-02-13 Thread Hannes Reinecke
On 02/14/2017 08:07 AM, Omar Sandoval wrote:
> On Tue, Feb 14, 2017 at 07:58:22AM +0100, Hannes Reinecke wrote:
>> While we're at the topic:
>>
>> Can't we use the same names for legacy and mq scheduler?
>> It's quite an unnecessary complication to have
>> 'noop', 'deadline', and 'cfq' for legacy, but 'none' and 'mq-deadline'
>> for mq. If we could use 'noop' and 'deadline' for mq, too, the existing
>> settings or udev rules will continue to work and we wouldn't get any
>> annoying and pointless warnings here...
> 
> I mentioned this to Jens a little while ago but I didn't feel strongly
> enough to push the issue. I also like this idea -- it makes the
> transition to blk-mq a little more transparent.
> 
And saves us _a lot_ of support cases :-)

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH BUGFIX] block: make elevator_get robust against cross blk/blk-mq choice

2017-02-13 Thread Hannes Reinecke
On 02/14/2017 08:07 AM, Omar Sandoval wrote:
> On Tue, Feb 14, 2017 at 07:58:22AM +0100, Hannes Reinecke wrote:
>> While we're at the topic:
>>
>> Can't we use the same names for legacy and mq scheduler?
>> It's quite an unnecessary complication to have
>> 'noop', 'deadline', and 'cfq' for legacy, but 'none' and 'mq-deadline'
>> for mq. If we could use 'noop' and 'deadline' for mq, too, the existing
>> settings or udev rules will continue to work and we wouldn't get any
>> annoying and pointless warnings here...
> 
> I mentioned this to Jens a little while ago but I didn't feel strongly
> enough to push the issue. I also like this idea -- it makes the
> transition to blk-mq a little more transparent.
> 
And saves us _a lot_ of support cases :-)

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


linux-next: Tree for Feb 14

2017-02-13 Thread Stephen Rothwell
Hi all,

Changes since 20170213:

Removed trees: rdma-leon, rdma-leon-test (at owner's request)

The net tree gained a build failure for which I applied a fix patch.

The mfd tree gained a conflict against the input tree.

The kvm tree gained conflicts against the powerpc tree.

The akpm-current tree gained a build failure from an interaction with
the powerpc tree.  I applied a merge fix patch.

Non-merge commits (relative to Linus' tree): 8845
 9758 files changed, 392326 insertions(+), 184216 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
and pseries_le_defconfig and i386, sparc and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 254 trees (counting Linus' and 37 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (7089db84e356 Linux 4.10-rc8)
Merging fixes/master (30066ce675d3 Merge branch 'linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6)
Merging kbuild-current/rc-fixes (c7858bf16c0b asm-prototypes: Clear any CPP 
defines before declaring the functions)
Merging arc-current/for-curr (8ba605b607b7 ARC: [plat-*] ARC_HAS_COH_CACHES no 
longer relevant)
Merging arm-current/fixes (228dbbfb5d77 ARM: 8643/3: arm/ptrace: Preserve 
previous registers for short regset write)
Merging m68k-current/for-linus (ad595b77c4a8 m68k/atari: Use seq_puts() in 
atari_get_hardware_list())
Merging metag-fixes/fixes (35d04077ad96 metag: Only define 
atomic_dec_if_positive conditionally)
Merging powerpc-fixes/fixes (f83e6862047e powerpc/powernv: Properly set 
"host-ipi" on IPIs)
Merging sparc/master (f9a42e0d58cf Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (0c59d28121b9 MAINTAINERS: Remove old e-mail address)
Applying: bpf: kernel header files need to be copied into the tools directory
Merging ipsec/master (c28a45cb xfrm: policy: init locks early)
Merging netfilter/master (f95d7a46bc57 netfilter: ctnetlink: Fix regression in 
CTA_HELP processing)
Merging ipvs/master (045169816b31 Merge branch 'linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6)
Merging wireless-drivers/master (52f5631a4c05 rtlwifi: rtl8192ce: Fix loading 
of incorrect firmware)
Merging mac80211/master (fd551bac4795 nl80211: Fix mesh HT operation check)
Merging sound-current/for-linus (af677166cf63 ALSA: hda - adding a new NV 
HDMI/DP codec ID in the driver)
Merging pci-current/for-linus (d98e0929071e Revert "PCI: pciehp: Add runtime PM 
support for PCIe hotplug ports")
Merging driver-core.current/driver-core-linus (49def1853334 Linux 4.10-rc4)
Merging tty.current/tty-linus (49def1853334 Linux 4.10-rc4)
Merging usb.current/usb-linus (d5adbfcd5f7b Linux 4.10-rc7)
Merging usb-gadget-fixes/fixes (efe357f4633a usb: dwc2: host: fix 
Wmaybe-uninitialized warning)
Merging usb-serial-fixes/usb-linus (d07830db1bdb USB: serial: pl2303: add ATEN 
device ID)
Merging usb-chipidea-fixes/ci-for-usb-stable (c7fbb09b2ea1 usb: chipidea: move 
the lock initialization to core file)
Merging phy/fixes (7ce7d89f4883 Linux 4.10-rc1)
Merging staging.current/staging-linus (d5adbfcd5f7b Linux 4.10-rc7)
Merging char-misc.current/char-misc-linus (d5adbfcd5f7b Linux 4.10-rc7)
Merging input-current/for-linus (722c5ac708b4 Input: elan_i2c - add ELAN0605 to 
the ACPI table)
Merging crypto-current/master (7c2cf1c4615c crypto: chcr - Fix key

linux-next: Tree for Feb 14

2017-02-13 Thread Stephen Rothwell
Hi all,

Changes since 20170213:

Removed trees: rdma-leon, rdma-leon-test (at owner's request)

The net tree gained a build failure for which I applied a fix patch.

The mfd tree gained a conflict against the input tree.

The kvm tree gained conflicts against the powerpc tree.

The akpm-current tree gained a build failure from an interaction with
the powerpc tree.  I applied a merge fix patch.

Non-merge commits (relative to Linus' tree): 8845
 9758 files changed, 392326 insertions(+), 184216 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
and pseries_le_defconfig and i386, sparc and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 254 trees (counting Linus' and 37 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (7089db84e356 Linux 4.10-rc8)
Merging fixes/master (30066ce675d3 Merge branch 'linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6)
Merging kbuild-current/rc-fixes (c7858bf16c0b asm-prototypes: Clear any CPP 
defines before declaring the functions)
Merging arc-current/for-curr (8ba605b607b7 ARC: [plat-*] ARC_HAS_COH_CACHES no 
longer relevant)
Merging arm-current/fixes (228dbbfb5d77 ARM: 8643/3: arm/ptrace: Preserve 
previous registers for short regset write)
Merging m68k-current/for-linus (ad595b77c4a8 m68k/atari: Use seq_puts() in 
atari_get_hardware_list())
Merging metag-fixes/fixes (35d04077ad96 metag: Only define 
atomic_dec_if_positive conditionally)
Merging powerpc-fixes/fixes (f83e6862047e powerpc/powernv: Properly set 
"host-ipi" on IPIs)
Merging sparc/master (f9a42e0d58cf Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (0c59d28121b9 MAINTAINERS: Remove old e-mail address)
Applying: bpf: kernel header files need to be copied into the tools directory
Merging ipsec/master (c28a45cb xfrm: policy: init locks early)
Merging netfilter/master (f95d7a46bc57 netfilter: ctnetlink: Fix regression in 
CTA_HELP processing)
Merging ipvs/master (045169816b31 Merge branch 'linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6)
Merging wireless-drivers/master (52f5631a4c05 rtlwifi: rtl8192ce: Fix loading 
of incorrect firmware)
Merging mac80211/master (fd551bac4795 nl80211: Fix mesh HT operation check)
Merging sound-current/for-linus (af677166cf63 ALSA: hda - adding a new NV 
HDMI/DP codec ID in the driver)
Merging pci-current/for-linus (d98e0929071e Revert "PCI: pciehp: Add runtime PM 
support for PCIe hotplug ports")
Merging driver-core.current/driver-core-linus (49def1853334 Linux 4.10-rc4)
Merging tty.current/tty-linus (49def1853334 Linux 4.10-rc4)
Merging usb.current/usb-linus (d5adbfcd5f7b Linux 4.10-rc7)
Merging usb-gadget-fixes/fixes (efe357f4633a usb: dwc2: host: fix 
Wmaybe-uninitialized warning)
Merging usb-serial-fixes/usb-linus (d07830db1bdb USB: serial: pl2303: add ATEN 
device ID)
Merging usb-chipidea-fixes/ci-for-usb-stable (c7fbb09b2ea1 usb: chipidea: move 
the lock initialization to core file)
Merging phy/fixes (7ce7d89f4883 Linux 4.10-rc1)
Merging staging.current/staging-linus (d5adbfcd5f7b Linux 4.10-rc7)
Merging char-misc.current/char-misc-linus (d5adbfcd5f7b Linux 4.10-rc7)
Merging input-current/for-linus (722c5ac708b4 Input: elan_i2c - add ELAN0605 to 
the ACPI table)
Merging crypto-current/master (7c2cf1c4615c crypto: chcr - Fix key

Re: net/xfrm: stack out-of-bounds in xfrm_flowi_sport

2017-02-13 Thread Steffen Klassert
On Mon, Feb 13, 2017 at 03:46:56PM +0100, Dmitry Vyukov wrote:
> 
> On commit 7089db84e356562f8ba737c29e472cc42d530dbc.
> 
> 
> struct flowi4 fl4_stack allocated on stack in udp_sendmsg is being
> casted to larger struct flowi and then accessed.

Looks like the problem is when using IPv4-mapped IPv6 addresses.

Does the patch below help?


Subject: [PATCH RFC ipsec] xfrm: Don't use sk_family for socket policy lookups

On IPv4-mapped IPv6 addresses sk_family is AF_INET6,
but the flow informations are created based on AF_INET.
So the routing set up 'struct flowi4' but we try to
access 'struct flowi6' what leads to an out of bounds
access. Fix this by using the family we get with the
dst_entry, like we do it for the standard policy lookup.

Reported-by: Dmitry Vyukov 
Signed-off-by: Steffen Klassert 
---
 net/xfrm/xfrm_policy.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index b5e665b..4891b7b 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -1216,7 +1216,7 @@ static inline int policy_to_flow_dir(int dir)
 }
 
 static struct xfrm_policy *xfrm_sk_policy_lookup(const struct sock *sk, int 
dir,
-const struct flowi *fl)
+const struct flowi *fl, u16 
family)
 {
struct xfrm_policy *pol;
struct net *net = sock_net(sk);
@@ -1225,8 +1225,7 @@ static struct xfrm_policy *xfrm_sk_policy_lookup(const 
struct sock *sk, int dir,
read_lock_bh(>xfrm.xfrm_policy_lock);
pol = rcu_dereference(sk->sk_policy[dir]);
if (pol != NULL) {
-   bool match = xfrm_selector_match(>selector, fl,
-sk->sk_family);
+   bool match = xfrm_selector_match(>selector, fl, family);
int err = 0;
 
if (match) {
@@ -2221,7 +2220,7 @@ struct dst_entry *xfrm_lookup(struct net *net, struct 
dst_entry *dst_orig,
sk = sk_const_to_full_sk(sk);
if (sk && sk->sk_policy[XFRM_POLICY_OUT]) {
num_pols = 1;
-   pols[0] = xfrm_sk_policy_lookup(sk, XFRM_POLICY_OUT, fl);
+   pols[0] = xfrm_sk_policy_lookup(sk, XFRM_POLICY_OUT, fl, 
family);
err = xfrm_expand_policies(fl, family, pols,
   _pols, _xfrms);
if (err < 0)
@@ -2500,7 +2499,7 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct 
sk_buff *skb,
pol = NULL;
sk = sk_to_full_sk(sk);
if (sk && sk->sk_policy[dir]) {
-   pol = xfrm_sk_policy_lookup(sk, dir, );
+   pol = xfrm_sk_policy_lookup(sk, dir, , family);
if (IS_ERR(pol)) {
XFRM_INC_STATS(net, LINUX_MIB_XFRMINPOLERROR);
return 0;
-- 
1.9.1



Re: net/xfrm: stack out-of-bounds in xfrm_flowi_sport

2017-02-13 Thread Steffen Klassert
On Mon, Feb 13, 2017 at 03:46:56PM +0100, Dmitry Vyukov wrote:
> 
> On commit 7089db84e356562f8ba737c29e472cc42d530dbc.
> 
> 
> struct flowi4 fl4_stack allocated on stack in udp_sendmsg is being
> casted to larger struct flowi and then accessed.

Looks like the problem is when using IPv4-mapped IPv6 addresses.

Does the patch below help?


Subject: [PATCH RFC ipsec] xfrm: Don't use sk_family for socket policy lookups

On IPv4-mapped IPv6 addresses sk_family is AF_INET6,
but the flow informations are created based on AF_INET.
So the routing set up 'struct flowi4' but we try to
access 'struct flowi6' what leads to an out of bounds
access. Fix this by using the family we get with the
dst_entry, like we do it for the standard policy lookup.

Reported-by: Dmitry Vyukov 
Signed-off-by: Steffen Klassert 
---
 net/xfrm/xfrm_policy.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index b5e665b..4891b7b 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -1216,7 +1216,7 @@ static inline int policy_to_flow_dir(int dir)
 }
 
 static struct xfrm_policy *xfrm_sk_policy_lookup(const struct sock *sk, int 
dir,
-const struct flowi *fl)
+const struct flowi *fl, u16 
family)
 {
struct xfrm_policy *pol;
struct net *net = sock_net(sk);
@@ -1225,8 +1225,7 @@ static struct xfrm_policy *xfrm_sk_policy_lookup(const 
struct sock *sk, int dir,
read_lock_bh(>xfrm.xfrm_policy_lock);
pol = rcu_dereference(sk->sk_policy[dir]);
if (pol != NULL) {
-   bool match = xfrm_selector_match(>selector, fl,
-sk->sk_family);
+   bool match = xfrm_selector_match(>selector, fl, family);
int err = 0;
 
if (match) {
@@ -2221,7 +2220,7 @@ struct dst_entry *xfrm_lookup(struct net *net, struct 
dst_entry *dst_orig,
sk = sk_const_to_full_sk(sk);
if (sk && sk->sk_policy[XFRM_POLICY_OUT]) {
num_pols = 1;
-   pols[0] = xfrm_sk_policy_lookup(sk, XFRM_POLICY_OUT, fl);
+   pols[0] = xfrm_sk_policy_lookup(sk, XFRM_POLICY_OUT, fl, 
family);
err = xfrm_expand_policies(fl, family, pols,
   _pols, _xfrms);
if (err < 0)
@@ -2500,7 +2499,7 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct 
sk_buff *skb,
pol = NULL;
sk = sk_to_full_sk(sk);
if (sk && sk->sk_policy[dir]) {
-   pol = xfrm_sk_policy_lookup(sk, dir, );
+   pol = xfrm_sk_policy_lookup(sk, dir, , family);
if (IS_ERR(pol)) {
XFRM_INC_STATS(net, LINUX_MIB_XFRMINPOLERROR);
return 0;
-- 
1.9.1



Re: [PATCH BUGFIX] block: make elevator_get robust against cross blk/blk-mq choice

2017-02-13 Thread Omar Sandoval
On Tue, Feb 14, 2017 at 07:58:22AM +0100, Hannes Reinecke wrote:
> While we're at the topic:
> 
> Can't we use the same names for legacy and mq scheduler?
> It's quite an unnecessary complication to have
> 'noop', 'deadline', and 'cfq' for legacy, but 'none' and 'mq-deadline'
> for mq. If we could use 'noop' and 'deadline' for mq, too, the existing
> settings or udev rules will continue to work and we wouldn't get any
> annoying and pointless warnings here...

I mentioned this to Jens a little while ago but I didn't feel strongly
enough to push the issue. I also like this idea -- it makes the
transition to blk-mq a little more transparent.


Re: [PATCH BUGFIX] block: make elevator_get robust against cross blk/blk-mq choice

2017-02-13 Thread Omar Sandoval
On Tue, Feb 14, 2017 at 07:58:22AM +0100, Hannes Reinecke wrote:
> While we're at the topic:
> 
> Can't we use the same names for legacy and mq scheduler?
> It's quite an unnecessary complication to have
> 'noop', 'deadline', and 'cfq' for legacy, but 'none' and 'mq-deadline'
> for mq. If we could use 'noop' and 'deadline' for mq, too, the existing
> settings or udev rules will continue to work and we wouldn't get any
> annoying and pointless warnings here...

I mentioned this to Jens a little while ago but I didn't feel strongly
enough to push the issue. I also like this idea -- it makes the
transition to blk-mq a little more transparent.


[PATCH] mm: free reserved area's memmap if possiable

2017-02-13 Thread zhouxianrong
From: zhouxianrong 

just like freeing no-map area's memmap (gaps of memblock.memory)
we could free reserved area's memmap (areas of memblock.reserved)
as well only when user of reserved area indicate that we can do
this in drivers. that is, user of reserved area know how to
use the reserved area who could not memblock_free or free_reserved_xxx
the reserved area and regard the area as raw pfn usage by kernel.
the patch supply a way to users who want to utilize the memmap
memory corresponding to raw pfn reserved areas as many as possible.
users can do this by memblock_mark_raw_pfn interface which mark the
reserved area as raw pfn and tell free_unused_memmap that this area's
memmap could be freeed.

Signed-off-by: zhouxianrong 
---
 arch/arm64/mm/init.c |   14 +-
 include/linux/memblock.h |3 +++
 mm/memblock.c|   24 
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 380ebe7..7e62ef8 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -358,7 +358,7 @@ static inline void free_memmap(unsigned long start_pfn, 
unsigned long end_pfn)
  */
 static void __init free_unused_memmap(void)
 {
-   unsigned long start, prev_end = 0;
+   unsigned long start, end, prev_end = 0;
struct memblock_region *reg;
 
for_each_memblock(memory, reg) {
@@ -391,6 +391,18 @@ static void __init free_unused_memmap(void)
if (!IS_ALIGNED(prev_end, PAGES_PER_SECTION))
free_memmap(prev_end, ALIGN(prev_end, PAGES_PER_SECTION));
 #endif
+
+   for_each_memblock(reserved, reg) {
+   if (!(reg->flags & MEMBLOCK_RAW_PFN))
+   continue;
+
+   start = memblock_region_memory_base_pfn(reg);
+   end = round_down(memblock_region_memory_end_pfn(reg),
+MAX_ORDER_NR_PAGES);
+
+   if (start < end)
+   free_memmap(start, end);
+   }
 }
 #endif /* !CONFIG_SPARSEMEM_VMEMMAP */
 
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 5b759c9..9f8d277 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -26,6 +26,7 @@ enum {
MEMBLOCK_HOTPLUG= 0x1,  /* hotpluggable region */
MEMBLOCK_MIRROR = 0x2,  /* mirrored region */
MEMBLOCK_NOMAP  = 0x4,  /* don't add to kernel direct mapping */
+   MEMBLOCK_RAW_PFN= 0x8,  /* region whose memmap never be used */
 };
 
 struct memblock_region {
@@ -92,6 +93,8 @@ bool memblock_overlaps_region(struct memblock_type *type,
 int memblock_clear_hotplug(phys_addr_t base, phys_addr_t size);
 int memblock_mark_mirror(phys_addr_t base, phys_addr_t size);
 int memblock_mark_nomap(phys_addr_t base, phys_addr_t size);
+int memblock_mark_raw_pfn(phys_addr_t base, phys_addr_t size);
+int memblock_clear_raw_pfn(phys_addr_t base, phys_addr_t size);
 ulong choose_memblock_flags(void);
 
 /* Low level functions */
diff --git a/mm/memblock.c b/mm/memblock.c
index 7608bc3..c103b94 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -814,6 +814,30 @@ int __init_memblock memblock_mark_nomap(phys_addr_t base, 
phys_addr_t size)
 }
 
 /**
+ * memblock_mark_raw_pfn - Mark raw pfn memory with flag MEMBLOCK_RAW_PFN.
+ * @base: the base phys addr of the region
+ * @size: the size of the region
+ *
+ * Return 0 on succees, -errno on failure.
+ */
+int __init_memblock memblock_mark_raw_pfn(phys_addr_t base, phys_addr_t size)
+{
+   return memblock_setclr_flag(base, size, 1, MEMBLOCK_RAW_PFN);
+}
+
+/**
+ * memblock_clear_raw_pfn - Clear flag MEMBLOCK_RAW_PFN for a specified region.
+ * @base: the base phys addr of the region
+ * @size: the size of the region
+ *
+ * Return 0 on succees, -errno on failure.
+ */
+int __init_memblock memblock_clear_raw_pfn(phys_addr_t base, phys_addr_t size)
+{
+   return memblock_setclr_flag(base, size, 0, MEMBLOCK_RAW_PFN);
+}
+
+/**
  * __next_reserved_mem_region - next function for for_each_reserved_region()
  * @idx: pointer to u64 loop variable
  * @out_start: ptr to phys_addr_t for start address of the region, can be %NULL
-- 
1.7.9.5



[PATCH] mm: free reserved area's memmap if possiable

2017-02-13 Thread zhouxianrong
From: zhouxianrong 

just like freeing no-map area's memmap (gaps of memblock.memory)
we could free reserved area's memmap (areas of memblock.reserved)
as well only when user of reserved area indicate that we can do
this in drivers. that is, user of reserved area know how to
use the reserved area who could not memblock_free or free_reserved_xxx
the reserved area and regard the area as raw pfn usage by kernel.
the patch supply a way to users who want to utilize the memmap
memory corresponding to raw pfn reserved areas as many as possible.
users can do this by memblock_mark_raw_pfn interface which mark the
reserved area as raw pfn and tell free_unused_memmap that this area's
memmap could be freeed.

Signed-off-by: zhouxianrong 
---
 arch/arm64/mm/init.c |   14 +-
 include/linux/memblock.h |3 +++
 mm/memblock.c|   24 
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 380ebe7..7e62ef8 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -358,7 +358,7 @@ static inline void free_memmap(unsigned long start_pfn, 
unsigned long end_pfn)
  */
 static void __init free_unused_memmap(void)
 {
-   unsigned long start, prev_end = 0;
+   unsigned long start, end, prev_end = 0;
struct memblock_region *reg;
 
for_each_memblock(memory, reg) {
@@ -391,6 +391,18 @@ static void __init free_unused_memmap(void)
if (!IS_ALIGNED(prev_end, PAGES_PER_SECTION))
free_memmap(prev_end, ALIGN(prev_end, PAGES_PER_SECTION));
 #endif
+
+   for_each_memblock(reserved, reg) {
+   if (!(reg->flags & MEMBLOCK_RAW_PFN))
+   continue;
+
+   start = memblock_region_memory_base_pfn(reg);
+   end = round_down(memblock_region_memory_end_pfn(reg),
+MAX_ORDER_NR_PAGES);
+
+   if (start < end)
+   free_memmap(start, end);
+   }
 }
 #endif /* !CONFIG_SPARSEMEM_VMEMMAP */
 
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 5b759c9..9f8d277 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -26,6 +26,7 @@ enum {
MEMBLOCK_HOTPLUG= 0x1,  /* hotpluggable region */
MEMBLOCK_MIRROR = 0x2,  /* mirrored region */
MEMBLOCK_NOMAP  = 0x4,  /* don't add to kernel direct mapping */
+   MEMBLOCK_RAW_PFN= 0x8,  /* region whose memmap never be used */
 };
 
 struct memblock_region {
@@ -92,6 +93,8 @@ bool memblock_overlaps_region(struct memblock_type *type,
 int memblock_clear_hotplug(phys_addr_t base, phys_addr_t size);
 int memblock_mark_mirror(phys_addr_t base, phys_addr_t size);
 int memblock_mark_nomap(phys_addr_t base, phys_addr_t size);
+int memblock_mark_raw_pfn(phys_addr_t base, phys_addr_t size);
+int memblock_clear_raw_pfn(phys_addr_t base, phys_addr_t size);
 ulong choose_memblock_flags(void);
 
 /* Low level functions */
diff --git a/mm/memblock.c b/mm/memblock.c
index 7608bc3..c103b94 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -814,6 +814,30 @@ int __init_memblock memblock_mark_nomap(phys_addr_t base, 
phys_addr_t size)
 }
 
 /**
+ * memblock_mark_raw_pfn - Mark raw pfn memory with flag MEMBLOCK_RAW_PFN.
+ * @base: the base phys addr of the region
+ * @size: the size of the region
+ *
+ * Return 0 on succees, -errno on failure.
+ */
+int __init_memblock memblock_mark_raw_pfn(phys_addr_t base, phys_addr_t size)
+{
+   return memblock_setclr_flag(base, size, 1, MEMBLOCK_RAW_PFN);
+}
+
+/**
+ * memblock_clear_raw_pfn - Clear flag MEMBLOCK_RAW_PFN for a specified region.
+ * @base: the base phys addr of the region
+ * @size: the size of the region
+ *
+ * Return 0 on succees, -errno on failure.
+ */
+int __init_memblock memblock_clear_raw_pfn(phys_addr_t base, phys_addr_t size)
+{
+   return memblock_setclr_flag(base, size, 0, MEMBLOCK_RAW_PFN);
+}
+
+/**
  * __next_reserved_mem_region - next function for for_each_reserved_region()
  * @idx: pointer to u64 loop variable
  * @out_start: ptr to phys_addr_t for start address of the region, can be %NULL
-- 
1.7.9.5



Re: [PATCH v2 1/9] llist: Provide a safe version for llist_for_each

2017-02-13 Thread Huang, Ying
Byungchul Park  writes:

> On Mon, Feb 13, 2017 at 04:58:05PM +0900, Byungchul Park wrote:
>> On Mon, Feb 13, 2017 at 03:52:44PM +0800, Huang, Ying wrote:
>> > Byungchul Park  writes:
>> > 
>> > > On Mon, Feb 13, 2017 at 03:36:33PM +0800, Huang, Ying wrote:
>> > >> Byungchul Park  writes:
>> > >> 
>> > >> > Sometimes we have to dereference next field of llist node before 
>> > >> > entering
>> > >> > loop becasue the node might be deleted or the next field might be
>> > >> > modified within the loop. So this adds the safe version of 
>> > >> > llist_for_each,
>> > >> > that is, llist_for_each_safe.
>> > >> >
>> > >> > Signed-off-by: Byungchul Park 
>> > >> > ---
>> > >> >  include/linux/llist.h | 19 +++
>> > >> >  1 file changed, 19 insertions(+)
>> > >> >
>> > >> > diff --git a/include/linux/llist.h b/include/linux/llist.h
>> > >> > index fd4ca0b..4c508a5 100644
>> > >> > --- a/include/linux/llist.h
>> > >> > +++ b/include/linux/llist.h
>> > >> > @@ -105,6 +105,25 @@ static inline void init_llist_head(struct 
>> > >> > llist_head *list)
>> > >> >   for ((pos) = (node); pos; (pos) = (pos)->next)
>> > >> >  
>> > >> >  /**
>> > >> > + * llist_for_each_safe - iterate over some deleted entries of a 
>> > >> > lock-less list
>> > >> > + *safe against removal of list entry
>> > >> > + * @pos: the  llist_node to use as a loop cursor
>> > >> > + * @n:   another type * to use as temporary storage
>> > >> 
>> > >> s/type */ llist_node/
>> > >
>> > > Yes.
>> > >
>> > >> 
>> > >> > + * @node:the first entry of deleted list entries
>> > >> > + *
>> > >> > + * In general, some entries of the lock-less list can be traversed
>> > >> > + * safely only after being deleted from list, so start with an entry
>> > >> > + * instead of list head.
>> > >> > + *
>> > >> > + * If being used on entries deleted from lock-less list directly, the
>> > >> > + * traverse order is from the newest to the oldest added entry.  If
>> > >> > + * you want to traverse from the oldest to the newest, you must
>> > >> > + * reverse the order by yourself before traversing.
>> > >> > + */
>> > >> > +#define llist_for_each_safe(pos, n, node)\
>> > >> > + for ((pos) = (node); (pos) && ((n) = (pos)->next, true); (pos) 
>> > >> > = (n))
>> > >> > +
>> > >> 
>> > >> Following the style of other xxx_for_each_safe,
>> > >> 
>> > >> #define llist_for_each_safe(pos, n, node)   \
>> > >> for (pos = (node), (pos && (n = pos->next)); pos; pos = n, n = 
>> > >> pos->next)
>> > >
>> > > Do you think it should be modified? I think mine is simpler. No?
>> > 
>> > Personally I prefer the style of other xxx_for_each_safe().
>> 
>> Yes, I will modify it as you recommand.
>> 
>> Thank you very much.
>
> I wanted to modify it as you recommanded but it has a bug. It should be
> (to fix the bug):
>
>for (pos = (node), (pos && (n = pos->next)); pos; pos = n, (pos && \
>(n = pos->next)))
>
> Don't you think this is too messy? Or do I miss something? I still think
> the following is neater and simpler.
>
>for (pos = node; pos && (n = pos->next, true); pos = n)

OK.  This looks better.

Best Regards,
Huang, Ying

> Or could you recommand another preference?


Re: [PATCH v2 1/9] llist: Provide a safe version for llist_for_each

2017-02-13 Thread Huang, Ying
Byungchul Park  writes:

> On Mon, Feb 13, 2017 at 04:58:05PM +0900, Byungchul Park wrote:
>> On Mon, Feb 13, 2017 at 03:52:44PM +0800, Huang, Ying wrote:
>> > Byungchul Park  writes:
>> > 
>> > > On Mon, Feb 13, 2017 at 03:36:33PM +0800, Huang, Ying wrote:
>> > >> Byungchul Park  writes:
>> > >> 
>> > >> > Sometimes we have to dereference next field of llist node before 
>> > >> > entering
>> > >> > loop becasue the node might be deleted or the next field might be
>> > >> > modified within the loop. So this adds the safe version of 
>> > >> > llist_for_each,
>> > >> > that is, llist_for_each_safe.
>> > >> >
>> > >> > Signed-off-by: Byungchul Park 
>> > >> > ---
>> > >> >  include/linux/llist.h | 19 +++
>> > >> >  1 file changed, 19 insertions(+)
>> > >> >
>> > >> > diff --git a/include/linux/llist.h b/include/linux/llist.h
>> > >> > index fd4ca0b..4c508a5 100644
>> > >> > --- a/include/linux/llist.h
>> > >> > +++ b/include/linux/llist.h
>> > >> > @@ -105,6 +105,25 @@ static inline void init_llist_head(struct 
>> > >> > llist_head *list)
>> > >> >   for ((pos) = (node); pos; (pos) = (pos)->next)
>> > >> >  
>> > >> >  /**
>> > >> > + * llist_for_each_safe - iterate over some deleted entries of a 
>> > >> > lock-less list
>> > >> > + *safe against removal of list entry
>> > >> > + * @pos: the  llist_node to use as a loop cursor
>> > >> > + * @n:   another type * to use as temporary storage
>> > >> 
>> > >> s/type */ llist_node/
>> > >
>> > > Yes.
>> > >
>> > >> 
>> > >> > + * @node:the first entry of deleted list entries
>> > >> > + *
>> > >> > + * In general, some entries of the lock-less list can be traversed
>> > >> > + * safely only after being deleted from list, so start with an entry
>> > >> > + * instead of list head.
>> > >> > + *
>> > >> > + * If being used on entries deleted from lock-less list directly, the
>> > >> > + * traverse order is from the newest to the oldest added entry.  If
>> > >> > + * you want to traverse from the oldest to the newest, you must
>> > >> > + * reverse the order by yourself before traversing.
>> > >> > + */
>> > >> > +#define llist_for_each_safe(pos, n, node)\
>> > >> > + for ((pos) = (node); (pos) && ((n) = (pos)->next, true); (pos) 
>> > >> > = (n))
>> > >> > +
>> > >> 
>> > >> Following the style of other xxx_for_each_safe,
>> > >> 
>> > >> #define llist_for_each_safe(pos, n, node)   \
>> > >> for (pos = (node), (pos && (n = pos->next)); pos; pos = n, n = 
>> > >> pos->next)
>> > >
>> > > Do you think it should be modified? I think mine is simpler. No?
>> > 
>> > Personally I prefer the style of other xxx_for_each_safe().
>> 
>> Yes, I will modify it as you recommand.
>> 
>> Thank you very much.
>
> I wanted to modify it as you recommanded but it has a bug. It should be
> (to fix the bug):
>
>for (pos = (node), (pos && (n = pos->next)); pos; pos = n, (pos && \
>(n = pos->next)))
>
> Don't you think this is too messy? Or do I miss something? I still think
> the following is neater and simpler.
>
>for (pos = node; pos && (n = pos->next, true); pos = n)

OK.  This looks better.

Best Regards,
Huang, Ying

> Or could you recommand another preference?


Re: [PATCH BUGFIX] block: make elevator_get robust against cross blk/blk-mq choice

2017-02-13 Thread Hannes Reinecke
On 02/13/2017 11:28 PM, Jens Axboe wrote:
> On 02/13/2017 03:09 PM, Omar Sandoval wrote:
>> On Mon, Feb 13, 2017 at 10:01:07PM +0100, Paolo Valente wrote:
>>> If, at boot, a legacy I/O scheduler is chosen for a device using blk-mq,
>>> or, viceversa, a blk-mq scheduler is chosen for a device using blk, then
>>> that scheduler is set and initialized without any check, driving the
>>> system into an inconsistent state. This commit addresses this issue by
>>> letting elevator_get fail for these wrong cross choices.
>>>
>>> Signed-off-by: Paolo Valente 
>>> ---
>>>  block/elevator.c | 26 ++
>>>  1 file changed, 18 insertions(+), 8 deletions(-)
>>
>> Hey, Paolo,
>>
>> How exactly are you triggering this? In __elevator_change(), we do check
>> for mq or not mq:
>>
>>  if (!e->uses_mq && q->mq_ops) {
>>  elevator_put(e);
>>  return -EINVAL;
>>  }
>>  if (e->uses_mq && !q->mq_ops) {
>>  elevator_put(e);
>>  return -EINVAL;
>>  }
>>
>> We don't ever appear to call elevator_init() with a specific scheduler
>> name, and for the default we switch off of q->mq_ops and use the
>> defaults from Kconfig:
>>
>>  if (q->mq_ops && q->nr_hw_queues == 1)
>>  e = elevator_get(CONFIG_DEFAULT_SQ_IOSCHED, false);
>>  else if (q->mq_ops)
>>  e = elevator_get(CONFIG_DEFAULT_MQ_IOSCHED, false);
>>  else
>>  e = elevator_get(CONFIG_DEFAULT_IOSCHED, false);
>>
>>  if (!e) {
>>  printk(KERN_ERR
>>  "Default I/O scheduler not found. " \
>>  "Using noop/none.\n");
>>  e = elevator_get("noop", false);
>>  }
>>
>> So I guess this could happen if someone manually changed those Kconfig
>> options, but I don't see what other case would make this happen, could
>> you please explain?
> 
> Was wondering the same - is it using the 'elevator=' boot parameter?
> Didn't look at that path just now, but that's the only one I could
> think of. If it is, I'd much prefer only using 'chosen_elevator' for
> the non-mq stuff, and the fix should be just that instead.
> 
[ .. ]
While we're at the topic:

Can't we use the same names for legacy and mq scheduler?
It's quite an unnecessary complication to have
'noop', 'deadline', and 'cfq' for legacy, but 'none' and 'mq-deadline'
for mq. If we could use 'noop' and 'deadline' for mq, too, the existing
settings or udev rules will continue to work and we wouldn't get any
annoying and pointless warnings here...

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH BUGFIX] block: make elevator_get robust against cross blk/blk-mq choice

2017-02-13 Thread Hannes Reinecke
On 02/13/2017 11:28 PM, Jens Axboe wrote:
> On 02/13/2017 03:09 PM, Omar Sandoval wrote:
>> On Mon, Feb 13, 2017 at 10:01:07PM +0100, Paolo Valente wrote:
>>> If, at boot, a legacy I/O scheduler is chosen for a device using blk-mq,
>>> or, viceversa, a blk-mq scheduler is chosen for a device using blk, then
>>> that scheduler is set and initialized without any check, driving the
>>> system into an inconsistent state. This commit addresses this issue by
>>> letting elevator_get fail for these wrong cross choices.
>>>
>>> Signed-off-by: Paolo Valente 
>>> ---
>>>  block/elevator.c | 26 ++
>>>  1 file changed, 18 insertions(+), 8 deletions(-)
>>
>> Hey, Paolo,
>>
>> How exactly are you triggering this? In __elevator_change(), we do check
>> for mq or not mq:
>>
>>  if (!e->uses_mq && q->mq_ops) {
>>  elevator_put(e);
>>  return -EINVAL;
>>  }
>>  if (e->uses_mq && !q->mq_ops) {
>>  elevator_put(e);
>>  return -EINVAL;
>>  }
>>
>> We don't ever appear to call elevator_init() with a specific scheduler
>> name, and for the default we switch off of q->mq_ops and use the
>> defaults from Kconfig:
>>
>>  if (q->mq_ops && q->nr_hw_queues == 1)
>>  e = elevator_get(CONFIG_DEFAULT_SQ_IOSCHED, false);
>>  else if (q->mq_ops)
>>  e = elevator_get(CONFIG_DEFAULT_MQ_IOSCHED, false);
>>  else
>>  e = elevator_get(CONFIG_DEFAULT_IOSCHED, false);
>>
>>  if (!e) {
>>  printk(KERN_ERR
>>  "Default I/O scheduler not found. " \
>>  "Using noop/none.\n");
>>  e = elevator_get("noop", false);
>>  }
>>
>> So I guess this could happen if someone manually changed those Kconfig
>> options, but I don't see what other case would make this happen, could
>> you please explain?
> 
> Was wondering the same - is it using the 'elevator=' boot parameter?
> Didn't look at that path just now, but that's the only one I could
> think of. If it is, I'd much prefer only using 'chosen_elevator' for
> the non-mq stuff, and the fix should be just that instead.
> 
[ .. ]
While we're at the topic:

Can't we use the same names for legacy and mq scheduler?
It's quite an unnecessary complication to have
'noop', 'deadline', and 'cfq' for legacy, but 'none' and 'mq-deadline'
for mq. If we could use 'noop' and 'deadline' for mq, too, the existing
settings or udev rules will continue to work and we wouldn't get any
annoying and pointless warnings here...

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH V2 0/3] Define coherent device memory node

2017-02-13 Thread Anshuman Khandual
On 02/13/2017 09:04 PM, Vlastimil Babka wrote:
> On 02/10/2017 11:06 AM, Anshuman Khandual wrote:
>>  This three patches define CDM node with HugeTLB & Buddy allocation
>> isolation. Please refer to the last RFC posting mentioned here for details.
>> The series has been split for easier review process. The next part of the
>> work like VM flags, auto NUMA and KSM interactions with tagged VMAs will
>> follow later.
> 
> Hi,
> 
> I'm not sure if the splitting to smaller series and focusing on partial
> implementations is helpful at this point, until there's some consensus
> about the whole approach from a big picture perspective.

I have been trying for that through RFCs on CDM but there were not
enough needed feedback from the larger MM community. Hence decided
to split up the series and ask for smaller chunks of code to be
reviewed, debated. Thought this will be a better approach. These
three patches are complete in themselves from functionality point of
view. VMA flags, auto NUMA, KSM are additional feature improvement on
this core set of patches.

RFC V2: https://lkml.org/lkml/2017/1/29/198  (zonelist and cpuset)
RFC V1: https://lkml.org/lkml/2016/10/24/19  (zonelist method)
RFC v2: https://lkml.org/lkml/2016/11/22/339 (cpuset method)

> 
> Note that it's also confusing that v1 of this partial patchset mentioned
> some alternative implementations, but only as git branches, and the
> discussion about their differences is linked elsewhere. That further
> makes meaningful review harder IMHO.

I had posted two alternate approaches except the GFP flag buddy method
in my last RFC. There were not much of discussion on them except some
generic top cpuset characteristics. The current posted nodemask based
isolation method is the minimalist, less intrusive and very less amount
of code change without affecting much of common MM code IMHO. But yes,
if required I can go ahead and post all other alternate methods on this
thread if looking into them helps in better comparison and review.

> 
> Going back to the bigger picture, I've read the comments on previous
> postings and I think Jerome makes many good points in this subthread [1]
> against the idea of representing the device memory as generic memory
> nodes and expecting userspace to mbind() to them. So if I make a program
> that uses mbind() to back some mmapped area with memory of "devices like
> accelerators, GPU cards, network cards, FPGA cards, PLD cards etc which
> might contain on board memory", then it will get such memory... and then
> what? How will it benefit from it? I will also need to tell some driver
> to make the device do some operations with this memory, right? And that
> most likely won't be a generic operation. In that case I can also ask
> the driver to give me that memory in the first place, and it can apply
> whatever policies are best for the device in question? And it's also the
> driver that can detect if the device memory is being wasted by a process
> that isn't currently performing the interesting operations, while
> another process that does them had to fallback its allocations to system
> memory and thus runs slower. I expect the NUMA balancing can't catch
> that for device memory (and you also disable it anyway?) So I don't
> really see how a generic solution would work, without having a full
> concrete example, and thus it's really hard to say that this approach is
> the right way to go and should be merged.

Okay, let me attempt to explain this.

* User space using mbind() to get CDM memory is an additional benefit
  we get by making the CDM plug in as a node and be part of the buddy
  allocator. But the over all idea from the user space point of view
  is that the application can allocate any generic buffer and try to
  use the buffer either from the CPU side or from the device without
  knowing about where the buffer is really mapped physically. That
  gives a seamless and transparent view to the user space where CPU
  compute and possible device based compute can work together. This
  is not possible through a driver allocated buffer.

* The placement of the memory on the buffer can happen on system memory
  when the CPU faults while accessing it. But a driver can manage the
  migration between system RAM and CDM memory once the buffer is being
  used from CPU and the device interchangeably. As you have mentioned
  driver will have more information about where which part of the buffer
  should be placed at any point of time and it can make it happen with
  migration. So both allocation and placement are decided by the driver
  during runtime. CDM provides the framework for this can kind device
  assisted compute and driver managed memory placements.

* If any application is not using CDM memory for along time placed on
  its buffer and another application is forced to fallback on system
  RAM when it really wanted is CDM, the driver can detect these kind
  of situations through memory access patters on the device HW and
  

Re: [PATCH V2 0/3] Define coherent device memory node

2017-02-13 Thread Anshuman Khandual
On 02/13/2017 09:04 PM, Vlastimil Babka wrote:
> On 02/10/2017 11:06 AM, Anshuman Khandual wrote:
>>  This three patches define CDM node with HugeTLB & Buddy allocation
>> isolation. Please refer to the last RFC posting mentioned here for details.
>> The series has been split for easier review process. The next part of the
>> work like VM flags, auto NUMA and KSM interactions with tagged VMAs will
>> follow later.
> 
> Hi,
> 
> I'm not sure if the splitting to smaller series and focusing on partial
> implementations is helpful at this point, until there's some consensus
> about the whole approach from a big picture perspective.

I have been trying for that through RFCs on CDM but there were not
enough needed feedback from the larger MM community. Hence decided
to split up the series and ask for smaller chunks of code to be
reviewed, debated. Thought this will be a better approach. These
three patches are complete in themselves from functionality point of
view. VMA flags, auto NUMA, KSM are additional feature improvement on
this core set of patches.

RFC V2: https://lkml.org/lkml/2017/1/29/198  (zonelist and cpuset)
RFC V1: https://lkml.org/lkml/2016/10/24/19  (zonelist method)
RFC v2: https://lkml.org/lkml/2016/11/22/339 (cpuset method)

> 
> Note that it's also confusing that v1 of this partial patchset mentioned
> some alternative implementations, but only as git branches, and the
> discussion about their differences is linked elsewhere. That further
> makes meaningful review harder IMHO.

I had posted two alternate approaches except the GFP flag buddy method
in my last RFC. There were not much of discussion on them except some
generic top cpuset characteristics. The current posted nodemask based
isolation method is the minimalist, less intrusive and very less amount
of code change without affecting much of common MM code IMHO. But yes,
if required I can go ahead and post all other alternate methods on this
thread if looking into them helps in better comparison and review.

> 
> Going back to the bigger picture, I've read the comments on previous
> postings and I think Jerome makes many good points in this subthread [1]
> against the idea of representing the device memory as generic memory
> nodes and expecting userspace to mbind() to them. So if I make a program
> that uses mbind() to back some mmapped area with memory of "devices like
> accelerators, GPU cards, network cards, FPGA cards, PLD cards etc which
> might contain on board memory", then it will get such memory... and then
> what? How will it benefit from it? I will also need to tell some driver
> to make the device do some operations with this memory, right? And that
> most likely won't be a generic operation. In that case I can also ask
> the driver to give me that memory in the first place, and it can apply
> whatever policies are best for the device in question? And it's also the
> driver that can detect if the device memory is being wasted by a process
> that isn't currently performing the interesting operations, while
> another process that does them had to fallback its allocations to system
> memory and thus runs slower. I expect the NUMA balancing can't catch
> that for device memory (and you also disable it anyway?) So I don't
> really see how a generic solution would work, without having a full
> concrete example, and thus it's really hard to say that this approach is
> the right way to go and should be merged.

Okay, let me attempt to explain this.

* User space using mbind() to get CDM memory is an additional benefit
  we get by making the CDM plug in as a node and be part of the buddy
  allocator. But the over all idea from the user space point of view
  is that the application can allocate any generic buffer and try to
  use the buffer either from the CPU side or from the device without
  knowing about where the buffer is really mapped physically. That
  gives a seamless and transparent view to the user space where CPU
  compute and possible device based compute can work together. This
  is not possible through a driver allocated buffer.

* The placement of the memory on the buffer can happen on system memory
  when the CPU faults while accessing it. But a driver can manage the
  migration between system RAM and CDM memory once the buffer is being
  used from CPU and the device interchangeably. As you have mentioned
  driver will have more information about where which part of the buffer
  should be placed at any point of time and it can make it happen with
  migration. So both allocation and placement are decided by the driver
  during runtime. CDM provides the framework for this can kind device
  assisted compute and driver managed memory placements.

* If any application is not using CDM memory for along time placed on
  its buffer and another application is forced to fallback on system
  RAM when it really wanted is CDM, the driver can detect these kind
  of situations through memory access patters on the device HW and
  

[PATCH v4 3/4] dmaengine: Add Broadcom SBA RAID driver

2017-02-13 Thread Anup Patel
The Broadcom stream buffer accelerator (SBA) provides offloading
capabilities for RAID operations. This SBA offload engine is
accessible via Broadcom SoC specific ring manager.

This patch adds Broadcom SBA RAID driver which provides one
DMA device with RAID capabilities using one or more Broadcom
SoC specific ring manager channels. The SBA RAID driver in its
current shape implements memcpy, xor, and pq operations.

Signed-off-by: Anup Patel 
Reviewed-by: Ray Jui 
---
 drivers/dma/Kconfig|   13 +
 drivers/dma/Makefile   |1 +
 drivers/dma/bcm-sba-raid.c | 1694 
 3 files changed, 1708 insertions(+)
 create mode 100644 drivers/dma/bcm-sba-raid.c

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 263495d..bf8fb84 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -99,6 +99,19 @@ config AXI_DMAC
  controller is often used in Analog Device's reference designs for FPGA
  platforms.
 
+config BCM_SBA_RAID
+   tristate "Broadcom SBA RAID engine support"
+   depends on (ARM64 && MAILBOX && RAID6_PQ) || COMPILE_TEST
+   select DMA_ENGINE
+   select DMA_ENGINE_RAID
+   select ASYNC_TX_ENABLE_CHANNEL_SWITCH
+   default ARCH_BCM_IPROC
+   help
+ Enable support for Broadcom SBA RAID Engine. The SBA RAID
+ engine is available on most of the Broadcom iProc SoCs. It
+ has the capability to offload memcpy, xor and pq computation
+ for raid5/6.
+
 config COH901318
bool "ST-Ericsson COH901318 DMA support"
select DMA_ENGINE
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index a4fa336..ba96bdd 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -17,6 +17,7 @@ obj-$(CONFIG_AMCC_PPC440SPE_ADMA) += ppc4xx/
 obj-$(CONFIG_AT_HDMAC) += at_hdmac.o
 obj-$(CONFIG_AT_XDMAC) += at_xdmac.o
 obj-$(CONFIG_AXI_DMAC) += dma-axi-dmac.o
+obj-$(CONFIG_BCM_SBA_RAID) += bcm-sba-raid.o
 obj-$(CONFIG_COH901318) += coh901318.o coh901318_lli.o
 obj-$(CONFIG_DMA_BCM2835) += bcm2835-dma.o
 obj-$(CONFIG_DMA_JZ4740) += dma-jz4740.o
diff --git a/drivers/dma/bcm-sba-raid.c b/drivers/dma/bcm-sba-raid.c
new file mode 100644
index 000..279e5e2
--- /dev/null
+++ b/drivers/dma/bcm-sba-raid.c
@@ -0,0 +1,1694 @@
+/*
+ * Copyright (C) 2017 Broadcom
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+/*
+ * Broadcom SBA RAID Driver
+ *
+ * The Broadcom stream buffer accelerator (SBA) provides offloading
+ * capabilities for RAID operations. The SBA offload engine is accessible
+ * via Broadcom SoC specific ring manager. Two or more offload engines
+ * can share same Broadcom SoC specific ring manager due to this Broadcom
+ * SoC specific ring manager driver is implemented as a mailbox controller
+ * driver and offload engine drivers are implemented as mallbox clients.
+ *
+ * Typically, Broadcom SoC specific ring manager will implement larger
+ * number of hardware rings over one or more SBA hardware devices. By
+ * design, the internal buffer size of SBA hardware device is limited
+ * but all offload operations supported by SBA can be broken down into
+ * multiple small size requests and executed parallely on multiple SBA
+ * hardware devices for achieving high through-put.
+ *
+ * The Broadcom SBA RAID driver does not require any register programming
+ * except submitting request to SBA hardware device via mailbox channels.
+ * This driver implements a DMA device with one DMA channel using a set
+ * of mailbox channels provided by Broadcom SoC specific ring manager
+ * driver. To exploit parallelism (as described above), all DMA request
+ * coming to SBA RAID DMA channel are broken down to smaller requests
+ * and submitted to multiple mailbox channels in round-robin fashion.
+ * For having more SBA DMA channels, we can create more SBA device nodes
+ * in Broadcom SoC specific DTS based on number of hardware rings supported
+ * by Broadcom SoC ring manager.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "dmaengine.h"
+
+/* SBA command related defines */
+#define SBA_TYPE_SHIFT 48
+#define SBA_TYPE_MASK  GENMASK(1, 0)
+#define SBA_TYPE_A 0x0
+#define SBA_TYPE_B 0x2
+#define SBA_TYPE_C 0x3
+#define SBA_USER_DEF_SHIFT 32
+#define SBA_USER_DEF_MASK  GENMASK(15, 0)
+#define SBA_R_MDATA_SHIFT  24
+#define SBA_R_MDATA_MASK   GENMASK(7, 0)
+#define SBA_C_MDATA_MS_SHIFT   18
+#define 

[PATCH v4 4/4] dt-bindings: Add DT bindings document for Broadcom SBA RAID driver

2017-02-13 Thread Anup Patel
This patch adds the DT bindings document for newly added Broadcom
SBA RAID driver.

Signed-off-by: Anup Patel 
Reviewed-by: Ray Jui 
Reviewed-by: Scott Branden 
---
 .../devicetree/bindings/dma/brcm,iproc-sba.txt | 29 ++
 1 file changed, 29 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/dma/brcm,iproc-sba.txt

diff --git a/Documentation/devicetree/bindings/dma/brcm,iproc-sba.txt 
b/Documentation/devicetree/bindings/dma/brcm,iproc-sba.txt
new file mode 100644
index 000..092913a
--- /dev/null
+++ b/Documentation/devicetree/bindings/dma/brcm,iproc-sba.txt
@@ -0,0 +1,29 @@
+* Broadcom SBA RAID engine
+
+Required properties:
+- compatible: Should be one of the following
+ "brcm,iproc-sba"
+ "brcm,iproc-sba-v2"
+  The "brcm,iproc-sba" has support for only 6 PQ coefficients
+  The "brcm,iproc-sba-v2" has support for only 30 PQ coefficients
+- mboxes: List of phandle and mailbox channel specifiers
+
+Example:
+
+raid_mbox: mbox@6740 {
+   ...
+   #mbox-cells = <3>;
+   ...
+};
+
+raid0 {
+   compatible = "brcm,iproc-sba-v2";
+   mboxes = <_mbox 0 0x1 0x>,
+<_mbox 1 0x1 0x>,
+<_mbox 2 0x1 0x>,
+<_mbox 3 0x1 0x>,
+<_mbox 4 0x1 0x>,
+<_mbox 5 0x1 0x>,
+<_mbox 6 0x1 0x>,
+<_mbox 7 0x1 0x>;
+};
-- 
2.7.4



[PATCH v4 3/4] dmaengine: Add Broadcom SBA RAID driver

2017-02-13 Thread Anup Patel
The Broadcom stream buffer accelerator (SBA) provides offloading
capabilities for RAID operations. This SBA offload engine is
accessible via Broadcom SoC specific ring manager.

This patch adds Broadcom SBA RAID driver which provides one
DMA device with RAID capabilities using one or more Broadcom
SoC specific ring manager channels. The SBA RAID driver in its
current shape implements memcpy, xor, and pq operations.

Signed-off-by: Anup Patel 
Reviewed-by: Ray Jui 
---
 drivers/dma/Kconfig|   13 +
 drivers/dma/Makefile   |1 +
 drivers/dma/bcm-sba-raid.c | 1694 
 3 files changed, 1708 insertions(+)
 create mode 100644 drivers/dma/bcm-sba-raid.c

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 263495d..bf8fb84 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -99,6 +99,19 @@ config AXI_DMAC
  controller is often used in Analog Device's reference designs for FPGA
  platforms.
 
+config BCM_SBA_RAID
+   tristate "Broadcom SBA RAID engine support"
+   depends on (ARM64 && MAILBOX && RAID6_PQ) || COMPILE_TEST
+   select DMA_ENGINE
+   select DMA_ENGINE_RAID
+   select ASYNC_TX_ENABLE_CHANNEL_SWITCH
+   default ARCH_BCM_IPROC
+   help
+ Enable support for Broadcom SBA RAID Engine. The SBA RAID
+ engine is available on most of the Broadcom iProc SoCs. It
+ has the capability to offload memcpy, xor and pq computation
+ for raid5/6.
+
 config COH901318
bool "ST-Ericsson COH901318 DMA support"
select DMA_ENGINE
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index a4fa336..ba96bdd 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -17,6 +17,7 @@ obj-$(CONFIG_AMCC_PPC440SPE_ADMA) += ppc4xx/
 obj-$(CONFIG_AT_HDMAC) += at_hdmac.o
 obj-$(CONFIG_AT_XDMAC) += at_xdmac.o
 obj-$(CONFIG_AXI_DMAC) += dma-axi-dmac.o
+obj-$(CONFIG_BCM_SBA_RAID) += bcm-sba-raid.o
 obj-$(CONFIG_COH901318) += coh901318.o coh901318_lli.o
 obj-$(CONFIG_DMA_BCM2835) += bcm2835-dma.o
 obj-$(CONFIG_DMA_JZ4740) += dma-jz4740.o
diff --git a/drivers/dma/bcm-sba-raid.c b/drivers/dma/bcm-sba-raid.c
new file mode 100644
index 000..279e5e2
--- /dev/null
+++ b/drivers/dma/bcm-sba-raid.c
@@ -0,0 +1,1694 @@
+/*
+ * Copyright (C) 2017 Broadcom
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+/*
+ * Broadcom SBA RAID Driver
+ *
+ * The Broadcom stream buffer accelerator (SBA) provides offloading
+ * capabilities for RAID operations. The SBA offload engine is accessible
+ * via Broadcom SoC specific ring manager. Two or more offload engines
+ * can share same Broadcom SoC specific ring manager due to this Broadcom
+ * SoC specific ring manager driver is implemented as a mailbox controller
+ * driver and offload engine drivers are implemented as mallbox clients.
+ *
+ * Typically, Broadcom SoC specific ring manager will implement larger
+ * number of hardware rings over one or more SBA hardware devices. By
+ * design, the internal buffer size of SBA hardware device is limited
+ * but all offload operations supported by SBA can be broken down into
+ * multiple small size requests and executed parallely on multiple SBA
+ * hardware devices for achieving high through-put.
+ *
+ * The Broadcom SBA RAID driver does not require any register programming
+ * except submitting request to SBA hardware device via mailbox channels.
+ * This driver implements a DMA device with one DMA channel using a set
+ * of mailbox channels provided by Broadcom SoC specific ring manager
+ * driver. To exploit parallelism (as described above), all DMA request
+ * coming to SBA RAID DMA channel are broken down to smaller requests
+ * and submitted to multiple mailbox channels in round-robin fashion.
+ * For having more SBA DMA channels, we can create more SBA device nodes
+ * in Broadcom SoC specific DTS based on number of hardware rings supported
+ * by Broadcom SoC ring manager.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "dmaengine.h"
+
+/* SBA command related defines */
+#define SBA_TYPE_SHIFT 48
+#define SBA_TYPE_MASK  GENMASK(1, 0)
+#define SBA_TYPE_A 0x0
+#define SBA_TYPE_B 0x2
+#define SBA_TYPE_C 0x3
+#define SBA_USER_DEF_SHIFT 32
+#define SBA_USER_DEF_MASK  GENMASK(15, 0)
+#define SBA_R_MDATA_SHIFT  24
+#define SBA_R_MDATA_MASK   GENMASK(7, 0)
+#define SBA_C_MDATA_MS_SHIFT   18
+#define SBA_C_MDATA_MS_MASKGENMASK(1, 0)

[PATCH v4 4/4] dt-bindings: Add DT bindings document for Broadcom SBA RAID driver

2017-02-13 Thread Anup Patel
This patch adds the DT bindings document for newly added Broadcom
SBA RAID driver.

Signed-off-by: Anup Patel 
Reviewed-by: Ray Jui 
Reviewed-by: Scott Branden 
---
 .../devicetree/bindings/dma/brcm,iproc-sba.txt | 29 ++
 1 file changed, 29 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/dma/brcm,iproc-sba.txt

diff --git a/Documentation/devicetree/bindings/dma/brcm,iproc-sba.txt 
b/Documentation/devicetree/bindings/dma/brcm,iproc-sba.txt
new file mode 100644
index 000..092913a
--- /dev/null
+++ b/Documentation/devicetree/bindings/dma/brcm,iproc-sba.txt
@@ -0,0 +1,29 @@
+* Broadcom SBA RAID engine
+
+Required properties:
+- compatible: Should be one of the following
+ "brcm,iproc-sba"
+ "brcm,iproc-sba-v2"
+  The "brcm,iproc-sba" has support for only 6 PQ coefficients
+  The "brcm,iproc-sba-v2" has support for only 30 PQ coefficients
+- mboxes: List of phandle and mailbox channel specifiers
+
+Example:
+
+raid_mbox: mbox@6740 {
+   ...
+   #mbox-cells = <3>;
+   ...
+};
+
+raid0 {
+   compatible = "brcm,iproc-sba-v2";
+   mboxes = <_mbox 0 0x1 0x>,
+<_mbox 1 0x1 0x>,
+<_mbox 2 0x1 0x>,
+<_mbox 3 0x1 0x>,
+<_mbox 4 0x1 0x>,
+<_mbox 5 0x1 0x>,
+<_mbox 6 0x1 0x>,
+<_mbox 7 0x1 0x>;
+};
-- 
2.7.4



[PATCH v4 0/4] Broadcom SBA RAID support

2017-02-13 Thread Anup Patel
The Broadcom SBA RAID is a stream-based device which provides
RAID5/6 offload.

It requires a SoC specific ring manager (such as Broadcom FlexRM
ring manager) to provide ring-based programming interface. Due to
this, the Broadcom SBA RAID driver (mailbox client) implements
DMA device having one DMA channel using a set of mailbox channels
provided by Broadcom SoC specific ring manager driver (mailbox
controller).

The Broadcom SBA RAID hardware requires PQ disk position instead
of PQ disk coefficient. To address this, we have added raid_gflog
table which will help driver to convert PQ disk coefficient to PQ
disk position.

This patchset is based on Linux-4.10-rc2 and depends on patchset
"[PATCH v4 0/2] Broadcom FlexRM ring manager support"

It is also available at sba-raid-v4 branch of
https://github.com/Broadcom/arm64-linux.git

Changes since v3:
 - Replaced SBA_ENC() with sba_cmd_enc() inline function
 - Use list_first_entry_or_null() wherever possible
 - Remove unwanted brances around loops wherever possible
 - Use lockdep_assert_held() where required

Changes since v2:
 - Droped patch to handle DMA devices having support for fewer
   PQ coefficients in Linux Async Tx
 - Added work-around in bcm-sba-raid driver to handle unsupported
   PQ coefficients using multiple SBA requests

Changes since v1:
 - Droped patch to add mbox_channel_device() API
 - Used GENMASK and BIT macros wherever possible in bcm-sba-raid driver
 - Replaced C_MDATA macros with static inline functions in
   bcm-sba-raid driver
 - Removed sba_alloc_chan_resources() callback in bcm-sba-raid driver
 - Used dev_err() instead of dev_info() wherever applicable
 - Removed call to sba_issue_pending() from sba_tx_submit() in
   bcm-sba-raid driver
 - Implemented SBA request chaning for handling (len > sba->req_size)
   in bcm-sba-raid driver
 - Implemented device_terminate_all() callback in bcm-sba-raid driver

Anup Patel (4):
  lib/raid6: Add log-of-2 table for RAID6 HW requiring disk position
  async_tx: Fix DMA_PREP_FENCE usage in do_async_gen_syndrome()
  dmaengine: Add Broadcom SBA RAID driver
  dt-bindings: Add DT bindings document for Broadcom SBA RAID driver

 .../devicetree/bindings/dma/brcm,iproc-sba.txt |   29 +
 crypto/async_tx/async_pq.c |5 +-
 drivers/dma/Kconfig|   13 +
 drivers/dma/Makefile   |1 +
 drivers/dma/bcm-sba-raid.c | 1694 
 include/linux/raid/pq.h|1 +
 lib/raid6/mktables.c   |   20 +
 7 files changed, 1760 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/dma/brcm,iproc-sba.txt
 create mode 100644 drivers/dma/bcm-sba-raid.c

-- 
2.7.4



[PATCH v4 2/4] async_tx: Fix DMA_PREP_FENCE usage in do_async_gen_syndrome()

2017-02-13 Thread Anup Patel
The DMA_PREP_FENCE is to be used when preparing Tx descriptor if output
of Tx descriptor is to be used by next/dependent Tx descriptor.

The DMA_PREP_FENSE will not be set correctly in do_async_gen_syndrome()
when calling dma->device_prep_dma_pq() under following conditions:
1. ASYNC_TX_FENCE not set in submit->flags
2. DMA_PREP_FENCE not set in dma_flags
3. src_cnt (= (disks - 2)) is greater than dma_maxpq(dma, dma_flags)

This patch fixes DMA_PREP_FENCE usage in do_async_gen_syndrome() taking
inspiration from do_async_xor() implementation.

Signed-off-by: Anup Patel 
Reviewed-by: Ray Jui 
Reviewed-by: Scott Branden 
---
 crypto/async_tx/async_pq.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/crypto/async_tx/async_pq.c b/crypto/async_tx/async_pq.c
index f83de99..56bd612 100644
--- a/crypto/async_tx/async_pq.c
+++ b/crypto/async_tx/async_pq.c
@@ -62,9 +62,6 @@ do_async_gen_syndrome(struct dma_chan *chan,
dma_addr_t dma_dest[2];
int src_off = 0;
 
-   if (submit->flags & ASYNC_TX_FENCE)
-   dma_flags |= DMA_PREP_FENCE;
-
while (src_cnt > 0) {
submit->flags = flags_orig;
pq_src_cnt = min(src_cnt, dma_maxpq(dma, dma_flags));
@@ -83,6 +80,8 @@ do_async_gen_syndrome(struct dma_chan *chan,
if (cb_fn_orig)
dma_flags |= DMA_PREP_INTERRUPT;
}
+   if (submit->flags & ASYNC_TX_FENCE)
+   dma_flags |= DMA_PREP_FENCE;
 
/* Drivers force forward progress in case they can not provide
 * a descriptor
-- 
2.7.4



[PATCH v4 1/4] lib/raid6: Add log-of-2 table for RAID6 HW requiring disk position

2017-02-13 Thread Anup Patel
The raid6_gfexp table represents {2}^n values for 0 <= n < 256. The
Linux async_tx framework pass values from raid6_gfexp as coefficients
for each source to prep_dma_pq() callback of DMA channel with PQ
capability. This creates problem for RAID6 offload engines (such as
Broadcom SBA) which take disk position (i.e. log of {2}) instead of
multiplicative cofficients from raid6_gfexp table.

This patch adds raid6_gflog table having log-of-2 value for any given
x such that 0 <= x < 256. For any given disk coefficient x, the
corresponding disk position is given by raid6_gflog[x]. The RAID6
offload engine driver can use this newly added raid6_gflog table to
get disk position from multiplicative coefficient.

Signed-off-by: Anup Patel 
Reviewed-by: Scott Branden 
Reviewed-by: Ray Jui 
---
 include/linux/raid/pq.h |  1 +
 lib/raid6/mktables.c| 20 
 2 files changed, 21 insertions(+)

diff --git a/include/linux/raid/pq.h b/include/linux/raid/pq.h
index 4d57bba..30f9453 100644
--- a/include/linux/raid/pq.h
+++ b/include/linux/raid/pq.h
@@ -142,6 +142,7 @@ int raid6_select_algo(void);
 extern const u8 raid6_gfmul[256][256] __attribute__((aligned(256)));
 extern const u8 raid6_vgfmul[256][32] __attribute__((aligned(256)));
 extern const u8 raid6_gfexp[256]  __attribute__((aligned(256)));
+extern const u8 raid6_gflog[256]  __attribute__((aligned(256)));
 extern const u8 raid6_gfinv[256]  __attribute__((aligned(256)));
 extern const u8 raid6_gfexi[256]  __attribute__((aligned(256)));
 
diff --git a/lib/raid6/mktables.c b/lib/raid6/mktables.c
index 39787db..e824d08 100644
--- a/lib/raid6/mktables.c
+++ b/lib/raid6/mktables.c
@@ -125,6 +125,26 @@ int main(int argc, char *argv[])
printf("EXPORT_SYMBOL(raid6_gfexp);\n");
printf("#endif\n");
 
+   /* Compute log-of-2 table */
+   printf("\nconst u8 __attribute__((aligned(256)))\n"
+  "raid6_gflog[256] =\n" "{\n");
+   for (i = 0; i < 256; i += 8) {
+   printf("\t");
+   for (j = 0; j < 8; j++) {
+   v = 255;
+   for (k = 0; k < 256; k++)
+   if (exptbl[k] == (i + j)) {
+   v = k;
+   break;
+   }
+   printf("0x%02x,%c", v, (j == 7) ? '\n' : ' ');
+   }
+   }
+   printf("};\n");
+   printf("#ifdef __KERNEL__\n");
+   printf("EXPORT_SYMBOL(raid6_gflog);\n");
+   printf("#endif\n");
+
/* Compute inverse table x^-1 == x^254 */
printf("\nconst u8 __attribute__((aligned(256)))\n"
   "raid6_gfinv[256] =\n" "{\n");
-- 
2.7.4



[PATCH v4 0/4] Broadcom SBA RAID support

2017-02-13 Thread Anup Patel
The Broadcom SBA RAID is a stream-based device which provides
RAID5/6 offload.

It requires a SoC specific ring manager (such as Broadcom FlexRM
ring manager) to provide ring-based programming interface. Due to
this, the Broadcom SBA RAID driver (mailbox client) implements
DMA device having one DMA channel using a set of mailbox channels
provided by Broadcom SoC specific ring manager driver (mailbox
controller).

The Broadcom SBA RAID hardware requires PQ disk position instead
of PQ disk coefficient. To address this, we have added raid_gflog
table which will help driver to convert PQ disk coefficient to PQ
disk position.

This patchset is based on Linux-4.10-rc2 and depends on patchset
"[PATCH v4 0/2] Broadcom FlexRM ring manager support"

It is also available at sba-raid-v4 branch of
https://github.com/Broadcom/arm64-linux.git

Changes since v3:
 - Replaced SBA_ENC() with sba_cmd_enc() inline function
 - Use list_first_entry_or_null() wherever possible
 - Remove unwanted brances around loops wherever possible
 - Use lockdep_assert_held() where required

Changes since v2:
 - Droped patch to handle DMA devices having support for fewer
   PQ coefficients in Linux Async Tx
 - Added work-around in bcm-sba-raid driver to handle unsupported
   PQ coefficients using multiple SBA requests

Changes since v1:
 - Droped patch to add mbox_channel_device() API
 - Used GENMASK and BIT macros wherever possible in bcm-sba-raid driver
 - Replaced C_MDATA macros with static inline functions in
   bcm-sba-raid driver
 - Removed sba_alloc_chan_resources() callback in bcm-sba-raid driver
 - Used dev_err() instead of dev_info() wherever applicable
 - Removed call to sba_issue_pending() from sba_tx_submit() in
   bcm-sba-raid driver
 - Implemented SBA request chaning for handling (len > sba->req_size)
   in bcm-sba-raid driver
 - Implemented device_terminate_all() callback in bcm-sba-raid driver

Anup Patel (4):
  lib/raid6: Add log-of-2 table for RAID6 HW requiring disk position
  async_tx: Fix DMA_PREP_FENCE usage in do_async_gen_syndrome()
  dmaengine: Add Broadcom SBA RAID driver
  dt-bindings: Add DT bindings document for Broadcom SBA RAID driver

 .../devicetree/bindings/dma/brcm,iproc-sba.txt |   29 +
 crypto/async_tx/async_pq.c |5 +-
 drivers/dma/Kconfig|   13 +
 drivers/dma/Makefile   |1 +
 drivers/dma/bcm-sba-raid.c | 1694 
 include/linux/raid/pq.h|1 +
 lib/raid6/mktables.c   |   20 +
 7 files changed, 1760 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/dma/brcm,iproc-sba.txt
 create mode 100644 drivers/dma/bcm-sba-raid.c

-- 
2.7.4



[PATCH v4 2/4] async_tx: Fix DMA_PREP_FENCE usage in do_async_gen_syndrome()

2017-02-13 Thread Anup Patel
The DMA_PREP_FENCE is to be used when preparing Tx descriptor if output
of Tx descriptor is to be used by next/dependent Tx descriptor.

The DMA_PREP_FENSE will not be set correctly in do_async_gen_syndrome()
when calling dma->device_prep_dma_pq() under following conditions:
1. ASYNC_TX_FENCE not set in submit->flags
2. DMA_PREP_FENCE not set in dma_flags
3. src_cnt (= (disks - 2)) is greater than dma_maxpq(dma, dma_flags)

This patch fixes DMA_PREP_FENCE usage in do_async_gen_syndrome() taking
inspiration from do_async_xor() implementation.

Signed-off-by: Anup Patel 
Reviewed-by: Ray Jui 
Reviewed-by: Scott Branden 
---
 crypto/async_tx/async_pq.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/crypto/async_tx/async_pq.c b/crypto/async_tx/async_pq.c
index f83de99..56bd612 100644
--- a/crypto/async_tx/async_pq.c
+++ b/crypto/async_tx/async_pq.c
@@ -62,9 +62,6 @@ do_async_gen_syndrome(struct dma_chan *chan,
dma_addr_t dma_dest[2];
int src_off = 0;
 
-   if (submit->flags & ASYNC_TX_FENCE)
-   dma_flags |= DMA_PREP_FENCE;
-
while (src_cnt > 0) {
submit->flags = flags_orig;
pq_src_cnt = min(src_cnt, dma_maxpq(dma, dma_flags));
@@ -83,6 +80,8 @@ do_async_gen_syndrome(struct dma_chan *chan,
if (cb_fn_orig)
dma_flags |= DMA_PREP_INTERRUPT;
}
+   if (submit->flags & ASYNC_TX_FENCE)
+   dma_flags |= DMA_PREP_FENCE;
 
/* Drivers force forward progress in case they can not provide
 * a descriptor
-- 
2.7.4



[PATCH v4 1/4] lib/raid6: Add log-of-2 table for RAID6 HW requiring disk position

2017-02-13 Thread Anup Patel
The raid6_gfexp table represents {2}^n values for 0 <= n < 256. The
Linux async_tx framework pass values from raid6_gfexp as coefficients
for each source to prep_dma_pq() callback of DMA channel with PQ
capability. This creates problem for RAID6 offload engines (such as
Broadcom SBA) which take disk position (i.e. log of {2}) instead of
multiplicative cofficients from raid6_gfexp table.

This patch adds raid6_gflog table having log-of-2 value for any given
x such that 0 <= x < 256. For any given disk coefficient x, the
corresponding disk position is given by raid6_gflog[x]. The RAID6
offload engine driver can use this newly added raid6_gflog table to
get disk position from multiplicative coefficient.

Signed-off-by: Anup Patel 
Reviewed-by: Scott Branden 
Reviewed-by: Ray Jui 
---
 include/linux/raid/pq.h |  1 +
 lib/raid6/mktables.c| 20 
 2 files changed, 21 insertions(+)

diff --git a/include/linux/raid/pq.h b/include/linux/raid/pq.h
index 4d57bba..30f9453 100644
--- a/include/linux/raid/pq.h
+++ b/include/linux/raid/pq.h
@@ -142,6 +142,7 @@ int raid6_select_algo(void);
 extern const u8 raid6_gfmul[256][256] __attribute__((aligned(256)));
 extern const u8 raid6_vgfmul[256][32] __attribute__((aligned(256)));
 extern const u8 raid6_gfexp[256]  __attribute__((aligned(256)));
+extern const u8 raid6_gflog[256]  __attribute__((aligned(256)));
 extern const u8 raid6_gfinv[256]  __attribute__((aligned(256)));
 extern const u8 raid6_gfexi[256]  __attribute__((aligned(256)));
 
diff --git a/lib/raid6/mktables.c b/lib/raid6/mktables.c
index 39787db..e824d08 100644
--- a/lib/raid6/mktables.c
+++ b/lib/raid6/mktables.c
@@ -125,6 +125,26 @@ int main(int argc, char *argv[])
printf("EXPORT_SYMBOL(raid6_gfexp);\n");
printf("#endif\n");
 
+   /* Compute log-of-2 table */
+   printf("\nconst u8 __attribute__((aligned(256)))\n"
+  "raid6_gflog[256] =\n" "{\n");
+   for (i = 0; i < 256; i += 8) {
+   printf("\t");
+   for (j = 0; j < 8; j++) {
+   v = 255;
+   for (k = 0; k < 256; k++)
+   if (exptbl[k] == (i + j)) {
+   v = k;
+   break;
+   }
+   printf("0x%02x,%c", v, (j == 7) ? '\n' : ' ');
+   }
+   }
+   printf("};\n");
+   printf("#ifdef __KERNEL__\n");
+   printf("EXPORT_SYMBOL(raid6_gflog);\n");
+   printf("#endif\n");
+
/* Compute inverse table x^-1 == x^254 */
printf("\nconst u8 __attribute__((aligned(256)))\n"
   "raid6_gfinv[256] =\n" "{\n");
-- 
2.7.4



Re: [PATCH v2 1/9] llist: Provide a safe version for llist_for_each

2017-02-13 Thread Byungchul Park
On Mon, Feb 13, 2017 at 04:58:05PM +0900, Byungchul Park wrote:
> On Mon, Feb 13, 2017 at 03:52:44PM +0800, Huang, Ying wrote:
> > Byungchul Park  writes:
> > 
> > > On Mon, Feb 13, 2017 at 03:36:33PM +0800, Huang, Ying wrote:
> > >> Byungchul Park  writes:
> > >> 
> > >> > Sometimes we have to dereference next field of llist node before 
> > >> > entering
> > >> > loop becasue the node might be deleted or the next field might be
> > >> > modified within the loop. So this adds the safe version of 
> > >> > llist_for_each,
> > >> > that is, llist_for_each_safe.
> > >> >
> > >> > Signed-off-by: Byungchul Park 
> > >> > ---
> > >> >  include/linux/llist.h | 19 +++
> > >> >  1 file changed, 19 insertions(+)
> > >> >
> > >> > diff --git a/include/linux/llist.h b/include/linux/llist.h
> > >> > index fd4ca0b..4c508a5 100644
> > >> > --- a/include/linux/llist.h
> > >> > +++ b/include/linux/llist.h
> > >> > @@ -105,6 +105,25 @@ static inline void init_llist_head(struct 
> > >> > llist_head *list)
> > >> >for ((pos) = (node); pos; (pos) = (pos)->next)
> > >> >  
> > >> >  /**
> > >> > + * llist_for_each_safe - iterate over some deleted entries of a 
> > >> > lock-less list
> > >> > + * safe against removal of list entry
> > >> > + * @pos:  the  llist_node to use as a loop cursor
> > >> > + * @n:another type * to use as temporary storage
> > >> 
> > >> s/type */ llist_node/
> > >
> > > Yes.
> > >
> > >> 
> > >> > + * @node: the first entry of deleted list entries
> > >> > + *
> > >> > + * In general, some entries of the lock-less list can be traversed
> > >> > + * safely only after being deleted from list, so start with an entry
> > >> > + * instead of list head.
> > >> > + *
> > >> > + * If being used on entries deleted from lock-less list directly, the
> > >> > + * traverse order is from the newest to the oldest added entry.  If
> > >> > + * you want to traverse from the oldest to the newest, you must
> > >> > + * reverse the order by yourself before traversing.
> > >> > + */
> > >> > +#define llist_for_each_safe(pos, n, node) \
> > >> > +  for ((pos) = (node); (pos) && ((n) = (pos)->next, true); (pos) 
> > >> > = (n))
> > >> > +
> > >> 
> > >> Following the style of other xxx_for_each_safe,
> > >> 
> > >> #define llist_for_each_safe(pos, n, node)\
> > >>  for (pos = (node), (pos && (n = pos->next)); pos; pos = n, n = 
> > >> pos->next)
> > >
> > > Do you think it should be modified? I think mine is simpler. No?
> > 
> > Personally I prefer the style of other xxx_for_each_safe().
> 
> Yes, I will modify it as you recommand.
> 
> Thank you very much.

I wanted to modify it as you recommanded but it has a bug. It should be
(to fix the bug):

   for (pos = (node), (pos && (n = pos->next)); pos; pos = n, (pos && \
   (n = pos->next)))

Don't you think this is too messy? Or do I miss something? I still think
the following is neater and simpler.

   for (pos = node; pos && (n = pos->next, true); pos = n)

Or could you recommand another preference?



Re: [PATCH v2 1/9] llist: Provide a safe version for llist_for_each

2017-02-13 Thread Byungchul Park
On Mon, Feb 13, 2017 at 04:58:05PM +0900, Byungchul Park wrote:
> On Mon, Feb 13, 2017 at 03:52:44PM +0800, Huang, Ying wrote:
> > Byungchul Park  writes:
> > 
> > > On Mon, Feb 13, 2017 at 03:36:33PM +0800, Huang, Ying wrote:
> > >> Byungchul Park  writes:
> > >> 
> > >> > Sometimes we have to dereference next field of llist node before 
> > >> > entering
> > >> > loop becasue the node might be deleted or the next field might be
> > >> > modified within the loop. So this adds the safe version of 
> > >> > llist_for_each,
> > >> > that is, llist_for_each_safe.
> > >> >
> > >> > Signed-off-by: Byungchul Park 
> > >> > ---
> > >> >  include/linux/llist.h | 19 +++
> > >> >  1 file changed, 19 insertions(+)
> > >> >
> > >> > diff --git a/include/linux/llist.h b/include/linux/llist.h
> > >> > index fd4ca0b..4c508a5 100644
> > >> > --- a/include/linux/llist.h
> > >> > +++ b/include/linux/llist.h
> > >> > @@ -105,6 +105,25 @@ static inline void init_llist_head(struct 
> > >> > llist_head *list)
> > >> >for ((pos) = (node); pos; (pos) = (pos)->next)
> > >> >  
> > >> >  /**
> > >> > + * llist_for_each_safe - iterate over some deleted entries of a 
> > >> > lock-less list
> > >> > + * safe against removal of list entry
> > >> > + * @pos:  the  llist_node to use as a loop cursor
> > >> > + * @n:another type * to use as temporary storage
> > >> 
> > >> s/type */ llist_node/
> > >
> > > Yes.
> > >
> > >> 
> > >> > + * @node: the first entry of deleted list entries
> > >> > + *
> > >> > + * In general, some entries of the lock-less list can be traversed
> > >> > + * safely only after being deleted from list, so start with an entry
> > >> > + * instead of list head.
> > >> > + *
> > >> > + * If being used on entries deleted from lock-less list directly, the
> > >> > + * traverse order is from the newest to the oldest added entry.  If
> > >> > + * you want to traverse from the oldest to the newest, you must
> > >> > + * reverse the order by yourself before traversing.
> > >> > + */
> > >> > +#define llist_for_each_safe(pos, n, node) \
> > >> > +  for ((pos) = (node); (pos) && ((n) = (pos)->next, true); (pos) 
> > >> > = (n))
> > >> > +
> > >> 
> > >> Following the style of other xxx_for_each_safe,
> > >> 
> > >> #define llist_for_each_safe(pos, n, node)\
> > >>  for (pos = (node), (pos && (n = pos->next)); pos; pos = n, n = 
> > >> pos->next)
> > >
> > > Do you think it should be modified? I think mine is simpler. No?
> > 
> > Personally I prefer the style of other xxx_for_each_safe().
> 
> Yes, I will modify it as you recommand.
> 
> Thank you very much.

I wanted to modify it as you recommanded but it has a bug. It should be
(to fix the bug):

   for (pos = (node), (pos && (n = pos->next)); pos; pos = n, (pos && \
   (n = pos->next)))

Don't you think this is too messy? Or do I miss something? I still think
the following is neater and simpler.

   for (pos = node; pos && (n = pos->next, true); pos = n)

Or could you recommand another preference?



[tip:perf/core] perf tests record: No need to test an array against NULL

2017-02-13 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  9ef6839bcce7ca944c1ace4a7247cf13ca92a28f
Gitweb: http://git.kernel.org/tip/9ef6839bcce7ca944c1ace4a7247cf13ca92a28f
Author: Arnaldo Carvalho de Melo 
AuthorDate: Mon, 13 Feb 2017 17:04:05 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 17:22:34 -0300

perf tests record: No need to test an array against NULL

It will always evaluate to 'true', as clang warns:

CC   /tmp/build/perf/tests/perf-record.o
CC   /tmp/build/perf/tests/evsel-roundtrip-name.o
  tests/perf-record.c:69:24: error: comparison of array 'argv' equal to a null 
pointer is always false [-Werror,-Wtautological-pointer-compare]
  if (evlist == NULL || argv == NULL) {
^~~~
  1 error generated.

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-o4977g6p9b3peak9ct6ef...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/perf-record.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/perf-record.c b/tools/perf/tests/perf-record.c
index 8f2e1de..541da7a 100644
--- a/tools/perf/tests/perf-record.c
+++ b/tools/perf/tests/perf-record.c
@@ -66,7 +66,7 @@ int test__PERF_RECORD(int subtest __maybe_unused)
if (evlist == NULL) /* Fallback for kernels lacking PERF_COUNT_SW_DUMMY 
*/
evlist = perf_evlist__new_default();
 
-   if (evlist == NULL || argv == NULL) {
+   if (evlist == NULL) {
pr_debug("Not enough memory to create evlist\n");
goto out;
}


[tip:perf/core] perf tests record: No need to test an array against NULL

2017-02-13 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  9ef6839bcce7ca944c1ace4a7247cf13ca92a28f
Gitweb: http://git.kernel.org/tip/9ef6839bcce7ca944c1ace4a7247cf13ca92a28f
Author: Arnaldo Carvalho de Melo 
AuthorDate: Mon, 13 Feb 2017 17:04:05 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 17:22:34 -0300

perf tests record: No need to test an array against NULL

It will always evaluate to 'true', as clang warns:

CC   /tmp/build/perf/tests/perf-record.o
CC   /tmp/build/perf/tests/evsel-roundtrip-name.o
  tests/perf-record.c:69:24: error: comparison of array 'argv' equal to a null 
pointer is always false [-Werror,-Wtautological-pointer-compare]
  if (evlist == NULL || argv == NULL) {
^~~~
  1 error generated.

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-o4977g6p9b3peak9ct6ef...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/perf-record.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/perf-record.c b/tools/perf/tests/perf-record.c
index 8f2e1de..541da7a 100644
--- a/tools/perf/tests/perf-record.c
+++ b/tools/perf/tests/perf-record.c
@@ -66,7 +66,7 @@ int test__PERF_RECORD(int subtest __maybe_unused)
if (evlist == NULL) /* Fallback for kernels lacking PERF_COUNT_SW_DUMMY 
*/
evlist = perf_evlist__new_default();
 
-   if (evlist == NULL || argv == NULL) {
+   if (evlist == NULL) {
pr_debug("Not enough memory to create evlist\n");
goto out;
}


[tip:perf/core] perf symbols: dso->name is an array, no need to check it against NULL

2017-02-13 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  5eae7d842510d33d4410c062280eda6c935403dd
Gitweb: http://git.kernel.org/tip/5eae7d842510d33d4410c062280eda6c935403dd
Author: Arnaldo Carvalho de Melo 
AuthorDate: Mon, 13 Feb 2017 17:11:03 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 17:22:35 -0300

perf symbols: dso->name is an array, no need to check it against NULL

As it will always evaluate to 'true', as reported by clang:

  util/map.c:390:36: error: address of array 'map->dso->name' will always 
evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
  if (map && map->dso && (map->dso->name || map->dso->long_name)) {
  ~~^~~~ ~~
  util/map.c:393:22: error: address of array 'map->dso->name' will always 
evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
  else if (map->dso->name)
 ~~  ~~^~~~

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-x8cu007cly40kfp8xnpi9...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/map.c| 4 ++--
 tools/perf/util/scripting-engines/trace-event-perl.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 4f9a71c..0a943e7 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -387,10 +387,10 @@ size_t map__fprintf_dsoname(struct map *map, FILE *fp)
 {
const char *dsoname = "[unknown]";
 
-   if (map && map->dso && (map->dso->name || map->dso->long_name)) {
+   if (map && map->dso) {
if (symbol_conf.show_kernel_path && map->dso->long_name)
dsoname = map->dso->long_name;
-   else if (map->dso->name)
+   else
dsoname = map->dso->name;
}
 
diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c 
b/tools/perf/util/scripting-engines/trace-event-perl.c
index 014ecd6..c1555fd 100644
--- a/tools/perf/util/scripting-engines/trace-event-perl.c
+++ b/tools/perf/util/scripting-engines/trace-event-perl.c
@@ -309,10 +309,10 @@ static SV *perl_process_callchain(struct perf_sample 
*sample,
if (node->map) {
struct map *map = node->map;
const char *dsoname = "[unknown]";
-   if (map && map->dso && (map->dso->name || 
map->dso->long_name)) {
+   if (map && map->dso) {
if (symbol_conf.show_kernel_path && 
map->dso->long_name)
dsoname = map->dso->long_name;
-   else if (map->dso->name)
+   else
dsoname = map->dso->name;
}
if (!hv_stores(elem, "dso", newSVpv(dsoname,0))) {


[tip:perf/core] perf symbols: dso->name is an array, no need to check it against NULL

2017-02-13 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  5eae7d842510d33d4410c062280eda6c935403dd
Gitweb: http://git.kernel.org/tip/5eae7d842510d33d4410c062280eda6c935403dd
Author: Arnaldo Carvalho de Melo 
AuthorDate: Mon, 13 Feb 2017 17:11:03 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 17:22:35 -0300

perf symbols: dso->name is an array, no need to check it against NULL

As it will always evaluate to 'true', as reported by clang:

  util/map.c:390:36: error: address of array 'map->dso->name' will always 
evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
  if (map && map->dso && (map->dso->name || map->dso->long_name)) {
  ~~^~~~ ~~
  util/map.c:393:22: error: address of array 'map->dso->name' will always 
evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
  else if (map->dso->name)
 ~~  ~~^~~~

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-x8cu007cly40kfp8xnpi9...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/map.c| 4 ++--
 tools/perf/util/scripting-engines/trace-event-perl.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 4f9a71c..0a943e7 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -387,10 +387,10 @@ size_t map__fprintf_dsoname(struct map *map, FILE *fp)
 {
const char *dsoname = "[unknown]";
 
-   if (map && map->dso && (map->dso->name || map->dso->long_name)) {
+   if (map && map->dso) {
if (symbol_conf.show_kernel_path && map->dso->long_name)
dsoname = map->dso->long_name;
-   else if (map->dso->name)
+   else
dsoname = map->dso->name;
}
 
diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c 
b/tools/perf/util/scripting-engines/trace-event-perl.c
index 014ecd6..c1555fd 100644
--- a/tools/perf/util/scripting-engines/trace-event-perl.c
+++ b/tools/perf/util/scripting-engines/trace-event-perl.c
@@ -309,10 +309,10 @@ static SV *perl_process_callchain(struct perf_sample 
*sample,
if (node->map) {
struct map *map = node->map;
const char *dsoname = "[unknown]";
-   if (map && map->dso && (map->dso->name || 
map->dso->long_name)) {
+   if (map && map->dso) {
if (symbol_conf.show_kernel_path && 
map->dso->long_name)
dsoname = map->dso->long_name;
-   else if (map->dso->name)
+   else
dsoname = map->dso->name;
}
if (!hv_stores(elem, "dso", newSVpv(dsoname,0))) {


[tip:perf/core] perf evsel: Inform how to make a sysctl setting permanent

2017-02-13 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  d6195a6a2c247515d5832debb51c03a74dc3f8f6
Gitweb: http://git.kernel.org/tip/d6195a6a2c247515d5832debb51c03a74dc3f8f6
Author: Arnaldo Carvalho de Melo 
AuthorDate: Mon, 13 Feb 2017 16:45:24 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 17:22:33 -0300

perf evsel: Inform how to make a sysctl setting permanent

When a tool can't open counters due to the kernel.perf_event_paranoit
sysctl setting, we inform how to tweak it to allow the operation to
succeed, in addition to that, suggest setting /etc/sysctl.conf to
make the setting permanent.

Suggested-by: Ingo Molnar 
Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-4gwe99k4a6p12d4u8bbyt...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-record.c | 2 +-
 tools/perf/builtin-stat.c   | 2 +-
 tools/perf/builtin-top.c| 2 +-
 tools/perf/util/evsel.c | 4 +++-
 4 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index ffac8ca..2ddf189 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -418,7 +418,7 @@ static int record__mmap(struct record *rec)
 
 static int record__open(struct record *rec)
 {
-   char msg[512];
+   char msg[BUFSIZ];
struct perf_evsel *pos;
struct perf_evlist *evlist = rec->evlist;
struct perf_session *session = rec->session;
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index a02f2e9..f287191 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -533,7 +533,7 @@ static int store_counter_ids(struct perf_evsel *counter)
 static int __run_perf_stat(int argc, const char **argv)
 {
int interval = stat_config.interval;
-   char msg[512];
+   char msg[BUFSIZ];
unsigned long long t0, t1;
struct perf_evsel *counter;
struct timespec ts;
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index d90927f..5a7fd7a 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -859,7 +859,7 @@ static void perf_top__mmap_read(struct perf_top *top)
 
 static int perf_top__start_counters(struct perf_top *top)
 {
-   char msg[512];
+   char msg[BUFSIZ];
struct perf_evsel *counter;
struct perf_evlist *evlist = top->evlist;
struct record_opts *opts = >record_opts;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 04e536a..cd2fb42 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2469,7 +2469,9 @@ int perf_evsel__open_strerror(struct perf_evsel *evsel, 
struct target *target,
 "  -1: Allow use of (almost) all events by all users\n"
 ">= 0: Disallow raw tracepoint access by users without 
CAP_IOC_LOCK\n"
 ">= 1: Disallow CPU event access by users without 
CAP_SYS_ADMIN\n"
-">= 2: Disallow kernel profiling by users without 
CAP_SYS_ADMIN",
+">= 2: Disallow kernel profiling by users without 
CAP_SYS_ADMIN\n\n"
+"To make this setting permanent, edit /etc/sysctl.conf too, 
e.g.:\n\n"
+"  kernel.perf_event_paranoid = -1\n" ,
 target->system_wide ? "system-wide " : "",
 perf_event_paranoid());
case ENOENT:


[tip:perf/core] perf evsel: Inform how to make a sysctl setting permanent

2017-02-13 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  d6195a6a2c247515d5832debb51c03a74dc3f8f6
Gitweb: http://git.kernel.org/tip/d6195a6a2c247515d5832debb51c03a74dc3f8f6
Author: Arnaldo Carvalho de Melo 
AuthorDate: Mon, 13 Feb 2017 16:45:24 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 17:22:33 -0300

perf evsel: Inform how to make a sysctl setting permanent

When a tool can't open counters due to the kernel.perf_event_paranoit
sysctl setting, we inform how to tweak it to allow the operation to
succeed, in addition to that, suggest setting /etc/sysctl.conf to
make the setting permanent.

Suggested-by: Ingo Molnar 
Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-4gwe99k4a6p12d4u8bbyt...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-record.c | 2 +-
 tools/perf/builtin-stat.c   | 2 +-
 tools/perf/builtin-top.c| 2 +-
 tools/perf/util/evsel.c | 4 +++-
 4 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index ffac8ca..2ddf189 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -418,7 +418,7 @@ static int record__mmap(struct record *rec)
 
 static int record__open(struct record *rec)
 {
-   char msg[512];
+   char msg[BUFSIZ];
struct perf_evsel *pos;
struct perf_evlist *evlist = rec->evlist;
struct perf_session *session = rec->session;
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index a02f2e9..f287191 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -533,7 +533,7 @@ static int store_counter_ids(struct perf_evsel *counter)
 static int __run_perf_stat(int argc, const char **argv)
 {
int interval = stat_config.interval;
-   char msg[512];
+   char msg[BUFSIZ];
unsigned long long t0, t1;
struct perf_evsel *counter;
struct timespec ts;
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index d90927f..5a7fd7a 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -859,7 +859,7 @@ static void perf_top__mmap_read(struct perf_top *top)
 
 static int perf_top__start_counters(struct perf_top *top)
 {
-   char msg[512];
+   char msg[BUFSIZ];
struct perf_evsel *counter;
struct perf_evlist *evlist = top->evlist;
struct record_opts *opts = >record_opts;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 04e536a..cd2fb42 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2469,7 +2469,9 @@ int perf_evsel__open_strerror(struct perf_evsel *evsel, 
struct target *target,
 "  -1: Allow use of (almost) all events by all users\n"
 ">= 0: Disallow raw tracepoint access by users without 
CAP_IOC_LOCK\n"
 ">= 1: Disallow CPU event access by users without 
CAP_SYS_ADMIN\n"
-">= 2: Disallow kernel profiling by users without 
CAP_SYS_ADMIN",
+">= 2: Disallow kernel profiling by users without 
CAP_SYS_ADMIN\n\n"
+"To make this setting permanent, edit /etc/sysctl.conf too, 
e.g.:\n\n"
+"  kernel.perf_event_paranoid = -1\n" ,
 target->system_wide ? "system-wide " : "",
 perf_event_paranoid());
case ENOENT:


[tip:perf/core] perf symbols: No need to check if sym->name is NULL

2017-02-13 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  a7c3899c06865c75f8887f33d9043f6e8e780e71
Gitweb: http://git.kernel.org/tip/a7c3899c06865c75f8887f33d9043f6e8e780e71
Author: Arnaldo Carvalho de Melo 
AuthorDate: Mon, 13 Feb 2017 16:52:15 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 17:22:34 -0300

perf symbols: No need to check if sym->name is NULL

As it is an array, so will always evaluate to 'true', as reported by
clang:

  builtin-sched.c:2070:19: error: address of array 'sym->name' will always 
evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
  if (sym && sym->name) {
  ~~ ~^~~~
  1 warning generated.

So just ditch all those useless checks.

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-ydpm927col06paixb775j...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-kmem.c| 4 ++--
 tools/perf/builtin-sched.c   | 2 +-
 tools/perf/util/evsel_fprintf.c  | 1 -
 tools/perf/util/machine.c| 2 +-
 tools/perf/util/symbol_fprintf.c | 2 +-
 5 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 29f4751..6da8d08 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -1065,7 +1065,7 @@ static void __print_page_alloc_result(struct perf_session 
*session, int n_lines)
 
data = rb_entry(next, struct page_stat, node);
sym = machine__find_kernel_function(machine, data->callsite, 
);
-   if (sym && sym->name)
+   if (sym)
caller = sym->name;
else
scnprintf(buf, sizeof(buf), "%"PRIx64, data->callsite);
@@ -1107,7 +1107,7 @@ static void __print_page_caller_result(struct 
perf_session *session, int n_lines
 
data = rb_entry(next, struct page_stat, node);
sym = machine__find_kernel_function(machine, data->callsite, 
);
-   if (sym && sym->name)
+   if (sym)
caller = sym->name;
else
scnprintf(buf, sizeof(buf), "%"PRIx64, data->callsite);
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index daceb32..270eb2d 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2067,7 +2067,7 @@ static void save_task_callchain(struct perf_sched *sched,
break;
 
sym = node->sym;
-   if (sym && sym->name) {
+   if (sym) {
if (!strcmp(sym->name, "schedule") ||
!strcmp(sym->name, "__schedule") ||
!strcmp(sym->name, "preempt_schedule"))
diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
index 6b29255..4ef5184 100644
--- a/tools/perf/util/evsel_fprintf.c
+++ b/tools/perf/util/evsel_fprintf.c
@@ -168,7 +168,6 @@ int sample__fprintf_callchain(struct perf_sample *sample, 
int left_alignment,
 
if (symbol_conf.bt_stop_list &&
node->sym &&
-   node->sym->name &&
strlist__has_entry(symbol_conf.bt_stop_list,
   node->sym->name)) {
break;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 747a034..a1043cf 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1565,7 +1565,7 @@ int machine__process_event(struct machine *machine, union 
perf_event *event,
 
 static bool symbol__match_regex(struct symbol *sym, regex_t *regex)
 {
-   if (sym->name && !regexec(regex, sym->name, 0, NULL, 0))
+   if (!regexec(regex, sym->name, 0, NULL, 0))
return 1;
return 0;
 }
diff --git a/tools/perf/util/symbol_fprintf.c b/tools/perf/util/symbol_fprintf.c
index 7c6b33e..63694e1 100644
--- a/tools/perf/util/symbol_fprintf.c
+++ b/tools/perf/util/symbol_fprintf.c
@@ -21,7 +21,7 @@ size_t __symbol__fprintf_symname_offs(const struct symbol 
*sym,
unsigned long offset;
size_t length;
 
-   if (sym && sym->name) {
+   if (sym) {
length = fprintf(fp, "%s", sym->name);
if (al && print_offsets) {
if (al->addr < sym->end)


[tip:perf/core] perf symbols: No need to check if sym->name is NULL

2017-02-13 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  a7c3899c06865c75f8887f33d9043f6e8e780e71
Gitweb: http://git.kernel.org/tip/a7c3899c06865c75f8887f33d9043f6e8e780e71
Author: Arnaldo Carvalho de Melo 
AuthorDate: Mon, 13 Feb 2017 16:52:15 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 17:22:34 -0300

perf symbols: No need to check if sym->name is NULL

As it is an array, so will always evaluate to 'true', as reported by
clang:

  builtin-sched.c:2070:19: error: address of array 'sym->name' will always 
evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
  if (sym && sym->name) {
  ~~ ~^~~~
  1 warning generated.

So just ditch all those useless checks.

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-ydpm927col06paixb775j...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-kmem.c| 4 ++--
 tools/perf/builtin-sched.c   | 2 +-
 tools/perf/util/evsel_fprintf.c  | 1 -
 tools/perf/util/machine.c| 2 +-
 tools/perf/util/symbol_fprintf.c | 2 +-
 5 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 29f4751..6da8d08 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -1065,7 +1065,7 @@ static void __print_page_alloc_result(struct perf_session 
*session, int n_lines)
 
data = rb_entry(next, struct page_stat, node);
sym = machine__find_kernel_function(machine, data->callsite, 
);
-   if (sym && sym->name)
+   if (sym)
caller = sym->name;
else
scnprintf(buf, sizeof(buf), "%"PRIx64, data->callsite);
@@ -1107,7 +1107,7 @@ static void __print_page_caller_result(struct 
perf_session *session, int n_lines
 
data = rb_entry(next, struct page_stat, node);
sym = machine__find_kernel_function(machine, data->callsite, 
);
-   if (sym && sym->name)
+   if (sym)
caller = sym->name;
else
scnprintf(buf, sizeof(buf), "%"PRIx64, data->callsite);
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index daceb32..270eb2d 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2067,7 +2067,7 @@ static void save_task_callchain(struct perf_sched *sched,
break;
 
sym = node->sym;
-   if (sym && sym->name) {
+   if (sym) {
if (!strcmp(sym->name, "schedule") ||
!strcmp(sym->name, "__schedule") ||
!strcmp(sym->name, "preempt_schedule"))
diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
index 6b29255..4ef5184 100644
--- a/tools/perf/util/evsel_fprintf.c
+++ b/tools/perf/util/evsel_fprintf.c
@@ -168,7 +168,6 @@ int sample__fprintf_callchain(struct perf_sample *sample, 
int left_alignment,
 
if (symbol_conf.bt_stop_list &&
node->sym &&
-   node->sym->name &&
strlist__has_entry(symbol_conf.bt_stop_list,
   node->sym->name)) {
break;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 747a034..a1043cf 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1565,7 +1565,7 @@ int machine__process_event(struct machine *machine, union 
perf_event *event,
 
 static bool symbol__match_regex(struct symbol *sym, regex_t *regex)
 {
-   if (sym->name && !regexec(regex, sym->name, 0, NULL, 0))
+   if (!regexec(regex, sym->name, 0, NULL, 0))
return 1;
return 0;
 }
diff --git a/tools/perf/util/symbol_fprintf.c b/tools/perf/util/symbol_fprintf.c
index 7c6b33e..63694e1 100644
--- a/tools/perf/util/symbol_fprintf.c
+++ b/tools/perf/util/symbol_fprintf.c
@@ -21,7 +21,7 @@ size_t __symbol__fprintf_symname_offs(const struct symbol 
*sym,
unsigned long offset;
size_t length;
 
-   if (sym && sym->name) {
+   if (sym) {
length = fprintf(fp, "%s", sym->name);
if (al && print_offsets) {
if (al->addr < sym->end)


Re: linux-next: build failure after merge of the net tree

2017-02-13 Thread Ingo Molnar

* Stephen Rothwell  wrote:

> Hi all,
> 
> After merging the net tree, today's linux-next build (powerpc64le perf)
> failed like this:
> 
> Warning: tools/include/uapi/linux/bpf.h differs from kernel
> bpf.c: In function 'bpf_prog_attach':
> bpf.c:180:6: error: 'union bpf_attr' has no member named 'attach_flags'; did 
> you mean 'map_flags'?
>   attr.attach_flags  = flags;
>   ^
> 
> Caused by commit
> 
>   7f677633379b ("bpf: introduce BPF_F_ALLOW_OVERRIDE flag")
> 
> Unfortunately, the perf header files are kept separate from the kernel
> header files proper and are not automatically copied over :-(

No, that's wrong, the problem is not that headers were not shared, the problem 
is 
that a tooling interdependency was not properly tested *and* that the 
dependency 
was not properly implemented in the build system either.

Note that we had similar build breakages when include headers _were_ shared as 
well, so sharing the headers would only have worked around this particular bug 
and 
would have introduced fragility in other places...

The best, most robust solution in this particular case would be to fix the 
(tooling) build system to express the dependency, that would have shown the 
build 
failure right when the modification was done.

Thanks,

Ingo


Re: linux-next: build failure after merge of the net tree

2017-02-13 Thread Ingo Molnar

* Stephen Rothwell  wrote:

> Hi all,
> 
> After merging the net tree, today's linux-next build (powerpc64le perf)
> failed like this:
> 
> Warning: tools/include/uapi/linux/bpf.h differs from kernel
> bpf.c: In function 'bpf_prog_attach':
> bpf.c:180:6: error: 'union bpf_attr' has no member named 'attach_flags'; did 
> you mean 'map_flags'?
>   attr.attach_flags  = flags;
>   ^
> 
> Caused by commit
> 
>   7f677633379b ("bpf: introduce BPF_F_ALLOW_OVERRIDE flag")
> 
> Unfortunately, the perf header files are kept separate from the kernel
> header files proper and are not automatically copied over :-(

No, that's wrong, the problem is not that headers were not shared, the problem 
is 
that a tooling interdependency was not properly tested *and* that the 
dependency 
was not properly implemented in the build system either.

Note that we had similar build breakages when include headers _were_ shared as 
well, so sharing the headers would only have worked around this particular bug 
and 
would have introduced fragility in other places...

The best, most robust solution in this particular case would be to fix the 
(tooling) build system to express the dependency, that would have shown the 
build 
failure right when the modification was done.

Thanks,

Ingo


[tip:perf/core] tools lib traceevent plugin function: Initialize 'index' variable

2017-02-13 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  e8c6f437fd12d39e462962eaed2315bac597d34c
Gitweb: http://git.kernel.org/tip/e8c6f437fd12d39e462962eaed2315bac597d34c
Author: Arnaldo Carvalho de Melo 
AuthorDate: Mon, 13 Feb 2017 13:33:57 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 17:22:33 -0300

tools lib traceevent plugin function: Initialize 'index' variable

Detected with clang:

CC   /tmp/build/perf/plugin_function.o
  plugin_function.c:145:6: warning: variable 'index' is used uninitialized 
whenever 'if' condition is false [-Wsometimes-uninitialized]
  if (parent && ftrace_indent->set)
  ^~~~
  plugin_function.c:148:29: note: uninitialized use occurs here
  trace_seq_printf(s, "%*s", index*3, "");
 ^
  plugin_function.c:145:2: note: remove the 'if' if its condition is always true
  if (parent && ftrace_indent->set)
  ^
  plugin_function.c:145:6: warning: variable 'index' is used uninitialized 
whenever '&&' condition is false [-Wsometimes-uninitialized]
  if (parent && ftrace_indent->set)
  ^~
  plugin_function.c:148:29: note: uninitialized use occurs here
  trace_seq_printf(s, "%*s", index*3, "");
 ^
  plugin_function.c:145:6: note: remove the '&&' if its condition is always true
  if (parent && ftrace_indent->set)
  ^
  plugin_function.c:133:11: note: initialize the variable 'index' to silence 
this warning
  int index;
   ^
= 0
  2 warnings generated.

Reviewed-by: Steven Rostedt (VMware) 
Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-b5wyjocel55gorl2jq2cb...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/lib/traceevent/plugin_function.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/lib/traceevent/plugin_function.c 
b/tools/lib/traceevent/plugin_function.c
index a00ec19..42dbf73 100644
--- a/tools/lib/traceevent/plugin_function.c
+++ b/tools/lib/traceevent/plugin_function.c
@@ -130,7 +130,7 @@ static int function_handler(struct trace_seq *s, struct 
pevent_record *record,
unsigned long long pfunction;
const char *func;
const char *parent;
-   int index;
+   int index = 0;
 
if (pevent_get_field_val(s, event, "ip", record, , 1))
return trace_seq_putc(s, '!');


[tip:perf/core] tools lib traceevent plugin function: Initialize 'index' variable

2017-02-13 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  e8c6f437fd12d39e462962eaed2315bac597d34c
Gitweb: http://git.kernel.org/tip/e8c6f437fd12d39e462962eaed2315bac597d34c
Author: Arnaldo Carvalho de Melo 
AuthorDate: Mon, 13 Feb 2017 13:33:57 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 17:22:33 -0300

tools lib traceevent plugin function: Initialize 'index' variable

Detected with clang:

CC   /tmp/build/perf/plugin_function.o
  plugin_function.c:145:6: warning: variable 'index' is used uninitialized 
whenever 'if' condition is false [-Wsometimes-uninitialized]
  if (parent && ftrace_indent->set)
  ^~~~
  plugin_function.c:148:29: note: uninitialized use occurs here
  trace_seq_printf(s, "%*s", index*3, "");
 ^
  plugin_function.c:145:2: note: remove the 'if' if its condition is always true
  if (parent && ftrace_indent->set)
  ^
  plugin_function.c:145:6: warning: variable 'index' is used uninitialized 
whenever '&&' condition is false [-Wsometimes-uninitialized]
  if (parent && ftrace_indent->set)
  ^~
  plugin_function.c:148:29: note: uninitialized use occurs here
  trace_seq_printf(s, "%*s", index*3, "");
 ^
  plugin_function.c:145:6: note: remove the '&&' if its condition is always true
  if (parent && ftrace_indent->set)
  ^
  plugin_function.c:133:11: note: initialize the variable 'index' to silence 
this warning
  int index;
   ^
= 0
  2 warnings generated.

Reviewed-by: Steven Rostedt (VMware) 
Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-b5wyjocel55gorl2jq2cb...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/lib/traceevent/plugin_function.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/lib/traceevent/plugin_function.c 
b/tools/lib/traceevent/plugin_function.c
index a00ec19..42dbf73 100644
--- a/tools/lib/traceevent/plugin_function.c
+++ b/tools/lib/traceevent/plugin_function.c
@@ -130,7 +130,7 @@ static int function_handler(struct trace_seq *s, struct 
pevent_record *record,
unsigned long long pfunction;
const char *func;
const char *parent;
-   int index;
+   int index = 0;
 
if (pevent_get_field_val(s, event, "ip", record, , 1))
return trace_seq_putc(s, '!');


[tip:perf/core] tools lib traceevent: Initialize lenght on OLD_RING_BUFFER_TYPE_TIME_STAMP

2017-02-13 Thread tip-bot for Steven Rostedt (VMware)
Commit-ID:  14e4d7e0abfdefabea2b8796c5a8b2b9c77b5326
Gitweb: http://git.kernel.org/tip/14e4d7e0abfdefabea2b8796c5a8b2b9c77b5326
Author: Steven Rostedt (VMware) 
AuthorDate: Mon, 13 Feb 2017 12:11:44 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 17:22:32 -0300

tools lib traceevent: Initialize lenght on OLD_RING_BUFFER_TYPE_TIME_STAMP

A undefined value was being used for the OLD_RING_BUFFER_TYPE_TIME_STAMP
case entry, as the 'length' variable was not being initialized, fix it.

Caught by the reporter when building tools/perf/ using clang, which emmitted
this warning:

  kbuffer-parse.c:312:7: warning: variable 'length' is used uninitialized 
whenever switch case is taken [-Wsometimes-uninitialized]
  case OLD_RINGBUF_TYPE_TIME_EXTEND:
   ^~~~
  kbuffer-parse.c:339:29: note: uninitialized use occurs here
  kbuf->next = kbuf->index + length;
   ^~
  kbuffer-parse.c:297:21: note: initialize the variable 'length' to silence 
this warning
  unsigned int length;
 ^
  = 0

Reported-by: Arnaldo Carvalho de Melo 
Signed-off-by: Steven Rostedt 
Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/r/20170213121418.47f27...@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/lib/traceevent/kbuffer-parse.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/lib/traceevent/kbuffer-parse.c 
b/tools/lib/traceevent/kbuffer-parse.c
index 65984f1..c94e364 100644
--- a/tools/lib/traceevent/kbuffer-parse.c
+++ b/tools/lib/traceevent/kbuffer-parse.c
@@ -315,6 +315,7 @@ static unsigned int old_update_pointers(struct kbuffer 
*kbuf)
extend += delta;
delta = extend;
ptr += 4;
+   length = 0;
break;
 
case OLD_RINGBUF_TYPE_TIME_STAMP:


[tip:perf/core] tools lib traceevent: Initialize lenght on OLD_RING_BUFFER_TYPE_TIME_STAMP

2017-02-13 Thread tip-bot for Steven Rostedt (VMware)
Commit-ID:  14e4d7e0abfdefabea2b8796c5a8b2b9c77b5326
Gitweb: http://git.kernel.org/tip/14e4d7e0abfdefabea2b8796c5a8b2b9c77b5326
Author: Steven Rostedt (VMware) 
AuthorDate: Mon, 13 Feb 2017 12:11:44 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 17:22:32 -0300

tools lib traceevent: Initialize lenght on OLD_RING_BUFFER_TYPE_TIME_STAMP

A undefined value was being used for the OLD_RING_BUFFER_TYPE_TIME_STAMP
case entry, as the 'length' variable was not being initialized, fix it.

Caught by the reporter when building tools/perf/ using clang, which emmitted
this warning:

  kbuffer-parse.c:312:7: warning: variable 'length' is used uninitialized 
whenever switch case is taken [-Wsometimes-uninitialized]
  case OLD_RINGBUF_TYPE_TIME_EXTEND:
   ^~~~
  kbuffer-parse.c:339:29: note: uninitialized use occurs here
  kbuf->next = kbuf->index + length;
   ^~
  kbuffer-parse.c:297:21: note: initialize the variable 'length' to silence 
this warning
  unsigned int length;
 ^
  = 0

Reported-by: Arnaldo Carvalho de Melo 
Signed-off-by: Steven Rostedt 
Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/r/20170213121418.47f27...@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/lib/traceevent/kbuffer-parse.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/lib/traceevent/kbuffer-parse.c 
b/tools/lib/traceevent/kbuffer-parse.c
index 65984f1..c94e364 100644
--- a/tools/lib/traceevent/kbuffer-parse.c
+++ b/tools/lib/traceevent/kbuffer-parse.c
@@ -315,6 +315,7 @@ static unsigned int old_update_pointers(struct kbuffer 
*kbuf)
extend += delta;
delta = extend;
ptr += 4;
+   length = 0;
break;
 
case OLD_RINGBUF_TYPE_TIME_STAMP:


[tip:perf/core] perf diff: Change default setting to "delta-abs"

2017-02-13 Thread tip-bot for Namhyung Kim
Commit-ID:  be57b3fd218ad4a19725ac4bd53e67b2ede42a9d
Gitweb: http://git.kernel.org/tip/be57b3fd218ad4a19725ac4bd53e67b2ede42a9d
Author: Namhyung Kim 
AuthorDate: Sat, 11 Feb 2017 01:18:56 +0900
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 14:29:38 -0300

perf diff: Change default setting to "delta-abs"

The "delta-abs" compute method will show most changed entries on top.
So users can easily see how much effect between the data.  Note that it
also changes the default of -o option to 1 in order to apply the compute
method.  To see original-style (sorted by baseline) use -o 0 option.

Signed-off-by: Namhyung Kim 
Cc: Jiri Olsa 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20170210161856.18422-1-namhy...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-diff.txt | 4 ++--
 tools/perf/builtin-diff.c  | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-diff.txt 
b/tools/perf/Documentation/perf-diff.txt
index 7391299..66dbe3de 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -87,7 +87,7 @@ OPTIONS
 -c::
 --compute::
 Differential computation selection - delta, ratio, wdiff, delta-abs
-(default is delta).  Default can be changed using diff.compute
+(default is delta-abs).  Default can be changed using diff.compute
 config option.  See COMPARISON METHODS section for more info.
 
 -p::
@@ -101,7 +101,7 @@ OPTIONS
 -o::
 --order::
Specify compute sorting column number.  0 means sorting by baseline
-   overhead (default) and 1 means sorting by computed value of column 1
+   overhead and 1 (default) means sorting by computed value of column 1
(data from the first file other base baseline).  Values more than 1
can be used only if enough data files are provided.
The default value can be set using the diff.order config option.
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index e68cc76..70a2893 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -66,7 +66,7 @@ static bool force;
 static bool show_period;
 static bool show_formula;
 static bool show_baseline_only;
-static unsigned int sort_compute;
+static unsigned int sort_compute = 1;
 
 static s64 compute_wdiff_w1;
 static s64 compute_wdiff_w2;
@@ -86,7 +86,7 @@ const char *compute_names[COMPUTE_MAX] = {
[COMPUTE_WEIGHTED_DIFF] = "wdiff",
 };
 
-static int compute = COMPUTE_DELTA;
+static int compute = COMPUTE_DELTA_ABS;
 
 static int compute_2_hpp[COMPUTE_MAX] = {
[COMPUTE_DELTA] = PERF_HPP_DIFF__DELTA,
@@ -810,7 +810,7 @@ static const struct option options[] = {
OPT_BOOLEAN('b', "baseline-only", _baseline_only,
"Show only items with match in baseline"),
OPT_CALLBACK('c', "compute", ,
-"delta,delta-abs,ratio,wdiff:w1,w2 (default delta)",
+"delta,delta-abs,ratio,wdiff:w1,w2 (default delta-abs)",
 "Entries differential computation selection",
 setup_compute),
OPT_BOOLEAN('p', "period", _period,


[tip:perf/core] perf diff: Change default setting to "delta-abs"

2017-02-13 Thread tip-bot for Namhyung Kim
Commit-ID:  be57b3fd218ad4a19725ac4bd53e67b2ede42a9d
Gitweb: http://git.kernel.org/tip/be57b3fd218ad4a19725ac4bd53e67b2ede42a9d
Author: Namhyung Kim 
AuthorDate: Sat, 11 Feb 2017 01:18:56 +0900
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 14:29:38 -0300

perf diff: Change default setting to "delta-abs"

The "delta-abs" compute method will show most changed entries on top.
So users can easily see how much effect between the data.  Note that it
also changes the default of -o option to 1 in order to apply the compute
method.  To see original-style (sorted by baseline) use -o 0 option.

Signed-off-by: Namhyung Kim 
Cc: Jiri Olsa 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20170210161856.18422-1-namhy...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-diff.txt | 4 ++--
 tools/perf/builtin-diff.c  | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-diff.txt 
b/tools/perf/Documentation/perf-diff.txt
index 7391299..66dbe3de 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -87,7 +87,7 @@ OPTIONS
 -c::
 --compute::
 Differential computation selection - delta, ratio, wdiff, delta-abs
-(default is delta).  Default can be changed using diff.compute
+(default is delta-abs).  Default can be changed using diff.compute
 config option.  See COMPARISON METHODS section for more info.
 
 -p::
@@ -101,7 +101,7 @@ OPTIONS
 -o::
 --order::
Specify compute sorting column number.  0 means sorting by baseline
-   overhead (default) and 1 means sorting by computed value of column 1
+   overhead and 1 (default) means sorting by computed value of column 1
(data from the first file other base baseline).  Values more than 1
can be used only if enough data files are provided.
The default value can be set using the diff.order config option.
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index e68cc76..70a2893 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -66,7 +66,7 @@ static bool force;
 static bool show_period;
 static bool show_formula;
 static bool show_baseline_only;
-static unsigned int sort_compute;
+static unsigned int sort_compute = 1;
 
 static s64 compute_wdiff_w1;
 static s64 compute_wdiff_w2;
@@ -86,7 +86,7 @@ const char *compute_names[COMPUTE_MAX] = {
[COMPUTE_WEIGHTED_DIFF] = "wdiff",
 };
 
-static int compute = COMPUTE_DELTA;
+static int compute = COMPUTE_DELTA_ABS;
 
 static int compute_2_hpp[COMPUTE_MAX] = {
[COMPUTE_DELTA] = PERF_HPP_DIFF__DELTA,
@@ -810,7 +810,7 @@ static const struct option options[] = {
OPT_BOOLEAN('b', "baseline-only", _baseline_only,
"Show only items with match in baseline"),
OPT_CALLBACK('c', "compute", ,
-"delta,delta-abs,ratio,wdiff:w1,w2 (default delta)",
+"delta,delta-abs,ratio,wdiff:w1,w2 (default delta-abs)",
 "Entries differential computation selection",
 setup_compute),
OPT_BOOLEAN('p', "period", _period,


[tip:perf/core] perf scripting perl: Fix compile error with some perl5 versions

2017-02-13 Thread tip-bot for Wang YanQing
Commit-ID:  d7dd112ea5cacf91ae72c0714c3b911eb6016fea
Gitweb: http://git.kernel.org/tip/d7dd112ea5cacf91ae72c0714c3b911eb6016fea
Author: Wang YanQing 
AuthorDate: Sun, 12 Feb 2017 10:46:55 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 17:22:32 -0300

perf scripting perl: Fix compile error with some perl5 versions

Fix below compile error:

  CC   util/scripting-engines/trace-event-perl.o
  In file included from /usr/lib/perl5/5.22.2/i686-linux/CORE/perl.h:5673:0,
   from util/scripting-engines/trace-event-perl.c:31:
  /usr/lib/perl5/5.22.2/i686-linux/CORE/inline.h: In function 
'S__is_utf8_char_slow':
  /usr/lib/perl5/5.22.2/i686-linux/CORE/inline.h:270:5: error: nested extern 
declaration of 'Perl___notused' [-Werror=nested-externs]
  dTHX;   /* The function called below requires thread context */
 ^
  cc1: all warnings being treated as errors

After digging perl5 repository, I find out that we will meet this
compile error with perl from v5.21.1 to v5.25.4

Signed-off-by: Wang YanQing 
Acked-by: Jiri Olsa 
Link: http://lkml.kernel.org/r/20170212024655.GA15997@udknight
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/scripting-engines/Build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/scripting-engines/Build 
b/tools/perf/util/scripting-engines/Build
index 6516e22..82d28c6 100644
--- a/tools/perf/util/scripting-engines/Build
+++ b/tools/perf/util/scripting-engines/Build
@@ -1,6 +1,6 @@
 libperf-$(CONFIG_LIBPERL)   += trace-event-perl.o
 libperf-$(CONFIG_LIBPYTHON) += trace-event-python.o
 
-CFLAGS_trace-event-perl.o += $(PERL_EMBED_CCOPTS) -Wno-redundant-decls 
-Wno-strict-prototypes -Wno-unused-parameter -Wno-shadow -Wno-undef 
-Wno-switch-default
+CFLAGS_trace-event-perl.o += $(PERL_EMBED_CCOPTS) -Wno-redundant-decls 
-Wno-strict-prototypes -Wno-unused-parameter -Wno-shadow -Wno-nested-externs 
-Wno-undef -Wno-switch-default
 
 CFLAGS_trace-event-python.o += $(PYTHON_EMBED_CCOPTS) -Wno-redundant-decls 
-Wno-strict-prototypes -Wno-unused-parameter -Wno-shadow


[tip:perf/core] perf scripting perl: Fix compile error with some perl5 versions

2017-02-13 Thread tip-bot for Wang YanQing
Commit-ID:  d7dd112ea5cacf91ae72c0714c3b911eb6016fea
Gitweb: http://git.kernel.org/tip/d7dd112ea5cacf91ae72c0714c3b911eb6016fea
Author: Wang YanQing 
AuthorDate: Sun, 12 Feb 2017 10:46:55 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 17:22:32 -0300

perf scripting perl: Fix compile error with some perl5 versions

Fix below compile error:

  CC   util/scripting-engines/trace-event-perl.o
  In file included from /usr/lib/perl5/5.22.2/i686-linux/CORE/perl.h:5673:0,
   from util/scripting-engines/trace-event-perl.c:31:
  /usr/lib/perl5/5.22.2/i686-linux/CORE/inline.h: In function 
'S__is_utf8_char_slow':
  /usr/lib/perl5/5.22.2/i686-linux/CORE/inline.h:270:5: error: nested extern 
declaration of 'Perl___notused' [-Werror=nested-externs]
  dTHX;   /* The function called below requires thread context */
 ^
  cc1: all warnings being treated as errors

After digging perl5 repository, I find out that we will meet this
compile error with perl from v5.21.1 to v5.25.4

Signed-off-by: Wang YanQing 
Acked-by: Jiri Olsa 
Link: http://lkml.kernel.org/r/20170212024655.GA15997@udknight
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/scripting-engines/Build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/scripting-engines/Build 
b/tools/perf/util/scripting-engines/Build
index 6516e22..82d28c6 100644
--- a/tools/perf/util/scripting-engines/Build
+++ b/tools/perf/util/scripting-engines/Build
@@ -1,6 +1,6 @@
 libperf-$(CONFIG_LIBPERL)   += trace-event-perl.o
 libperf-$(CONFIG_LIBPYTHON) += trace-event-python.o
 
-CFLAGS_trace-event-perl.o += $(PERL_EMBED_CCOPTS) -Wno-redundant-decls 
-Wno-strict-prototypes -Wno-unused-parameter -Wno-shadow -Wno-undef 
-Wno-switch-default
+CFLAGS_trace-event-perl.o += $(PERL_EMBED_CCOPTS) -Wno-redundant-decls 
-Wno-strict-prototypes -Wno-unused-parameter -Wno-shadow -Wno-nested-externs 
-Wno-undef -Wno-switch-default
 
 CFLAGS_trace-event-python.o += $(PYTHON_EMBED_CCOPTS) -Wno-redundant-decls 
-Wno-strict-prototypes -Wno-unused-parameter -Wno-shadow


[tip:perf/core] perf diff: Add diff.compute config option

2017-02-13 Thread tip-bot for Namhyung Kim
Commit-ID:  4b35994abe459f08f58b4b3855abf4ba80308680
Gitweb: http://git.kernel.org/tip/4b35994abe459f08f58b4b3855abf4ba80308680
Author: Namhyung Kim 
AuthorDate: Fri, 10 Feb 2017 16:36:13 +0900
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 14:29:37 -0300

perf diff: Add diff.compute config option

The diff.compute config variable is to set the default compute method of
perf diff command (-c option).  Possible values 'delta' (default),
'delta-abs', 'ratio' and 'wdiff'.

Signed-off-by: Namhyung Kim 
Cc: Jiri Olsa 
Cc: Peter Zijlstra 
Cc: Taeung Song 
Link: http://lkml.kernel.org/r/20170210073614.24584-4-namhy...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-config.txt |  5 +
 tools/perf/Documentation/perf-diff.txt   |  5 +++--
 tools/perf/builtin-diff.c| 16 +++-
 3 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-config.txt 
b/tools/perf/Documentation/perf-config.txt
index 49ab79d..5b4fff3 100644
--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@@ -505,6 +505,11 @@ diff.*::
Setting it to 1 will sort the result by delta (or other
compute method selected).
 
+   diff.compute::
+   This options sets the method for computing the diff result.
+   Possible values are 'delta', 'delta-abs', 'ratio' and
+   'wdiff'.  Default is 'delta'.
+
 SEE ALSO
 
 linkperf:perf[1]
diff --git a/tools/perf/Documentation/perf-diff.txt 
b/tools/perf/Documentation/perf-diff.txt
index 7c014c9..7391299 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -86,8 +86,9 @@ OPTIONS
 
 -c::
 --compute::
-Differential computation selection - delta,ratio,wdiff,delta-abs 
(default is delta).
-See COMPARISON METHODS section for more info.
+Differential computation selection - delta, ratio, wdiff, delta-abs
+(default is delta).  Default can be changed using diff.compute
+config option.  See COMPARISON METHODS section for more info.
 
 -p::
 --period::
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 181ff99..e68cc76 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -86,7 +86,7 @@ const char *compute_names[COMPUTE_MAX] = {
[COMPUTE_WEIGHTED_DIFF] = "wdiff",
 };
 
-static int compute;
+static int compute = COMPUTE_DELTA;
 
 static int compute_2_hpp[COMPUTE_MAX] = {
[COMPUTE_DELTA] = PERF_HPP_DIFF__DELTA,
@@ -1299,6 +1299,20 @@ static int diff__config(const char *var, const char 
*value,
sort_compute = perf_config_int(var, value);
return 0;
}
+   if (!strcmp(var, "diff.compute")) {
+   if (!strcmp(value, "delta")) {
+   compute = COMPUTE_DELTA;
+   } else if (!strcmp(value, "delta-abs")) {
+   compute = COMPUTE_DELTA_ABS;
+   } else if (!strcmp(value, "ratio")) {
+   compute = COMPUTE_RATIO;
+   } else if (!strcmp(value, "wdiff")) {
+   compute = COMPUTE_WEIGHTED_DIFF;
+   } else {
+   pr_err("Invalid compute method: %s\n", value);
+   return -1;
+   }
+   }
 
return 0;
 }


[tip:perf/core] perf diff: Add diff.compute config option

2017-02-13 Thread tip-bot for Namhyung Kim
Commit-ID:  4b35994abe459f08f58b4b3855abf4ba80308680
Gitweb: http://git.kernel.org/tip/4b35994abe459f08f58b4b3855abf4ba80308680
Author: Namhyung Kim 
AuthorDate: Fri, 10 Feb 2017 16:36:13 +0900
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 14:29:37 -0300

perf diff: Add diff.compute config option

The diff.compute config variable is to set the default compute method of
perf diff command (-c option).  Possible values 'delta' (default),
'delta-abs', 'ratio' and 'wdiff'.

Signed-off-by: Namhyung Kim 
Cc: Jiri Olsa 
Cc: Peter Zijlstra 
Cc: Taeung Song 
Link: http://lkml.kernel.org/r/20170210073614.24584-4-namhy...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-config.txt |  5 +
 tools/perf/Documentation/perf-diff.txt   |  5 +++--
 tools/perf/builtin-diff.c| 16 +++-
 3 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-config.txt 
b/tools/perf/Documentation/perf-config.txt
index 49ab79d..5b4fff3 100644
--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@@ -505,6 +505,11 @@ diff.*::
Setting it to 1 will sort the result by delta (or other
compute method selected).
 
+   diff.compute::
+   This options sets the method for computing the diff result.
+   Possible values are 'delta', 'delta-abs', 'ratio' and
+   'wdiff'.  Default is 'delta'.
+
 SEE ALSO
 
 linkperf:perf[1]
diff --git a/tools/perf/Documentation/perf-diff.txt 
b/tools/perf/Documentation/perf-diff.txt
index 7c014c9..7391299 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -86,8 +86,9 @@ OPTIONS
 
 -c::
 --compute::
-Differential computation selection - delta,ratio,wdiff,delta-abs 
(default is delta).
-See COMPARISON METHODS section for more info.
+Differential computation selection - delta, ratio, wdiff, delta-abs
+(default is delta).  Default can be changed using diff.compute
+config option.  See COMPARISON METHODS section for more info.
 
 -p::
 --period::
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 181ff99..e68cc76 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -86,7 +86,7 @@ const char *compute_names[COMPUTE_MAX] = {
[COMPUTE_WEIGHTED_DIFF] = "wdiff",
 };
 
-static int compute;
+static int compute = COMPUTE_DELTA;
 
 static int compute_2_hpp[COMPUTE_MAX] = {
[COMPUTE_DELTA] = PERF_HPP_DIFF__DELTA,
@@ -1299,6 +1299,20 @@ static int diff__config(const char *var, const char 
*value,
sort_compute = perf_config_int(var, value);
return 0;
}
+   if (!strcmp(var, "diff.compute")) {
+   if (!strcmp(value, "delta")) {
+   compute = COMPUTE_DELTA;
+   } else if (!strcmp(value, "delta-abs")) {
+   compute = COMPUTE_DELTA_ABS;
+   } else if (!strcmp(value, "ratio")) {
+   compute = COMPUTE_RATIO;
+   } else if (!strcmp(value, "wdiff")) {
+   compute = COMPUTE_WEIGHTED_DIFF;
+   } else {
+   pr_err("Invalid compute method: %s\n", value);
+   return -1;
+   }
+   }
 
return 0;
 }


[tip:perf/core] tools include: Introduce linux/compiler-gcc.h

2017-02-13 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  192614010a5052fe92611c7076ef664fd9bb60e8
Gitweb: http://git.kernel.org/tip/192614010a5052fe92611c7076ef664fd9bb60e8
Author: Arnaldo Carvalho de Melo 
AuthorDate: Fri, 10 Feb 2017 11:41:11 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 14:29:29 -0300

tools include: Introduce linux/compiler-gcc.h

To match the kernel headers structure, setting up things that are
specific to gcc or to some specific version of gcc.

It gets included by linux/compiler.h when gcc is the compiler being
used.

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Joe Perches 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-fabcqfq4asodq9t158hcs...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/include/linux/compiler-gcc.h | 14 ++
 tools/include/linux/compiler.h | 10 +-
 tools/perf/MANIFEST|  1 +
 3 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/tools/include/linux/compiler-gcc.h 
b/tools/include/linux/compiler-gcc.h
new file mode 100644
index 000..48af2f1
--- /dev/null
+++ b/tools/include/linux/compiler-gcc.h
@@ -0,0 +1,14 @@
+#ifndef _TOOLS_LINUX_COMPILER_H_
+#error "Please don't include  directly, include 
 instead."
+#endif
+
+/*
+ * Common definitions for all gcc versions go here.
+ */
+#define GCC_VERSION (__GNUC__ * 1  \
++ __GNUC_MINOR__ * 100 \
++ __GNUC_PATCHLEVEL__)
+
+#if GCC_VERSION >= 7 && !defined(__CHECKER__)
+# define __fallthrough __attribute__ ((fallthrough))
+#endif
diff --git a/tools/include/linux/compiler.h b/tools/include/linux/compiler.h
index d94179f..6326ede 100644
--- a/tools/include/linux/compiler.h
+++ b/tools/include/linux/compiler.h
@@ -1,6 +1,10 @@
 #ifndef _TOOLS_LINUX_COMPILER_H_
 #define _TOOLS_LINUX_COMPILER_H_
 
+#ifdef __GNUC__
+#include 
+#endif
+
 /* Optimization barrier */
 /* The "volatile" is due to gcc bugs */
 #define barrier() __asm__ __volatile__("": : :"memory")
@@ -128,11 +132,7 @@ static __always_inline void __write_once_size(volatile 
void *p, void *res, int s
 
 
 #ifndef __fallthrough
-# if defined(__GNUC__) && __GNUC__ >= 7
-#  define __fallthrough __attribute__ ((fallthrough))
-# else
-#  define __fallthrough
-# endif
+# define __fallthrough
 #endif
 
 #endif /* _TOOLS_LINUX_COMPILER_H */
diff --git a/tools/perf/MANIFEST b/tools/perf/MANIFEST
index a511e5f..8672f83 100644
--- a/tools/perf/MANIFEST
+++ b/tools/perf/MANIFEST
@@ -61,6 +61,7 @@ tools/include/asm-generic/bitops.h
 tools/include/linux/atomic.h
 tools/include/linux/bitops.h
 tools/include/linux/compiler.h
+tools/include/linux/compiler-gcc.h
 tools/include/linux/coresight-pmu.h
 tools/include/linux/filter.h
 tools/include/linux/hash.h


[tip:perf/core] tools include: Introduce linux/compiler-gcc.h

2017-02-13 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  192614010a5052fe92611c7076ef664fd9bb60e8
Gitweb: http://git.kernel.org/tip/192614010a5052fe92611c7076ef664fd9bb60e8
Author: Arnaldo Carvalho de Melo 
AuthorDate: Fri, 10 Feb 2017 11:41:11 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 14:29:29 -0300

tools include: Introduce linux/compiler-gcc.h

To match the kernel headers structure, setting up things that are
specific to gcc or to some specific version of gcc.

It gets included by linux/compiler.h when gcc is the compiler being
used.

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Joe Perches 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-fabcqfq4asodq9t158hcs...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/include/linux/compiler-gcc.h | 14 ++
 tools/include/linux/compiler.h | 10 +-
 tools/perf/MANIFEST|  1 +
 3 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/tools/include/linux/compiler-gcc.h 
b/tools/include/linux/compiler-gcc.h
new file mode 100644
index 000..48af2f1
--- /dev/null
+++ b/tools/include/linux/compiler-gcc.h
@@ -0,0 +1,14 @@
+#ifndef _TOOLS_LINUX_COMPILER_H_
+#error "Please don't include  directly, include 
 instead."
+#endif
+
+/*
+ * Common definitions for all gcc versions go here.
+ */
+#define GCC_VERSION (__GNUC__ * 1  \
++ __GNUC_MINOR__ * 100 \
++ __GNUC_PATCHLEVEL__)
+
+#if GCC_VERSION >= 7 && !defined(__CHECKER__)
+# define __fallthrough __attribute__ ((fallthrough))
+#endif
diff --git a/tools/include/linux/compiler.h b/tools/include/linux/compiler.h
index d94179f..6326ede 100644
--- a/tools/include/linux/compiler.h
+++ b/tools/include/linux/compiler.h
@@ -1,6 +1,10 @@
 #ifndef _TOOLS_LINUX_COMPILER_H_
 #define _TOOLS_LINUX_COMPILER_H_
 
+#ifdef __GNUC__
+#include 
+#endif
+
 /* Optimization barrier */
 /* The "volatile" is due to gcc bugs */
 #define barrier() __asm__ __volatile__("": : :"memory")
@@ -128,11 +132,7 @@ static __always_inline void __write_once_size(volatile 
void *p, void *res, int s
 
 
 #ifndef __fallthrough
-# if defined(__GNUC__) && __GNUC__ >= 7
-#  define __fallthrough __attribute__ ((fallthrough))
-# else
-#  define __fallthrough
-# endif
+# define __fallthrough
 #endif
 
 #endif /* _TOOLS_LINUX_COMPILER_H */
diff --git a/tools/perf/MANIFEST b/tools/perf/MANIFEST
index a511e5f..8672f83 100644
--- a/tools/perf/MANIFEST
+++ b/tools/perf/MANIFEST
@@ -61,6 +61,7 @@ tools/include/asm-generic/bitops.h
 tools/include/linux/atomic.h
 tools/include/linux/bitops.h
 tools/include/linux/compiler.h
+tools/include/linux/compiler-gcc.h
 tools/include/linux/coresight-pmu.h
 tools/include/linux/filter.h
 tools/include/linux/hash.h


[tip:perf/core] perf diff: Add diff.order config option

2017-02-13 Thread tip-bot for Namhyung Kim
Commit-ID:  d49dd15d69731589de4436a6dcfca59567320fdf
Gitweb: http://git.kernel.org/tip/d49dd15d69731589de4436a6dcfca59567320fdf
Author: Namhyung Kim 
AuthorDate: Fri, 10 Feb 2017 16:36:12 +0900
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 14:29:37 -0300

perf diff: Add diff.order config option

In many cases, I need to look at differences between two data so I often
used the -o option to sort the result base on the difference first.
It'd be nice to have a config option to set it by default.

The diff.order config option is to set the default value of -o/--order
option.

Signed-off-by: Namhyung Kim 
Cc: Jiri Olsa 
Cc: Peter Zijlstra 
Cc: Taeung Song 
Link: http://lkml.kernel.org/r/20170210073614.24584-3-namhy...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-config.txt |  7 +++
 tools/perf/Documentation/perf-diff.txt   |  6 +-
 tools/perf/builtin-diff.c| 14 ++
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-config.txt 
b/tools/perf/Documentation/perf-config.txt
index 9365b75..49ab79d 100644
--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@@ -498,6 +498,13 @@ record.*::
But if this option is 'no-cache', it will not update the 
build-id cache.
'skip' skips post-processing and does not update the cache.
 
+diff.*::
+   diff.order::
+   This option sets the number of columns to sort the result.
+   The default is 0, which means sorting by baseline.
+   Setting it to 1 will sort the result by delta (or other
+   compute method selected).
+
 SEE ALSO
 
 linkperf:perf[1]
diff --git a/tools/perf/Documentation/perf-diff.txt 
b/tools/perf/Documentation/perf-diff.txt
index af80284..7c014c9 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -99,7 +99,11 @@ OPTIONS
 
 -o::
 --order::
-   Specify compute sorting column number.
+   Specify compute sorting column number.  0 means sorting by baseline
+   overhead (default) and 1 means sorting by computed value of column 1
+   (data from the first file other base baseline).  Values more than 1
+   can be used only if enough data files are provided.
+   The default value can be set using the diff.order config option.
 
 --percentage::
Determine how to display the overhead percentage of filtered entries.
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 781c9e6..181ff99 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -17,6 +17,7 @@
 #include "util/symbol.h"
 #include "util/util.h"
 #include "util/data.h"
+#include "util/config.h"
 
 #include 
 #include 
@@ -1291,6 +1292,17 @@ static int data_init(int argc, const char **argv)
return 0;
 }
 
+static int diff__config(const char *var, const char *value,
+   void *cb __maybe_unused)
+{
+   if (!strcmp(var, "diff.order")) {
+   sort_compute = perf_config_int(var, value);
+   return 0;
+   }
+
+   return 0;
+}
+
 int cmd_diff(int argc, const char **argv, const char *prefix __maybe_unused)
 {
int ret = hists__init();
@@ -1298,6 +1310,8 @@ int cmd_diff(int argc, const char **argv, const char 
*prefix __maybe_unused)
if (ret < 0)
return ret;
 
+   perf_config(diff__config, NULL);
+
argc = parse_options(argc, argv, options, diff_usage, 0);
 
if (symbol__init(NULL) < 0)


[tip:perf/core] perf diff: Add diff.order config option

2017-02-13 Thread tip-bot for Namhyung Kim
Commit-ID:  d49dd15d69731589de4436a6dcfca59567320fdf
Gitweb: http://git.kernel.org/tip/d49dd15d69731589de4436a6dcfca59567320fdf
Author: Namhyung Kim 
AuthorDate: Fri, 10 Feb 2017 16:36:12 +0900
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 14:29:37 -0300

perf diff: Add diff.order config option

In many cases, I need to look at differences between two data so I often
used the -o option to sort the result base on the difference first.
It'd be nice to have a config option to set it by default.

The diff.order config option is to set the default value of -o/--order
option.

Signed-off-by: Namhyung Kim 
Cc: Jiri Olsa 
Cc: Peter Zijlstra 
Cc: Taeung Song 
Link: http://lkml.kernel.org/r/20170210073614.24584-3-namhy...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-config.txt |  7 +++
 tools/perf/Documentation/perf-diff.txt   |  6 +-
 tools/perf/builtin-diff.c| 14 ++
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-config.txt 
b/tools/perf/Documentation/perf-config.txt
index 9365b75..49ab79d 100644
--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@@ -498,6 +498,13 @@ record.*::
But if this option is 'no-cache', it will not update the 
build-id cache.
'skip' skips post-processing and does not update the cache.
 
+diff.*::
+   diff.order::
+   This option sets the number of columns to sort the result.
+   The default is 0, which means sorting by baseline.
+   Setting it to 1 will sort the result by delta (or other
+   compute method selected).
+
 SEE ALSO
 
 linkperf:perf[1]
diff --git a/tools/perf/Documentation/perf-diff.txt 
b/tools/perf/Documentation/perf-diff.txt
index af80284..7c014c9 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -99,7 +99,11 @@ OPTIONS
 
 -o::
 --order::
-   Specify compute sorting column number.
+   Specify compute sorting column number.  0 means sorting by baseline
+   overhead (default) and 1 means sorting by computed value of column 1
+   (data from the first file other base baseline).  Values more than 1
+   can be used only if enough data files are provided.
+   The default value can be set using the diff.order config option.
 
 --percentage::
Determine how to display the overhead percentage of filtered entries.
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 781c9e6..181ff99 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -17,6 +17,7 @@
 #include "util/symbol.h"
 #include "util/util.h"
 #include "util/data.h"
+#include "util/config.h"
 
 #include 
 #include 
@@ -1291,6 +1292,17 @@ static int data_init(int argc, const char **argv)
return 0;
 }
 
+static int diff__config(const char *var, const char *value,
+   void *cb __maybe_unused)
+{
+   if (!strcmp(var, "diff.order")) {
+   sort_compute = perf_config_int(var, value);
+   return 0;
+   }
+
+   return 0;
+}
+
 int cmd_diff(int argc, const char **argv, const char *prefix __maybe_unused)
 {
int ret = hists__init();
@@ -1298,6 +1310,8 @@ int cmd_diff(int argc, const char **argv, const char 
*prefix __maybe_unused)
if (ret < 0)
return ret;
 
+   perf_config(diff__config, NULL);
+
argc = parse_options(argc, argv, options, diff_usage, 0);
 
if (symbol__init(NULL) < 0)


[tip:perf/core] perf diff: Add 'delta-abs' compute method

2017-02-13 Thread tip-bot for Namhyung Kim
Commit-ID:  a1668c25a8e1b53d00b2997ef5bc5e25c7a77235
Gitweb: http://git.kernel.org/tip/a1668c25a8e1b53d00b2997ef5bc5e25c7a77235
Author: Namhyung Kim 
AuthorDate: Fri, 10 Feb 2017 16:36:11 +0900
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 14:29:36 -0300

perf diff: Add 'delta-abs' compute method

The 'delta-abs' compute method is same as 'delta' but shows entries with
bigger absolute delta first instead of sorting numerically.  This is
only useful together with -o option.

Below is default output (-c delta):

  $ perf diff -o 1 -c delta | grep -v ^# | head
42.22%   +4.97%  [kernel.kallsyms]  [k] cfb_imageblit
 0.62%   +1.23%  [kernel.kallsyms]  [k] mutex_lock
 +1.15%  [kernel.kallsyms]  [k] copy_user_generic_string
 2.40%   +0.95%  [kernel.kallsyms]  [k] bit_putcs
 0.31%   +0.79%  [kernel.kallsyms]  [k] link_path_walk
 +0.64%  [kernel.kallsyms]  [k] kmem_cache_alloc
 0.00%   +0.57%  [kernel.kallsyms]  [k] __rcu_read_unlock
 +0.45%  [kernel.kallsyms]  [k] alloc_set_pte
 0.16%   +0.45%  [kernel.kallsyms]  [k] menu_select
 +0.41%  ld-2.24.so [.] do_lookup_x

Now with 'delta-abs' it shows entries have bigger delta value either
positive or negative.

  $ perf diff -o 1 -c delta-abs | grep -v ^# | head
42.22%   +4.97%  [kernel.kallsyms]  [k] cfb_imageblit
12.72%   -3.01%  [kernel.kallsyms]  [k] intel_idle
 9.72%   -1.31%  [unknown]  [.] 0x00411343
 0.62%   +1.23%  [kernel.kallsyms]  [k] mutex_lock
 2.40%   +0.95%  [kernel.kallsyms]  [k] bit_putcs
 0.31%   +0.79%  [kernel.kallsyms]  [k] link_path_walk
 1.35%   -0.71%  [kernel.kallsyms]  [k] smp_call_function_single
 0.00%   +0.57%  [kernel.kallsyms]  [k] __rcu_read_unlock
 0.16%   +0.45%  [kernel.kallsyms]  [k] menu_select
 0.72%   -0.44%  [kernel.kallsyms]  [k] lookup_fast

Signed-off-by: Namhyung Kim 
Cc: Jiri Olsa 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20170210073614.24584-2-namhy...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-diff.txt |  6 -
 tools/perf/builtin-diff.c  | 46 --
 2 files changed, 49 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-diff.txt 
b/tools/perf/Documentation/perf-diff.txt
index 3e9490b..af80284 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -86,7 +86,7 @@ OPTIONS
 
 -c::
 --compute::
-Differential computation selection - delta,ratio,wdiff (default is 
delta).
+Differential computation selection - delta,ratio,wdiff,delta-abs 
(default is delta).
 See COMPARISON METHODS section for more info.
 
 -p::
@@ -181,6 +181,10 @@ with:
 relative to how entries are filtered.  Use --percentage=absolute to
 prevent such fluctuation.
 
+delta-abs
+~
+Same as 'delta` method, but sort the result with the absolute values.
+
 ratio
 ~
 If specified the 'Ratio' column is displayed with value 'r' computed as:
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 933aeec..781c9e6 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -30,6 +30,7 @@ enum {
PERF_HPP_DIFF__RATIO,
PERF_HPP_DIFF__WEIGHTED_DIFF,
PERF_HPP_DIFF__FORMULA,
+   PERF_HPP_DIFF__DELTA_ABS,
 
PERF_HPP_DIFF__MAX_INDEX
 };
@@ -73,11 +74,13 @@ enum {
COMPUTE_DELTA,
COMPUTE_RATIO,
COMPUTE_WEIGHTED_DIFF,
+   COMPUTE_DELTA_ABS,
COMPUTE_MAX,
 };
 
 const char *compute_names[COMPUTE_MAX] = {
[COMPUTE_DELTA] = "delta",
+   [COMPUTE_DELTA_ABS] = "delta-abs",
[COMPUTE_RATIO] = "ratio",
[COMPUTE_WEIGHTED_DIFF] = "wdiff",
 };
@@ -86,6 +89,7 @@ static int compute;
 
 static int compute_2_hpp[COMPUTE_MAX] = {
[COMPUTE_DELTA] = PERF_HPP_DIFF__DELTA,
+   [COMPUTE_DELTA_ABS] = PERF_HPP_DIFF__DELTA_ABS,
[COMPUTE_RATIO] = PERF_HPP_DIFF__RATIO,
[COMPUTE_WEIGHTED_DIFF] = PERF_HPP_DIFF__WEIGHTED_DIFF,
 };
@@ -111,6 +115,10 @@ static struct header_column {
.name  = "Delta",
.width = 7,
},
+   [PERF_HPP_DIFF__DELTA_ABS] = {
+   .name  = "Delta Abs",
+   .width = 7,
+   },
[PERF_HPP_DIFF__RATIO] = {
.name  = "Ratio",
.width = 14,
@@ -298,6 +306,7 @@ static int formula_fprintf(struct hist_entry *he, struct 
hist_entry *pair,
 {
switch (compute) {
case COMPUTE_DELTA:
+   case COMPUTE_DELTA_ABS:
return formula_delta(he, pair, buf, size);
case COMPUTE_RATIO:
return formula_ratio(he, pair, buf, size);
@@ -461,6 +470,7 @@ static void hists__precompute(struct hists 

[tip:perf/core] perf diff: Add 'delta-abs' compute method

2017-02-13 Thread tip-bot for Namhyung Kim
Commit-ID:  a1668c25a8e1b53d00b2997ef5bc5e25c7a77235
Gitweb: http://git.kernel.org/tip/a1668c25a8e1b53d00b2997ef5bc5e25c7a77235
Author: Namhyung Kim 
AuthorDate: Fri, 10 Feb 2017 16:36:11 +0900
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 13 Feb 2017 14:29:36 -0300

perf diff: Add 'delta-abs' compute method

The 'delta-abs' compute method is same as 'delta' but shows entries with
bigger absolute delta first instead of sorting numerically.  This is
only useful together with -o option.

Below is default output (-c delta):

  $ perf diff -o 1 -c delta | grep -v ^# | head
42.22%   +4.97%  [kernel.kallsyms]  [k] cfb_imageblit
 0.62%   +1.23%  [kernel.kallsyms]  [k] mutex_lock
 +1.15%  [kernel.kallsyms]  [k] copy_user_generic_string
 2.40%   +0.95%  [kernel.kallsyms]  [k] bit_putcs
 0.31%   +0.79%  [kernel.kallsyms]  [k] link_path_walk
 +0.64%  [kernel.kallsyms]  [k] kmem_cache_alloc
 0.00%   +0.57%  [kernel.kallsyms]  [k] __rcu_read_unlock
 +0.45%  [kernel.kallsyms]  [k] alloc_set_pte
 0.16%   +0.45%  [kernel.kallsyms]  [k] menu_select
 +0.41%  ld-2.24.so [.] do_lookup_x

Now with 'delta-abs' it shows entries have bigger delta value either
positive or negative.

  $ perf diff -o 1 -c delta-abs | grep -v ^# | head
42.22%   +4.97%  [kernel.kallsyms]  [k] cfb_imageblit
12.72%   -3.01%  [kernel.kallsyms]  [k] intel_idle
 9.72%   -1.31%  [unknown]  [.] 0x00411343
 0.62%   +1.23%  [kernel.kallsyms]  [k] mutex_lock
 2.40%   +0.95%  [kernel.kallsyms]  [k] bit_putcs
 0.31%   +0.79%  [kernel.kallsyms]  [k] link_path_walk
 1.35%   -0.71%  [kernel.kallsyms]  [k] smp_call_function_single
 0.00%   +0.57%  [kernel.kallsyms]  [k] __rcu_read_unlock
 0.16%   +0.45%  [kernel.kallsyms]  [k] menu_select
 0.72%   -0.44%  [kernel.kallsyms]  [k] lookup_fast

Signed-off-by: Namhyung Kim 
Cc: Jiri Olsa 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20170210073614.24584-2-namhy...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-diff.txt |  6 -
 tools/perf/builtin-diff.c  | 46 --
 2 files changed, 49 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-diff.txt 
b/tools/perf/Documentation/perf-diff.txt
index 3e9490b..af80284 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -86,7 +86,7 @@ OPTIONS
 
 -c::
 --compute::
-Differential computation selection - delta,ratio,wdiff (default is 
delta).
+Differential computation selection - delta,ratio,wdiff,delta-abs 
(default is delta).
 See COMPARISON METHODS section for more info.
 
 -p::
@@ -181,6 +181,10 @@ with:
 relative to how entries are filtered.  Use --percentage=absolute to
 prevent such fluctuation.
 
+delta-abs
+~
+Same as 'delta` method, but sort the result with the absolute values.
+
 ratio
 ~
 If specified the 'Ratio' column is displayed with value 'r' computed as:
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 933aeec..781c9e6 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -30,6 +30,7 @@ enum {
PERF_HPP_DIFF__RATIO,
PERF_HPP_DIFF__WEIGHTED_DIFF,
PERF_HPP_DIFF__FORMULA,
+   PERF_HPP_DIFF__DELTA_ABS,
 
PERF_HPP_DIFF__MAX_INDEX
 };
@@ -73,11 +74,13 @@ enum {
COMPUTE_DELTA,
COMPUTE_RATIO,
COMPUTE_WEIGHTED_DIFF,
+   COMPUTE_DELTA_ABS,
COMPUTE_MAX,
 };
 
 const char *compute_names[COMPUTE_MAX] = {
[COMPUTE_DELTA] = "delta",
+   [COMPUTE_DELTA_ABS] = "delta-abs",
[COMPUTE_RATIO] = "ratio",
[COMPUTE_WEIGHTED_DIFF] = "wdiff",
 };
@@ -86,6 +89,7 @@ static int compute;
 
 static int compute_2_hpp[COMPUTE_MAX] = {
[COMPUTE_DELTA] = PERF_HPP_DIFF__DELTA,
+   [COMPUTE_DELTA_ABS] = PERF_HPP_DIFF__DELTA_ABS,
[COMPUTE_RATIO] = PERF_HPP_DIFF__RATIO,
[COMPUTE_WEIGHTED_DIFF] = PERF_HPP_DIFF__WEIGHTED_DIFF,
 };
@@ -111,6 +115,10 @@ static struct header_column {
.name  = "Delta",
.width = 7,
},
+   [PERF_HPP_DIFF__DELTA_ABS] = {
+   .name  = "Delta Abs",
+   .width = 7,
+   },
[PERF_HPP_DIFF__RATIO] = {
.name  = "Ratio",
.width = 14,
@@ -298,6 +306,7 @@ static int formula_fprintf(struct hist_entry *he, struct 
hist_entry *pair,
 {
switch (compute) {
case COMPUTE_DELTA:
+   case COMPUTE_DELTA_ABS:
return formula_delta(he, pair, buf, size);
case COMPUTE_RATIO:
return formula_ratio(he, pair, buf, size);
@@ -461,6 +470,7 @@ static void hists__precompute(struct hists *hists)
 
switch (compute) {
case COMPUTE_DELTA:
+   

Re: [PATCH] seccomp: Only dump core when single-threaded

2017-02-13 Thread Andrei Vagin
On Tue, Feb 07, 2017 at 03:18:51PM -0800, Kees Cook wrote:
> The SECCOMP_RET_KILL filter return code has always killed the current
> thread, not the entire process. Changing this as a side-effect of dumping
> core isn't a safe thing to do (a few test suites have already flagged this
> behavioral change). Instead, restore the RET_KILL semantics, but still
> dump core when a RET_KILL delivers SIGSYS to a single-threaded process.
> 
> Fixes: b25e67161c29 ("seccomp: dump core when using SECCOMP_RET_KILL")
> Signed-off-by: Kees Cook 

All CRIU tests passed with this patch. Thanks!

Acked-by: Andrei Vagin 

> ---
>  kernel/seccomp.c | 13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index f8f88ebcb3ba..e15185c28de5 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -643,11 +643,14 @@ static int __seccomp_filter(int this_syscall, const 
> struct seccomp_data *sd,
>   default: {
>   siginfo_t info;
>   audit_seccomp(this_syscall, SIGSYS, action);
> - /* Show the original registers in the dump. */
> - syscall_rollback(current, task_pt_regs(current));
> - /* Trigger a manual coredump since do_exit skips it. */
> - seccomp_init_siginfo(, this_syscall, data);
> - do_coredump();
> + /* Dump core only if this is the last remaining thread. */
> + if (get_nr_threads(current) == 1) {
> + /* Show the original registers in the dump. */
> + syscall_rollback(current, task_pt_regs(current));
> + /* Trigger a manual coredump since do_exit skips it. */
> + seccomp_init_siginfo(, this_syscall, data);
> + do_coredump();
> + }
>   do_exit(SIGSYS);
>   }
>   }
> -- 
> 2.7.4
> 
> 
> -- 
> Kees Cook
> Pixel Security


Re: [PATCH] seccomp: Only dump core when single-threaded

2017-02-13 Thread Andrei Vagin
On Tue, Feb 07, 2017 at 03:18:51PM -0800, Kees Cook wrote:
> The SECCOMP_RET_KILL filter return code has always killed the current
> thread, not the entire process. Changing this as a side-effect of dumping
> core isn't a safe thing to do (a few test suites have already flagged this
> behavioral change). Instead, restore the RET_KILL semantics, but still
> dump core when a RET_KILL delivers SIGSYS to a single-threaded process.
> 
> Fixes: b25e67161c29 ("seccomp: dump core when using SECCOMP_RET_KILL")
> Signed-off-by: Kees Cook 

All CRIU tests passed with this patch. Thanks!

Acked-by: Andrei Vagin 

> ---
>  kernel/seccomp.c | 13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index f8f88ebcb3ba..e15185c28de5 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -643,11 +643,14 @@ static int __seccomp_filter(int this_syscall, const 
> struct seccomp_data *sd,
>   default: {
>   siginfo_t info;
>   audit_seccomp(this_syscall, SIGSYS, action);
> - /* Show the original registers in the dump. */
> - syscall_rollback(current, task_pt_regs(current));
> - /* Trigger a manual coredump since do_exit skips it. */
> - seccomp_init_siginfo(, this_syscall, data);
> - do_coredump();
> + /* Dump core only if this is the last remaining thread. */
> + if (get_nr_threads(current) == 1) {
> + /* Show the original registers in the dump. */
> + syscall_rollback(current, task_pt_regs(current));
> + /* Trigger a manual coredump since do_exit skips it. */
> + seccomp_init_siginfo(, this_syscall, data);
> + do_coredump();
> + }
>   do_exit(SIGSYS);
>   }
>   }
> -- 
> 2.7.4
> 
> 
> -- 
> Kees Cook
> Pixel Security


Re: linux-next: build failure after merge of the net tree

2017-02-13 Thread Ingo Molnar

* Stephen Rothwell  wrote:

> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -116,6 +116,12 @@ enum bpf_attach_type {
>  
>  #define MAX_BPF_ATTACH_TYPE __MAX_BPF_ATTACH_TYPE
>  
> +/* If BPF_F_ALLOW_OVERRIDE flag is used in BPF_PROG_ATTACH command
> + * to the given target_fd cgroup the descendent cgroup will be able to
> + * override effective bpf program that was inherited from this cgroup
> + */
> +#define BPF_F_ALLOW_OVERRIDE (1U << 0)
> +

BTW., guys, for heaven's sake, please use the standard (multi-line) comment 
style:

  /*
   * Comment .
   * .. goes here.
   */

specified in Documentation/CodingStyle...

It's not that hard to create visually balanced patterns.

Thanks,

Ingo



Re: linux-next: build failure after merge of the net tree

2017-02-13 Thread Ingo Molnar

* Stephen Rothwell  wrote:

> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -116,6 +116,12 @@ enum bpf_attach_type {
>  
>  #define MAX_BPF_ATTACH_TYPE __MAX_BPF_ATTACH_TYPE
>  
> +/* If BPF_F_ALLOW_OVERRIDE flag is used in BPF_PROG_ATTACH command
> + * to the given target_fd cgroup the descendent cgroup will be able to
> + * override effective bpf program that was inherited from this cgroup
> + */
> +#define BPF_F_ALLOW_OVERRIDE (1U << 0)
> +

BTW., guys, for heaven's sake, please use the standard (multi-line) comment 
style:

  /*
   * Comment .
   * .. goes here.
   */

specified in Documentation/CodingStyle...

It's not that hard to create visually balanced patterns.

Thanks,

Ingo



Re: [PATCH 02/10] ARM: dts: da850-evm: fix whitespace errors

2017-02-13 Thread Sekhar Nori
On Tuesday 14 February 2017 02:31 AM, Kevin Hilman wrote:
> Kevin Hilman  writes:
> 
>> Bartosz Golaszewski  writes:
>>
>>> Signed-off-by: Bartosz Golaszewski 
>>
>> I'll fold this one into the original since it's not yet merged.
> 
> Oops, Sekhar has already merged this one to his v4.11/dt branch, so he
> can apply it (or fold it in.)

Alright, I will apply this.

Bartosz, for future, please add a commit description for trivial patches
too. Even if its more or less just a rewording of subject line.

Thanks,
Sekhar


[PATCH v5] drivers/misc: Add Aspeed LPC control driver

2017-02-13 Thread Cyril Bur
In order to manage server systems, there is typically another processor
known as a BMC (Baseboard Management Controller) which is responsible
for powering the server and other various elements, sometimes fans,
often the system flash.

The Aspeed BMC family which is what is used on OpenPOWER machines and a
number of x86 as well is typically connected to the host via an LPC
(Low Pin Count) bus (among others).

The LPC bus is an ISA bus on steroids. It's generally used by the
BMC chip to provide the host with access to the system flash (via MEM/FW
cycles) that contains the BIOS or other host firmware along with a
number of SuperIO-style IOs (via IO space) such as UARTs, IPMI
controllers.

On the BMC chip side, this is all configured via a bunch of registers
whose content is related to a given policy of what devices are exposed
at a per system level, which is system/vendor specific, so we don't want
to bolt that into the BMC kernel. This started with a need to provide
something nicer than /dev/mem for user space to configure these things.

One important aspect of the configuration is how the MEM/FW space is
exposed to the host (ie, the x86 or POWER). Some registers in that
bridge can define a window remapping all or portion of the LPC MEM/FW
space to a portion of the BMC internal bus, with no specific limits
imposed in HW.

I think it makes sense to ensure that this window is configured by a
kernel driver that can apply some serious sanity checks on what it is
configured to map.

In practice, user space wants to control this by flipping the mapping
between essentially two types of portions of the BMC address space:

   - The flash space. This is a region of the BMC MMIO space that
more/less directly maps the system flash (at least for reads, writes
are somewhat more complicated).

   - One (or more) reserved area(s) of the BMC physical memory.

The latter is needed for a number of things, such as avoiding letting
the host manipulate the innards of the BMC flash controller via some
evil backdoor, we want to do flash updates by routing the window to a
portion of memory (under control of a mailbox protocol via some
separate set of registers) which the host can use to write new data in
bulk and then request the BMC to flash it. There are other uses, such
as allowing the host to boot from an in-memory flash image rather than
the one in flash (very handy for continuous integration and test, the
BMC can just download new images).

It is important to note that due to the way the Aspeed chip lets the
kernel configure the mapping between host LPC addresses and BMC ram
addresses the offset within the window must be a multiple of size.
Not doing so will fragment the accessible space rather than simply
moving 'zero' upwards. This is caused by the nature of HICR8 being a
mask and the way host LPC addresses are translated.

Signed-off-by: Cyril Bur 
---
v2:
   Removed unused functions
   Removed use of access_ok()
   All input is evil
   Reworked the interface as per Benjamin Herrenschmidts vision
v3:
   Removed 'default y' from Kconfig
   Reordered ioctl() struct fields
   Reworeded some comments
v4:
   Reorder ioctl() struct fields (again)
v5:
   Style cleanups
   Use of_address_to_resource()
   Document ioctl structure fields


 drivers/misc/Kconfig |   8 ++
 drivers/misc/Makefile|   1 +
 drivers/misc/aspeed-lpc-ctrl.c   | 264 +++
 include/uapi/linux/aspeed-lpc-ctrl.h |  60 
 4 files changed, 333 insertions(+)
 create mode 100644 drivers/misc/aspeed-lpc-ctrl.c
 create mode 100644 include/uapi/linux/aspeed-lpc-ctrl.h

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 64971baf11fa..7dc4c369012f 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -766,6 +766,14 @@ config PANEL_BOOT_MESSAGE
  An empty message will only clear the display at driver init time. Any 
other
  printf()-formatted message is valid with newline and escape codes.
 
+config ASPEED_LPC_CTRL
+   depends on (ARCH_ASPEED || COMPILE_TEST) && REGMAP && MFD_SYSCON
+   tristate "Aspeed ast2400/2500 HOST LPC to BMC bridge control"
+   ---help---
+ Control Aspeed ast2400/2500 HOST LPC to BMC mappings through
+ ioctl()s, the driver also provides a read/write interface to a BMC ram
+ region where the host LPC read/write region can be buffered.
+
 source "drivers/misc/c2port/Kconfig"
 source "drivers/misc/eeprom/Kconfig"
 source "drivers/misc/cb710/Kconfig"
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 31983366090a..de1925a9c80b 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -53,6 +53,7 @@ obj-$(CONFIG_ECHO)+= echo/
 obj-$(CONFIG_VEXPRESS_SYSCFG)  += vexpress-syscfg.o
 obj-$(CONFIG_CXL_BASE) += cxl/
 obj-$(CONFIG_PANEL) += panel.o
+obj-$(CONFIG_ASPEED_LPC_CTRL)  += aspeed-lpc-ctrl.o
 
 lkdtm-$(CONFIG_LKDTM)  += 

Re: [PATCH 02/10] ARM: dts: da850-evm: fix whitespace errors

2017-02-13 Thread Sekhar Nori
On Tuesday 14 February 2017 02:31 AM, Kevin Hilman wrote:
> Kevin Hilman  writes:
> 
>> Bartosz Golaszewski  writes:
>>
>>> Signed-off-by: Bartosz Golaszewski 
>>
>> I'll fold this one into the original since it's not yet merged.
> 
> Oops, Sekhar has already merged this one to his v4.11/dt branch, so he
> can apply it (or fold it in.)

Alright, I will apply this.

Bartosz, for future, please add a commit description for trivial patches
too. Even if its more or less just a rewording of subject line.

Thanks,
Sekhar


[PATCH v5] drivers/misc: Add Aspeed LPC control driver

2017-02-13 Thread Cyril Bur
In order to manage server systems, there is typically another processor
known as a BMC (Baseboard Management Controller) which is responsible
for powering the server and other various elements, sometimes fans,
often the system flash.

The Aspeed BMC family which is what is used on OpenPOWER machines and a
number of x86 as well is typically connected to the host via an LPC
(Low Pin Count) bus (among others).

The LPC bus is an ISA bus on steroids. It's generally used by the
BMC chip to provide the host with access to the system flash (via MEM/FW
cycles) that contains the BIOS or other host firmware along with a
number of SuperIO-style IOs (via IO space) such as UARTs, IPMI
controllers.

On the BMC chip side, this is all configured via a bunch of registers
whose content is related to a given policy of what devices are exposed
at a per system level, which is system/vendor specific, so we don't want
to bolt that into the BMC kernel. This started with a need to provide
something nicer than /dev/mem for user space to configure these things.

One important aspect of the configuration is how the MEM/FW space is
exposed to the host (ie, the x86 or POWER). Some registers in that
bridge can define a window remapping all or portion of the LPC MEM/FW
space to a portion of the BMC internal bus, with no specific limits
imposed in HW.

I think it makes sense to ensure that this window is configured by a
kernel driver that can apply some serious sanity checks on what it is
configured to map.

In practice, user space wants to control this by flipping the mapping
between essentially two types of portions of the BMC address space:

   - The flash space. This is a region of the BMC MMIO space that
more/less directly maps the system flash (at least for reads, writes
are somewhat more complicated).

   - One (or more) reserved area(s) of the BMC physical memory.

The latter is needed for a number of things, such as avoiding letting
the host manipulate the innards of the BMC flash controller via some
evil backdoor, we want to do flash updates by routing the window to a
portion of memory (under control of a mailbox protocol via some
separate set of registers) which the host can use to write new data in
bulk and then request the BMC to flash it. There are other uses, such
as allowing the host to boot from an in-memory flash image rather than
the one in flash (very handy for continuous integration and test, the
BMC can just download new images).

It is important to note that due to the way the Aspeed chip lets the
kernel configure the mapping between host LPC addresses and BMC ram
addresses the offset within the window must be a multiple of size.
Not doing so will fragment the accessible space rather than simply
moving 'zero' upwards. This is caused by the nature of HICR8 being a
mask and the way host LPC addresses are translated.

Signed-off-by: Cyril Bur 
---
v2:
   Removed unused functions
   Removed use of access_ok()
   All input is evil
   Reworked the interface as per Benjamin Herrenschmidts vision
v3:
   Removed 'default y' from Kconfig
   Reordered ioctl() struct fields
   Reworeded some comments
v4:
   Reorder ioctl() struct fields (again)
v5:
   Style cleanups
   Use of_address_to_resource()
   Document ioctl structure fields


 drivers/misc/Kconfig |   8 ++
 drivers/misc/Makefile|   1 +
 drivers/misc/aspeed-lpc-ctrl.c   | 264 +++
 include/uapi/linux/aspeed-lpc-ctrl.h |  60 
 4 files changed, 333 insertions(+)
 create mode 100644 drivers/misc/aspeed-lpc-ctrl.c
 create mode 100644 include/uapi/linux/aspeed-lpc-ctrl.h

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 64971baf11fa..7dc4c369012f 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -766,6 +766,14 @@ config PANEL_BOOT_MESSAGE
  An empty message will only clear the display at driver init time. Any 
other
  printf()-formatted message is valid with newline and escape codes.
 
+config ASPEED_LPC_CTRL
+   depends on (ARCH_ASPEED || COMPILE_TEST) && REGMAP && MFD_SYSCON
+   tristate "Aspeed ast2400/2500 HOST LPC to BMC bridge control"
+   ---help---
+ Control Aspeed ast2400/2500 HOST LPC to BMC mappings through
+ ioctl()s, the driver also provides a read/write interface to a BMC ram
+ region where the host LPC read/write region can be buffered.
+
 source "drivers/misc/c2port/Kconfig"
 source "drivers/misc/eeprom/Kconfig"
 source "drivers/misc/cb710/Kconfig"
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 31983366090a..de1925a9c80b 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -53,6 +53,7 @@ obj-$(CONFIG_ECHO)+= echo/
 obj-$(CONFIG_VEXPRESS_SYSCFG)  += vexpress-syscfg.o
 obj-$(CONFIG_CXL_BASE) += cxl/
 obj-$(CONFIG_PANEL) += panel.o
+obj-$(CONFIG_ASPEED_LPC_CTRL)  += aspeed-lpc-ctrl.o
 
 lkdtm-$(CONFIG_LKDTM)  += lkdtm_core.o
 

Re: [GIT PULL 00/15] perf/core improvements and fixes

2017-02-13 Thread Ingo Molnar

* Arnaldo Carvalho de Melo <a...@kernel.org> wrote:

> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit f2029b1e47b607619d1dd2cb0bbb77f64ec6b7c2:
> 
>   perf/x86/intel: Add Kaby Lake support (2017-02-11 21:28:23 +0100)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-4.11-20170213
> 
> for you to fetch changes up to a734fb5d60067a73dd7099a58756847c07f9cd68:
> 
>   samples/bpf: Reset global variables (2017-02-13 17:22:53 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> New feature:
> 
> - Introduce the 'delta-abs' 'perf diff' compute method, that orders the
>   histogram entries by the absolute value of the percentage delta for a
>   function in two perf.data files, i.e. the functions that changed the
>   most (increase or decrease in samples) comes first (Namhyung Kim)
> 
> User visible:
> 
> - Improve message about tweaking the kernel.perf_event_paranoid setting,
>   telling how to make the change permanent by editing /etc/sysctl.conf
>   (Ingo Molnar)
> 
> Infrastructure:
> 
> - Introduce linux/compiler-gcc.h as a counterpart to the kernel's,
>   initially containing the definition of __fallthrough, more to
>   come (__maybe_unused, etc) (Arnaldo Carvalho de Melo)
> 
> - Fixes for problems uncovered by building tools/perf with clang, such
>   as always true tests of arrays against NULL and variables that sometimes
>   were used without being initialized (Arnaldo Carvalho de Melo, Steven 
> Rostedt)
> 
> - Before loading a new ELF, clear global variables set by the
>   samples/bpf loader (Mickaël Salaün)
> 
> - Ignore already processed ELF sections in the samples/bpf
>   loader (Mickaël Salaün)
> 
> - Fix compile error in the scripting code with some perl5
>   versions (Wang YanQing)
> 
> Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
> 
> 
> Arnaldo Carvalho de Melo (6):
>   tools include: Introduce linux/compiler-gcc.h
>   tools lib traceevent plugin function: Initialize 'index' variable
>   perf evsel: Inform how to make a sysctl setting permanent
>   perf symbols: No need to check if sym->name is NULL
>   perf tests record: No need to test an array against NULL
>   perf symbols: dso->name is an array, no need to check it against NULL
> 
> Mickaël Salaün (3):
>   samples/bpf: Add missing header
>   samples/bpf: Ignore already processed ELF sections
>   samples/bpf: Reset global variables
> 
> Namhyung Kim (4):
>   perf diff: Add 'delta-abs' compute method
>   perf diff: Add diff.order config option
>   perf diff: Add diff.compute config option
>   perf diff: Change default setting to "delta-abs"
> 
> Steven Rostedt (VMware) (1):
>   tools lib traceevent: Initialize lenght on 
> OLD_RING_BUFFER_TYPE_TIME_STAMP
> 
> Wang YanQing (1):
>   perf scripting perl: Fix compile error with some perl5 versions
> 
>  samples/bpf/bpf_load.c |  7 ++
>  samples/bpf/tracex5_kern.c |  1 +
>  tools/include/linux/compiler-gcc.h | 14 
>  tools/include/linux/compiler.h | 10 +--
>  tools/lib/traceevent/kbuffer-parse.c   |  1 +
>  tools/lib/traceevent/plugin_function.c |  2 +-
>  tools/perf/Documentation/perf-config.txt   | 12 
>  tools/perf/Documentation/perf-diff.txt | 15 -
>  tools/perf/MANIFEST|  1 +
>  tools/perf/builtin-diff.c  | 78 
> --
>  tools/perf/builtin-kmem.c  |  4 +-
>  tools/perf/builtin-record.c|  2 +-
>  tools/perf/builtin-sched.c |  2 +-
>  tools/perf/builtin-stat.c  |  2 +-
>  tools/perf/builtin-top.c   |  2 +-
>  tools/perf/tests/perf-record.c |  2 +-
>  tools/perf/util/evsel.c|  4 +-
>  tools/perf/util/evsel_fprintf.c|  1 -
>  tools/perf/util/machine.c  |  2 +-
>  tools/perf/util/map.c  |  4 +-
>  tools/perf/util/scripting-engines/Build|  2 +-
>  .../perf/util/scripting-engines/trace-event-perl.c |  4 +-
>  tools/perf/util/symbol_fprintf.c   |  2 +-
>  23 files changed, 145 insertions(+), 29 deletions(-)
>  create mode 100644 tools/include/linux/compiler-gcc.h

Pulled, thanks a lot Arnaldo!

Ingo


Re: [GIT PULL 00/15] perf/core improvements and fixes

2017-02-13 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit f2029b1e47b607619d1dd2cb0bbb77f64ec6b7c2:
> 
>   perf/x86/intel: Add Kaby Lake support (2017-02-11 21:28:23 +0100)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-4.11-20170213
> 
> for you to fetch changes up to a734fb5d60067a73dd7099a58756847c07f9cd68:
> 
>   samples/bpf: Reset global variables (2017-02-13 17:22:53 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> New feature:
> 
> - Introduce the 'delta-abs' 'perf diff' compute method, that orders the
>   histogram entries by the absolute value of the percentage delta for a
>   function in two perf.data files, i.e. the functions that changed the
>   most (increase or decrease in samples) comes first (Namhyung Kim)
> 
> User visible:
> 
> - Improve message about tweaking the kernel.perf_event_paranoid setting,
>   telling how to make the change permanent by editing /etc/sysctl.conf
>   (Ingo Molnar)
> 
> Infrastructure:
> 
> - Introduce linux/compiler-gcc.h as a counterpart to the kernel's,
>   initially containing the definition of __fallthrough, more to
>   come (__maybe_unused, etc) (Arnaldo Carvalho de Melo)
> 
> - Fixes for problems uncovered by building tools/perf with clang, such
>   as always true tests of arrays against NULL and variables that sometimes
>   were used without being initialized (Arnaldo Carvalho de Melo, Steven 
> Rostedt)
> 
> - Before loading a new ELF, clear global variables set by the
>   samples/bpf loader (Mickaël Salaün)
> 
> - Ignore already processed ELF sections in the samples/bpf
>   loader (Mickaël Salaün)
> 
> - Fix compile error in the scripting code with some perl5
>   versions (Wang YanQing)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Arnaldo Carvalho de Melo (6):
>   tools include: Introduce linux/compiler-gcc.h
>   tools lib traceevent plugin function: Initialize 'index' variable
>   perf evsel: Inform how to make a sysctl setting permanent
>   perf symbols: No need to check if sym->name is NULL
>   perf tests record: No need to test an array against NULL
>   perf symbols: dso->name is an array, no need to check it against NULL
> 
> Mickaël Salaün (3):
>   samples/bpf: Add missing header
>   samples/bpf: Ignore already processed ELF sections
>   samples/bpf: Reset global variables
> 
> Namhyung Kim (4):
>   perf diff: Add 'delta-abs' compute method
>   perf diff: Add diff.order config option
>   perf diff: Add diff.compute config option
>   perf diff: Change default setting to "delta-abs"
> 
> Steven Rostedt (VMware) (1):
>   tools lib traceevent: Initialize lenght on 
> OLD_RING_BUFFER_TYPE_TIME_STAMP
> 
> Wang YanQing (1):
>   perf scripting perl: Fix compile error with some perl5 versions
> 
>  samples/bpf/bpf_load.c |  7 ++
>  samples/bpf/tracex5_kern.c |  1 +
>  tools/include/linux/compiler-gcc.h | 14 
>  tools/include/linux/compiler.h | 10 +--
>  tools/lib/traceevent/kbuffer-parse.c   |  1 +
>  tools/lib/traceevent/plugin_function.c |  2 +-
>  tools/perf/Documentation/perf-config.txt   | 12 
>  tools/perf/Documentation/perf-diff.txt | 15 -
>  tools/perf/MANIFEST|  1 +
>  tools/perf/builtin-diff.c  | 78 
> --
>  tools/perf/builtin-kmem.c  |  4 +-
>  tools/perf/builtin-record.c|  2 +-
>  tools/perf/builtin-sched.c |  2 +-
>  tools/perf/builtin-stat.c  |  2 +-
>  tools/perf/builtin-top.c   |  2 +-
>  tools/perf/tests/perf-record.c |  2 +-
>  tools/perf/util/evsel.c|  4 +-
>  tools/perf/util/evsel_fprintf.c|  1 -
>  tools/perf/util/machine.c  |  2 +-
>  tools/perf/util/map.c  |  4 +-
>  tools/perf/util/scripting-engines/Build|  2 +-
>  .../perf/util/scripting-engines/trace-event-perl.c |  4 +-
>  tools/perf/util/symbol_fprintf.c   |  2 +-
>  23 files changed, 145 insertions(+), 29 deletions(-)
>  create mode 100644 tools/include/linux/compiler-gcc.h

Pulled, thanks a lot Arnaldo!

Ingo


[PATCH v2] drm/amd/dc: resource: fix semicolon.cocci warnings (fwd)

2017-02-13 Thread Julia Lawall

Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

CC: Harry Wentland 
Signed-off-by: Julia Lawall 
Signed-off-by: Fengguang Wu 
---

v2: make subject line unique

tree:   git://people.freedesktop.org/~agd5f/linux.git amd-staging-4.9
head:   79d2de1bcb650296adff1cb08bfbf1501a6e6e14
commit: bad4c165a6986a131cdd1455507ba3857baaa561 [201/657] drm/amd/dc: Add
dc display driver

 dc_resource.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_resource.c
@@ -135,7 +135,7 @@ static void update_num_audio(
break;
default:
DC_ERR("DC: unexpected audio fuse!\n");
-   };
+   }
 }

 bool resource_construct(


[PATCH v2] drm/amd/dc: resource: fix semicolon.cocci warnings (fwd)

2017-02-13 Thread Julia Lawall

Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

CC: Harry Wentland 
Signed-off-by: Julia Lawall 
Signed-off-by: Fengguang Wu 
---

v2: make subject line unique

tree:   git://people.freedesktop.org/~agd5f/linux.git amd-staging-4.9
head:   79d2de1bcb650296adff1cb08bfbf1501a6e6e14
commit: bad4c165a6986a131cdd1455507ba3857baaa561 [201/657] drm/amd/dc: Add
dc display driver

 dc_resource.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_resource.c
@@ -135,7 +135,7 @@ static void update_num_audio(
break;
default:
DC_ERR("DC: unexpected audio fuse!\n");
-   };
+   }
 }

 bool resource_construct(


[PATCH v2] drm/amd/dc: dm_types: fix semicolon.cocci warnings

2017-02-13 Thread Julia Lawall
Remove unneeded semicolons.

Generated by: scripts/coccinelle/misc/semicolon.cocci

CC: Harry Wentland 
Signed-off-by: Julia Lawall 
Signed-off-by: Fengguang Wu 
---

v2: make subject line unique

tree:   git://people.freedesktop.org/~agd5f/linux.git amd-staging-4.9
head:   79d2de1bcb650296adff1cb08bfbf1501a6e6e14
commit: bad4c165a6986a131cdd1455507ba3857baaa561 [201/657] drm/amd/dc: Add
dc display driver

 amdgpu_dm_types.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_types.c
+++ b/drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_types.c
@@ -890,11 +890,11 @@ static void copy_crtc_timing_for_drm_dis
dst_mode->crtc_hsync_end = src_mode->crtc_hsync_end;
dst_mode->crtc_htotal = src_mode->crtc_htotal;
dst_mode->crtc_hskew = src_mode->crtc_hskew;
-   dst_mode->crtc_vblank_start = src_mode->crtc_vblank_start;;
-   dst_mode->crtc_vblank_end = src_mode->crtc_vblank_end;;
-   dst_mode->crtc_vsync_start = src_mode->crtc_vsync_start;;
-   dst_mode->crtc_vsync_end = src_mode->crtc_vsync_end;;
-   dst_mode->crtc_vtotal = src_mode->crtc_vtotal;;
+   dst_mode->crtc_vblank_start = src_mode->crtc_vblank_start;
+   dst_mode->crtc_vblank_end = src_mode->crtc_vblank_end;
+   dst_mode->crtc_vsync_start = src_mode->crtc_vsync_start;
+   dst_mode->crtc_vsync_end = src_mode->crtc_vsync_end;
+   dst_mode->crtc_vtotal = src_mode->crtc_vtotal;
 }

 static void decide_crtc_timing_for_drm_display_mode(


[PATCH v2] drm/amd/dc: dm_types: fix semicolon.cocci warnings

2017-02-13 Thread Julia Lawall
Remove unneeded semicolons.

Generated by: scripts/coccinelle/misc/semicolon.cocci

CC: Harry Wentland 
Signed-off-by: Julia Lawall 
Signed-off-by: Fengguang Wu 
---

v2: make subject line unique

tree:   git://people.freedesktop.org/~agd5f/linux.git amd-staging-4.9
head:   79d2de1bcb650296adff1cb08bfbf1501a6e6e14
commit: bad4c165a6986a131cdd1455507ba3857baaa561 [201/657] drm/amd/dc: Add
dc display driver

 amdgpu_dm_types.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_types.c
+++ b/drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_types.c
@@ -890,11 +890,11 @@ static void copy_crtc_timing_for_drm_dis
dst_mode->crtc_hsync_end = src_mode->crtc_hsync_end;
dst_mode->crtc_htotal = src_mode->crtc_htotal;
dst_mode->crtc_hskew = src_mode->crtc_hskew;
-   dst_mode->crtc_vblank_start = src_mode->crtc_vblank_start;;
-   dst_mode->crtc_vblank_end = src_mode->crtc_vblank_end;;
-   dst_mode->crtc_vsync_start = src_mode->crtc_vsync_start;;
-   dst_mode->crtc_vsync_end = src_mode->crtc_vsync_end;;
-   dst_mode->crtc_vtotal = src_mode->crtc_vtotal;;
+   dst_mode->crtc_vblank_start = src_mode->crtc_vblank_start;
+   dst_mode->crtc_vblank_end = src_mode->crtc_vblank_end;
+   dst_mode->crtc_vsync_start = src_mode->crtc_vsync_start;
+   dst_mode->crtc_vsync_end = src_mode->crtc_vsync_end;
+   dst_mode->crtc_vtotal = src_mode->crtc_vtotal;
 }

 static void decide_crtc_timing_for_drm_display_mode(


  1   2   3   4   5   6   7   8   9   10   >