Re: [RFC][PATCHv3 2/5] printk: introduce printing kernel thread

2017-05-09 Thread Sergey Senozhatsky
fix !PRINTK config build.
(Reported-by: kbuild test robot ).



From: Sergey Senozhatsky 
Subject: [PATCH 2/5] printk: introduce printing kernel thread

printk() is quite complex internally and, basically, it does two
slightly independent things:
 a) adds a new message to a kernel log buffer (log_store())
 b) prints kernel log messages to serial consoles (console_unlock())

while (a) is guaranteed to be executed by printk(), (b) is not, for a
variety of reasons, and, unlike log_store(), it comes at a price:

 1) console_unlock() attempts to flush all pending kernel log messages
to the console and it can loop indefinitely.

 2) while console_unlock() is executed on one particular CPU, printing
pending kernel log messages, other CPUs can simultaneously append new
messages to the kernel log buffer.

 3) the time it takes console_unlock() to print kernel messages also
depends on the speed of the console -- which may not be fast at all.

 4) console_unlock() is executed in the same context as printk(), so
it may be non-preemptible/atomic, which makes 1)-3) dangerous.

As a result, nobody knows how long a printk() call will take, so
it's not really safe to call printk() in a number of situations,
including atomic context, RCU critical sections, interrupt context,
etc.

This patch introduces a '/sys/module/printk/parameters/atomic_print_limit'
sysfs param, which sets the limit on number of lines a process can print
from console_unlock(). Value 0 corresponds to the current behavior (no
limitation). The printing offloading is happening from console_unlock()
function and, briefly, looks as follows: as soon as process prints more
than `atomic_print_limit' lines it attempts to offload printing to another
process. Since nothing guarantees that there will another process sleeping
on the console_sem or calling printk() on another CPU simultaneously, the
patch also introduces an auxiliary kernel thread - printk_kthread, the
main purpose of which is to take over printing duty. The workflow is, thus,
turns into: as soon as process prints more than `atomic_print_limit' lines
it wakes up printk_kthread and unlocks the console_sem. So in the best case
at this point there will be at least 1 processes trying to lock the
console_sem: printk_kthread. (There can also be a process that was sleeping
on the console_sem and that was woken up by console semaphore up(); and
concurrent printk() invocations from other CPUs). But in the worst case
there won't be any processes ready to take over the printing duty: it
may take printk_kthread some time to become running; or printk_kthread
may even never become running (a misbehaving scheduler, or other critical
condition). That's why after we wake_up() printk_kthread we can't
immediately leave the printing loop, we must ensure that the console_sem
has a new owner before we do so. Therefore, `atomic_print_limit' is a soft
limit, not the hard one: we let task to overrun `atomic_print_limit'. But,
at the same time, the console_unlock() printing loop behaves differently
for tasks that have exceeded `atomic_print_limit': after every printed
logbuf entry (call_console_drivers()) such a process wakes up printk_kthread,
unlocks the console_sem and attempts to console_trylock() a bit later (if
there any are pending messages in the logbuf, of course). In the best case
scenario either printk_kthread or some other tasks will lock the console_sem,
so current printing task will see failed console_trylock(), which will
indicate a successful printing offloading. In the worst case, however,
current will successfully console_trylock(), which will indicate that
offloading did not take place and we can't return from console_unlock(),
so the printing task will print one more line from the logbuf and attempt
to offload printing once again; and it will continue doing so until another
process locks the console_sem or until there are pending messages in the
logbuf. So if everything goes wrong - we can't wakeup printk_kthread and
there are no other processes sleeping on the console_sem or trying to down()
it - then we will have the existing console_unlock() behavior: print all
pending messages in one shot.

We track the number of unsuccessful offloading attempts and after some
point we declare a `printk_emergency' condition and give up trying to
offload. `printk_emergency' is a new name for printk mode in which printk
does not attempt to offload printing, but instead flushes all the pending
logbuf messages (basically, the existing behaviour).

There also other cases when we can't (or should avoid) offloading. For
example, we can't call into the scheduler from panic(), because this may
cause deadlock. Therefore printk() has some additional places where it
switches to printk_emergency mode: for instance, once a EMERG log level
message appears in the log buffer. We also provide two new functions
that can be used when a path needs to declare a temporal

Re: [RFC][PATCHv3 2/5] printk: introduce printing kernel thread

2017-05-09 Thread Sergey Senozhatsky
fix !PRINTK config build.
(Reported-by: kbuild test robot ).



From: Sergey Senozhatsky 
Subject: [PATCH 2/5] printk: introduce printing kernel thread

printk() is quite complex internally and, basically, it does two
slightly independent things:
 a) adds a new message to a kernel log buffer (log_store())
 b) prints kernel log messages to serial consoles (console_unlock())

while (a) is guaranteed to be executed by printk(), (b) is not, for a
variety of reasons, and, unlike log_store(), it comes at a price:

 1) console_unlock() attempts to flush all pending kernel log messages
to the console and it can loop indefinitely.

 2) while console_unlock() is executed on one particular CPU, printing
pending kernel log messages, other CPUs can simultaneously append new
messages to the kernel log buffer.

 3) the time it takes console_unlock() to print kernel messages also
depends on the speed of the console -- which may not be fast at all.

 4) console_unlock() is executed in the same context as printk(), so
it may be non-preemptible/atomic, which makes 1)-3) dangerous.

As a result, nobody knows how long a printk() call will take, so
it's not really safe to call printk() in a number of situations,
including atomic context, RCU critical sections, interrupt context,
etc.

This patch introduces a '/sys/module/printk/parameters/atomic_print_limit'
sysfs param, which sets the limit on number of lines a process can print
from console_unlock(). Value 0 corresponds to the current behavior (no
limitation). The printing offloading is happening from console_unlock()
function and, briefly, looks as follows: as soon as process prints more
than `atomic_print_limit' lines it attempts to offload printing to another
process. Since nothing guarantees that there will another process sleeping
on the console_sem or calling printk() on another CPU simultaneously, the
patch also introduces an auxiliary kernel thread - printk_kthread, the
main purpose of which is to take over printing duty. The workflow is, thus,
turns into: as soon as process prints more than `atomic_print_limit' lines
it wakes up printk_kthread and unlocks the console_sem. So in the best case
at this point there will be at least 1 processes trying to lock the
console_sem: printk_kthread. (There can also be a process that was sleeping
on the console_sem and that was woken up by console semaphore up(); and
concurrent printk() invocations from other CPUs). But in the worst case
there won't be any processes ready to take over the printing duty: it
may take printk_kthread some time to become running; or printk_kthread
may even never become running (a misbehaving scheduler, or other critical
condition). That's why after we wake_up() printk_kthread we can't
immediately leave the printing loop, we must ensure that the console_sem
has a new owner before we do so. Therefore, `atomic_print_limit' is a soft
limit, not the hard one: we let task to overrun `atomic_print_limit'. But,
at the same time, the console_unlock() printing loop behaves differently
for tasks that have exceeded `atomic_print_limit': after every printed
logbuf entry (call_console_drivers()) such a process wakes up printk_kthread,
unlocks the console_sem and attempts to console_trylock() a bit later (if
there any are pending messages in the logbuf, of course). In the best case
scenario either printk_kthread or some other tasks will lock the console_sem,
so current printing task will see failed console_trylock(), which will
indicate a successful printing offloading. In the worst case, however,
current will successfully console_trylock(), which will indicate that
offloading did not take place and we can't return from console_unlock(),
so the printing task will print one more line from the logbuf and attempt
to offload printing once again; and it will continue doing so until another
process locks the console_sem or until there are pending messages in the
logbuf. So if everything goes wrong - we can't wakeup printk_kthread and
there are no other processes sleeping on the console_sem or trying to down()
it - then we will have the existing console_unlock() behavior: print all
pending messages in one shot.

We track the number of unsuccessful offloading attempts and after some
point we declare a `printk_emergency' condition and give up trying to
offload. `printk_emergency' is a new name for printk mode in which printk
does not attempt to offload printing, but instead flushes all the pending
logbuf messages (basically, the existing behaviour).

There also other cases when we can't (or should avoid) offloading. For
example, we can't call into the scheduler from panic(), because this may
cause deadlock. Therefore printk() has some additional places where it
switches to printk_emergency mode: for instance, once a EMERG log level
message appears in the log buffer. We also provide two new functions
that can be used when a path needs to declare a temporal
`printk_emergency' mode:

 -- 

Re: [RFC 09/10] x86/mm: Rework lazy TLB to track the actual loaded mm

2017-05-09 Thread Ingo Molnar

* Thomas Gleixner  wrote:

> On Sun, 7 May 2017, Andy Lutomirski wrote:
> >  /* context.lock is held for us, so we don't need any locking. */
> >  static void flush_ldt(void *current_mm)
> >  {
> > +   struct mm_struct *mm = current_mm;
> > mm_context_t *pc;
> >  
> > -   if (current->active_mm != current_mm)
> > +   if (this_cpu_read(cpu_tlbstate.loaded_mm) != current_mm)
> 
> While functional correct, this really should compare against 'mm'.
> 
> > return;
> >  
> > -   pc = >active_mm->context;
> > +   pc = >context;

So this appears to be the function:

 static void flush_ldt(void *current_mm)
 {
struct mm_struct *mm = current_mm;
mm_context_t *pc;

if (this_cpu_read(cpu_tlbstate.loaded_mm) != current_mm)
return;

pc = >context;
set_ldt(pc->ldt->entries, pc->ldt->size);
 }

why not rename 'current_mm' to 'mm' and remove the 'mm' local variable?

Thanks,

Ingo


Re: [RFC 09/10] x86/mm: Rework lazy TLB to track the actual loaded mm

2017-05-09 Thread Ingo Molnar

* Thomas Gleixner  wrote:

> On Sun, 7 May 2017, Andy Lutomirski wrote:
> >  /* context.lock is held for us, so we don't need any locking. */
> >  static void flush_ldt(void *current_mm)
> >  {
> > +   struct mm_struct *mm = current_mm;
> > mm_context_t *pc;
> >  
> > -   if (current->active_mm != current_mm)
> > +   if (this_cpu_read(cpu_tlbstate.loaded_mm) != current_mm)
> 
> While functional correct, this really should compare against 'mm'.
> 
> > return;
> >  
> > -   pc = >active_mm->context;
> > +   pc = >context;

So this appears to be the function:

 static void flush_ldt(void *current_mm)
 {
struct mm_struct *mm = current_mm;
mm_context_t *pc;

if (this_cpu_read(cpu_tlbstate.loaded_mm) != current_mm)
return;

pc = >context;
set_ldt(pc->ldt->entries, pc->ldt->size);
 }

why not rename 'current_mm' to 'mm' and remove the 'mm' local variable?

Thanks,

Ingo


Re: [PATCH v2] perf report: distinguish between inliners in the same function

2017-05-09 Thread Namhyung Kim
Hi,

On Wed, May 03, 2017 at 11:35:36PM +0200, Milian Wolff wrote:
> When different functions get inlined into the same function, we
> want to show them individually in the reports. But when we group by
> function, we would aggregate all IPs and would only keep the first
> one in that function. E.g. for C++ code like the following:
> 
> ~
> #include 
> #include 
> #include 
> 
> using namespace std;
> 
> int main()
> {
> uniform_real_distribution uniform(-1E5, 1E5);
> default_random_engine engine;
> double s = 0;
> for (int i = 0; i < 1000; ++i) {
> s += uniform(engine);
> }
> cout << "random sum: " << s << '\n';
> return 0;
> }
> ~
> 
> Building it with `g++ -O2 -g` and recording some samples with
> `perf record --call-graph dwarf` yields for me:
> 
> ~
> $ perf report --stdio --inline
> # Overhead  CommandShared Object  Symbol
> #   .  .  
> #
> 99.40%99.11%  a.outa.out[.] main
> |
>  --99.11%--_start
>__libc_start_main
>main
> ...
> ~
> 
> Note how no inlined frames are actually shown, because the first
> sample in main points to an IP that does not correspond to any
> inlined frames.
> 
> With this patch applied, we instead get the following, much more
> meaningful, reports.
> 
> ~
> $ perf report --stdio --inline --no-children
> # Overhead  CommandShared Object  Symbol
> #   .  .  
> #
> 99.11%  a.outa.out [.] main
> |
> |--48.15%--main
> |  
> std::__detail::_Adaptor, double>::operator() (inline)
> |  
> std::uniform_real_distribution::operator() > (inline)
> |  
> std::uniform_real_distribution::operator() > (inline)
> |  main (inline)
> |  __libc_start_main
> |  _start
> |
> |--47.61%--main
> |  std::__detail::__mod 16807ul, 0ul> (inline)
> |  std::linear_congruential_engine 16807ul, 0ul, 2147483647ul>::operator() (inline)
> |  std::generate_canonical std::linear_congruential_engine > 
> (inline)
> |  
> std::__detail::_Adaptor, double>::operator() (inline)
> |  
> std::uniform_real_distribution::operator() > (inline)
> |  
> std::uniform_real_distribution::operator() > (inline)
> |  main (inline)
> |  __libc_start_main
> |  _start
> |
>  --3.35%--main
>
> std::uniform_real_distribution::operator() > (inline)
>main (inline)
>__libc_start_main
>_start
> ...
> 
> $ perf report --stdio --inline
> # Children  Self  Command  Shared ObjectSymbol
> #     ...  ...  
> 
> #
> 99.40%99.11%  a.outa.out[.] main
> |
>  --99.11%--_start
>__libc_start_main
>|
>|--70.51%--main
>|  main (inline)
>|  
> std::uniform_real_distribution::operator() > (inline)
>|  
> std::uniform_real_distribution::operator() > (inline)
>|  
> std::__detail::_Adaptor, double>::operator() (inline)
>|
>|--25.25%--main
>|  main (inline)
>|  
> std::uniform_real_distribution::operator() > (inline)
>|  
> std::uniform_real_distribution::operator() > (inline)
>|  
> std::__detail::_Adaptor, 

Re: [PATCH v2] perf report: distinguish between inliners in the same function

2017-05-09 Thread Namhyung Kim
Hi,

On Wed, May 03, 2017 at 11:35:36PM +0200, Milian Wolff wrote:
> When different functions get inlined into the same function, we
> want to show them individually in the reports. But when we group by
> function, we would aggregate all IPs and would only keep the first
> one in that function. E.g. for C++ code like the following:
> 
> ~
> #include 
> #include 
> #include 
> 
> using namespace std;
> 
> int main()
> {
> uniform_real_distribution uniform(-1E5, 1E5);
> default_random_engine engine;
> double s = 0;
> for (int i = 0; i < 1000; ++i) {
> s += uniform(engine);
> }
> cout << "random sum: " << s << '\n';
> return 0;
> }
> ~
> 
> Building it with `g++ -O2 -g` and recording some samples with
> `perf record --call-graph dwarf` yields for me:
> 
> ~
> $ perf report --stdio --inline
> # Overhead  CommandShared Object  Symbol
> #   .  .  
> #
> 99.40%99.11%  a.outa.out[.] main
> |
>  --99.11%--_start
>__libc_start_main
>main
> ...
> ~
> 
> Note how no inlined frames are actually shown, because the first
> sample in main points to an IP that does not correspond to any
> inlined frames.
> 
> With this patch applied, we instead get the following, much more
> meaningful, reports.
> 
> ~
> $ perf report --stdio --inline --no-children
> # Overhead  CommandShared Object  Symbol
> #   .  .  
> #
> 99.11%  a.outa.out [.] main
> |
> |--48.15%--main
> |  
> std::__detail::_Adaptor 16807ul, 0ul, 2147483647ul>, double>::operator() (inline)
> |  
> std::uniform_real_distribution::operator()  long, 16807ul, 0ul, 2147483647ul> > (inline)
> |  
> std::uniform_real_distribution::operator()  long, 16807ul, 0ul, 2147483647ul> > (inline)
> |  main (inline)
> |  __libc_start_main
> |  _start
> |
> |--47.61%--main
> |  std::__detail::__mod 16807ul, 0ul> (inline)
> |  std::linear_congruential_engine 16807ul, 0ul, 2147483647ul>::operator() (inline)
> |  std::generate_canonical std::linear_congruential_engine > 
> (inline)
> |  
> std::__detail::_Adaptor 16807ul, 0ul, 2147483647ul>, double>::operator() (inline)
> |  
> std::uniform_real_distribution::operator()  long, 16807ul, 0ul, 2147483647ul> > (inline)
> |  
> std::uniform_real_distribution::operator()  long, 16807ul, 0ul, 2147483647ul> > (inline)
> |  main (inline)
> |  __libc_start_main
> |  _start
> |
>  --3.35%--main
>
> std::uniform_real_distribution::operator()  long, 16807ul, 0ul, 2147483647ul> > (inline)
>main (inline)
>__libc_start_main
>_start
> ...
> 
> $ perf report --stdio --inline
> # Children  Self  Command  Shared ObjectSymbol
> #     ...  ...  
> 
> #
> 99.40%99.11%  a.outa.out[.] main
> |
>  --99.11%--_start
>__libc_start_main
>|
>|--70.51%--main
>|  main (inline)
>|  
> std::uniform_real_distribution::operator()  long, 16807ul, 0ul, 2147483647ul> > (inline)
>|  
> std::uniform_real_distribution::operator()  long, 16807ul, 0ul, 2147483647ul> > (inline)
>|  
> std::__detail::_Adaptor 16807ul, 0ul, 2147483647ul>, double>::operator() (inline)
>|
>|--25.25%--main
>|  main (inline)
>|  
> std::uniform_real_distribution::operator()  long, 16807ul, 0ul, 2147483647ul> > (inline)
>|  
> std::uniform_real_distribution::operator()  long, 16807ul, 0ul, 2147483647ul> > (inline)
>|  
> std::__detail::_Adaptor 16807ul, 0ul, 2147483647ul>, double>::operator() (inline)
>|  std::generate_canonical std::linear_congruential_engine > 
> (inline)
>|  std::linear_congruential_engine long, 16807ul, 0ul, 2147483647ul>::operator() (inline)
>|  std::__detail::__mod 2147483647ul, 16807ul, 0ul> (inline)
>|
> --3.35%--main
>   

Re: [PATCH -v3 0/13] mm: make movable onlining suck less

2017-05-09 Thread Michal Hocko
On Tue 09-05-17 21:43:16, Dan Williams wrote:
> On Fri, Apr 21, 2017 at 5:05 AM, Michal Hocko  wrote:
> > Hi,
> > The last version of this series has been posted here [1]. It has seen
> > some more testing (thanks to Reza Arbab and Igor Mammedov[2]), Jerome's
> > and Vlastimil's review resulted in few fixes mostly folded in their
> > respected patches.
> > There are 4 more patches (patch 6+ in this series).  I have checked the
> > most prominent pfn walkers to skip over offline holes and now and I feel
> > more comfortable to have this merged. All the reported issues should be
> > fixed
> >
> > There is still a lot of work on top - namely this implementation doesn't
> > support reonlining to a different zone on the zones boundaries but I
> > will do that in a separate series because this one is getting quite
> > large already and it should work reasonably well now.
> >
> > Joonsoo had some worries about pfn_valid and suggested to change its
> > semantic to return false on offline holes but I would be rally worried
> > to change a established semantic used by a lot of code and so I have
> > introuduced pfn_to_online_page helper instead. If this is seen as a
> > controversial point I would rather drop pfn_to_online_page and related
> > patches as they are not stictly necessary because the code would be
> > similarly broken as now wrt. offline holes.
> >
> > This is a rebase on top of linux-next (next-20170418) and the full
> > series is in git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
> > try attempts/rewrite-mem_hotplug branch.
> >
> [..]
> > Any thoughts, complains, suggestions?
> >
> > As a bonus we will get a nice cleanup in the memory hotplug codebase.
> >  arch/ia64/mm/init.c|  11 +-
> >  arch/powerpc/mm/mem.c  |  12 +-
> >  arch/s390/mm/init.c|  32 +--
> >  arch/sh/mm/init.c  |  10 +-
> >  arch/x86/mm/init_32.c  |   7 +-
> >  arch/x86/mm/init_64.c  |  11 +-
> >  drivers/base/memory.c  |  79 +++
> >  drivers/base/node.c|  58 ++
> >  include/linux/memory_hotplug.h |  40 +++-
> >  include/linux/mmzone.h |  44 +++-
> >  include/linux/node.h   |  35 +++-
> >  kernel/memremap.c  |   6 +-
> >  mm/compaction.c|   5 +-
> >  mm/memory_hotplug.c| 455 
> > ++---
> >  mm/page_alloc.c|  13 +-
> >  mm/page_isolation.c|  26 ++-
> >  mm/sparse.c|  48 -
> >  17 files changed, 407 insertions(+), 485 deletions(-)
> >
> > Shortlog says:
> > Michal Hocko (13):
> >   mm: remove return value from init_currently_empty_zone
> >   mm, memory_hotplug: use node instead of zone in 
> > can_online_high_movable
> >   mm: drop page_initialized check from get_nid_for_pfn
> >   mm, memory_hotplug: get rid of is_zone_device_section
> >   mm, memory_hotplug: split up register_one_node
> >   mm, memory_hotplug: consider offline memblocks removable
> >   mm: consider zone which is not fully populated to have holes
> >   mm, compaction: skip over holes in __reset_isolation_suitable
> >   mm: __first_valid_page skip over offline pages
> >   mm, memory_hotplug: do not associate hotadded memory to zones until 
> > online
> >   mm, memory_hotplug: replace for_device by want_memblock in 
> > arch_add_memory
> >   mm, memory_hotplug: fix the section mismatch warning
> >   mm, memory_hotplug: remove unused cruft after memory hotplug rework
> >
> > [1] http://lkml.kernel.org/r/20170410110351.12215-1-mho...@kernel.org
> > [2] http://lkml.kernel.org/r/20170410162749.7d7f3...@nial.brq.redhat.com
> >
> >
> 
> The latest "attempts/rewrite-mem_hotplug" branch passes my regression
> testing if I cherry-pick the following x86/mm fixes from mainline:
> 
> e6ab9c4d4377 x86/mm/64: Fix crash in remove_pagetable()
> 71389703839e mm, zone_device: Replace {get, put}_zone_device_page()
> with a single reference to fix pmem crash

I will make sure those will appear in the mmotm git tree (I will
probably pull the whole tip/x86-mm-for-linus.
 
> You can add:
> 
> Tested-by: Dan Williams 

Thanks a lot for your testing! I will put your tested-by to patches
where you were on the CC explicitly (and which might affect zone device)
- mm, memory_hotplug: get rid of is_zone_device_section
- mm, memory_hotplug: do not associate hotadded memory to zones until
  online
- mm, memory_hotplug: replace for_device by want_memblock in
  arch_add_memory

Let me know if you want other patches as well.

-- 
Michal Hocko
SUSE Labs


Re: [PATCH] x86/PCI: fix duplicate 'const' declaration specifier

2017-05-09 Thread Nick Desaulniers
Please disregard this patch.  I think I may have found a bug in Clang,
or at least an incompatibility with GCC 7.1.

Clang bug: https://bugs.llvm.org/show_bug.cgi?id=32985


Re: [PATCH -v3 0/13] mm: make movable onlining suck less

2017-05-09 Thread Michal Hocko
On Tue 09-05-17 21:43:16, Dan Williams wrote:
> On Fri, Apr 21, 2017 at 5:05 AM, Michal Hocko  wrote:
> > Hi,
> > The last version of this series has been posted here [1]. It has seen
> > some more testing (thanks to Reza Arbab and Igor Mammedov[2]), Jerome's
> > and Vlastimil's review resulted in few fixes mostly folded in their
> > respected patches.
> > There are 4 more patches (patch 6+ in this series).  I have checked the
> > most prominent pfn walkers to skip over offline holes and now and I feel
> > more comfortable to have this merged. All the reported issues should be
> > fixed
> >
> > There is still a lot of work on top - namely this implementation doesn't
> > support reonlining to a different zone on the zones boundaries but I
> > will do that in a separate series because this one is getting quite
> > large already and it should work reasonably well now.
> >
> > Joonsoo had some worries about pfn_valid and suggested to change its
> > semantic to return false on offline holes but I would be rally worried
> > to change a established semantic used by a lot of code and so I have
> > introuduced pfn_to_online_page helper instead. If this is seen as a
> > controversial point I would rather drop pfn_to_online_page and related
> > patches as they are not stictly necessary because the code would be
> > similarly broken as now wrt. offline holes.
> >
> > This is a rebase on top of linux-next (next-20170418) and the full
> > series is in git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
> > try attempts/rewrite-mem_hotplug branch.
> >
> [..]
> > Any thoughts, complains, suggestions?
> >
> > As a bonus we will get a nice cleanup in the memory hotplug codebase.
> >  arch/ia64/mm/init.c|  11 +-
> >  arch/powerpc/mm/mem.c  |  12 +-
> >  arch/s390/mm/init.c|  32 +--
> >  arch/sh/mm/init.c  |  10 +-
> >  arch/x86/mm/init_32.c  |   7 +-
> >  arch/x86/mm/init_64.c  |  11 +-
> >  drivers/base/memory.c  |  79 +++
> >  drivers/base/node.c|  58 ++
> >  include/linux/memory_hotplug.h |  40 +++-
> >  include/linux/mmzone.h |  44 +++-
> >  include/linux/node.h   |  35 +++-
> >  kernel/memremap.c  |   6 +-
> >  mm/compaction.c|   5 +-
> >  mm/memory_hotplug.c| 455 
> > ++---
> >  mm/page_alloc.c|  13 +-
> >  mm/page_isolation.c|  26 ++-
> >  mm/sparse.c|  48 -
> >  17 files changed, 407 insertions(+), 485 deletions(-)
> >
> > Shortlog says:
> > Michal Hocko (13):
> >   mm: remove return value from init_currently_empty_zone
> >   mm, memory_hotplug: use node instead of zone in 
> > can_online_high_movable
> >   mm: drop page_initialized check from get_nid_for_pfn
> >   mm, memory_hotplug: get rid of is_zone_device_section
> >   mm, memory_hotplug: split up register_one_node
> >   mm, memory_hotplug: consider offline memblocks removable
> >   mm: consider zone which is not fully populated to have holes
> >   mm, compaction: skip over holes in __reset_isolation_suitable
> >   mm: __first_valid_page skip over offline pages
> >   mm, memory_hotplug: do not associate hotadded memory to zones until 
> > online
> >   mm, memory_hotplug: replace for_device by want_memblock in 
> > arch_add_memory
> >   mm, memory_hotplug: fix the section mismatch warning
> >   mm, memory_hotplug: remove unused cruft after memory hotplug rework
> >
> > [1] http://lkml.kernel.org/r/20170410110351.12215-1-mho...@kernel.org
> > [2] http://lkml.kernel.org/r/20170410162749.7d7f3...@nial.brq.redhat.com
> >
> >
> 
> The latest "attempts/rewrite-mem_hotplug" branch passes my regression
> testing if I cherry-pick the following x86/mm fixes from mainline:
> 
> e6ab9c4d4377 x86/mm/64: Fix crash in remove_pagetable()
> 71389703839e mm, zone_device: Replace {get, put}_zone_device_page()
> with a single reference to fix pmem crash

I will make sure those will appear in the mmotm git tree (I will
probably pull the whole tip/x86-mm-for-linus.
 
> You can add:
> 
> Tested-by: Dan Williams 

Thanks a lot for your testing! I will put your tested-by to patches
where you were on the CC explicitly (and which might affect zone device)
- mm, memory_hotplug: get rid of is_zone_device_section
- mm, memory_hotplug: do not associate hotadded memory to zones until
  online
- mm, memory_hotplug: replace for_device by want_memblock in
  arch_add_memory

Let me know if you want other patches as well.

-- 
Michal Hocko
SUSE Labs


Re: [PATCH] x86/PCI: fix duplicate 'const' declaration specifier

2017-05-09 Thread Nick Desaulniers
Please disregard this patch.  I think I may have found a bug in Clang,
or at least an incompatibility with GCC 7.1.

Clang bug: https://bugs.llvm.org/show_bug.cgi?id=32985


Re: [PATCH] drm/mm: fix duplicate 'const' declaration specifier

2017-05-09 Thread Nick Desaulniers
Please disregard this patch.  I think I may have found a bug in Clang,
or at least an incompatibility with GCC 7.1.

Clang bug: https://bugs.llvm.org/show_bug.cgi?id=32985


Re: [PATCH] drm/mm: fix duplicate 'const' declaration specifier

2017-05-09 Thread Nick Desaulniers
Please disregard this patch.  I think I may have found a bug in Clang,
or at least an incompatibility with GCC 7.1.

Clang bug: https://bugs.llvm.org/show_bug.cgi?id=32985


Re: [PATCH] x86/power/64: Use char arrays for asm function names

2017-05-09 Thread Ingo Molnar

* Rafael J. Wysocki  wrote:

> On Tuesday, May 09, 2017 02:00:51 PM Kees Cook wrote:
> > This switches the hibernate_64.S function names into character arrays
> > to match other areas of the kernel where this is done (e.g., linker
> > scripts). Specifically this fixes a compile-time error noticed by the
> > future CONFIG_FORTIFY_SOURCE routines that complained about PAGE_SIZE
> > being copied out of the "single byte" core_restore_code variable.
> > 
> > Additionally drops the "acpi_save_state_mem" exern which does not
> > appear to be used anywhere else in the kernel.
> > 
> > Cc: Daniel Micay 
> > Signed-off-by: Kees Cook 
> 
> Acked-by: Rafael J. Wysocki 
> 
> Or I can queue this up if that's preferred.

LGTM too!

Acked-by: Ingo Molnar 

Thanks,

Ingo


Re: [PATCH] x86/power/64: Use char arrays for asm function names

2017-05-09 Thread Ingo Molnar

* Rafael J. Wysocki  wrote:

> On Tuesday, May 09, 2017 02:00:51 PM Kees Cook wrote:
> > This switches the hibernate_64.S function names into character arrays
> > to match other areas of the kernel where this is done (e.g., linker
> > scripts). Specifically this fixes a compile-time error noticed by the
> > future CONFIG_FORTIFY_SOURCE routines that complained about PAGE_SIZE
> > being copied out of the "single byte" core_restore_code variable.
> > 
> > Additionally drops the "acpi_save_state_mem" exern which does not
> > appear to be used anywhere else in the kernel.
> > 
> > Cc: Daniel Micay 
> > Signed-off-by: Kees Cook 
> 
> Acked-by: Rafael J. Wysocki 
> 
> Or I can queue this up if that's preferred.

LGTM too!

Acked-by: Ingo Molnar 

Thanks,

Ingo


Re: [PATCH v8] Input: psxpad-spi - Add PlayStation 1/2 joypads via SPI interface Driver

2017-05-09 Thread Tomohiro Yoshidomi
Mr.Torokhov

Hi.

> Could you please test the version of the patch I sent to you? It had
> more changes than the 2 you mentioned above.

Sorry, what did you mail to me with the patch?
I searched old mails, but I couldn't find it.

Please tell me time when you mailed the patch?

Thanks.

---
Tomohiro 


Re: [PATCH v8] Input: psxpad-spi - Add PlayStation 1/2 joypads via SPI interface Driver

2017-05-09 Thread Tomohiro Yoshidomi
Mr.Torokhov

Hi.

> Could you please test the version of the patch I sent to you? It had
> more changes than the 2 you mentioned above.

Sorry, what did you mail to me with the patch?
I searched old mails, but I couldn't find it.

Please tell me time when you mailed the patch?

Thanks.

---
Tomohiro 


Re: [v4 3/4] iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74

2017-05-09 Thread Linu Cherian
> On Tue May 09, 2017 at 02:02:58PM +0100, Robin Murphy wrote:
> > On 09/05/17 12:45, Geetha sowjanya wrote:
> > > From: Linu Cherian 
> > > 
> > > Cavium ThunderX2 SMMU implementation doesn't support page 1 register space
> > > and PAGE0_REGS_ONLY option is enabled as an errata workaround.
> > > This option when turned on, replaces all page 1 offsets used for
> > > EVTQ_PROD/CONS, PRIQ_PROD/CONS register access with page 0 offsets.
> > > 
> > > SMMU resource size checks are now based on SMMU option PAGE0_REGS_ONLY,
> > > since resource size can be either 64k/128k.
> > > For this, arm_smmu_device_dt_probe/acpi_probe has been moved before
> > > platform_get_resource call, so that SMMU options are set beforehand.
> > > 
> > > Signed-off-by: Linu Cherian 
> > > Signed-off-by: Geetha Sowjanya 
> > > ---
> > >  Documentation/arm64/silicon-errata.txt |  1 +
> > >  .../devicetree/bindings/iommu/arm,smmu-v3.txt  |  6 ++
> > >  drivers/iommu/arm-smmu-v3.c| 80 
> > > --
> > >  3 files changed, 66 insertions(+), 21 deletions(-)
> > > 
> > > diff --git a/Documentation/arm64/silicon-errata.txt 
> > > b/Documentation/arm64/silicon-errata.txt
> > > index 10f2ddd..4693a32 100644
> > > --- a/Documentation/arm64/silicon-errata.txt
> > > +++ b/Documentation/arm64/silicon-errata.txt
> > > @@ -62,6 +62,7 @@ stable kernels.
> > >  | Cavium | ThunderX GICv3  | #23154  | 
> > > CAVIUM_ERRATUM_23154|
> > >  | Cavium | ThunderX Core   | #27456  | 
> > > CAVIUM_ERRATUM_27456|
> > >  | Cavium | ThunderX SMMUv2 | #27704  | N/A   
> > >   |
> > > +| Cavium | ThunderX2 SMMUv3| #74 | N/A   
> > >   |
> > >  || | |   
> > >   |
> > >  | Freescale/NXP  | LS2080A/LS1043A | A-008585| 
> > > FSL_ERRATUM_A008585 |
> > >  || | |   
> > >   |
> > > diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt 
> > > b/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
> > > index be57550..e6da62b 100644
> > > --- a/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
> > > +++ b/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
> > > @@ -49,6 +49,12 @@ the PCIe specification.
> > >  - hisilicon,broken-prefetch-cmd
> > >  : Avoid sending CMD_PREFETCH_* commands to the SMMU.
> > >  
> > > +- cavium-cn99xx,broken-page1-regspace
> > > +: Replaces all page 1 offsets used for 
> > > EVTQ_PROD/CONS,
> > > + PRIQ_PROD/CONS register access 
> > > with page 0 offsets.
> > > + Set for Caviun ThunderX2 
> > > silicon that doesn't support
> > > + SMMU page1 register space.
> > > +
> > >  ** Example
> > >  
> > >  smmu@2b40 {
> > > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> > > index 380969a..1e986a0 100644
> > > --- a/drivers/iommu/arm-smmu-v3.c
> > > +++ b/drivers/iommu/arm-smmu-v3.c
> > > @@ -176,15 +176,15 @@
> > >  #define ARM_SMMU_CMDQ_CONS   0x9c
> > >  
> > >  #define ARM_SMMU_EVTQ_BASE   0xa0
> > > -#define ARM_SMMU_EVTQ_PROD   0x100a8
> > > -#define ARM_SMMU_EVTQ_CONS   0x100ac
> > > +#define ARM_SMMU_EVTQ_PROD(smmu) (page1_offset_adjust(0x100a8, smmu))
> > > +#define ARM_SMMU_EVTQ_CONS(smmu) (page1_offset_adjust(0x100ac, smmu))
> > 
> > Sorry, perhaps I should have communicated the rest of the idea more
> > explicitly - you now don't need to change these definitions...
> > 
> 
> Fine. 
> 
> 
> > >  #define ARM_SMMU_EVTQ_IRQ_CFG0   0xb0
> > >  #define ARM_SMMU_EVTQ_IRQ_CFG1   0xb8
> > >  #define ARM_SMMU_EVTQ_IRQ_CFG2   0xbc
> > >  
> > >  #define ARM_SMMU_PRIQ_BASE   0xc0
> > > -#define ARM_SMMU_PRIQ_PROD   0x100c8
> > > -#define ARM_SMMU_PRIQ_CONS   0x100cc
> > > +#define ARM_SMMU_PRIQ_PROD(smmu) (page1_offset_adjust(0x100c8, smmu))
> > > +#define ARM_SMMU_PRIQ_CONS(smmu) (page1_offset_adjust(0x100cc, smmu))
> > >  #define ARM_SMMU_PRIQ_IRQ_CFG0   0xd0
> > >  #define ARM_SMMU_PRIQ_IRQ_CFG1   0xd8
> > >  #define ARM_SMMU_PRIQ_IRQ_CFG2   0xdc
> > > @@ -412,6 +412,9 @@
> > >  #define MSI_IOVA_BASE0x800
> > >  #define MSI_IOVA_LENGTH  0x10
> > >  
> > > +#define ARM_SMMU_PAGE0_REGS_ONLY(smmu)   \
> > > + ((smmu)->options & ARM_SMMU_OPT_PAGE0_REGS_ONLY)
> > > +
> > >  static bool disable_bypass;
> > >  module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO);
> > >  MODULE_PARM_DESC(disable_bypass,
> > > @@ -597,6 +600,7 @@ 

Re: [v4 3/4] iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74

2017-05-09 Thread Linu Cherian
> On Tue May 09, 2017 at 02:02:58PM +0100, Robin Murphy wrote:
> > On 09/05/17 12:45, Geetha sowjanya wrote:
> > > From: Linu Cherian 
> > > 
> > > Cavium ThunderX2 SMMU implementation doesn't support page 1 register space
> > > and PAGE0_REGS_ONLY option is enabled as an errata workaround.
> > > This option when turned on, replaces all page 1 offsets used for
> > > EVTQ_PROD/CONS, PRIQ_PROD/CONS register access with page 0 offsets.
> > > 
> > > SMMU resource size checks are now based on SMMU option PAGE0_REGS_ONLY,
> > > since resource size can be either 64k/128k.
> > > For this, arm_smmu_device_dt_probe/acpi_probe has been moved before
> > > platform_get_resource call, so that SMMU options are set beforehand.
> > > 
> > > Signed-off-by: Linu Cherian 
> > > Signed-off-by: Geetha Sowjanya 
> > > ---
> > >  Documentation/arm64/silicon-errata.txt |  1 +
> > >  .../devicetree/bindings/iommu/arm,smmu-v3.txt  |  6 ++
> > >  drivers/iommu/arm-smmu-v3.c| 80 
> > > --
> > >  3 files changed, 66 insertions(+), 21 deletions(-)
> > > 
> > > diff --git a/Documentation/arm64/silicon-errata.txt 
> > > b/Documentation/arm64/silicon-errata.txt
> > > index 10f2ddd..4693a32 100644
> > > --- a/Documentation/arm64/silicon-errata.txt
> > > +++ b/Documentation/arm64/silicon-errata.txt
> > > @@ -62,6 +62,7 @@ stable kernels.
> > >  | Cavium | ThunderX GICv3  | #23154  | 
> > > CAVIUM_ERRATUM_23154|
> > >  | Cavium | ThunderX Core   | #27456  | 
> > > CAVIUM_ERRATUM_27456|
> > >  | Cavium | ThunderX SMMUv2 | #27704  | N/A   
> > >   |
> > > +| Cavium | ThunderX2 SMMUv3| #74 | N/A   
> > >   |
> > >  || | |   
> > >   |
> > >  | Freescale/NXP  | LS2080A/LS1043A | A-008585| 
> > > FSL_ERRATUM_A008585 |
> > >  || | |   
> > >   |
> > > diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt 
> > > b/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
> > > index be57550..e6da62b 100644
> > > --- a/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
> > > +++ b/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
> > > @@ -49,6 +49,12 @@ the PCIe specification.
> > >  - hisilicon,broken-prefetch-cmd
> > >  : Avoid sending CMD_PREFETCH_* commands to the SMMU.
> > >  
> > > +- cavium-cn99xx,broken-page1-regspace
> > > +: Replaces all page 1 offsets used for 
> > > EVTQ_PROD/CONS,
> > > + PRIQ_PROD/CONS register access 
> > > with page 0 offsets.
> > > + Set for Caviun ThunderX2 
> > > silicon that doesn't support
> > > + SMMU page1 register space.
> > > +
> > >  ** Example
> > >  
> > >  smmu@2b40 {
> > > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> > > index 380969a..1e986a0 100644
> > > --- a/drivers/iommu/arm-smmu-v3.c
> > > +++ b/drivers/iommu/arm-smmu-v3.c
> > > @@ -176,15 +176,15 @@
> > >  #define ARM_SMMU_CMDQ_CONS   0x9c
> > >  
> > >  #define ARM_SMMU_EVTQ_BASE   0xa0
> > > -#define ARM_SMMU_EVTQ_PROD   0x100a8
> > > -#define ARM_SMMU_EVTQ_CONS   0x100ac
> > > +#define ARM_SMMU_EVTQ_PROD(smmu) (page1_offset_adjust(0x100a8, smmu))
> > > +#define ARM_SMMU_EVTQ_CONS(smmu) (page1_offset_adjust(0x100ac, smmu))
> > 
> > Sorry, perhaps I should have communicated the rest of the idea more
> > explicitly - you now don't need to change these definitions...
> > 
> 
> Fine. 
> 
> 
> > >  #define ARM_SMMU_EVTQ_IRQ_CFG0   0xb0
> > >  #define ARM_SMMU_EVTQ_IRQ_CFG1   0xb8
> > >  #define ARM_SMMU_EVTQ_IRQ_CFG2   0xbc
> > >  
> > >  #define ARM_SMMU_PRIQ_BASE   0xc0
> > > -#define ARM_SMMU_PRIQ_PROD   0x100c8
> > > -#define ARM_SMMU_PRIQ_CONS   0x100cc
> > > +#define ARM_SMMU_PRIQ_PROD(smmu) (page1_offset_adjust(0x100c8, smmu))
> > > +#define ARM_SMMU_PRIQ_CONS(smmu) (page1_offset_adjust(0x100cc, smmu))
> > >  #define ARM_SMMU_PRIQ_IRQ_CFG0   0xd0
> > >  #define ARM_SMMU_PRIQ_IRQ_CFG1   0xd8
> > >  #define ARM_SMMU_PRIQ_IRQ_CFG2   0xdc
> > > @@ -412,6 +412,9 @@
> > >  #define MSI_IOVA_BASE0x800
> > >  #define MSI_IOVA_LENGTH  0x10
> > >  
> > > +#define ARM_SMMU_PAGE0_REGS_ONLY(smmu)   \
> > > + ((smmu)->options & ARM_SMMU_OPT_PAGE0_REGS_ONLY)
> > > +
> > >  static bool disable_bypass;
> > >  module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO);
> > >  MODULE_PARM_DESC(disable_bypass,
> > > @@ -597,6 +600,7 @@ struct arm_smmu_device {
> > >   u32 features;
> > >  

Re: [PATCH] nfsd: avoid out of bounds read on array nfsd4_layout_ops

2017-05-09 Thread Ari Kauppi

> On 10.5.2017, at 0.14, Colin Ian King  wrote:
> 
> On 09/05/17 22:03, J . Bruce Fields wrote:
>> On Tue, May 09, 2017 at 05:04:14PM +0300, Dan Carpenter wrote:
>>> On Tue, May 09, 2017 at 02:31:21PM +0100, Colin King wrote:
 diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
 index 1dbf62190bee..c453a1998e00 100644
 --- a/fs/nfsd/nfs4proc.c
 +++ b/fs/nfsd/nfs4proc.c
 @@ -1259,7 +1259,8 @@ nfsd4_layout_verify(struct svc_export *exp, unsigned 
 int layout_type)
return NULL;
}
 
 -  if (layout_type >= 32 || !(exp->ex_layout_types & (1 << layout_type))) {
 +  if (layout_type >= LAYOUT_TYPE_MAX ||
 +  !(exp->ex_layout_types & (1 << layout_type))) {
>>> 
>>> The 32 is there to prevent a shift wrapping bug.  The bit test prevents
>>> a buffer overflow so this can't actually overflow.
>> 
>> Yes, looks like a false positive for coverity.
>> 
>>> But this change doesn't hurt and is probably cleaner.
>> 
>> Sure.  Hope it's OK if I just merge this into the previous commit:
> 
> Fine by me.  Colin

Looks good to me.

Thanks,

--
Ari


Re: [PATCH] nfsd: avoid out of bounds read on array nfsd4_layout_ops

2017-05-09 Thread Ari Kauppi

> On 10.5.2017, at 0.14, Colin Ian King  wrote:
> 
> On 09/05/17 22:03, J . Bruce Fields wrote:
>> On Tue, May 09, 2017 at 05:04:14PM +0300, Dan Carpenter wrote:
>>> On Tue, May 09, 2017 at 02:31:21PM +0100, Colin King wrote:
 diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
 index 1dbf62190bee..c453a1998e00 100644
 --- a/fs/nfsd/nfs4proc.c
 +++ b/fs/nfsd/nfs4proc.c
 @@ -1259,7 +1259,8 @@ nfsd4_layout_verify(struct svc_export *exp, unsigned 
 int layout_type)
return NULL;
}
 
 -  if (layout_type >= 32 || !(exp->ex_layout_types & (1 << layout_type))) {
 +  if (layout_type >= LAYOUT_TYPE_MAX ||
 +  !(exp->ex_layout_types & (1 << layout_type))) {
>>> 
>>> The 32 is there to prevent a shift wrapping bug.  The bit test prevents
>>> a buffer overflow so this can't actually overflow.
>> 
>> Yes, looks like a false positive for coverity.
>> 
>>> But this change doesn't hurt and is probably cleaner.
>> 
>> Sure.  Hope it's OK if I just merge this into the previous commit:
> 
> Fine by me.  Colin

Looks good to me.

Thanks,

--
Ari


[PATCH] socfpga_a10: reset CPU1 in socfpga_cpu_kill()

2017-05-09 Thread yanjiang.jin
From: Yanjiang Jin 

Kexec's second kernel would hang if CPU1 isn't reset.

Signed-off-by: Yanjiang Jin 
---
 arch/arm/mach-socfpga/platsmp.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mach-socfpga/platsmp.c b/arch/arm/mach-socfpga/platsmp.c
index 0ee7677..db3940e 100644
--- a/arch/arm/mach-socfpga/platsmp.c
+++ b/arch/arm/mach-socfpga/platsmp.c
@@ -117,6 +117,16 @@ static int socfpga_cpu_kill(unsigned int cpu)
 {
return 1;
 }
+
+static int socfpga_a10_cpu_kill(unsigned int cpu)
+{
+   /* This will put CPU #1 into reset. */
+   if (socfpga_cpu1start_addr)
+   writel(RSTMGR_MPUMODRST_CPU1, rst_manager_base_addr +
+   SOCFPGA_A10_RSTMGR_MODMPURST);
+
+   return 1;
+}
 #endif
 
 static const struct smp_operations socfpga_smp_ops __initconst = {
@@ -133,7 +143,7 @@ static int socfpga_cpu_kill(unsigned int cpu)
.smp_boot_secondary = socfpga_a10_boot_secondary,
 #ifdef CONFIG_HOTPLUG_CPU
.cpu_die= socfpga_cpu_die,
-   .cpu_kill   = socfpga_cpu_kill,
+   .cpu_kill   = socfpga_a10_cpu_kill,
 #endif
 };
 
-- 
1.9.1



[PATCH] socfpga_a10: reset CPU1 in socfpga_cpu_kill()

2017-05-09 Thread yanjiang.jin
From: Yanjiang Jin 

Kexec's second kernel would hang if CPU1 isn't reset.

Signed-off-by: Yanjiang Jin 
---
 arch/arm/mach-socfpga/platsmp.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mach-socfpga/platsmp.c b/arch/arm/mach-socfpga/platsmp.c
index 0ee7677..db3940e 100644
--- a/arch/arm/mach-socfpga/platsmp.c
+++ b/arch/arm/mach-socfpga/platsmp.c
@@ -117,6 +117,16 @@ static int socfpga_cpu_kill(unsigned int cpu)
 {
return 1;
 }
+
+static int socfpga_a10_cpu_kill(unsigned int cpu)
+{
+   /* This will put CPU #1 into reset. */
+   if (socfpga_cpu1start_addr)
+   writel(RSTMGR_MPUMODRST_CPU1, rst_manager_base_addr +
+   SOCFPGA_A10_RSTMGR_MODMPURST);
+
+   return 1;
+}
 #endif
 
 static const struct smp_operations socfpga_smp_ops __initconst = {
@@ -133,7 +143,7 @@ static int socfpga_cpu_kill(unsigned int cpu)
.smp_boot_secondary = socfpga_a10_boot_secondary,
 #ifdef CONFIG_HOTPLUG_CPU
.cpu_die= socfpga_cpu_die,
-   .cpu_kill   = socfpga_cpu_kill,
+   .cpu_kill   = socfpga_a10_cpu_kill,
 #endif
 };
 
-- 
1.9.1



Re: WMI and Kernel:User interface

2017-05-09 Thread Greg Kroah-Hartman
On Tue, May 09, 2017 at 04:16:39PM -0700, Darren Hart wrote:
> Linus and Greg,
> 
> We are in the process of redesigning the Windows Management Instrumentation
> (WMI) [1] system in the kernel. WMI is the Microsoft implementation of 
> Web-Based
> Enterprise Management (WBEM). We are looking to provide WMI access to 
> userspace,
> while allowing the kernel to filter requests that conflict with its own usage.
> We'd like your take on how this approach relates to our commitment to not 
> break
> userspace.
> 
> For this discussion, we are specifically referring to ACPI PNP0C14 WMI
> devices, consisting of a GUID and a set of methods and events, as well as a
> precompiled intermediate description of the methods and arguments (MOF). 
> Exposed
> to userspace, these methods provide for BIOS interaction and are used for 
> system
> management as well as LEDs, hot keys, radio switches, etc. There is vendor
> interest in achieving feature parity with Windows by exposing WMI methods to
> userspace for system management.
> 
> While it appears WMI intended to be accessed from userspace, we have
> made use of it in the kernel to support various laptop features by connecting
> the WMI methods to other subsystems, notably input, leds, and rfkill [2]. The
> challenge is continuing to use WMI for these platform features, while allowing
> userspace to use it for system management tasks. Unfortunately, the WMI 
> methods
> are not guaranteed to be split up along granular functional lines, and we will
> certainly face situations where the same GUID::METHOD_ID will be needed for a
> kernel feature (say LED support) as well as a system management task.
> 
> To address this, I have proposed [3] that exporting WMI be opt-in, only done 
> at
> the request of and in collaboration with a vendor, with the kernel platform
> driver given the opportunity to filter requests. This filtering would need to 
> be
> at the method and argument inspection level, such as checking for specific 
> bits
> in the input buffer, and rejecting the request if they conflict with an in
> kernel usage (that's worst case, in some cases just GUID or method ID could be
> sufficient).
> 
> Because the kernel and the platform drivers are under continual development, 
> and
> new systems appear regularly, we will encounter necessary changes to the
> platform driver WMI request filters. These changes could be considered a 
> change
> to the kernel provided WMI interface to userspace. For example, we could
> regularly accept a call to $GUID::$METHOD_ID with bit 4 of the buffer set, and
> later deny the call when we determine it interferes with kernel usage.
> 
> In your view, is it acceptable to provide a chardev interface, for example,
> exposing WMI methods to userspace, with the understanding that the kernel may
> choose to filter certain requests which conflict with its own use? And that 
> this
> filtering may change as new features are added to the platform drivers?

So, for example, if a new driver for a "brightness key" were added to
the kernel, all of a sudden the "raw" access to the wmi data through the
chardev would filtered away by the kernel and not seen by userspace?

Why would you want to do that?  What's wrong with providing "raw" access
through a chardev, and the current in-kernel access as well at the same
time?

I don't really understand what would "break" over time here.

thanks,

greg k-h


Re: WMI and Kernel:User interface

2017-05-09 Thread Greg Kroah-Hartman
On Tue, May 09, 2017 at 04:16:39PM -0700, Darren Hart wrote:
> Linus and Greg,
> 
> We are in the process of redesigning the Windows Management Instrumentation
> (WMI) [1] system in the kernel. WMI is the Microsoft implementation of 
> Web-Based
> Enterprise Management (WBEM). We are looking to provide WMI access to 
> userspace,
> while allowing the kernel to filter requests that conflict with its own usage.
> We'd like your take on how this approach relates to our commitment to not 
> break
> userspace.
> 
> For this discussion, we are specifically referring to ACPI PNP0C14 WMI
> devices, consisting of a GUID and a set of methods and events, as well as a
> precompiled intermediate description of the methods and arguments (MOF). 
> Exposed
> to userspace, these methods provide for BIOS interaction and are used for 
> system
> management as well as LEDs, hot keys, radio switches, etc. There is vendor
> interest in achieving feature parity with Windows by exposing WMI methods to
> userspace for system management.
> 
> While it appears WMI intended to be accessed from userspace, we have
> made use of it in the kernel to support various laptop features by connecting
> the WMI methods to other subsystems, notably input, leds, and rfkill [2]. The
> challenge is continuing to use WMI for these platform features, while allowing
> userspace to use it for system management tasks. Unfortunately, the WMI 
> methods
> are not guaranteed to be split up along granular functional lines, and we will
> certainly face situations where the same GUID::METHOD_ID will be needed for a
> kernel feature (say LED support) as well as a system management task.
> 
> To address this, I have proposed [3] that exporting WMI be opt-in, only done 
> at
> the request of and in collaboration with a vendor, with the kernel platform
> driver given the opportunity to filter requests. This filtering would need to 
> be
> at the method and argument inspection level, such as checking for specific 
> bits
> in the input buffer, and rejecting the request if they conflict with an in
> kernel usage (that's worst case, in some cases just GUID or method ID could be
> sufficient).
> 
> Because the kernel and the platform drivers are under continual development, 
> and
> new systems appear regularly, we will encounter necessary changes to the
> platform driver WMI request filters. These changes could be considered a 
> change
> to the kernel provided WMI interface to userspace. For example, we could
> regularly accept a call to $GUID::$METHOD_ID with bit 4 of the buffer set, and
> later deny the call when we determine it interferes with kernel usage.
> 
> In your view, is it acceptable to provide a chardev interface, for example,
> exposing WMI methods to userspace, with the understanding that the kernel may
> choose to filter certain requests which conflict with its own use? And that 
> this
> filtering may change as new features are added to the platform drivers?

So, for example, if a new driver for a "brightness key" were added to
the kernel, all of a sudden the "raw" access to the wmi data through the
chardev would filtered away by the kernel and not seen by userspace?

Why would you want to do that?  What's wrong with providing "raw" access
through a chardev, and the current in-kernel access as well at the same
time?

I don't really understand what would "break" over time here.

thanks,

greg k-h


[PATCH] socfpga_a10: fix a kexec boot issue

2017-05-09 Thread yanjiang.jin
From: Yanjiang Jin 

I guess socfpga's other boards may need this patch, but I have Arria10
board only. 
So add a new function socfpga_a10_cpu_kill(), keep the old function
socfpga_cpu_kill() for other boards.
I also verified CPU_HOTPLUG after applying this patch, everything seems well.

Test steps:

1. Enable kexec and build a SOCFPGA kernel;
2. Use zImage as 1st and 2nd kernel;
3. kexec -l /root/zImage --append="`cat /proc/cmdline`" 
4. kexec -e

Test env:

U-Boot 2014.10 (Jan 13 2016 - 11:07:09)

CPU   : Altera SOCFPGA Arria 10 Platform
BOARD : Altera SOCFPGA Arria 10 Dev Kit

-- 
1.9.1



[PATCH] socfpga_a10: fix a kexec boot issue

2017-05-09 Thread yanjiang.jin
From: Yanjiang Jin 

I guess socfpga's other boards may need this patch, but I have Arria10
board only. 
So add a new function socfpga_a10_cpu_kill(), keep the old function
socfpga_cpu_kill() for other boards.
I also verified CPU_HOTPLUG after applying this patch, everything seems well.

Test steps:

1. Enable kexec and build a SOCFPGA kernel;
2. Use zImage as 1st and 2nd kernel;
3. kexec -l /root/zImage --append="`cat /proc/cmdline`" 
4. kexec -e

Test env:

U-Boot 2014.10 (Jan 13 2016 - 11:07:09)

CPU   : Altera SOCFPGA Arria 10 Platform
BOARD : Altera SOCFPGA Arria 10 Dev Kit

-- 
1.9.1



Re: [patch V2 24/24] cpu/hotplug: Convert hotplug locking to percpu rwsem

2017-05-09 Thread Michael Ellerman
Thomas Gleixner <t...@linutronix.de> writes:

> @@ -130,6 +130,7 @@ void __static_key_slow_inc(struct static
>* the all CPUs, for that to be serialized against CPU hot-plug
>* we need to avoid CPUs coming online.
>*/
> + lockdep_assert_hotplug_held();
>   jump_label_lock();
>   if (atomic_read(>enabled) == 0) {
>   atomic_set(>enabled, -1);

I seem to be hitting this assert from the ftrace event selftests,
enabled at boot with CONFIG_FTRACE_STARTUP_TEST=y, using next-20170509
(on powerpc).

[  842.691191] Testing event rpc_call_status: 
[  842.691209] [ cut here ]
[  842.691399] WARNING: CPU: 6 PID: 1 at ../kernel/cpu.c:234 
lockdep_assert_hotplug_held+0x5c/0x70
[  842.691575] Modules linked in:
[  842.691675] CPU: 6 PID: 1 Comm: swapper/0 Tainted: G    W   
4.11.0-gcc-5.4.1-next-20170509 #218
[  842.691865] task: c001fe78 task.stack: c001fe80
[  842.692003] NIP: c00ff3dc LR: c00ff3d0 CTR: c0218650
[  842.692166] REGS: c001fe8036e0 TRAP: 0700   Tainted: G    W
(4.11.0-gcc-5.4.1-next-20170509)
[  842.692343] MSR: 80029033 <SF,EE,ME,IR,DR,RI,LE>
[  842.692491]   CR: 2800  XER: 2000
[  842.692689] CFAR: c0171530 SOFTE: 1 
   GPR00: c00ff3d0 c001fe803960 c12b7600 
 
   GPR04:   c000fc10c0e8 
 
   GPR08:    
c000f8180008 
   GPR12: 2200 cfd42100 c000e218 
 
   GPR16:    
 
   GPR20:    
c000f9341610 
   GPR24: c127ee48 c0aa49d0 000a 
c000fc3c 
   GPR28: c117b148 c1264230  
c127ee48 
[  842.694287] NIP [c00ff3dc] lockdep_assert_hotplug_held+0x5c/0x70
[  842.694434] LR [c00ff3d0] lockdep_assert_hotplug_held+0x50/0x70
[  842.694577] Call Trace:
[  842.694658] [c001fe803960] [c00ff3d0] 
lockdep_assert_hotplug_held+0x50/0x70 (unreliable)
[  842.694876] [c001fe803980] [c02a3754] 
__static_key_slow_inc+0x104/0x170
[  842.695054] [c001fe8039f0] [c02176ac] 
tracepoint_probe_register_prio+0x2dc/0x390
[  842.695258] [c001fe803a60] [c024cf50] trace_event_reg+0xe0/0x130
[  842.695434] [c001fe803a80] [c024d5f0] 
__ftrace_event_enable_disable+0x270/0x3e0
[  842.695601] [c001fe803b10] [c0e20328] 
event_trace_self_tests+0x14c/0x350
[  842.695778] [c001fe803bc0] [c0e20774] 
event_trace_self_tests_init+0xc8/0xf4
[  842.695944] [c001fe803c30] [c000d87c] do_one_initcall+0x6c/0x1d0
[  842.696113] [c001fe803cf0] [c0df462c] 
kernel_init_freeable+0x304/0x3e4
[  842.696282] [c001fe803dc0] [c000e23c] kernel_init+0x2c/0x170
[  842.696460] [c001fe803e30] [c000bdec] 
ret_from_kernel_thread+0x5c/0x70
[  842.696662] Instruction dump:
[  842.696763] 409e0014 38210020 e8010010 7c0803a6 4e800020 3c62ffe6 3880 
38634808 
[  842.697009] 480720ed 6000 2fa3 409effd8 <0fe0> 38210020 e8010010 
7c0803a6 
[  842.697271] ---[ end trace f68728a0d30544a1 ]---


The stupidly obvious (or perhaps obviously stupid) patch below fixes it:

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index daefdee9411a..5531f7ce8fa6 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -3241,9 +3241,19 @@ static __init void event_trace_self_tests(void)
continue;
}
 
+   get_online_cpus();
+   mutex_lock(_mutex);
ftrace_event_enable_disable(file, 1);
+   mutex_unlock(_mutex);
+   put_online_cpus();
+
event_test_stuff();
+
+   get_online_cpus();
+   mutex_lock(_mutex);
ftrace_event_enable_disable(file, 0);
+   mutex_unlock(_mutex);
+   put_online_cpus();
 
pr_cont("OK\n");
}

cheers


Re: [patch V2 24/24] cpu/hotplug: Convert hotplug locking to percpu rwsem

2017-05-09 Thread Michael Ellerman
Thomas Gleixner  writes:

> @@ -130,6 +130,7 @@ void __static_key_slow_inc(struct static
>* the all CPUs, for that to be serialized against CPU hot-plug
>* we need to avoid CPUs coming online.
>*/
> + lockdep_assert_hotplug_held();
>   jump_label_lock();
>   if (atomic_read(>enabled) == 0) {
>   atomic_set(>enabled, -1);

I seem to be hitting this assert from the ftrace event selftests,
enabled at boot with CONFIG_FTRACE_STARTUP_TEST=y, using next-20170509
(on powerpc).

[  842.691191] Testing event rpc_call_status: 
[  842.691209] [ cut here ]
[  842.691399] WARNING: CPU: 6 PID: 1 at ../kernel/cpu.c:234 
lockdep_assert_hotplug_held+0x5c/0x70
[  842.691575] Modules linked in:
[  842.691675] CPU: 6 PID: 1 Comm: swapper/0 Tainted: GW   
4.11.0-gcc-5.4.1-next-20170509 #218
[  842.691865] task: c001fe78 task.stack: c001fe80
[  842.692003] NIP: c00ff3dc LR: c00ff3d0 CTR: c0218650
[  842.692166] REGS: c001fe8036e0 TRAP: 0700   Tainted: GW    
(4.11.0-gcc-5.4.1-next-20170509)
[  842.692343] MSR: 80029033 
[  842.692491]   CR: 2800  XER: 2000
[  842.692689] CFAR: c0171530 SOFTE: 1 
   GPR00: c00ff3d0 c001fe803960 c12b7600 
 
   GPR04:   c000fc10c0e8 
 
   GPR08:    
c000f8180008 
   GPR12: 2200 cfd42100 c000e218 
 
   GPR16:    
 
   GPR20:    
c000f9341610 
   GPR24: c127ee48 c0aa49d0 000a 
c000fc3c 
   GPR28: c117b148 c1264230  
c127ee48 
[  842.694287] NIP [c00ff3dc] lockdep_assert_hotplug_held+0x5c/0x70
[  842.694434] LR [c00ff3d0] lockdep_assert_hotplug_held+0x50/0x70
[  842.694577] Call Trace:
[  842.694658] [c001fe803960] [c00ff3d0] 
lockdep_assert_hotplug_held+0x50/0x70 (unreliable)
[  842.694876] [c001fe803980] [c02a3754] 
__static_key_slow_inc+0x104/0x170
[  842.695054] [c001fe8039f0] [c02176ac] 
tracepoint_probe_register_prio+0x2dc/0x390
[  842.695258] [c001fe803a60] [c024cf50] trace_event_reg+0xe0/0x130
[  842.695434] [c001fe803a80] [c024d5f0] 
__ftrace_event_enable_disable+0x270/0x3e0
[  842.695601] [c001fe803b10] [c0e20328] 
event_trace_self_tests+0x14c/0x350
[  842.695778] [c001fe803bc0] [c0e20774] 
event_trace_self_tests_init+0xc8/0xf4
[  842.695944] [c001fe803c30] [c000d87c] do_one_initcall+0x6c/0x1d0
[  842.696113] [c001fe803cf0] [c0df462c] 
kernel_init_freeable+0x304/0x3e4
[  842.696282] [c001fe803dc0] [c000e23c] kernel_init+0x2c/0x170
[  842.696460] [c001fe803e30] [c000bdec] 
ret_from_kernel_thread+0x5c/0x70
[  842.696662] Instruction dump:
[  842.696763] 409e0014 38210020 e8010010 7c0803a6 4e800020 3c62ffe6 3880 
38634808 
[  842.697009] 480720ed 6000 2fa3 409effd8 <0fe0> 38210020 e8010010 
7c0803a6 
[  842.697271] ---[ end trace f68728a0d30544a1 ]---


The stupidly obvious (or perhaps obviously stupid) patch below fixes it:

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index daefdee9411a..5531f7ce8fa6 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -3241,9 +3241,19 @@ static __init void event_trace_self_tests(void)
continue;
}
 
+   get_online_cpus();
+   mutex_lock(_mutex);
ftrace_event_enable_disable(file, 1);
+   mutex_unlock(_mutex);
+   put_online_cpus();
+
event_test_stuff();
+
+   get_online_cpus();
+   mutex_lock(_mutex);
ftrace_event_enable_disable(file, 0);
+   mutex_unlock(_mutex);
+   put_online_cpus();
 
pr_cont("OK\n");
}

cheers


Re: [PATCH] net: dsa: loop: Check for memory allocation failure

2017-05-09 Thread Marion & Christophe JAILLET



Le 10/05/2017 à 06:46, Julia Lawall a écrit :


On Wed, 10 May 2017, Christophe JAILLET wrote:


Le 09/05/2017 à 17:18, Joe Perches a écrit :

On Mon, 2017-05-08 at 17:35 -0700, Florian Fainelli wrote:

On 05/08/2017 04:46 PM, Julia Lawall wrote:

On Mon, 8 May 2017, Joe Perches wrote:

Each time -EPROBE_DEFER occurs, another set of calls to
dsa_switch_alloc and dev_kzalloc also occurs.

Perhaps it'd be better to do:

if (ps->netdev) {
devm_kfree(>dev, ps);
devm_kfree(>dev, ds);
return -EPROBE_DEFER;
}

Is EPROBE_DEFER handled differently than other kinds of errors?

In the core device driver model, yes, EPROBE_DEFER is treated
differently than other errors because it puts the driver on a retry queue.

EPROBE_DEFER is already a slow and exceptional path, and this is a
mock-up driver, so I am not sure what value there is in trying to
balance devm_kzalloc() with corresponding devm_kfree()...

Example code should be as correct as possible.


Le 09/05/2017 à 17:18, Joe Perches a écrit :

On Mon, 2017-05-08 at 17:35 -0700, Florian Fainelli wrote:

On 05/08/2017 04:46 PM, Julia Lawall wrote:

On Mon, 8 May 2017, Joe Perches wrote:

Each time -EPROBE_DEFER occurs, another set of calls to
dsa_switch_alloc and dev_kzalloc also occurs.

Perhaps it'd be better to do:

if (ps->netdev) {
devm_kfree(>dev, ps);
devm_kfree(>dev, ds);
return -EPROBE_DEFER;
}

Is EPROBE_DEFER handled differently than other kinds of errors?

In the core device driver model, yes, EPROBE_DEFER is treated
differently than other errors because it puts the driver on a retry queue.

EPROBE_DEFER is already a slow and exceptional path, and this is a
mock-up driver, so I am not sure what value there is in trying to
balance devm_kzalloc() with corresponding devm_kfree()...

Example code should be as correct as possible.


(* number of people/mailing list in copy has been reduced *)


The coccinelle script below gives the following list of candidates for such
improvement.

char/hw_random/omap-rng.c
clk/clk-si5351.c
clk/clk-versaclock5.c
crypto/mediatek/mtk-platform.c
devfreq/rk3399_dmc.c
dma/mv_xor_v2.c
dma/omap-dma.c
gpu/drm/arc/arcpgu_hdmi.c
gpu/drm/bridge/dumb-vga-dac.c
gpu/drm/bridge/lvds-encoder.c
gpu/drm/exynos/exynos_dp.c
gpu/drm/exynos/exynos_drm_dsi.c
gpu/drm/imx/dw_hdmi-imx.c
gpu/drm/mediatek/mtk_dpi.c
gpu/drm/mediatek/mtk_drm_ddp_comp.c
gpu/drm/mediatek/mtk_dsi.c
gpu/drm/panel/panel-lvds.c
gpu/drm/panel/panel-simple.c
gpu/drm/panel/panel-sitronix-st7789v.c
gpu/drm/rcar-du/rcar_du_lvdscon.c
gpu/drm/rockchip/cdn-dp-core.c
gpu/drm/rockchip/dw_hdmi-rockchip.c
gpu/drm/sti/sti_hdmi.c
gpu/drm/tegra/sor.c
gpu/drm/tilcdc/tilcdc_panel.c
gpu/drm/vc4/vc4_hdmi.c
gpu/ipu-v3/ipu-common.c
gpu/ipu-v3/ipu-pre.c
gpu/ipu-v3/ipu-prg.c
hwtracing/coresight/coresight-stm.c
i2c/busses/i2c-designware-platdrv.c
i2c/busses/i2c-mv64xxx.c
i2c/muxes/i2c-mux-gpio.c
i2c/muxes/i2c-mux-pinctrl.c
i2c/muxes/i2c-mux-reg.c
iommu/mtk_iommu.c
iommu/mtk_iommu_v1.c
irqchip/qcom-irq-combiner.c
mailbox/mailbox-test.c
media/i2c/mt9m111.c
media/i2c/ov2640.c
media/i2c/ov7670.c
media/i2c/smiapp/smiapp-core.c
media/i2c/soc_camera/imx074.c
media/platform/coda/coda-common.c
media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
media/platform/s5p-cec/s5p_cec.c
media/platform/sti/cec/stih-cec.c
memory/tegra/tegra124-emc.c
mfd/twl6040.c
mtd/nand/lpc32xx_mlc.c
mtd/nand/lpc32xx_slc.c
net/dsa/dsa_loop.c
net/ethernet/mediatek/mtk_eth_soc.c
net/phy/xilinx_gmii2rgmii.c
net/wireless/ti/wlcore/spi.c
pci/host/pcie-iproc-platform.c
phy/phy-exynos5250-sata.c
phy/phy-mt65xx-usb3.c
phy/phy-qcom-qusb2.c
phy/phy-sun4i-usb.c
pinctrl/core.c
pinctrl/pinctrl-at91.c
platform/x86/intel_cht_int33fe.c
power/supply/act8945a_charger.c
power/supply/axp20x_ac_power.c
power/supply/axp20x_battery.c
power/supply/axp288_charger.c
power/supply/bq24190_charger.c
power/supply/cpcap-charger.c
power/supply/gpio-charger.c
soc/bcm/raspberrypi-power.c
thermal/samsung/exynos_tmu.c
tty/serial/8250/8250_dw.c
tty/serial/max310x.c
tty/serial/sccnxp.c
usb/chipidea/ci_hdrc_msm.c
usb/gadget/udc/mv_udc_core.c
usb/host/xhci-mtk.c
usb/mtu3/mtu3_plat.c
usb/musb/sunxi.c
usb/phy/phy-am335x.c
usb/phy/phy-generic.c
usb/phy/phy-twl6030-usb.c
video/backlight/hx8357.c
video/backlight/lp855x_bl.c
video/fbdev/simplefb.c


Coccinelle script :
=
// find calls to kmalloc or equivalent function
@call@
expression ptr;
position p;
@@

(
*   ptr@p = kmalloc(...)
|
*   ptr@p = kzalloc(...)
|
*   ptr@p = kcalloc(...)
|
*   ptr@p = kmalloc_array(...)

Do you get any reports for the above function?  Those would normally just
be memory leaks.

Only one, but the corresponding kfree was in place.


julia


|
*   ptr@p = devm_kmalloc(...)
|
*   ptr@p = devm_kzalloc(...)
|
*   ptr@p = devm_kcalloc(...)
|
*   ptr@p = devm_kmalloc_array(...)
)
  ...
*  return -EPROBE_DEFER;

--
To 

Re: [PATCH] net: dsa: loop: Check for memory allocation failure

2017-05-09 Thread Marion & Christophe JAILLET



Le 10/05/2017 à 06:46, Julia Lawall a écrit :


On Wed, 10 May 2017, Christophe JAILLET wrote:


Le 09/05/2017 à 17:18, Joe Perches a écrit :

On Mon, 2017-05-08 at 17:35 -0700, Florian Fainelli wrote:

On 05/08/2017 04:46 PM, Julia Lawall wrote:

On Mon, 8 May 2017, Joe Perches wrote:

Each time -EPROBE_DEFER occurs, another set of calls to
dsa_switch_alloc and dev_kzalloc also occurs.

Perhaps it'd be better to do:

if (ps->netdev) {
devm_kfree(>dev, ps);
devm_kfree(>dev, ds);
return -EPROBE_DEFER;
}

Is EPROBE_DEFER handled differently than other kinds of errors?

In the core device driver model, yes, EPROBE_DEFER is treated
differently than other errors because it puts the driver on a retry queue.

EPROBE_DEFER is already a slow and exceptional path, and this is a
mock-up driver, so I am not sure what value there is in trying to
balance devm_kzalloc() with corresponding devm_kfree()...

Example code should be as correct as possible.


Le 09/05/2017 à 17:18, Joe Perches a écrit :

On Mon, 2017-05-08 at 17:35 -0700, Florian Fainelli wrote:

On 05/08/2017 04:46 PM, Julia Lawall wrote:

On Mon, 8 May 2017, Joe Perches wrote:

Each time -EPROBE_DEFER occurs, another set of calls to
dsa_switch_alloc and dev_kzalloc also occurs.

Perhaps it'd be better to do:

if (ps->netdev) {
devm_kfree(>dev, ps);
devm_kfree(>dev, ds);
return -EPROBE_DEFER;
}

Is EPROBE_DEFER handled differently than other kinds of errors?

In the core device driver model, yes, EPROBE_DEFER is treated
differently than other errors because it puts the driver on a retry queue.

EPROBE_DEFER is already a slow and exceptional path, and this is a
mock-up driver, so I am not sure what value there is in trying to
balance devm_kzalloc() with corresponding devm_kfree()...

Example code should be as correct as possible.


(* number of people/mailing list in copy has been reduced *)


The coccinelle script below gives the following list of candidates for such
improvement.

char/hw_random/omap-rng.c
clk/clk-si5351.c
clk/clk-versaclock5.c
crypto/mediatek/mtk-platform.c
devfreq/rk3399_dmc.c
dma/mv_xor_v2.c
dma/omap-dma.c
gpu/drm/arc/arcpgu_hdmi.c
gpu/drm/bridge/dumb-vga-dac.c
gpu/drm/bridge/lvds-encoder.c
gpu/drm/exynos/exynos_dp.c
gpu/drm/exynos/exynos_drm_dsi.c
gpu/drm/imx/dw_hdmi-imx.c
gpu/drm/mediatek/mtk_dpi.c
gpu/drm/mediatek/mtk_drm_ddp_comp.c
gpu/drm/mediatek/mtk_dsi.c
gpu/drm/panel/panel-lvds.c
gpu/drm/panel/panel-simple.c
gpu/drm/panel/panel-sitronix-st7789v.c
gpu/drm/rcar-du/rcar_du_lvdscon.c
gpu/drm/rockchip/cdn-dp-core.c
gpu/drm/rockchip/dw_hdmi-rockchip.c
gpu/drm/sti/sti_hdmi.c
gpu/drm/tegra/sor.c
gpu/drm/tilcdc/tilcdc_panel.c
gpu/drm/vc4/vc4_hdmi.c
gpu/ipu-v3/ipu-common.c
gpu/ipu-v3/ipu-pre.c
gpu/ipu-v3/ipu-prg.c
hwtracing/coresight/coresight-stm.c
i2c/busses/i2c-designware-platdrv.c
i2c/busses/i2c-mv64xxx.c
i2c/muxes/i2c-mux-gpio.c
i2c/muxes/i2c-mux-pinctrl.c
i2c/muxes/i2c-mux-reg.c
iommu/mtk_iommu.c
iommu/mtk_iommu_v1.c
irqchip/qcom-irq-combiner.c
mailbox/mailbox-test.c
media/i2c/mt9m111.c
media/i2c/ov2640.c
media/i2c/ov7670.c
media/i2c/smiapp/smiapp-core.c
media/i2c/soc_camera/imx074.c
media/platform/coda/coda-common.c
media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
media/platform/s5p-cec/s5p_cec.c
media/platform/sti/cec/stih-cec.c
memory/tegra/tegra124-emc.c
mfd/twl6040.c
mtd/nand/lpc32xx_mlc.c
mtd/nand/lpc32xx_slc.c
net/dsa/dsa_loop.c
net/ethernet/mediatek/mtk_eth_soc.c
net/phy/xilinx_gmii2rgmii.c
net/wireless/ti/wlcore/spi.c
pci/host/pcie-iproc-platform.c
phy/phy-exynos5250-sata.c
phy/phy-mt65xx-usb3.c
phy/phy-qcom-qusb2.c
phy/phy-sun4i-usb.c
pinctrl/core.c
pinctrl/pinctrl-at91.c
platform/x86/intel_cht_int33fe.c
power/supply/act8945a_charger.c
power/supply/axp20x_ac_power.c
power/supply/axp20x_battery.c
power/supply/axp288_charger.c
power/supply/bq24190_charger.c
power/supply/cpcap-charger.c
power/supply/gpio-charger.c
soc/bcm/raspberrypi-power.c
thermal/samsung/exynos_tmu.c
tty/serial/8250/8250_dw.c
tty/serial/max310x.c
tty/serial/sccnxp.c
usb/chipidea/ci_hdrc_msm.c
usb/gadget/udc/mv_udc_core.c
usb/host/xhci-mtk.c
usb/mtu3/mtu3_plat.c
usb/musb/sunxi.c
usb/phy/phy-am335x.c
usb/phy/phy-generic.c
usb/phy/phy-twl6030-usb.c
video/backlight/hx8357.c
video/backlight/lp855x_bl.c
video/fbdev/simplefb.c


Coccinelle script :
=
// find calls to kmalloc or equivalent function
@call@
expression ptr;
position p;
@@

(
*   ptr@p = kmalloc(...)
|
*   ptr@p = kzalloc(...)
|
*   ptr@p = kcalloc(...)
|
*   ptr@p = kmalloc_array(...)

Do you get any reports for the above function?  Those would normally just
be memory leaks.

Only one, but the corresponding kfree was in place.


julia


|
*   ptr@p = devm_kmalloc(...)
|
*   ptr@p = devm_kzalloc(...)
|
*   ptr@p = devm_kcalloc(...)
|
*   ptr@p = devm_kmalloc_array(...)
)
  ...
*  return -EPROBE_DEFER;

--
To 

Re: [PATCH] net: fec: select queue depending on VLAN priority

2017-05-09 Thread Stefan Agner
On 2017-05-09 06:39, David Miller wrote:
> From: Stefan Agner 
> Date: Mon,  8 May 2017 22:37:08 -0700
> 
>> Since the addition of the multi queue code with commit 59d0f7465644
>> ("net: fec: init multi queue date structure") the queue selection
>> has been handelt by the default transmit queue selection
>> implementation which tries to evenly distribute the traffic across
>> all available queues. This selection presumes that the queues are
>> using an equal priority, however, the queues 1 and 2 are actually
>> of higher priority (the classification of the queues is enabled in
>> fec_enet_enable_ring).
>>
>> This can lead to net scheduler warnings and continuous TX ring
>> dumps when exercising the system with iperf.
>>
>> Use only queue 0 for all common traffic (no VLAN and P802.1p
>> priority 0 and 1) and route level 2-7 through queue 1 and 2.
>>
>> Signed-off-by: Fugang Duan 
>> Fixes: 59d0f7465644 ("net: fec: init multi queue date structure")
> 
> If the queues are used for prioritization, and it does not have
> multiple normal priority level queues, multiqueue is not what the
> driver should have implemented.

As Andy mentioned, there is also a round-robin mode. I'll try that.

What would be the proper way to use the prioritized queues?

--
Stefan


Re: [PATCH] net: fec: select queue depending on VLAN priority

2017-05-09 Thread Stefan Agner
On 2017-05-09 06:39, David Miller wrote:
> From: Stefan Agner 
> Date: Mon,  8 May 2017 22:37:08 -0700
> 
>> Since the addition of the multi queue code with commit 59d0f7465644
>> ("net: fec: init multi queue date structure") the queue selection
>> has been handelt by the default transmit queue selection
>> implementation which tries to evenly distribute the traffic across
>> all available queues. This selection presumes that the queues are
>> using an equal priority, however, the queues 1 and 2 are actually
>> of higher priority (the classification of the queues is enabled in
>> fec_enet_enable_ring).
>>
>> This can lead to net scheduler warnings and continuous TX ring
>> dumps when exercising the system with iperf.
>>
>> Use only queue 0 for all common traffic (no VLAN and P802.1p
>> priority 0 and 1) and route level 2-7 through queue 1 and 2.
>>
>> Signed-off-by: Fugang Duan 
>> Fixes: 59d0f7465644 ("net: fec: init multi queue date structure")
> 
> If the queues are used for prioritization, and it does not have
> multiple normal priority level queues, multiqueue is not what the
> driver should have implemented.

As Andy mentioned, there is also a round-robin mode. I'll try that.

What would be the proper way to use the prioritized queues?

--
Stefan


Re: [PATCH] net: dsa: loop: Check for memory allocation failure

2017-05-09 Thread Julia Lawall


On Wed, 10 May 2017, Christophe JAILLET wrote:

> Le 09/05/2017 à 17:18, Joe Perches a écrit :
> > On Mon, 2017-05-08 at 17:35 -0700, Florian Fainelli wrote:
> > > On 05/08/2017 04:46 PM, Julia Lawall wrote:
> > > > On Mon, 8 May 2017, Joe Perches wrote:
> > > > > Each time -EPROBE_DEFER occurs, another set of calls to
> > > > > dsa_switch_alloc and dev_kzalloc also occurs.
> > > > >
> > > > > Perhaps it'd be better to do:
> > > > >
> > > > >   if (ps->netdev) {
> > > > >   devm_kfree(>dev, ps);
> > > > >   devm_kfree(>dev, ds);
> > > > >   return -EPROBE_DEFER;
> > > > >   }
> > > > Is EPROBE_DEFER handled differently than other kinds of errors?
> > > In the core device driver model, yes, EPROBE_DEFER is treated
> > > differently than other errors because it puts the driver on a retry queue.
> > >
> > > EPROBE_DEFER is already a slow and exceptional path, and this is a
> > > mock-up driver, so I am not sure what value there is in trying to
> > > balance devm_kzalloc() with corresponding devm_kfree()...
> > Example code should be as correct as possible.
> >
> Le 09/05/2017 à 17:18, Joe Perches a écrit :
> > On Mon, 2017-05-08 at 17:35 -0700, Florian Fainelli wrote:
> > > On 05/08/2017 04:46 PM, Julia Lawall wrote:
> > > > On Mon, 8 May 2017, Joe Perches wrote:
> > > > > Each time -EPROBE_DEFER occurs, another set of calls to
> > > > > dsa_switch_alloc and dev_kzalloc also occurs.
> > > > >
> > > > > Perhaps it'd be better to do:
> > > > >
> > > > >   if (ps->netdev) {
> > > > >   devm_kfree(>dev, ps);
> > > > >   devm_kfree(>dev, ds);
> > > > >   return -EPROBE_DEFER;
> > > > >   }
> > > > Is EPROBE_DEFER handled differently than other kinds of errors?
> > > In the core device driver model, yes, EPROBE_DEFER is treated
> > > differently than other errors because it puts the driver on a retry queue.
> > >
> > > EPROBE_DEFER is already a slow and exceptional path, and this is a
> > > mock-up driver, so I am not sure what value there is in trying to
> > > balance devm_kzalloc() with corresponding devm_kfree()...
> > Example code should be as correct as possible.
> >
> (* number of people/mailing list in copy has been reduced *)
>
>
> The coccinelle script below gives the following list of candidates for such
> improvement.
>
> char/hw_random/omap-rng.c
> clk/clk-si5351.c
> clk/clk-versaclock5.c
> crypto/mediatek/mtk-platform.c
> devfreq/rk3399_dmc.c
> dma/mv_xor_v2.c
> dma/omap-dma.c
> gpu/drm/arc/arcpgu_hdmi.c
> gpu/drm/bridge/dumb-vga-dac.c
> gpu/drm/bridge/lvds-encoder.c
> gpu/drm/exynos/exynos_dp.c
> gpu/drm/exynos/exynos_drm_dsi.c
> gpu/drm/imx/dw_hdmi-imx.c
> gpu/drm/mediatek/mtk_dpi.c
> gpu/drm/mediatek/mtk_drm_ddp_comp.c
> gpu/drm/mediatek/mtk_dsi.c
> gpu/drm/panel/panel-lvds.c
> gpu/drm/panel/panel-simple.c
> gpu/drm/panel/panel-sitronix-st7789v.c
> gpu/drm/rcar-du/rcar_du_lvdscon.c
> gpu/drm/rockchip/cdn-dp-core.c
> gpu/drm/rockchip/dw_hdmi-rockchip.c
> gpu/drm/sti/sti_hdmi.c
> gpu/drm/tegra/sor.c
> gpu/drm/tilcdc/tilcdc_panel.c
> gpu/drm/vc4/vc4_hdmi.c
> gpu/ipu-v3/ipu-common.c
> gpu/ipu-v3/ipu-pre.c
> gpu/ipu-v3/ipu-prg.c
> hwtracing/coresight/coresight-stm.c
> i2c/busses/i2c-designware-platdrv.c
> i2c/busses/i2c-mv64xxx.c
> i2c/muxes/i2c-mux-gpio.c
> i2c/muxes/i2c-mux-pinctrl.c
> i2c/muxes/i2c-mux-reg.c
> iommu/mtk_iommu.c
> iommu/mtk_iommu_v1.c
> irqchip/qcom-irq-combiner.c
> mailbox/mailbox-test.c
> media/i2c/mt9m111.c
> media/i2c/ov2640.c
> media/i2c/ov7670.c
> media/i2c/smiapp/smiapp-core.c
> media/i2c/soc_camera/imx074.c
> media/platform/coda/coda-common.c
> media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
> media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
> media/platform/s5p-cec/s5p_cec.c
> media/platform/sti/cec/stih-cec.c
> memory/tegra/tegra124-emc.c
> mfd/twl6040.c
> mtd/nand/lpc32xx_mlc.c
> mtd/nand/lpc32xx_slc.c
> net/dsa/dsa_loop.c
> net/ethernet/mediatek/mtk_eth_soc.c
> net/phy/xilinx_gmii2rgmii.c
> net/wireless/ti/wlcore/spi.c
> pci/host/pcie-iproc-platform.c
> phy/phy-exynos5250-sata.c
> phy/phy-mt65xx-usb3.c
> phy/phy-qcom-qusb2.c
> phy/phy-sun4i-usb.c
> pinctrl/core.c
> pinctrl/pinctrl-at91.c
> platform/x86/intel_cht_int33fe.c
> power/supply/act8945a_charger.c
> power/supply/axp20x_ac_power.c
> power/supply/axp20x_battery.c
> power/supply/axp288_charger.c
> power/supply/bq24190_charger.c
> power/supply/cpcap-charger.c
> power/supply/gpio-charger.c
> soc/bcm/raspberrypi-power.c
> thermal/samsung/exynos_tmu.c
> tty/serial/8250/8250_dw.c
> tty/serial/max310x.c
> tty/serial/sccnxp.c
> usb/chipidea/ci_hdrc_msm.c
> usb/gadget/udc/mv_udc_core.c
> usb/host/xhci-mtk.c
> usb/mtu3/mtu3_plat.c
> usb/musb/sunxi.c
> usb/phy/phy-am335x.c
> usb/phy/phy-generic.c
> usb/phy/phy-twl6030-usb.c
> video/backlight/hx8357.c
> video/backlight/lp855x_bl.c
> video/fbdev/simplefb.c
>
>
> Coccinelle script :
> =
> // find calls to kmalloc or equivalent function
> @call@
> expression ptr;
> position p;
> 

Re: [PATCH] net: dsa: loop: Check for memory allocation failure

2017-05-09 Thread Julia Lawall


On Wed, 10 May 2017, Christophe JAILLET wrote:

> Le 09/05/2017 à 17:18, Joe Perches a écrit :
> > On Mon, 2017-05-08 at 17:35 -0700, Florian Fainelli wrote:
> > > On 05/08/2017 04:46 PM, Julia Lawall wrote:
> > > > On Mon, 8 May 2017, Joe Perches wrote:
> > > > > Each time -EPROBE_DEFER occurs, another set of calls to
> > > > > dsa_switch_alloc and dev_kzalloc also occurs.
> > > > >
> > > > > Perhaps it'd be better to do:
> > > > >
> > > > >   if (ps->netdev) {
> > > > >   devm_kfree(>dev, ps);
> > > > >   devm_kfree(>dev, ds);
> > > > >   return -EPROBE_DEFER;
> > > > >   }
> > > > Is EPROBE_DEFER handled differently than other kinds of errors?
> > > In the core device driver model, yes, EPROBE_DEFER is treated
> > > differently than other errors because it puts the driver on a retry queue.
> > >
> > > EPROBE_DEFER is already a slow and exceptional path, and this is a
> > > mock-up driver, so I am not sure what value there is in trying to
> > > balance devm_kzalloc() with corresponding devm_kfree()...
> > Example code should be as correct as possible.
> >
> Le 09/05/2017 à 17:18, Joe Perches a écrit :
> > On Mon, 2017-05-08 at 17:35 -0700, Florian Fainelli wrote:
> > > On 05/08/2017 04:46 PM, Julia Lawall wrote:
> > > > On Mon, 8 May 2017, Joe Perches wrote:
> > > > > Each time -EPROBE_DEFER occurs, another set of calls to
> > > > > dsa_switch_alloc and dev_kzalloc also occurs.
> > > > >
> > > > > Perhaps it'd be better to do:
> > > > >
> > > > >   if (ps->netdev) {
> > > > >   devm_kfree(>dev, ps);
> > > > >   devm_kfree(>dev, ds);
> > > > >   return -EPROBE_DEFER;
> > > > >   }
> > > > Is EPROBE_DEFER handled differently than other kinds of errors?
> > > In the core device driver model, yes, EPROBE_DEFER is treated
> > > differently than other errors because it puts the driver on a retry queue.
> > >
> > > EPROBE_DEFER is already a slow and exceptional path, and this is a
> > > mock-up driver, so I am not sure what value there is in trying to
> > > balance devm_kzalloc() with corresponding devm_kfree()...
> > Example code should be as correct as possible.
> >
> (* number of people/mailing list in copy has been reduced *)
>
>
> The coccinelle script below gives the following list of candidates for such
> improvement.
>
> char/hw_random/omap-rng.c
> clk/clk-si5351.c
> clk/clk-versaclock5.c
> crypto/mediatek/mtk-platform.c
> devfreq/rk3399_dmc.c
> dma/mv_xor_v2.c
> dma/omap-dma.c
> gpu/drm/arc/arcpgu_hdmi.c
> gpu/drm/bridge/dumb-vga-dac.c
> gpu/drm/bridge/lvds-encoder.c
> gpu/drm/exynos/exynos_dp.c
> gpu/drm/exynos/exynos_drm_dsi.c
> gpu/drm/imx/dw_hdmi-imx.c
> gpu/drm/mediatek/mtk_dpi.c
> gpu/drm/mediatek/mtk_drm_ddp_comp.c
> gpu/drm/mediatek/mtk_dsi.c
> gpu/drm/panel/panel-lvds.c
> gpu/drm/panel/panel-simple.c
> gpu/drm/panel/panel-sitronix-st7789v.c
> gpu/drm/rcar-du/rcar_du_lvdscon.c
> gpu/drm/rockchip/cdn-dp-core.c
> gpu/drm/rockchip/dw_hdmi-rockchip.c
> gpu/drm/sti/sti_hdmi.c
> gpu/drm/tegra/sor.c
> gpu/drm/tilcdc/tilcdc_panel.c
> gpu/drm/vc4/vc4_hdmi.c
> gpu/ipu-v3/ipu-common.c
> gpu/ipu-v3/ipu-pre.c
> gpu/ipu-v3/ipu-prg.c
> hwtracing/coresight/coresight-stm.c
> i2c/busses/i2c-designware-platdrv.c
> i2c/busses/i2c-mv64xxx.c
> i2c/muxes/i2c-mux-gpio.c
> i2c/muxes/i2c-mux-pinctrl.c
> i2c/muxes/i2c-mux-reg.c
> iommu/mtk_iommu.c
> iommu/mtk_iommu_v1.c
> irqchip/qcom-irq-combiner.c
> mailbox/mailbox-test.c
> media/i2c/mt9m111.c
> media/i2c/ov2640.c
> media/i2c/ov7670.c
> media/i2c/smiapp/smiapp-core.c
> media/i2c/soc_camera/imx074.c
> media/platform/coda/coda-common.c
> media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
> media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
> media/platform/s5p-cec/s5p_cec.c
> media/platform/sti/cec/stih-cec.c
> memory/tegra/tegra124-emc.c
> mfd/twl6040.c
> mtd/nand/lpc32xx_mlc.c
> mtd/nand/lpc32xx_slc.c
> net/dsa/dsa_loop.c
> net/ethernet/mediatek/mtk_eth_soc.c
> net/phy/xilinx_gmii2rgmii.c
> net/wireless/ti/wlcore/spi.c
> pci/host/pcie-iproc-platform.c
> phy/phy-exynos5250-sata.c
> phy/phy-mt65xx-usb3.c
> phy/phy-qcom-qusb2.c
> phy/phy-sun4i-usb.c
> pinctrl/core.c
> pinctrl/pinctrl-at91.c
> platform/x86/intel_cht_int33fe.c
> power/supply/act8945a_charger.c
> power/supply/axp20x_ac_power.c
> power/supply/axp20x_battery.c
> power/supply/axp288_charger.c
> power/supply/bq24190_charger.c
> power/supply/cpcap-charger.c
> power/supply/gpio-charger.c
> soc/bcm/raspberrypi-power.c
> thermal/samsung/exynos_tmu.c
> tty/serial/8250/8250_dw.c
> tty/serial/max310x.c
> tty/serial/sccnxp.c
> usb/chipidea/ci_hdrc_msm.c
> usb/gadget/udc/mv_udc_core.c
> usb/host/xhci-mtk.c
> usb/mtu3/mtu3_plat.c
> usb/musb/sunxi.c
> usb/phy/phy-am335x.c
> usb/phy/phy-generic.c
> usb/phy/phy-twl6030-usb.c
> video/backlight/hx8357.c
> video/backlight/lp855x_bl.c
> video/fbdev/simplefb.c
>
>
> Coccinelle script :
> =
> // find calls to kmalloc or equivalent function
> @call@
> expression ptr;
> position p;
> 

Re: [PATCH -v3 0/13] mm: make movable onlining suck less

2017-05-09 Thread Dan Williams
On Fri, Apr 21, 2017 at 5:05 AM, Michal Hocko  wrote:
> Hi,
> The last version of this series has been posted here [1]. It has seen
> some more testing (thanks to Reza Arbab and Igor Mammedov[2]), Jerome's
> and Vlastimil's review resulted in few fixes mostly folded in their
> respected patches.
> There are 4 more patches (patch 6+ in this series).  I have checked the
> most prominent pfn walkers to skip over offline holes and now and I feel
> more comfortable to have this merged. All the reported issues should be
> fixed
>
> There is still a lot of work on top - namely this implementation doesn't
> support reonlining to a different zone on the zones boundaries but I
> will do that in a separate series because this one is getting quite
> large already and it should work reasonably well now.
>
> Joonsoo had some worries about pfn_valid and suggested to change its
> semantic to return false on offline holes but I would be rally worried
> to change a established semantic used by a lot of code and so I have
> introuduced pfn_to_online_page helper instead. If this is seen as a
> controversial point I would rather drop pfn_to_online_page and related
> patches as they are not stictly necessary because the code would be
> similarly broken as now wrt. offline holes.
>
> This is a rebase on top of linux-next (next-20170418) and the full
> series is in git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
> try attempts/rewrite-mem_hotplug branch.
>
[..]
> Any thoughts, complains, suggestions?
>
> As a bonus we will get a nice cleanup in the memory hotplug codebase.
>  arch/ia64/mm/init.c|  11 +-
>  arch/powerpc/mm/mem.c  |  12 +-
>  arch/s390/mm/init.c|  32 +--
>  arch/sh/mm/init.c  |  10 +-
>  arch/x86/mm/init_32.c  |   7 +-
>  arch/x86/mm/init_64.c  |  11 +-
>  drivers/base/memory.c  |  79 +++
>  drivers/base/node.c|  58 ++
>  include/linux/memory_hotplug.h |  40 +++-
>  include/linux/mmzone.h |  44 +++-
>  include/linux/node.h   |  35 +++-
>  kernel/memremap.c  |   6 +-
>  mm/compaction.c|   5 +-
>  mm/memory_hotplug.c| 455 
> ++---
>  mm/page_alloc.c|  13 +-
>  mm/page_isolation.c|  26 ++-
>  mm/sparse.c|  48 -
>  17 files changed, 407 insertions(+), 485 deletions(-)
>
> Shortlog says:
> Michal Hocko (13):
>   mm: remove return value from init_currently_empty_zone
>   mm, memory_hotplug: use node instead of zone in can_online_high_movable
>   mm: drop page_initialized check from get_nid_for_pfn
>   mm, memory_hotplug: get rid of is_zone_device_section
>   mm, memory_hotplug: split up register_one_node
>   mm, memory_hotplug: consider offline memblocks removable
>   mm: consider zone which is not fully populated to have holes
>   mm, compaction: skip over holes in __reset_isolation_suitable
>   mm: __first_valid_page skip over offline pages
>   mm, memory_hotplug: do not associate hotadded memory to zones until 
> online
>   mm, memory_hotplug: replace for_device by want_memblock in 
> arch_add_memory
>   mm, memory_hotplug: fix the section mismatch warning
>   mm, memory_hotplug: remove unused cruft after memory hotplug rework
>
> [1] http://lkml.kernel.org/r/20170410110351.12215-1-mho...@kernel.org
> [2] http://lkml.kernel.org/r/20170410162749.7d7f3...@nial.brq.redhat.com
>
>

The latest "attempts/rewrite-mem_hotplug" branch passes my regression
testing if I cherry-pick the following x86/mm fixes from mainline:

e6ab9c4d4377 x86/mm/64: Fix crash in remove_pagetable()
71389703839e mm, zone_device: Replace {get, put}_zone_device_page()
with a single reference to fix pmem crash

You can add:

Tested-by: Dan Williams 


Re: [PATCH -v3 0/13] mm: make movable onlining suck less

2017-05-09 Thread Dan Williams
On Fri, Apr 21, 2017 at 5:05 AM, Michal Hocko  wrote:
> Hi,
> The last version of this series has been posted here [1]. It has seen
> some more testing (thanks to Reza Arbab and Igor Mammedov[2]), Jerome's
> and Vlastimil's review resulted in few fixes mostly folded in their
> respected patches.
> There are 4 more patches (patch 6+ in this series).  I have checked the
> most prominent pfn walkers to skip over offline holes and now and I feel
> more comfortable to have this merged. All the reported issues should be
> fixed
>
> There is still a lot of work on top - namely this implementation doesn't
> support reonlining to a different zone on the zones boundaries but I
> will do that in a separate series because this one is getting quite
> large already and it should work reasonably well now.
>
> Joonsoo had some worries about pfn_valid and suggested to change its
> semantic to return false on offline holes but I would be rally worried
> to change a established semantic used by a lot of code and so I have
> introuduced pfn_to_online_page helper instead. If this is seen as a
> controversial point I would rather drop pfn_to_online_page and related
> patches as they are not stictly necessary because the code would be
> similarly broken as now wrt. offline holes.
>
> This is a rebase on top of linux-next (next-20170418) and the full
> series is in git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
> try attempts/rewrite-mem_hotplug branch.
>
[..]
> Any thoughts, complains, suggestions?
>
> As a bonus we will get a nice cleanup in the memory hotplug codebase.
>  arch/ia64/mm/init.c|  11 +-
>  arch/powerpc/mm/mem.c  |  12 +-
>  arch/s390/mm/init.c|  32 +--
>  arch/sh/mm/init.c  |  10 +-
>  arch/x86/mm/init_32.c  |   7 +-
>  arch/x86/mm/init_64.c  |  11 +-
>  drivers/base/memory.c  |  79 +++
>  drivers/base/node.c|  58 ++
>  include/linux/memory_hotplug.h |  40 +++-
>  include/linux/mmzone.h |  44 +++-
>  include/linux/node.h   |  35 +++-
>  kernel/memremap.c  |   6 +-
>  mm/compaction.c|   5 +-
>  mm/memory_hotplug.c| 455 
> ++---
>  mm/page_alloc.c|  13 +-
>  mm/page_isolation.c|  26 ++-
>  mm/sparse.c|  48 -
>  17 files changed, 407 insertions(+), 485 deletions(-)
>
> Shortlog says:
> Michal Hocko (13):
>   mm: remove return value from init_currently_empty_zone
>   mm, memory_hotplug: use node instead of zone in can_online_high_movable
>   mm: drop page_initialized check from get_nid_for_pfn
>   mm, memory_hotplug: get rid of is_zone_device_section
>   mm, memory_hotplug: split up register_one_node
>   mm, memory_hotplug: consider offline memblocks removable
>   mm: consider zone which is not fully populated to have holes
>   mm, compaction: skip over holes in __reset_isolation_suitable
>   mm: __first_valid_page skip over offline pages
>   mm, memory_hotplug: do not associate hotadded memory to zones until 
> online
>   mm, memory_hotplug: replace for_device by want_memblock in 
> arch_add_memory
>   mm, memory_hotplug: fix the section mismatch warning
>   mm, memory_hotplug: remove unused cruft after memory hotplug rework
>
> [1] http://lkml.kernel.org/r/20170410110351.12215-1-mho...@kernel.org
> [2] http://lkml.kernel.org/r/20170410162749.7d7f3...@nial.brq.redhat.com
>
>

The latest "attempts/rewrite-mem_hotplug" branch passes my regression
testing if I cherry-pick the following x86/mm fixes from mainline:

e6ab9c4d4377 x86/mm/64: Fix crash in remove_pagetable()
71389703839e mm, zone_device: Replace {get, put}_zone_device_page()
with a single reference to fix pmem crash

You can add:

Tested-by: Dan Williams 


[PATCH] staging: typec: Fix sparse warnings about incorrect types

2017-05-09 Thread Guru Das Srinagesh
Fix the following sparse warnings about incorrect type usage:

tcpci.c:290:38: warning: incorrect type in argument 1 (different base types)
tcpci.c:290:38:expected unsigned short [unsigned] [usertype] header
tcpci.c:290:38:got restricted __le16 const [usertype] header
tcpci.c:295:16: warning: incorrect type in assignment (different base types)
tcpci.c:295:16:expected unsigned int [unsigned] header
tcpci.c:295:16:got restricted __le16
tcpci.c:393:28: warning: incorrect type in assignment (different base types)
tcpci.c:393:28:expected restricted __le16 [usertype] header
tcpci.c:393:28:got unsigned int [unsigned] [addressable] reg

fusb302.c:1028:32: warning: incorrect type in argument 1 (different base types)
fusb302.c:1028:32:expected unsigned short [unsigned] [usertype] header
fusb302.c:1028:32:got restricted __le16 const [usertype] header
fusb302.c:1484:32: warning: incorrect type in argument 1 (different base types)
fusb302.c:1484:32:expected unsigned short [unsigned] [usertype] header
fusb302.c:1484:32:got restricted __le16 [usertype] header

Signed-off-by: Guru Das Srinagesh 
---
 drivers/staging/typec/fusb302/fusb302.c | 4 ++--
 drivers/staging/typec/tcpci.c   | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/typec/fusb302/fusb302.c 
b/drivers/staging/typec/fusb302/fusb302.c
index 2cee9a9..9612ef1 100644
--- a/drivers/staging/typec/fusb302/fusb302.c
+++ b/drivers/staging/typec/fusb302/fusb302.c
@@ -1025,7 +1025,7 @@ static int fusb302_pd_send_message(struct fusb302_chip 
*chip,
buf[pos++] = FUSB302_TKN_SYNC1;
buf[pos++] = FUSB302_TKN_SYNC2;
 
-   len = pd_header_cnt(msg->header) * 4;
+   len = pd_header_cnt(le16_to_cpu(msg->header)) * 4;
/* plug 2 for header */
len += 2;
if (len > 0x1F) {
@@ -1481,7 +1481,7 @@ static int fusb302_pd_read_message(struct fusb302_chip 
*chip,
 (u8 *)>header);
if (ret < 0)
return ret;
-   len = pd_header_cnt(msg->header) * 4;
+   len = pd_header_cnt(le16_to_cpu(msg->header)) * 4;
/* add 4 to length to include the CRC */
if (len > PD_MAX_PAYLOAD * 4) {
fusb302_log(chip, "PD message too long %d", len);
diff --git a/drivers/staging/typec/tcpci.c b/drivers/staging/typec/tcpci.c
index 5e5be74..d0c22a7 100644
--- a/drivers/staging/typec/tcpci.c
+++ b/drivers/staging/typec/tcpci.c
@@ -287,12 +287,12 @@ static int tcpci_pd_transmit(struct tcpc_dev *tcpc,
unsigned int reg, cnt, header;
int ret;
 
-   cnt = msg ? pd_header_cnt(msg->header) * 4 : 0;
+   cnt = msg ? pd_header_cnt(le16_to_cpu(msg->header)) * 4 : 0;
ret = regmap_write(tcpci->regmap, TCPC_TX_BYTE_CNT, cnt + 2);
if (ret < 0)
return ret;
 
-   header = msg ? msg->header : 0;
+   header = msg ? le16_to_cpu(msg->header) : 0;
ret = tcpci_write16(tcpci, TCPC_TX_HDR, header);
if (ret < 0)
return ret;
@@ -390,7 +390,7 @@ static irqreturn_t tcpci_irq(int irq, void *dev_id)
regmap_read(tcpci->regmap, TCPC_RX_BYTE_CNT, );
 
tcpci_read16(tcpci, TCPC_RX_HDR, );
-   msg.header = reg;
+   msg.header = cpu_to_le16(reg);
 
if (WARN_ON(cnt > sizeof(msg.payload)))
cnt = sizeof(msg.payload);
-- 
2.7.4



[PATCH] staging: typec: Fix sparse warnings about incorrect types

2017-05-09 Thread Guru Das Srinagesh
Fix the following sparse warnings about incorrect type usage:

tcpci.c:290:38: warning: incorrect type in argument 1 (different base types)
tcpci.c:290:38:expected unsigned short [unsigned] [usertype] header
tcpci.c:290:38:got restricted __le16 const [usertype] header
tcpci.c:295:16: warning: incorrect type in assignment (different base types)
tcpci.c:295:16:expected unsigned int [unsigned] header
tcpci.c:295:16:got restricted __le16
tcpci.c:393:28: warning: incorrect type in assignment (different base types)
tcpci.c:393:28:expected restricted __le16 [usertype] header
tcpci.c:393:28:got unsigned int [unsigned] [addressable] reg

fusb302.c:1028:32: warning: incorrect type in argument 1 (different base types)
fusb302.c:1028:32:expected unsigned short [unsigned] [usertype] header
fusb302.c:1028:32:got restricted __le16 const [usertype] header
fusb302.c:1484:32: warning: incorrect type in argument 1 (different base types)
fusb302.c:1484:32:expected unsigned short [unsigned] [usertype] header
fusb302.c:1484:32:got restricted __le16 [usertype] header

Signed-off-by: Guru Das Srinagesh 
---
 drivers/staging/typec/fusb302/fusb302.c | 4 ++--
 drivers/staging/typec/tcpci.c   | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/typec/fusb302/fusb302.c 
b/drivers/staging/typec/fusb302/fusb302.c
index 2cee9a9..9612ef1 100644
--- a/drivers/staging/typec/fusb302/fusb302.c
+++ b/drivers/staging/typec/fusb302/fusb302.c
@@ -1025,7 +1025,7 @@ static int fusb302_pd_send_message(struct fusb302_chip 
*chip,
buf[pos++] = FUSB302_TKN_SYNC1;
buf[pos++] = FUSB302_TKN_SYNC2;
 
-   len = pd_header_cnt(msg->header) * 4;
+   len = pd_header_cnt(le16_to_cpu(msg->header)) * 4;
/* plug 2 for header */
len += 2;
if (len > 0x1F) {
@@ -1481,7 +1481,7 @@ static int fusb302_pd_read_message(struct fusb302_chip 
*chip,
 (u8 *)>header);
if (ret < 0)
return ret;
-   len = pd_header_cnt(msg->header) * 4;
+   len = pd_header_cnt(le16_to_cpu(msg->header)) * 4;
/* add 4 to length to include the CRC */
if (len > PD_MAX_PAYLOAD * 4) {
fusb302_log(chip, "PD message too long %d", len);
diff --git a/drivers/staging/typec/tcpci.c b/drivers/staging/typec/tcpci.c
index 5e5be74..d0c22a7 100644
--- a/drivers/staging/typec/tcpci.c
+++ b/drivers/staging/typec/tcpci.c
@@ -287,12 +287,12 @@ static int tcpci_pd_transmit(struct tcpc_dev *tcpc,
unsigned int reg, cnt, header;
int ret;
 
-   cnt = msg ? pd_header_cnt(msg->header) * 4 : 0;
+   cnt = msg ? pd_header_cnt(le16_to_cpu(msg->header)) * 4 : 0;
ret = regmap_write(tcpci->regmap, TCPC_TX_BYTE_CNT, cnt + 2);
if (ret < 0)
return ret;
 
-   header = msg ? msg->header : 0;
+   header = msg ? le16_to_cpu(msg->header) : 0;
ret = tcpci_write16(tcpci, TCPC_TX_HDR, header);
if (ret < 0)
return ret;
@@ -390,7 +390,7 @@ static irqreturn_t tcpci_irq(int irq, void *dev_id)
regmap_read(tcpci->regmap, TCPC_RX_BYTE_CNT, );
 
tcpci_read16(tcpci, TCPC_RX_HDR, );
-   msg.header = reg;
+   msg.header = cpu_to_le16(reg);
 
if (WARN_ON(cnt > sizeof(msg.payload)))
cnt = sizeof(msg.payload);
-- 
2.7.4



Re: [PATCH] net: dsa: loop: Check for memory allocation failure

2017-05-09 Thread Christophe JAILLET

Le 09/05/2017 à 17:18, Joe Perches a écrit :

On Mon, 2017-05-08 at 17:35 -0700, Florian Fainelli wrote:

On 05/08/2017 04:46 PM, Julia Lawall wrote:

On Mon, 8 May 2017, Joe Perches wrote:

Each time -EPROBE_DEFER occurs, another set of calls to
dsa_switch_alloc and dev_kzalloc also occurs.

Perhaps it'd be better to do:

if (ps->netdev) {
devm_kfree(>dev, ps);
devm_kfree(>dev, ds);
return -EPROBE_DEFER;
}

Is EPROBE_DEFER handled differently than other kinds of errors?

In the core device driver model, yes, EPROBE_DEFER is treated
differently than other errors because it puts the driver on a retry queue.

EPROBE_DEFER is already a slow and exceptional path, and this is a
mock-up driver, so I am not sure what value there is in trying to
balance devm_kzalloc() with corresponding devm_kfree()...

Example code should be as correct as possible.


Le 09/05/2017 à 17:18, Joe Perches a écrit :

On Mon, 2017-05-08 at 17:35 -0700, Florian Fainelli wrote:

On 05/08/2017 04:46 PM, Julia Lawall wrote:

On Mon, 8 May 2017, Joe Perches wrote:

Each time -EPROBE_DEFER occurs, another set of calls to
dsa_switch_alloc and dev_kzalloc also occurs.

Perhaps it'd be better to do:

if (ps->netdev) {
devm_kfree(>dev, ps);
devm_kfree(>dev, ds);
return -EPROBE_DEFER;
}

Is EPROBE_DEFER handled differently than other kinds of errors?

In the core device driver model, yes, EPROBE_DEFER is treated
differently than other errors because it puts the driver on a retry queue.

EPROBE_DEFER is already a slow and exceptional path, and this is a
mock-up driver, so I am not sure what value there is in trying to
balance devm_kzalloc() with corresponding devm_kfree()...

Example code should be as correct as possible.


(* number of people/mailing list in copy has been reduced *)


The coccinelle script below gives the following list of candidates for 
such improvement.


char/hw_random/omap-rng.c
clk/clk-si5351.c
clk/clk-versaclock5.c
crypto/mediatek/mtk-platform.c
devfreq/rk3399_dmc.c
dma/mv_xor_v2.c
dma/omap-dma.c
gpu/drm/arc/arcpgu_hdmi.c
gpu/drm/bridge/dumb-vga-dac.c
gpu/drm/bridge/lvds-encoder.c
gpu/drm/exynos/exynos_dp.c
gpu/drm/exynos/exynos_drm_dsi.c
gpu/drm/imx/dw_hdmi-imx.c
gpu/drm/mediatek/mtk_dpi.c
gpu/drm/mediatek/mtk_drm_ddp_comp.c
gpu/drm/mediatek/mtk_dsi.c
gpu/drm/panel/panel-lvds.c
gpu/drm/panel/panel-simple.c
gpu/drm/panel/panel-sitronix-st7789v.c
gpu/drm/rcar-du/rcar_du_lvdscon.c
gpu/drm/rockchip/cdn-dp-core.c
gpu/drm/rockchip/dw_hdmi-rockchip.c
gpu/drm/sti/sti_hdmi.c
gpu/drm/tegra/sor.c
gpu/drm/tilcdc/tilcdc_panel.c
gpu/drm/vc4/vc4_hdmi.c
gpu/ipu-v3/ipu-common.c
gpu/ipu-v3/ipu-pre.c
gpu/ipu-v3/ipu-prg.c
hwtracing/coresight/coresight-stm.c
i2c/busses/i2c-designware-platdrv.c
i2c/busses/i2c-mv64xxx.c
i2c/muxes/i2c-mux-gpio.c
i2c/muxes/i2c-mux-pinctrl.c
i2c/muxes/i2c-mux-reg.c
iommu/mtk_iommu.c
iommu/mtk_iommu_v1.c
irqchip/qcom-irq-combiner.c
mailbox/mailbox-test.c
media/i2c/mt9m111.c
media/i2c/ov2640.c
media/i2c/ov7670.c
media/i2c/smiapp/smiapp-core.c
media/i2c/soc_camera/imx074.c
media/platform/coda/coda-common.c
media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
media/platform/s5p-cec/s5p_cec.c
media/platform/sti/cec/stih-cec.c
memory/tegra/tegra124-emc.c
mfd/twl6040.c
mtd/nand/lpc32xx_mlc.c
mtd/nand/lpc32xx_slc.c
net/dsa/dsa_loop.c
net/ethernet/mediatek/mtk_eth_soc.c
net/phy/xilinx_gmii2rgmii.c
net/wireless/ti/wlcore/spi.c
pci/host/pcie-iproc-platform.c
phy/phy-exynos5250-sata.c
phy/phy-mt65xx-usb3.c
phy/phy-qcom-qusb2.c
phy/phy-sun4i-usb.c
pinctrl/core.c
pinctrl/pinctrl-at91.c
platform/x86/intel_cht_int33fe.c
power/supply/act8945a_charger.c
power/supply/axp20x_ac_power.c
power/supply/axp20x_battery.c
power/supply/axp288_charger.c
power/supply/bq24190_charger.c
power/supply/cpcap-charger.c
power/supply/gpio-charger.c
soc/bcm/raspberrypi-power.c
thermal/samsung/exynos_tmu.c
tty/serial/8250/8250_dw.c
tty/serial/max310x.c
tty/serial/sccnxp.c
usb/chipidea/ci_hdrc_msm.c
usb/gadget/udc/mv_udc_core.c
usb/host/xhci-mtk.c
usb/mtu3/mtu3_plat.c
usb/musb/sunxi.c
usb/phy/phy-am335x.c
usb/phy/phy-generic.c
usb/phy/phy-twl6030-usb.c
video/backlight/hx8357.c
video/backlight/lp855x_bl.c
video/fbdev/simplefb.c


Coccinelle script :
=
// find calls to kmalloc or equivalent function
@call@
expression ptr;
position p;
@@

(
*   ptr@p = kmalloc(...)
|
*   ptr@p = kzalloc(...)
|
*   ptr@p = kcalloc(...)
|
*   ptr@p = kmalloc_array(...)
|
*   ptr@p = devm_kmalloc(...)
|
*   ptr@p = devm_kzalloc(...)
|
*   ptr@p = devm_kcalloc(...)
|
*   ptr@p = devm_kmalloc_array(...)
)
 ...
*  return -EPROBE_DEFER;



Re: [PATCH] net: dsa: loop: Check for memory allocation failure

2017-05-09 Thread Christophe JAILLET

Le 09/05/2017 à 17:18, Joe Perches a écrit :

On Mon, 2017-05-08 at 17:35 -0700, Florian Fainelli wrote:

On 05/08/2017 04:46 PM, Julia Lawall wrote:

On Mon, 8 May 2017, Joe Perches wrote:

Each time -EPROBE_DEFER occurs, another set of calls to
dsa_switch_alloc and dev_kzalloc also occurs.

Perhaps it'd be better to do:

if (ps->netdev) {
devm_kfree(>dev, ps);
devm_kfree(>dev, ds);
return -EPROBE_DEFER;
}

Is EPROBE_DEFER handled differently than other kinds of errors?

In the core device driver model, yes, EPROBE_DEFER is treated
differently than other errors because it puts the driver on a retry queue.

EPROBE_DEFER is already a slow and exceptional path, and this is a
mock-up driver, so I am not sure what value there is in trying to
balance devm_kzalloc() with corresponding devm_kfree()...

Example code should be as correct as possible.


Le 09/05/2017 à 17:18, Joe Perches a écrit :

On Mon, 2017-05-08 at 17:35 -0700, Florian Fainelli wrote:

On 05/08/2017 04:46 PM, Julia Lawall wrote:

On Mon, 8 May 2017, Joe Perches wrote:

Each time -EPROBE_DEFER occurs, another set of calls to
dsa_switch_alloc and dev_kzalloc also occurs.

Perhaps it'd be better to do:

if (ps->netdev) {
devm_kfree(>dev, ps);
devm_kfree(>dev, ds);
return -EPROBE_DEFER;
}

Is EPROBE_DEFER handled differently than other kinds of errors?

In the core device driver model, yes, EPROBE_DEFER is treated
differently than other errors because it puts the driver on a retry queue.

EPROBE_DEFER is already a slow and exceptional path, and this is a
mock-up driver, so I am not sure what value there is in trying to
balance devm_kzalloc() with corresponding devm_kfree()...

Example code should be as correct as possible.


(* number of people/mailing list in copy has been reduced *)


The coccinelle script below gives the following list of candidates for 
such improvement.


char/hw_random/omap-rng.c
clk/clk-si5351.c
clk/clk-versaclock5.c
crypto/mediatek/mtk-platform.c
devfreq/rk3399_dmc.c
dma/mv_xor_v2.c
dma/omap-dma.c
gpu/drm/arc/arcpgu_hdmi.c
gpu/drm/bridge/dumb-vga-dac.c
gpu/drm/bridge/lvds-encoder.c
gpu/drm/exynos/exynos_dp.c
gpu/drm/exynos/exynos_drm_dsi.c
gpu/drm/imx/dw_hdmi-imx.c
gpu/drm/mediatek/mtk_dpi.c
gpu/drm/mediatek/mtk_drm_ddp_comp.c
gpu/drm/mediatek/mtk_dsi.c
gpu/drm/panel/panel-lvds.c
gpu/drm/panel/panel-simple.c
gpu/drm/panel/panel-sitronix-st7789v.c
gpu/drm/rcar-du/rcar_du_lvdscon.c
gpu/drm/rockchip/cdn-dp-core.c
gpu/drm/rockchip/dw_hdmi-rockchip.c
gpu/drm/sti/sti_hdmi.c
gpu/drm/tegra/sor.c
gpu/drm/tilcdc/tilcdc_panel.c
gpu/drm/vc4/vc4_hdmi.c
gpu/ipu-v3/ipu-common.c
gpu/ipu-v3/ipu-pre.c
gpu/ipu-v3/ipu-prg.c
hwtracing/coresight/coresight-stm.c
i2c/busses/i2c-designware-platdrv.c
i2c/busses/i2c-mv64xxx.c
i2c/muxes/i2c-mux-gpio.c
i2c/muxes/i2c-mux-pinctrl.c
i2c/muxes/i2c-mux-reg.c
iommu/mtk_iommu.c
iommu/mtk_iommu_v1.c
irqchip/qcom-irq-combiner.c
mailbox/mailbox-test.c
media/i2c/mt9m111.c
media/i2c/ov2640.c
media/i2c/ov7670.c
media/i2c/smiapp/smiapp-core.c
media/i2c/soc_camera/imx074.c
media/platform/coda/coda-common.c
media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
media/platform/s5p-cec/s5p_cec.c
media/platform/sti/cec/stih-cec.c
memory/tegra/tegra124-emc.c
mfd/twl6040.c
mtd/nand/lpc32xx_mlc.c
mtd/nand/lpc32xx_slc.c
net/dsa/dsa_loop.c
net/ethernet/mediatek/mtk_eth_soc.c
net/phy/xilinx_gmii2rgmii.c
net/wireless/ti/wlcore/spi.c
pci/host/pcie-iproc-platform.c
phy/phy-exynos5250-sata.c
phy/phy-mt65xx-usb3.c
phy/phy-qcom-qusb2.c
phy/phy-sun4i-usb.c
pinctrl/core.c
pinctrl/pinctrl-at91.c
platform/x86/intel_cht_int33fe.c
power/supply/act8945a_charger.c
power/supply/axp20x_ac_power.c
power/supply/axp20x_battery.c
power/supply/axp288_charger.c
power/supply/bq24190_charger.c
power/supply/cpcap-charger.c
power/supply/gpio-charger.c
soc/bcm/raspberrypi-power.c
thermal/samsung/exynos_tmu.c
tty/serial/8250/8250_dw.c
tty/serial/max310x.c
tty/serial/sccnxp.c
usb/chipidea/ci_hdrc_msm.c
usb/gadget/udc/mv_udc_core.c
usb/host/xhci-mtk.c
usb/mtu3/mtu3_plat.c
usb/musb/sunxi.c
usb/phy/phy-am335x.c
usb/phy/phy-generic.c
usb/phy/phy-twl6030-usb.c
video/backlight/hx8357.c
video/backlight/lp855x_bl.c
video/fbdev/simplefb.c


Coccinelle script :
=
// find calls to kmalloc or equivalent function
@call@
expression ptr;
position p;
@@

(
*   ptr@p = kmalloc(...)
|
*   ptr@p = kzalloc(...)
|
*   ptr@p = kcalloc(...)
|
*   ptr@p = kmalloc_array(...)
|
*   ptr@p = devm_kmalloc(...)
|
*   ptr@p = devm_kzalloc(...)
|
*   ptr@p = devm_kcalloc(...)
|
*   ptr@p = devm_kmalloc_array(...)
)
 ...
*  return -EPROBE_DEFER;



Re: [PATCH] libertas: Avoid reading past end of buffer

2017-05-09 Thread Joe Perches
On Tue, 2017-05-09 at 16:23 -0700, Kees Cook wrote:
> Using memcpy() from a string that is shorter than the length copied means
> the destination buffer is being filled with arbitrary data from the kernel
> rodata segment. Instead, use strncpy() which will fill the trailing bytes
> with zeros. Additionally adjust indentation to keep checkpatch.pl happy.
> 
> This was found with the future CONFIG_FORTIFY_SOURCE feature.
[]
> diff --git a/drivers/net/wireless/marvell/libertas/mesh.c 
> b/drivers/net/wireless/marvell/libertas/mesh.c
[]
> @@ -1177,9 +1177,9 @@ void lbs_mesh_ethtool_get_strings(struct net_device 
> *dev,
>   switch (stringset) {
>   case ETH_SS_STATS:
>   for (i = 0; i < MESH_STATS_NUM; i++) {
> - memcpy(s + i * ETH_GSTRING_LEN,
> - mesh_stat_strings[i],
> - ETH_GSTRING_LEN);
> + strncpy(s + i * ETH_GSTRING_LEN,
> + mesh_stat_strings[i],
> + ETH_GSTRING_LEN);
>   }

The better solution is to declare
mesh_stat_strings in in the normal way

---
 drivers/net/wireless/marvell/libertas/mesh.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/wireless/marvell/libertas/mesh.c 
b/drivers/net/wireless/marvell/libertas/mesh.c
index d0c881dd5846..a535e7f48d2d 100644
--- a/drivers/net/wireless/marvell/libertas/mesh.c
+++ b/drivers/net/wireless/marvell/libertas/mesh.c
@@ -1108,15 +1108,15 @@ void lbs_mesh_set_txpd(struct lbs_private *priv,
  * Ethtool related
  */
 
-static const char * const mesh_stat_strings[] = {
-   "drop_duplicate_bcast",
-   "drop_ttl_zero",
-   "drop_no_fwd_route",
-   "drop_no_buffers",
-   "fwded_unicast_cnt",
-   "fwded_bcast_cnt",
-   "drop_blind_table",
-   "tx_failed_cnt"
+static const char mesh_stat_strings[][ETH_GSTRING_LEN] = {
+   "drop_duplicate_bcast",
+   "drop_ttl_zero",
+   "drop_no_fwd_route",
+   "drop_no_buffers",
+   "fwded_unicast_cnt",
+   "fwded_bcast_cnt",
+   "drop_blind_table",
+   "tx_failed_cnt",
 };
 
 void lbs_mesh_ethtool_get_stats(struct net_device *dev,


Re: [PATCH] libertas: Avoid reading past end of buffer

2017-05-09 Thread Joe Perches
On Tue, 2017-05-09 at 16:23 -0700, Kees Cook wrote:
> Using memcpy() from a string that is shorter than the length copied means
> the destination buffer is being filled with arbitrary data from the kernel
> rodata segment. Instead, use strncpy() which will fill the trailing bytes
> with zeros. Additionally adjust indentation to keep checkpatch.pl happy.
> 
> This was found with the future CONFIG_FORTIFY_SOURCE feature.
[]
> diff --git a/drivers/net/wireless/marvell/libertas/mesh.c 
> b/drivers/net/wireless/marvell/libertas/mesh.c
[]
> @@ -1177,9 +1177,9 @@ void lbs_mesh_ethtool_get_strings(struct net_device 
> *dev,
>   switch (stringset) {
>   case ETH_SS_STATS:
>   for (i = 0; i < MESH_STATS_NUM; i++) {
> - memcpy(s + i * ETH_GSTRING_LEN,
> - mesh_stat_strings[i],
> - ETH_GSTRING_LEN);
> + strncpy(s + i * ETH_GSTRING_LEN,
> + mesh_stat_strings[i],
> + ETH_GSTRING_LEN);
>   }

The better solution is to declare
mesh_stat_strings in in the normal way

---
 drivers/net/wireless/marvell/libertas/mesh.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/wireless/marvell/libertas/mesh.c 
b/drivers/net/wireless/marvell/libertas/mesh.c
index d0c881dd5846..a535e7f48d2d 100644
--- a/drivers/net/wireless/marvell/libertas/mesh.c
+++ b/drivers/net/wireless/marvell/libertas/mesh.c
@@ -1108,15 +1108,15 @@ void lbs_mesh_set_txpd(struct lbs_private *priv,
  * Ethtool related
  */
 
-static const char * const mesh_stat_strings[] = {
-   "drop_duplicate_bcast",
-   "drop_ttl_zero",
-   "drop_no_fwd_route",
-   "drop_no_buffers",
-   "fwded_unicast_cnt",
-   "fwded_bcast_cnt",
-   "drop_blind_table",
-   "tx_failed_cnt"
+static const char mesh_stat_strings[][ETH_GSTRING_LEN] = {
+   "drop_duplicate_bcast",
+   "drop_ttl_zero",
+   "drop_no_fwd_route",
+   "drop_no_buffers",
+   "fwded_unicast_cnt",
+   "fwded_bcast_cnt",
+   "drop_blind_table",
+   "tx_failed_cnt",
 };
 
 void lbs_mesh_ethtool_get_stats(struct net_device *dev,


Re: [PATCH] net: dsa: loop: Free resources if initialization is deferred

2017-05-09 Thread Julia Lawall


On Wed, 10 May 2017, Christophe JAILLET wrote:

> Free some devm'allocated memory in case of deferred driver initialization.
> This avoid to waste some memory in such a case.

I really think it would be helpful to mention the special behavior of
-EPROBE_DEFER.  It doesn't take much space, and it coud be helpful to
someone in the future.

julia

>
> Suggested-by: Joe Perches 
> Signed-off-by: Christophe JAILLET 
> ---
>  drivers/net/dsa/dsa_loop.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/dsa/dsa_loop.c b/drivers/net/dsa/dsa_loop.c
> index a19e1781e9bb..557afb418320 100644
> --- a/drivers/net/dsa/dsa_loop.c
> +++ b/drivers/net/dsa/dsa_loop.c
> @@ -260,8 +260,11 @@ static int dsa_loop_drv_probe(struct mdio_device 
> *mdiodev)
>   return -ENOMEM;
>
>   ps->netdev = dev_get_by_name(_net, pdata->netdev);
> - if (!ps->netdev)
> + if (!ps->netdev) {
> + devm_kfree(>dev, ps);
> + devm_kfree(>dev, ds);
>   return -EPROBE_DEFER;
> + }
>
>   pdata->cd.netdev[DSA_LOOP_CPU_PORT] = >netdev->dev;
>
> --
> 2.11.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


Re: [PATCH] net: dsa: loop: Free resources if initialization is deferred

2017-05-09 Thread Julia Lawall


On Wed, 10 May 2017, Christophe JAILLET wrote:

> Free some devm'allocated memory in case of deferred driver initialization.
> This avoid to waste some memory in such a case.

I really think it would be helpful to mention the special behavior of
-EPROBE_DEFER.  It doesn't take much space, and it coud be helpful to
someone in the future.

julia

>
> Suggested-by: Joe Perches 
> Signed-off-by: Christophe JAILLET 
> ---
>  drivers/net/dsa/dsa_loop.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/dsa/dsa_loop.c b/drivers/net/dsa/dsa_loop.c
> index a19e1781e9bb..557afb418320 100644
> --- a/drivers/net/dsa/dsa_loop.c
> +++ b/drivers/net/dsa/dsa_loop.c
> @@ -260,8 +260,11 @@ static int dsa_loop_drv_probe(struct mdio_device 
> *mdiodev)
>   return -ENOMEM;
>
>   ps->netdev = dev_get_by_name(_net, pdata->netdev);
> - if (!ps->netdev)
> + if (!ps->netdev) {
> + devm_kfree(>dev, ps);
> + devm_kfree(>dev, ds);
>   return -EPROBE_DEFER;
> + }
>
>   pdata->cd.netdev[DSA_LOOP_CPU_PORT] = >netdev->dev;
>
> --
> 2.11.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


Re: [PATCH] cifs: cifsacl: Use a temporary ops variable to reduce code length

2017-05-09 Thread Shirish Pargaonkar
Looks correct.

Acked-by: Shirish Pargaonkar 

On Sun, May 7, 2017 at 3:31 AM, Joe Perches via samba-technical
 wrote:
> Create an ops variable to store tcon->ses->server->ops and cache
> indirections and reduce code size a trivial bit.
>
> $ size fs/cifs/cifsacl.o*
>textdata bss dec hex filename
>5338 136   85482156a fs/cifs/cifsacl.o.new
>5371 136   85515158b fs/cifs/cifsacl.o.old
>
> Signed-off-by: Joe Perches 
> ---
>  fs/cifs/cifsacl.c | 30 ++
>  1 file changed, 14 insertions(+), 16 deletions(-)
>
> diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c
> index 15bac390dff9..b98436f5c7c7 100644
> --- a/fs/cifs/cifsacl.c
> +++ b/fs/cifs/cifsacl.c
> @@ -1135,20 +1135,19 @@ cifs_acl_to_fattr(struct cifs_sb_info *cifs_sb, 
> struct cifs_fattr *fattr,
> u32 acllen = 0;
> int rc = 0;
> struct tcon_link *tlink = cifs_sb_tlink(cifs_sb);
> -   struct cifs_tcon *tcon;
> +   struct smb_version_operations *ops;
>
> cifs_dbg(NOISY, "converting ACL to mode for %s\n", path);
>
> if (IS_ERR(tlink))
> return PTR_ERR(tlink);
> -   tcon = tlink_tcon(tlink);
>
> -   if (pfid && (tcon->ses->server->ops->get_acl_by_fid))
> -   pntsd = tcon->ses->server->ops->get_acl_by_fid(cifs_sb, pfid,
> - );
> -   else if (tcon->ses->server->ops->get_acl)
> -   pntsd = tcon->ses->server->ops->get_acl(cifs_sb, inode, path,
> -   );
> +   ops = tlink_tcon(tlink)->ses->server->ops;
> +
> +   if (pfid && (ops->get_acl_by_fid))
> +   pntsd = ops->get_acl_by_fid(cifs_sb, pfid, );
> +   else if (ops->get_acl)
> +   pntsd = ops->get_acl(cifs_sb, inode, path, );
> else {
> cifs_put_tlink(tlink);
> return -EOPNOTSUPP;
> @@ -1181,23 +1180,23 @@ id_mode_to_cifs_acl(struct inode *inode, const char 
> *path, __u64 nmode,
> struct cifs_ntsd *pnntsd = NULL; /* modified acl to be sent to server 
> */
> struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
> struct tcon_link *tlink = cifs_sb_tlink(cifs_sb);
> -   struct cifs_tcon *tcon;
> +   struct smb_version_operations *ops;
>
> if (IS_ERR(tlink))
> return PTR_ERR(tlink);
> -   tcon = tlink_tcon(tlink);
> +
> +   ops = tlink_tcon(tlink)->ses->server->ops;
>
> cifs_dbg(NOISY, "set ACL from mode for %s\n", path);
>
> /* Get the security descriptor */
>
> -   if (tcon->ses->server->ops->get_acl == NULL) {
> +   if (ops->get_acl == NULL) {
> cifs_put_tlink(tlink);
> return -EOPNOTSUPP;
> }
>
> -   pntsd = tcon->ses->server->ops->get_acl(cifs_sb, inode, path,
> -   );
> +   pntsd = ops->get_acl(cifs_sb, inode, path, );
> if (IS_ERR(pntsd)) {
> rc = PTR_ERR(pntsd);
> cifs_dbg(VFS, "%s: error %d getting sec desc\n", __func__, 
> rc);
> @@ -1224,13 +1223,12 @@ id_mode_to_cifs_acl(struct inode *inode, const char 
> *path, __u64 nmode,
>
> cifs_dbg(NOISY, "build_sec_desc rc: %d\n", rc);
>
> -   if (tcon->ses->server->ops->set_acl == NULL)
> +   if (ops->set_acl == NULL)
> rc = -EOPNOTSUPP;
>
> if (!rc) {
> /* Set the security descriptor */
> -   rc = tcon->ses->server->ops->set_acl(pnntsd, secdesclen, 
> inode,
> -path, aclflag);
> +   rc = ops->set_acl(pnntsd, secdesclen, inode, path, aclflag);
> cifs_dbg(NOISY, "set_cifs_acl rc: %d\n", rc);
> }
> cifs_put_tlink(tlink);
> --
> 2.10.0.rc2.1.g053435c
>
>


Re: [PATCH] cifs: cifsacl: Use a temporary ops variable to reduce code length

2017-05-09 Thread Shirish Pargaonkar
Looks correct.

Acked-by: Shirish Pargaonkar 

On Sun, May 7, 2017 at 3:31 AM, Joe Perches via samba-technical
 wrote:
> Create an ops variable to store tcon->ses->server->ops and cache
> indirections and reduce code size a trivial bit.
>
> $ size fs/cifs/cifsacl.o*
>textdata bss dec hex filename
>5338 136   85482156a fs/cifs/cifsacl.o.new
>5371 136   85515158b fs/cifs/cifsacl.o.old
>
> Signed-off-by: Joe Perches 
> ---
>  fs/cifs/cifsacl.c | 30 ++
>  1 file changed, 14 insertions(+), 16 deletions(-)
>
> diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c
> index 15bac390dff9..b98436f5c7c7 100644
> --- a/fs/cifs/cifsacl.c
> +++ b/fs/cifs/cifsacl.c
> @@ -1135,20 +1135,19 @@ cifs_acl_to_fattr(struct cifs_sb_info *cifs_sb, 
> struct cifs_fattr *fattr,
> u32 acllen = 0;
> int rc = 0;
> struct tcon_link *tlink = cifs_sb_tlink(cifs_sb);
> -   struct cifs_tcon *tcon;
> +   struct smb_version_operations *ops;
>
> cifs_dbg(NOISY, "converting ACL to mode for %s\n", path);
>
> if (IS_ERR(tlink))
> return PTR_ERR(tlink);
> -   tcon = tlink_tcon(tlink);
>
> -   if (pfid && (tcon->ses->server->ops->get_acl_by_fid))
> -   pntsd = tcon->ses->server->ops->get_acl_by_fid(cifs_sb, pfid,
> - );
> -   else if (tcon->ses->server->ops->get_acl)
> -   pntsd = tcon->ses->server->ops->get_acl(cifs_sb, inode, path,
> -   );
> +   ops = tlink_tcon(tlink)->ses->server->ops;
> +
> +   if (pfid && (ops->get_acl_by_fid))
> +   pntsd = ops->get_acl_by_fid(cifs_sb, pfid, );
> +   else if (ops->get_acl)
> +   pntsd = ops->get_acl(cifs_sb, inode, path, );
> else {
> cifs_put_tlink(tlink);
> return -EOPNOTSUPP;
> @@ -1181,23 +1180,23 @@ id_mode_to_cifs_acl(struct inode *inode, const char 
> *path, __u64 nmode,
> struct cifs_ntsd *pnntsd = NULL; /* modified acl to be sent to server 
> */
> struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
> struct tcon_link *tlink = cifs_sb_tlink(cifs_sb);
> -   struct cifs_tcon *tcon;
> +   struct smb_version_operations *ops;
>
> if (IS_ERR(tlink))
> return PTR_ERR(tlink);
> -   tcon = tlink_tcon(tlink);
> +
> +   ops = tlink_tcon(tlink)->ses->server->ops;
>
> cifs_dbg(NOISY, "set ACL from mode for %s\n", path);
>
> /* Get the security descriptor */
>
> -   if (tcon->ses->server->ops->get_acl == NULL) {
> +   if (ops->get_acl == NULL) {
> cifs_put_tlink(tlink);
> return -EOPNOTSUPP;
> }
>
> -   pntsd = tcon->ses->server->ops->get_acl(cifs_sb, inode, path,
> -   );
> +   pntsd = ops->get_acl(cifs_sb, inode, path, );
> if (IS_ERR(pntsd)) {
> rc = PTR_ERR(pntsd);
> cifs_dbg(VFS, "%s: error %d getting sec desc\n", __func__, 
> rc);
> @@ -1224,13 +1223,12 @@ id_mode_to_cifs_acl(struct inode *inode, const char 
> *path, __u64 nmode,
>
> cifs_dbg(NOISY, "build_sec_desc rc: %d\n", rc);
>
> -   if (tcon->ses->server->ops->set_acl == NULL)
> +   if (ops->set_acl == NULL)
> rc = -EOPNOTSUPP;
>
> if (!rc) {
> /* Set the security descriptor */
> -   rc = tcon->ses->server->ops->set_acl(pnntsd, secdesclen, 
> inode,
> -path, aclflag);
> +   rc = ops->set_acl(pnntsd, secdesclen, inode, path, aclflag);
> cifs_dbg(NOISY, "set_cifs_acl rc: %d\n", rc);
> }
> cifs_put_tlink(tlink);
> --
> 2.10.0.rc2.1.g053435c
>
>


linux-next: Tree for May 10

2017-05-09 Thread Stephen Rothwell
Hi all,

Please do not add any v4.13 destined material in your linux-next
included branches until after v4.12-rc1 has been released.

Changes since 20170509:

The tpmdd tree gaind a build failure for which I applied a fix patch.

Non-merge commits (relative to Linus' tree): 1220
 1257 files changed, 46382 insertions(+), 28742 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
and pseries_le_defconfig and i386, sparc and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 258 trees (counting Linus' and 37 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (56868a460b83 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide)
Merging fixes/master (97da3854c526 Linux 4.11-rc3)
Merging kbuild-current/fixes (9be3213b14d4 gconfig: remove misleading 
parentheses around a condition)
Merging arc-current/for-curr (cf4100d1cddc Revert "ARCv2: Allow enabling PAE40 
w/o HIGHMEM")
Merging arm-current/fixes (6d8059493691 ARM: 8670/1: V7M: Do not corrupt vector 
table around v7m_invalidate_l1 call)
Merging m68k-current/for-linus (f6ab4d59a5fe nubus: Add MVC and VSC video card 
definitions)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (be5c5e843c4a powerpc/64: Fix HMI exception on LE 
with CONFIG_RELOCATABLE=y)
Merging sparc/master (3c7f62212018 sparc64: fix fault handling in NGbzero.S and 
GENbzero.S)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (657831ffc38e dccp/tcp: do not inherit mc_list from parent)
Merging ipsec/master (2c1497bbc8fd xfrm: Fix NETDEV_DOWN with IPSec offload)
Merging netfilter/master (f411af682218 Merge branch 
'ibmvnic-Updated-reset-handler-andcode-fixes')
Merging ipvs/master (3c5ab3f395d6 ipvs: SNAT packet replies only for NATed 
connections)
Merging wireless-drivers/master (d77facb88448 brcmfmac: use local iftype 
avoiding use-after-free of virtual interface)
Merging mac80211/master (29cee56c0be4 Merge tag 'mac80211-for-davem-2017-05-08' 
of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211)
Merging sound-current/for-linus (a5c3b32a1146 Merge tag 'asoc-v4.12' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus)
Merging pci-current/for-linus (b9c1153f7a9c PCI: hisi: Fix DT binding 
(hisi-pcie-almost-ecam))
Merging driver-core.current/driver-core-linus (af82455f7dbd Merge tag 
'char-misc-4.12-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc)
Merging tty.current/tty-linus (4f7d029b9bf0 Linux 4.11-rc7)
Merging usb.current/usb-linus (a71c9a1c779f Linux 4.11-rc5)
Merging usb-gadget-fixes/fixes (a351e9b9fc24 Linux 4.11)
Merging usb-serial-fixes/usb-linus (c02ed2e75ef4 Linux 4.11-rc4)
Merging usb-chipidea-fixes/ci-for-usb-stable (c7fbb09b2ea1 usb: chipidea: move 
the lock initialization to core file)
Merging phy/fixes (1a09b6a7c10e phy: qcom-usb-hs: Add depends on EXTCON)
Merging staging.current/staging-linus (4a1e31c68e9f Merge tag 'arc-4.12-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc)
Merging char-misc.current/char-misc-linus (af82455f7dbd Merge tag 
'char-misc-4.12-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc)
Merging input-current/for-linus (4706aa075662 Input: xpad - add USB IDs for Mad 
Catz Brawlstick and Razer Sabertooth)
Merging crypto-current/master (929562b14478 crypto: stm32 - Fix OF module alias 
i

linux-next: Tree for May 10

2017-05-09 Thread Stephen Rothwell
Hi all,

Please do not add any v4.13 destined material in your linux-next
included branches until after v4.12-rc1 has been released.

Changes since 20170509:

The tpmdd tree gaind a build failure for which I applied a fix patch.

Non-merge commits (relative to Linus' tree): 1220
 1257 files changed, 46382 insertions(+), 28742 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
and pseries_le_defconfig and i386, sparc and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 258 trees (counting Linus' and 37 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (56868a460b83 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide)
Merging fixes/master (97da3854c526 Linux 4.11-rc3)
Merging kbuild-current/fixes (9be3213b14d4 gconfig: remove misleading 
parentheses around a condition)
Merging arc-current/for-curr (cf4100d1cddc Revert "ARCv2: Allow enabling PAE40 
w/o HIGHMEM")
Merging arm-current/fixes (6d8059493691 ARM: 8670/1: V7M: Do not corrupt vector 
table around v7m_invalidate_l1 call)
Merging m68k-current/for-linus (f6ab4d59a5fe nubus: Add MVC and VSC video card 
definitions)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (be5c5e843c4a powerpc/64: Fix HMI exception on LE 
with CONFIG_RELOCATABLE=y)
Merging sparc/master (3c7f62212018 sparc64: fix fault handling in NGbzero.S and 
GENbzero.S)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (657831ffc38e dccp/tcp: do not inherit mc_list from parent)
Merging ipsec/master (2c1497bbc8fd xfrm: Fix NETDEV_DOWN with IPSec offload)
Merging netfilter/master (f411af682218 Merge branch 
'ibmvnic-Updated-reset-handler-andcode-fixes')
Merging ipvs/master (3c5ab3f395d6 ipvs: SNAT packet replies only for NATed 
connections)
Merging wireless-drivers/master (d77facb88448 brcmfmac: use local iftype 
avoiding use-after-free of virtual interface)
Merging mac80211/master (29cee56c0be4 Merge tag 'mac80211-for-davem-2017-05-08' 
of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211)
Merging sound-current/for-linus (a5c3b32a1146 Merge tag 'asoc-v4.12' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus)
Merging pci-current/for-linus (b9c1153f7a9c PCI: hisi: Fix DT binding 
(hisi-pcie-almost-ecam))
Merging driver-core.current/driver-core-linus (af82455f7dbd Merge tag 
'char-misc-4.12-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc)
Merging tty.current/tty-linus (4f7d029b9bf0 Linux 4.11-rc7)
Merging usb.current/usb-linus (a71c9a1c779f Linux 4.11-rc5)
Merging usb-gadget-fixes/fixes (a351e9b9fc24 Linux 4.11)
Merging usb-serial-fixes/usb-linus (c02ed2e75ef4 Linux 4.11-rc4)
Merging usb-chipidea-fixes/ci-for-usb-stable (c7fbb09b2ea1 usb: chipidea: move 
the lock initialization to core file)
Merging phy/fixes (1a09b6a7c10e phy: qcom-usb-hs: Add depends on EXTCON)
Merging staging.current/staging-linus (4a1e31c68e9f Merge tag 'arc-4.12-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc)
Merging char-misc.current/char-misc-linus (af82455f7dbd Merge tag 
'char-misc-4.12-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc)
Merging input-current/for-linus (4706aa075662 Input: xpad - add USB IDs for Mad 
Catz Brawlstick and Razer Sabertooth)
Merging crypto-current/master (929562b14478 crypto: stm32 - Fix OF module alias 
i

[PATCH 2/3] autofs - make dev ioctl version and ismountpoint user accessible

2017-05-09 Thread Ian Kent
Some of the autofs miscellaneous device ioctls need to be accessable to
user space applications without CAP_SYS_ADMIN to get information about
autofs mounts.

Start by making the autofs miscellaneous device ioctl header available
and allow applications to use version and ismountpoint ioctls.

Signed-off-by: Ian Kent 
Cc: Colin Walters 
Cc: Ondrej Holy 
Cc: sta...@vger.kernel.org
---
 fs/autofs4/dev-ioctl.c  |   12 
 include/uapi/linux/Kbuild   |1 +
 include/uapi/linux/auto_dev-ioctl.h |2 +-
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index 9b58d6e..f8cb3f6 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -628,10 +628,6 @@ static int _autofs_dev_ioctl(unsigned int command,
ioctl_fn fn = NULL;
int err = 0;
 
-   /* only root can play with this */
-   if (!capable(CAP_SYS_ADMIN))
-   return -EPERM;
-
cmd_first = _IOC_NR(AUTOFS_DEV_IOCTL_IOC_FIRST);
cmd = _IOC_NR(command);
 
@@ -640,6 +636,14 @@ static int _autofs_dev_ioctl(unsigned int command,
return -ENOTTY;
}
 
+   /* Only root can use ioctls other than AUTOFS_DEV_IOCTL_VERSION_CMD
+* and AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD
+*/
+   if (cmd != AUTOFS_DEV_IOCTL_VERSION_CMD &&
+   cmd != AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD &&
+   !capable(CAP_SYS_ADMIN))
+   return -EPERM;
+
/* Copy the parameters into kernel space. */
param = copy_dev_ioctl(user);
if (IS_ERR(param))
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 662c592..1f22bbb 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -61,6 +61,7 @@ header-y += atm_zatm.h
 header-y += audit.h
 header-y += auto_fs4.h
 header-y += auto_fs.h
+header-y += auto_dev-ioctl.h
 header-y += auxvec.h
 header-y += ax25.h
 header-y += b1lli.h
diff --git a/include/uapi/linux/auto_dev-ioctl.h 
b/include/uapi/linux/auto_dev-ioctl.h
index 744b3d0..5558db8 100644
--- a/include/uapi/linux/auto_dev-ioctl.h
+++ b/include/uapi/linux/auto_dev-ioctl.h
@@ -16,7 +16,7 @@
 #define AUTOFS_DEVICE_NAME "autofs"
 
 #define AUTOFS_DEV_IOCTL_VERSION_MAJOR 1
-#define AUTOFS_DEV_IOCTL_VERSION_MINOR 0
+#define AUTOFS_DEV_IOCTL_VERSION_MINOR 1
 
 #define AUTOFS_DEV_IOCTL_SIZE  sizeof(struct autofs_dev_ioctl)
 



[PATCH 3/3] autofs - fix AT_NO_AUTOMOUNT not being honored

2017-05-09 Thread Ian Kent
The fstatat(2) and statx() calls can pass the flag AT_NO_AUTOMOUNT
which is meant to clear the LOOKUP_AUTOMOUNT flag and prevent triggering
of an automount by the call. But this flag is unconditionally cleared
for all stat family system calls except statx().

stat family system calls have always triggered mount requests for the
negative dentry case in follow_automount() which is intended but prevents
the fstatat(2) and statx() AT_NO_AUTOMOUNT case from being handled.

In order to handle the AT_NO_AUTOMOUNT for both system calls the
negative dentry case in follow_automount() needs to be changed to
return ENOENT when the LOOKUP_AUTOMOUNT flag is clear (and the other
required flags are clear).

AFAICT this change doesn't have any noticable side effects and may,
in some use cases (although I didn't see it in testing) prevent
unnecessary callbacks to the automount daemon.

It's also possible that a stat family call has been made with a
path that is in the process of being mounted by some other process.
But stat family calls should return the automount state of the path
as it is "now" so it shouldn't wait for mount completion.

This is the same semantic as the positive dentry case already
handled.

Signed-off-by: Ian Kent 
Cc: David Howells 
Cc: Colin Walters 
Cc: Ondrej Holy 
Cc: sta...@vger.kernel.org
---
 fs/namei.c |   15 ---
 include/linux/fs.h |3 +--
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 7286f87..cd74838 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1129,9 +1129,18 @@ static int follow_automount(struct path *path, struct 
nameidata *nd,
 * of the daemon to instantiate them before they can be used.
 */
if (!(nd->flags & (LOOKUP_PARENT | LOOKUP_DIRECTORY |
-  LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT)) &&
-   path->dentry->d_inode)
-   return -EISDIR;
+  LOOKUP_OPEN | LOOKUP_CREATE |
+  LOOKUP_AUTOMOUNT))) {
+   /* Positive dentry that isn't meant to trigger an
+* automount, EISDIR will allow it to be used,
+* otherwise there's no mount here "now" so return
+* ENOENT.
+*/
+   if (path->dentry->d_inode)
+   return -EISDIR;
+   else
+   return -ENOENT;
+   }
 
if (path->dentry->d_sb->s_user_ns != _user_ns)
return -EACCES;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 26488b4..be09684 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2935,8 +2935,7 @@ static inline int vfs_lstat(const char __user *name, 
struct kstat *stat)
 static inline int vfs_fstatat(int dfd, const char __user *filename,
  struct kstat *stat, int flags)
 {
-   return vfs_statx(dfd, filename, flags | AT_NO_AUTOMOUNT,
-stat, STATX_BASIC_STATS);
+   return vfs_statx(dfd, filename, flags, stat, STATX_BASIC_STATS);
 }
 static inline int vfs_fstat(int fd, struct kstat *stat)
 {



[PATCH 1/3] autofs - make disc device user accessible

2017-05-09 Thread Ian Kent
The autofs miscellanous device ioctls that shouldn't require
CAP_SYS_ADMIN need to be accessible to user space applications in
order to be able to get information about autofs mounts.

The module checks capabilities so the miscelaneous device should
be fine with broad permissions.

Signed-off-by: Ian Kent 
Cc: Colin Walters 
Cc: Ondrej Holy 
Cc: sta...@vger.kernel.org
---
 fs/autofs4/dev-ioctl.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index 734cbf8..9b58d6e 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -733,7 +733,8 @@ static const struct file_operations _dev_ioctl_fops = {
 static struct miscdevice _autofs_dev_ioctl_misc = {
.minor  = AUTOFS_MINOR,
.name   = AUTOFS_DEVICE_NAME,
-   .fops   = &_dev_ioctl_fops
+   .fops   = &_dev_ioctl_fops,
+   .mode   = 0666
 };
 
 MODULE_ALIAS_MISCDEV(AUTOFS_MINOR);



[PATCH 2/3] autofs - make dev ioctl version and ismountpoint user accessible

2017-05-09 Thread Ian Kent
Some of the autofs miscellaneous device ioctls need to be accessable to
user space applications without CAP_SYS_ADMIN to get information about
autofs mounts.

Start by making the autofs miscellaneous device ioctl header available
and allow applications to use version and ismountpoint ioctls.

Signed-off-by: Ian Kent 
Cc: Colin Walters 
Cc: Ondrej Holy 
Cc: sta...@vger.kernel.org
---
 fs/autofs4/dev-ioctl.c  |   12 
 include/uapi/linux/Kbuild   |1 +
 include/uapi/linux/auto_dev-ioctl.h |2 +-
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index 9b58d6e..f8cb3f6 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -628,10 +628,6 @@ static int _autofs_dev_ioctl(unsigned int command,
ioctl_fn fn = NULL;
int err = 0;
 
-   /* only root can play with this */
-   if (!capable(CAP_SYS_ADMIN))
-   return -EPERM;
-
cmd_first = _IOC_NR(AUTOFS_DEV_IOCTL_IOC_FIRST);
cmd = _IOC_NR(command);
 
@@ -640,6 +636,14 @@ static int _autofs_dev_ioctl(unsigned int command,
return -ENOTTY;
}
 
+   /* Only root can use ioctls other than AUTOFS_DEV_IOCTL_VERSION_CMD
+* and AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD
+*/
+   if (cmd != AUTOFS_DEV_IOCTL_VERSION_CMD &&
+   cmd != AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD &&
+   !capable(CAP_SYS_ADMIN))
+   return -EPERM;
+
/* Copy the parameters into kernel space. */
param = copy_dev_ioctl(user);
if (IS_ERR(param))
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 662c592..1f22bbb 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -61,6 +61,7 @@ header-y += atm_zatm.h
 header-y += audit.h
 header-y += auto_fs4.h
 header-y += auto_fs.h
+header-y += auto_dev-ioctl.h
 header-y += auxvec.h
 header-y += ax25.h
 header-y += b1lli.h
diff --git a/include/uapi/linux/auto_dev-ioctl.h 
b/include/uapi/linux/auto_dev-ioctl.h
index 744b3d0..5558db8 100644
--- a/include/uapi/linux/auto_dev-ioctl.h
+++ b/include/uapi/linux/auto_dev-ioctl.h
@@ -16,7 +16,7 @@
 #define AUTOFS_DEVICE_NAME "autofs"
 
 #define AUTOFS_DEV_IOCTL_VERSION_MAJOR 1
-#define AUTOFS_DEV_IOCTL_VERSION_MINOR 0
+#define AUTOFS_DEV_IOCTL_VERSION_MINOR 1
 
 #define AUTOFS_DEV_IOCTL_SIZE  sizeof(struct autofs_dev_ioctl)
 



[PATCH 3/3] autofs - fix AT_NO_AUTOMOUNT not being honored

2017-05-09 Thread Ian Kent
The fstatat(2) and statx() calls can pass the flag AT_NO_AUTOMOUNT
which is meant to clear the LOOKUP_AUTOMOUNT flag and prevent triggering
of an automount by the call. But this flag is unconditionally cleared
for all stat family system calls except statx().

stat family system calls have always triggered mount requests for the
negative dentry case in follow_automount() which is intended but prevents
the fstatat(2) and statx() AT_NO_AUTOMOUNT case from being handled.

In order to handle the AT_NO_AUTOMOUNT for both system calls the
negative dentry case in follow_automount() needs to be changed to
return ENOENT when the LOOKUP_AUTOMOUNT flag is clear (and the other
required flags are clear).

AFAICT this change doesn't have any noticable side effects and may,
in some use cases (although I didn't see it in testing) prevent
unnecessary callbacks to the automount daemon.

It's also possible that a stat family call has been made with a
path that is in the process of being mounted by some other process.
But stat family calls should return the automount state of the path
as it is "now" so it shouldn't wait for mount completion.

This is the same semantic as the positive dentry case already
handled.

Signed-off-by: Ian Kent 
Cc: David Howells 
Cc: Colin Walters 
Cc: Ondrej Holy 
Cc: sta...@vger.kernel.org
---
 fs/namei.c |   15 ---
 include/linux/fs.h |3 +--
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 7286f87..cd74838 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1129,9 +1129,18 @@ static int follow_automount(struct path *path, struct 
nameidata *nd,
 * of the daemon to instantiate them before they can be used.
 */
if (!(nd->flags & (LOOKUP_PARENT | LOOKUP_DIRECTORY |
-  LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT)) &&
-   path->dentry->d_inode)
-   return -EISDIR;
+  LOOKUP_OPEN | LOOKUP_CREATE |
+  LOOKUP_AUTOMOUNT))) {
+   /* Positive dentry that isn't meant to trigger an
+* automount, EISDIR will allow it to be used,
+* otherwise there's no mount here "now" so return
+* ENOENT.
+*/
+   if (path->dentry->d_inode)
+   return -EISDIR;
+   else
+   return -ENOENT;
+   }
 
if (path->dentry->d_sb->s_user_ns != _user_ns)
return -EACCES;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 26488b4..be09684 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2935,8 +2935,7 @@ static inline int vfs_lstat(const char __user *name, 
struct kstat *stat)
 static inline int vfs_fstatat(int dfd, const char __user *filename,
  struct kstat *stat, int flags)
 {
-   return vfs_statx(dfd, filename, flags | AT_NO_AUTOMOUNT,
-stat, STATX_BASIC_STATS);
+   return vfs_statx(dfd, filename, flags, stat, STATX_BASIC_STATS);
 }
 static inline int vfs_fstat(int fd, struct kstat *stat)
 {



[PATCH 1/3] autofs - make disc device user accessible

2017-05-09 Thread Ian Kent
The autofs miscellanous device ioctls that shouldn't require
CAP_SYS_ADMIN need to be accessible to user space applications in
order to be able to get information about autofs mounts.

The module checks capabilities so the miscelaneous device should
be fine with broad permissions.

Signed-off-by: Ian Kent 
Cc: Colin Walters 
Cc: Ondrej Holy 
Cc: sta...@vger.kernel.org
---
 fs/autofs4/dev-ioctl.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index 734cbf8..9b58d6e 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -733,7 +733,8 @@ static const struct file_operations _dev_ioctl_fops = {
 static struct miscdevice _autofs_dev_ioctl_misc = {
.minor  = AUTOFS_MINOR,
.name   = AUTOFS_DEVICE_NAME,
-   .fops   = &_dev_ioctl_fops
+   .fops   = &_dev_ioctl_fops,
+   .mode   = 0666
 };
 
 MODULE_ALIAS_MISCDEV(AUTOFS_MINOR);



[PATCH] net: dsa: loop: Free resources if initialization is deferred

2017-05-09 Thread Christophe JAILLET
Free some devm'allocated memory in case of deferred driver initialization.
This avoid to waste some memory in such a case.

Suggested-by: Joe Perches 
Signed-off-by: Christophe JAILLET 
---
 drivers/net/dsa/dsa_loop.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/dsa_loop.c b/drivers/net/dsa/dsa_loop.c
index a19e1781e9bb..557afb418320 100644
--- a/drivers/net/dsa/dsa_loop.c
+++ b/drivers/net/dsa/dsa_loop.c
@@ -260,8 +260,11 @@ static int dsa_loop_drv_probe(struct mdio_device *mdiodev)
return -ENOMEM;
 
ps->netdev = dev_get_by_name(_net, pdata->netdev);
-   if (!ps->netdev)
+   if (!ps->netdev) {
+   devm_kfree(>dev, ps);
+   devm_kfree(>dev, ds);
return -EPROBE_DEFER;
+   }
 
pdata->cd.netdev[DSA_LOOP_CPU_PORT] = >netdev->dev;
 
-- 
2.11.0



[PATCH] net: dsa: loop: Free resources if initialization is deferred

2017-05-09 Thread Christophe JAILLET
Free some devm'allocated memory in case of deferred driver initialization.
This avoid to waste some memory in such a case.

Suggested-by: Joe Perches 
Signed-off-by: Christophe JAILLET 
---
 drivers/net/dsa/dsa_loop.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/dsa_loop.c b/drivers/net/dsa/dsa_loop.c
index a19e1781e9bb..557afb418320 100644
--- a/drivers/net/dsa/dsa_loop.c
+++ b/drivers/net/dsa/dsa_loop.c
@@ -260,8 +260,11 @@ static int dsa_loop_drv_probe(struct mdio_device *mdiodev)
return -ENOMEM;
 
ps->netdev = dev_get_by_name(_net, pdata->netdev);
-   if (!ps->netdev)
+   if (!ps->netdev) {
+   devm_kfree(>dev, ps);
+   devm_kfree(>dev, ds);
return -EPROBE_DEFER;
+   }
 
pdata->cd.netdev[DSA_LOOP_CPU_PORT] = >netdev->dev;
 
-- 
2.11.0



Re: [PATCH 4/6] tty: serial: lpuart: add imx7ulp support

2017-05-09 Thread Stefan Agner
On 2017-05-09 00:50, Dong Aisheng wrote:
> The lpuart of imx7ulp is basically the same as ls1021a. It's also
> 32 bit width register, but unlike ls1021a, it's little endian.
> Besides that, imx7ulp lpuart has a minor different register layout
> from ls1021a that it has four extra registers (verid, param, global,
> pincfg) located at the beginning of register map, which are currently
> not used by the driver and less to be used later.
> 
> To ease the register difference handling, we add a reg_off member
> in lpuart_soc_data structure to represent if the normal
> lpuart32_{read|write} requires plus a offset to hide the issue.
> 
> Cc: Greg Kroah-Hartman 
> Cc: Jiri Slaby 
> Cc: Fugang Duan 
> Cc: Stefan Agner 
> Cc: Mingkai Hu 
> Cc: Yangbo Lu 
> Signed-off-by: Dong Aisheng 
> ---
>  drivers/tty/serial/fsl_lpuart.c | 21 ++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
> index bddd041..1cdb3f9 100644
> --- a/drivers/tty/serial/fsl_lpuart.c
> +++ b/drivers/tty/serial/fsl_lpuart.c
> @@ -231,7 +231,11 @@
>  #define DEV_NAME "ttyLP"
>  #define UART_NR  6
>  
> +/* IMX lpuart has four extra unused regs located at the beginning */
> +#define IMX_REG_OFF  0x10
> +
>  static bool lpuart_is_be;
> +static u8 lpuart_reg_off;

Global variables? That hardly works once you have two UARTs...

Instead of adding a fixed offset to any write you could just add the
offset to sport->port.membase...

--
Stefan

>  
>  struct lpuart_port {
>   struct uart_portport;
> @@ -263,6 +267,7 @@ struct lpuart_port {
>  struct lpuart_soc_data {
>   boolis_32;
>   boolis_be;
> + u8  reg_off;
>  };
>  
>  static struct lpuart_soc_data vf_data = {
> @@ -272,11 +277,19 @@ static struct lpuart_soc_data vf_data = {
>  static struct lpuart_soc_data ls_data = {
>   .is_32 = true,
>   .is_be = true,
> + .reg_off = 0x0,
> +};
> +
> +static struct lpuart_soc_data imx_data = {
> + .is_32 = true,
> + .is_be = false,
> + .reg_off = IMX_REG_OFF,
>  };
>  
>  static const struct of_device_id lpuart_dt_ids[] = {
>   { .compatible = "fsl,vf610-lpuart", .data = _data, },
>   { .compatible = "fsl,ls1021a-lpuart", .data = _data, },
> + { .compatible = "fsl,imx7ulp-lpuart", .data = _data, },
>   { /* sentinel */ }
>  };
>  MODULE_DEVICE_TABLE(of, lpuart_dt_ids);
> @@ -286,15 +299,16 @@ static void lpuart_dma_tx_complete(void *arg);
>  
>  static u32 lpuart32_read(void __iomem *addr)
>  {
> - return lpuart_is_be ? ioread32be(addr) : readl(addr);
> + return lpuart_is_be ? ioread32be(addr + lpuart_reg_off) :
> +   readl(addr + lpuart_reg_off);
>  }
>  
>  static void lpuart32_write(u32 val, void __iomem *addr)
>  {
>   if (lpuart_is_be)
> - iowrite32be(val, addr);
> + iowrite32be(val, addr + lpuart_reg_off);
>   else
> - writel(val, addr);
> + writel(val, addr + lpuart_reg_off);
>  }
>  
>  static void lpuart_stop_tx(struct uart_port *port)
> @@ -2008,6 +2022,7 @@ static int lpuart_probe(struct platform_device *pdev)
>   sport->port.line = ret;
>   sport->lpuart32 = sdata->is_32;
>   lpuart_is_be = sdata->is_be;
> + lpuart_reg_off = sdata->reg_off;
>  
>   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>   sport->port.membase = devm_ioremap_resource(>dev, res);


Re: [PATCH 4/6] tty: serial: lpuart: add imx7ulp support

2017-05-09 Thread Stefan Agner
On 2017-05-09 00:50, Dong Aisheng wrote:
> The lpuart of imx7ulp is basically the same as ls1021a. It's also
> 32 bit width register, but unlike ls1021a, it's little endian.
> Besides that, imx7ulp lpuart has a minor different register layout
> from ls1021a that it has four extra registers (verid, param, global,
> pincfg) located at the beginning of register map, which are currently
> not used by the driver and less to be used later.
> 
> To ease the register difference handling, we add a reg_off member
> in lpuart_soc_data structure to represent if the normal
> lpuart32_{read|write} requires plus a offset to hide the issue.
> 
> Cc: Greg Kroah-Hartman 
> Cc: Jiri Slaby 
> Cc: Fugang Duan 
> Cc: Stefan Agner 
> Cc: Mingkai Hu 
> Cc: Yangbo Lu 
> Signed-off-by: Dong Aisheng 
> ---
>  drivers/tty/serial/fsl_lpuart.c | 21 ++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
> index bddd041..1cdb3f9 100644
> --- a/drivers/tty/serial/fsl_lpuart.c
> +++ b/drivers/tty/serial/fsl_lpuart.c
> @@ -231,7 +231,11 @@
>  #define DEV_NAME "ttyLP"
>  #define UART_NR  6
>  
> +/* IMX lpuart has four extra unused regs located at the beginning */
> +#define IMX_REG_OFF  0x10
> +
>  static bool lpuart_is_be;
> +static u8 lpuart_reg_off;

Global variables? That hardly works once you have two UARTs...

Instead of adding a fixed offset to any write you could just add the
offset to sport->port.membase...

--
Stefan

>  
>  struct lpuart_port {
>   struct uart_portport;
> @@ -263,6 +267,7 @@ struct lpuart_port {
>  struct lpuart_soc_data {
>   boolis_32;
>   boolis_be;
> + u8  reg_off;
>  };
>  
>  static struct lpuart_soc_data vf_data = {
> @@ -272,11 +277,19 @@ static struct lpuart_soc_data vf_data = {
>  static struct lpuart_soc_data ls_data = {
>   .is_32 = true,
>   .is_be = true,
> + .reg_off = 0x0,
> +};
> +
> +static struct lpuart_soc_data imx_data = {
> + .is_32 = true,
> + .is_be = false,
> + .reg_off = IMX_REG_OFF,
>  };
>  
>  static const struct of_device_id lpuart_dt_ids[] = {
>   { .compatible = "fsl,vf610-lpuart", .data = _data, },
>   { .compatible = "fsl,ls1021a-lpuart", .data = _data, },
> + { .compatible = "fsl,imx7ulp-lpuart", .data = _data, },
>   { /* sentinel */ }
>  };
>  MODULE_DEVICE_TABLE(of, lpuart_dt_ids);
> @@ -286,15 +299,16 @@ static void lpuart_dma_tx_complete(void *arg);
>  
>  static u32 lpuart32_read(void __iomem *addr)
>  {
> - return lpuart_is_be ? ioread32be(addr) : readl(addr);
> + return lpuart_is_be ? ioread32be(addr + lpuart_reg_off) :
> +   readl(addr + lpuart_reg_off);
>  }
>  
>  static void lpuart32_write(u32 val, void __iomem *addr)
>  {
>   if (lpuart_is_be)
> - iowrite32be(val, addr);
> + iowrite32be(val, addr + lpuart_reg_off);
>   else
> - writel(val, addr);
> + writel(val, addr + lpuart_reg_off);
>  }
>  
>  static void lpuart_stop_tx(struct uart_port *port)
> @@ -2008,6 +2022,7 @@ static int lpuart_probe(struct platform_device *pdev)
>   sport->port.line = ret;
>   sport->lpuart32 = sdata->is_32;
>   lpuart_is_be = sdata->is_be;
> + lpuart_reg_off = sdata->reg_off;
>  
>   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>   sport->port.membase = devm_ioremap_resource(>dev, res);


[PATCH] xen: adjust early dom0 p2m handling to xen hypervisor behavior

2017-05-09 Thread Juergen Gross
When booted as pv-guest the p2m list presented by the Xen is already
mapped to virtual addresses. In dom0 case the hypervisor might make use
of 2M- or 1G-pages for this mapping. Unfortunately while being properly
aligned in virtual and machine address space, those pages might not be
aligned properly in guest physical address space.

So when trying to obtain the guest physical address of such a page
pud_pfn() and pmd_pfn() must be avoided as those will mask away guest
physical address bits not being zero in this special case.

Signed-off-by: Juergen Gross 
---
 arch/x86/xen/mmu_pv.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
index 9d9ae6650aa1..7397d8b8459d 100644
--- a/arch/x86/xen/mmu_pv.c
+++ b/arch/x86/xen/mmu_pv.c
@@ -2025,7 +2025,8 @@ static unsigned long __init 
xen_read_phys_ulong(phys_addr_t addr)
 
 /*
  * Translate a virtual address to a physical one without relying on mapped
- * page tables.
+ * page tables. Don't rely on big pages being aligned in (guest) physical
+ * space!
  */
 static phys_addr_t __init xen_early_virt_to_phys(unsigned long vaddr)
 {
@@ -2046,7 +2047,7 @@ static phys_addr_t __init xen_early_virt_to_phys(unsigned 
long vaddr)
   sizeof(pud)));
if (!pud_present(pud))
return 0;
-   pa = pud_pfn(pud) << PAGE_SHIFT;
+   pa = pud_val(pud) & PTE_PFN_MASK;
if (pud_large(pud))
return pa + (vaddr & ~PUD_MASK);
 
@@ -2054,7 +2055,7 @@ static phys_addr_t __init xen_early_virt_to_phys(unsigned 
long vaddr)
   sizeof(pmd)));
if (!pmd_present(pmd))
return 0;
-   pa = pmd_pfn(pmd) << PAGE_SHIFT;
+   pa = pmd_val(pmd) & PTE_PFN_MASK;
if (pmd_large(pmd))
return pa + (vaddr & ~PMD_MASK);
 
-- 
2.12.0



[PATCH] xen: adjust early dom0 p2m handling to xen hypervisor behavior

2017-05-09 Thread Juergen Gross
When booted as pv-guest the p2m list presented by the Xen is already
mapped to virtual addresses. In dom0 case the hypervisor might make use
of 2M- or 1G-pages for this mapping. Unfortunately while being properly
aligned in virtual and machine address space, those pages might not be
aligned properly in guest physical address space.

So when trying to obtain the guest physical address of such a page
pud_pfn() and pmd_pfn() must be avoided as those will mask away guest
physical address bits not being zero in this special case.

Signed-off-by: Juergen Gross 
---
 arch/x86/xen/mmu_pv.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
index 9d9ae6650aa1..7397d8b8459d 100644
--- a/arch/x86/xen/mmu_pv.c
+++ b/arch/x86/xen/mmu_pv.c
@@ -2025,7 +2025,8 @@ static unsigned long __init 
xen_read_phys_ulong(phys_addr_t addr)
 
 /*
  * Translate a virtual address to a physical one without relying on mapped
- * page tables.
+ * page tables. Don't rely on big pages being aligned in (guest) physical
+ * space!
  */
 static phys_addr_t __init xen_early_virt_to_phys(unsigned long vaddr)
 {
@@ -2046,7 +2047,7 @@ static phys_addr_t __init xen_early_virt_to_phys(unsigned 
long vaddr)
   sizeof(pud)));
if (!pud_present(pud))
return 0;
-   pa = pud_pfn(pud) << PAGE_SHIFT;
+   pa = pud_val(pud) & PTE_PFN_MASK;
if (pud_large(pud))
return pa + (vaddr & ~PUD_MASK);
 
@@ -2054,7 +2055,7 @@ static phys_addr_t __init xen_early_virt_to_phys(unsigned 
long vaddr)
   sizeof(pmd)));
if (!pmd_present(pmd))
return 0;
-   pa = pmd_pfn(pmd) << PAGE_SHIFT;
+   pa = pmd_val(pmd) & PTE_PFN_MASK;
if (pmd_large(pmd))
return pa + (vaddr & ~PMD_MASK);
 
-- 
2.12.0



Re: [PATCH 2/6] tty: serial: lpuart: add little endian 32 bit register support

2017-05-09 Thread Stefan Agner
On 2017-05-09 00:50, Dong Aisheng wrote:
> It's based on the exist lpuart32 read/write implementation.
> 
> Cc: Greg Kroah-Hartman 
> Cc: Jiri Slaby  (supporter:TTY LAYER)
> Cc: Fugang Duan 
> Cc: Stefan Agner 
> Cc: Mingkai Hu 
> Cc: Yangbo Lu 
> Signed-off-by: Dong Aisheng 
> ---
>  drivers/tty/serial/fsl_lpuart.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
> index cd4e905..bddd041 100644
> --- a/drivers/tty/serial/fsl_lpuart.c
> +++ b/drivers/tty/serial/fsl_lpuart.c
> @@ -231,6 +231,8 @@
>  #define DEV_NAME "ttyLP"
>  #define UART_NR  6
>  
> +static bool lpuart_is_be;
> +

Other LS1021a IP's such as SPI use the big-endian device tree property
along with regmap.

See e.g.
drivers/spi/spi-fsl-dspi.c

(Used in vf610 in little endian mode and ls1021a in big endian)

Not sure if we want to switch to regmap, but you can also get the
property using of_get_property.

The ls1021a lpuart node do not specify big-endian at the moment (would
probably good to add it), so I would leave big-endian the driver default
and check for little-endian for the new device and check whether that is
specified:
of_get_property(dn, "little-endian", NULL)

--
Stefan

>  struct lpuart_port {
>   struct uart_portport;
>   struct clk  *clk;
> @@ -260,6 +262,7 @@ struct lpuart_port {
>  
>  struct lpuart_soc_data {
>   boolis_32;
> + boolis_be;
>  };
>  
>  static struct lpuart_soc_data vf_data = {
> @@ -268,6 +271,7 @@ static struct lpuart_soc_data vf_data = {
>  
>  static struct lpuart_soc_data ls_data = {
>   .is_32 = true,
> + .is_be = true,
>  };
>  
>  static const struct of_device_id lpuart_dt_ids[] = {
> @@ -282,12 +286,15 @@ static void lpuart_dma_tx_complete(void *arg);
>  
>  static u32 lpuart32_read(void __iomem *addr)
>  {
> - return ioread32be(addr);
> + return lpuart_is_be ? ioread32be(addr) : readl(addr);
>  }
>  
>  static void lpuart32_write(u32 val, void __iomem *addr)
>  {
> - iowrite32be(val, addr);
> + if (lpuart_is_be)
> + iowrite32be(val, addr);
> + else
> + writel(val, addr);
>  }
>  
>  static void lpuart_stop_tx(struct uart_port *port)
> @@ -2000,6 +2007,7 @@ static int lpuart_probe(struct platform_device *pdev)
>   }
>   sport->port.line = ret;
>   sport->lpuart32 = sdata->is_32;
> + lpuart_is_be = sdata->is_be;
>  
>   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>   sport->port.membase = devm_ioremap_resource(>dev, res);


Re: [PATCH 2/6] tty: serial: lpuart: add little endian 32 bit register support

2017-05-09 Thread Stefan Agner
On 2017-05-09 00:50, Dong Aisheng wrote:
> It's based on the exist lpuart32 read/write implementation.
> 
> Cc: Greg Kroah-Hartman 
> Cc: Jiri Slaby  (supporter:TTY LAYER)
> Cc: Fugang Duan 
> Cc: Stefan Agner 
> Cc: Mingkai Hu 
> Cc: Yangbo Lu 
> Signed-off-by: Dong Aisheng 
> ---
>  drivers/tty/serial/fsl_lpuart.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
> index cd4e905..bddd041 100644
> --- a/drivers/tty/serial/fsl_lpuart.c
> +++ b/drivers/tty/serial/fsl_lpuart.c
> @@ -231,6 +231,8 @@
>  #define DEV_NAME "ttyLP"
>  #define UART_NR  6
>  
> +static bool lpuart_is_be;
> +

Other LS1021a IP's such as SPI use the big-endian device tree property
along with regmap.

See e.g.
drivers/spi/spi-fsl-dspi.c

(Used in vf610 in little endian mode and ls1021a in big endian)

Not sure if we want to switch to regmap, but you can also get the
property using of_get_property.

The ls1021a lpuart node do not specify big-endian at the moment (would
probably good to add it), so I would leave big-endian the driver default
and check for little-endian for the new device and check whether that is
specified:
of_get_property(dn, "little-endian", NULL)

--
Stefan

>  struct lpuart_port {
>   struct uart_portport;
>   struct clk  *clk;
> @@ -260,6 +262,7 @@ struct lpuart_port {
>  
>  struct lpuart_soc_data {
>   boolis_32;
> + boolis_be;
>  };
>  
>  static struct lpuart_soc_data vf_data = {
> @@ -268,6 +271,7 @@ static struct lpuart_soc_data vf_data = {
>  
>  static struct lpuart_soc_data ls_data = {
>   .is_32 = true,
> + .is_be = true,
>  };
>  
>  static const struct of_device_id lpuart_dt_ids[] = {
> @@ -282,12 +286,15 @@ static void lpuart_dma_tx_complete(void *arg);
>  
>  static u32 lpuart32_read(void __iomem *addr)
>  {
> - return ioread32be(addr);
> + return lpuart_is_be ? ioread32be(addr) : readl(addr);
>  }
>  
>  static void lpuart32_write(u32 val, void __iomem *addr)
>  {
> - iowrite32be(val, addr);
> + if (lpuart_is_be)
> + iowrite32be(val, addr);
> + else
> + writel(val, addr);
>  }
>  
>  static void lpuart_stop_tx(struct uart_port *port)
> @@ -2000,6 +2007,7 @@ static int lpuart_probe(struct platform_device *pdev)
>   }
>   sport->port.line = ret;
>   sport->lpuart32 = sdata->is_32;
> + lpuart_is_be = sdata->is_be;
>  
>   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>   sport->port.membase = devm_ioremap_resource(>dev, res);


[PATCH] drm/vc4: Fix resource leak in 'vc4_get_hang_state_ioctl()' in error handling path

2017-05-09 Thread Christophe JAILLET
If one 'drm_gem_handle_create()' fails, we leak somes handles and some
memory.

In order to fix it:
  - move the 'free(bo_state)' at the end of the function in the error
handling path. This has the side effect to also try to free it if the
first 'kcalloc' fails. This is harmless.
  - delete already allocated handles
  - remove the now useless 'err' label

The way the code is now written will also delete the handles if the
'copy_to_user()' call fails.

Signed-off-by: Christophe JAILLET 
---
This patch also add the 'vc4_free_hang_state()' call in the error
handling path. It sounds logical to me, but I'm not sure of it.
---
 drivers/gpu/drm/vc4/vc4_gem.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
index e9c381c42139..891c7a22cf81 100644
--- a/drivers/gpu/drm/vc4/vc4_gem.c
+++ b/drivers/gpu/drm/vc4/vc4_gem.c
@@ -111,8 +111,8 @@ vc4_get_hang_state_ioctl(struct drm_device *dev, void *data,
);
 
if (ret) {
-   state->bo_count = i - 1;
-   goto err;
+   state->bo_count = i;
+   goto err_free;
}
bo_state[i].handle = handle;
bo_state[i].paddr = vc4_bo->base.paddr;
@@ -124,13 +124,16 @@ vc4_get_hang_state_ioctl(struct drm_device *dev, void 
*data,
 state->bo_count * sizeof(*bo_state)))
ret = -EFAULT;
 
-   kfree(bo_state);
-
 err_free:
-
vc4_free_hang_state(dev, kernel_state);
 
-err:
+   if (ret) {
+   for (i = 0; i < state->bo_count; i++)
+   drm_gem_handle_delete(file_priv, bo_state[i].handle);
+   }
+
+   kfree(bo_state);
+
return ret;
 }
 
-- 
2.11.0



[PATCH] drm/vc4: Fix resource leak in 'vc4_get_hang_state_ioctl()' in error handling path

2017-05-09 Thread Christophe JAILLET
If one 'drm_gem_handle_create()' fails, we leak somes handles and some
memory.

In order to fix it:
  - move the 'free(bo_state)' at the end of the function in the error
handling path. This has the side effect to also try to free it if the
first 'kcalloc' fails. This is harmless.
  - delete already allocated handles
  - remove the now useless 'err' label

The way the code is now written will also delete the handles if the
'copy_to_user()' call fails.

Signed-off-by: Christophe JAILLET 
---
This patch also add the 'vc4_free_hang_state()' call in the error
handling path. It sounds logical to me, but I'm not sure of it.
---
 drivers/gpu/drm/vc4/vc4_gem.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
index e9c381c42139..891c7a22cf81 100644
--- a/drivers/gpu/drm/vc4/vc4_gem.c
+++ b/drivers/gpu/drm/vc4/vc4_gem.c
@@ -111,8 +111,8 @@ vc4_get_hang_state_ioctl(struct drm_device *dev, void *data,
);
 
if (ret) {
-   state->bo_count = i - 1;
-   goto err;
+   state->bo_count = i;
+   goto err_free;
}
bo_state[i].handle = handle;
bo_state[i].paddr = vc4_bo->base.paddr;
@@ -124,13 +124,16 @@ vc4_get_hang_state_ioctl(struct drm_device *dev, void 
*data,
 state->bo_count * sizeof(*bo_state)))
ret = -EFAULT;
 
-   kfree(bo_state);
-
 err_free:
-
vc4_free_hang_state(dev, kernel_state);
 
-err:
+   if (ret) {
+   for (i = 0; i < state->bo_count; i++)
+   drm_gem_handle_delete(file_priv, bo_state[i].handle);
+   }
+
+   kfree(bo_state);
+
return ret;
 }
 
-- 
2.11.0



Re: [PATCH] Allow to use DMA_CTRL_REUSE flag for all channel types

2017-05-09 Thread Vinod Koul
On Tue, May 02, 2017 at 03:16:18PM +, Eugeniy Paltsev wrote:
> Hi Vinod,
> 
> On Mon, 2017-05-01 at 11:21 +0530, Vinod Koul wrote:
> > On Fri, Apr 28, 2017 at 04:37:46PM +0300, Eugeniy Paltsev wrote:
> > > In the current implementation dma_get_slave_caps is used to check
> > > state of descriptor_reuse option. But dma_get_slave_caps includes
> > > check if the channel supports slave transactions.
> > > So DMA_CTRL_REUSE flag can be set (even for MEM-TO-MEM tranfers)
> > > only if channel supports slave transactions.
> > > 
> > > Now we can use DMA_CTRL_REUSE flag for all channel types.
> > > Also it allows to test reusing mechanism with simply mem-to-mem dma
> > > test.
> > 
> > We do not want to allow that actually. Slave is always treated as a
> > special
> > case, so resue was allowed.
> > 
> > With memcpy the assumptions are different and clients can do reuse.
> 
> Could you please clarify why don't we want to allow use DMA_CTRL_REUSE
> for mem-to-mem transfers?
> 
> Reusing of mem-to-mem (MEMCPY and DMA_SG) descriptors will work fine on
> virt-dma based drivers.

Precisely, the client does not know if you have a virt-dma or some other
kind if implementation

For them they see a channel and use it!

> Anyway the current implementation behaviour is quite strange:
> If channel supports *slave* transfers DMA_CTRL_REUSE can be set to
> slave and *mem-to-mem* transfers.
> 
> And, of course, we can pass DMA_CTRL_REUSE flag to device_prep_dma_sg
> or device_prep_dma_memcpy directly without checks.

Yeah thats bad, do send a patch to forbid that..

-- 
~Vinod


Re: [PATCH] Allow to use DMA_CTRL_REUSE flag for all channel types

2017-05-09 Thread Vinod Koul
On Tue, May 02, 2017 at 03:16:18PM +, Eugeniy Paltsev wrote:
> Hi Vinod,
> 
> On Mon, 2017-05-01 at 11:21 +0530, Vinod Koul wrote:
> > On Fri, Apr 28, 2017 at 04:37:46PM +0300, Eugeniy Paltsev wrote:
> > > In the current implementation dma_get_slave_caps is used to check
> > > state of descriptor_reuse option. But dma_get_slave_caps includes
> > > check if the channel supports slave transactions.
> > > So DMA_CTRL_REUSE flag can be set (even for MEM-TO-MEM tranfers)
> > > only if channel supports slave transactions.
> > > 
> > > Now we can use DMA_CTRL_REUSE flag for all channel types.
> > > Also it allows to test reusing mechanism with simply mem-to-mem dma
> > > test.
> > 
> > We do not want to allow that actually. Slave is always treated as a
> > special
> > case, so resue was allowed.
> > 
> > With memcpy the assumptions are different and clients can do reuse.
> 
> Could you please clarify why don't we want to allow use DMA_CTRL_REUSE
> for mem-to-mem transfers?
> 
> Reusing of mem-to-mem (MEMCPY and DMA_SG) descriptors will work fine on
> virt-dma based drivers.

Precisely, the client does not know if you have a virt-dma or some other
kind if implementation

For them they see a channel and use it!

> Anyway the current implementation behaviour is quite strange:
> If channel supports *slave* transfers DMA_CTRL_REUSE can be set to
> slave and *mem-to-mem* transfers.
> 
> And, of course, we can pass DMA_CTRL_REUSE flag to device_prep_dma_sg
> or device_prep_dma_memcpy directly without checks.

Yeah thats bad, do send a patch to forbid that..

-- 
~Vinod


Re: [PATCH 1/6] tty: serial: lpuart: introduce lpuart_soc_data to represent SoC property

2017-05-09 Thread Stefan Agner
On 2017-05-09 00:50, Dong Aisheng wrote:
> This is used to dynamically check the SoC specific lpuart properies.
> Currently only the checking of 32 bit register width is added which
> functions the same as before. More will be added later for supporting
> new chips.
> 
> Cc: Greg Kroah-Hartman 
> Cc: Jiri Slaby 
> Cc: Fugang Duan 
> Cc: Stefan Agner 
> Cc: Mingkai Hu 
> Cc: Yangbo Lu 
> Signed-off-by: Dong Aisheng 
> ---
>  drivers/tty/serial/fsl_lpuart.c | 25 ++---
>  1 file changed, 18 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
> index 15df1ba7..cd4e905 100644
> --- a/drivers/tty/serial/fsl_lpuart.c
> +++ b/drivers/tty/serial/fsl_lpuart.c
> @@ -258,13 +258,21 @@ struct lpuart_port {
>   wait_queue_head_t   dma_wait;
>  };
>  
> +struct lpuart_soc_data {
> + boolis_32;
> +};
> +
> +static struct lpuart_soc_data vf_data = {
> + .is_32 = false,
> +};
> +
> +static struct lpuart_soc_data ls_data = {
> + .is_32 = true,
> +};

This could be const I guess?

--
Stefan

> +
>  static const struct of_device_id lpuart_dt_ids[] = {
> - {
> - .compatible = "fsl,vf610-lpuart",
> - },
> - {
> - .compatible = "fsl,ls1021a-lpuart",
> - },
> + { .compatible = "fsl,vf610-lpuart", .data = _data, },
> + { .compatible = "fsl,ls1021a-lpuart", .data = _data, },
>   { /* sentinel */ }
>  };
>  MODULE_DEVICE_TABLE(of, lpuart_dt_ids);
> @@ -1971,6 +1979,9 @@ static struct uart_driver lpuart_reg = {
>  
>  static int lpuart_probe(struct platform_device *pdev)
>  {
> + const struct of_device_id *of_id = of_match_device(lpuart_dt_ids,
> +>dev);
> + const struct lpuart_soc_data *sdata = of_id->data;
>   struct device_node *np = pdev->dev.of_node;
>   struct lpuart_port *sport;
>   struct resource *res;
> @@ -1988,7 +1999,7 @@ static int lpuart_probe(struct platform_device *pdev)
>   return ret;
>   }
>   sport->port.line = ret;
> - sport->lpuart32 = of_device_is_compatible(np, "fsl,ls1021a-lpuart");
> + sport->lpuart32 = sdata->is_32;
>  
>   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>   sport->port.membase = devm_ioremap_resource(>dev, res);


Re: [PATCH 1/6] tty: serial: lpuart: introduce lpuart_soc_data to represent SoC property

2017-05-09 Thread Stefan Agner
On 2017-05-09 00:50, Dong Aisheng wrote:
> This is used to dynamically check the SoC specific lpuart properies.
> Currently only the checking of 32 bit register width is added which
> functions the same as before. More will be added later for supporting
> new chips.
> 
> Cc: Greg Kroah-Hartman 
> Cc: Jiri Slaby 
> Cc: Fugang Duan 
> Cc: Stefan Agner 
> Cc: Mingkai Hu 
> Cc: Yangbo Lu 
> Signed-off-by: Dong Aisheng 
> ---
>  drivers/tty/serial/fsl_lpuart.c | 25 ++---
>  1 file changed, 18 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
> index 15df1ba7..cd4e905 100644
> --- a/drivers/tty/serial/fsl_lpuart.c
> +++ b/drivers/tty/serial/fsl_lpuart.c
> @@ -258,13 +258,21 @@ struct lpuart_port {
>   wait_queue_head_t   dma_wait;
>  };
>  
> +struct lpuart_soc_data {
> + boolis_32;
> +};
> +
> +static struct lpuart_soc_data vf_data = {
> + .is_32 = false,
> +};
> +
> +static struct lpuart_soc_data ls_data = {
> + .is_32 = true,
> +};

This could be const I guess?

--
Stefan

> +
>  static const struct of_device_id lpuart_dt_ids[] = {
> - {
> - .compatible = "fsl,vf610-lpuart",
> - },
> - {
> - .compatible = "fsl,ls1021a-lpuart",
> - },
> + { .compatible = "fsl,vf610-lpuart", .data = _data, },
> + { .compatible = "fsl,ls1021a-lpuart", .data = _data, },
>   { /* sentinel */ }
>  };
>  MODULE_DEVICE_TABLE(of, lpuart_dt_ids);
> @@ -1971,6 +1979,9 @@ static struct uart_driver lpuart_reg = {
>  
>  static int lpuart_probe(struct platform_device *pdev)
>  {
> + const struct of_device_id *of_id = of_match_device(lpuart_dt_ids,
> +>dev);
> + const struct lpuart_soc_data *sdata = of_id->data;
>   struct device_node *np = pdev->dev.of_node;
>   struct lpuart_port *sport;
>   struct resource *res;
> @@ -1988,7 +1999,7 @@ static int lpuart_probe(struct platform_device *pdev)
>   return ret;
>   }
>   sport->port.line = ret;
> - sport->lpuart32 = of_device_is_compatible(np, "fsl,ls1021a-lpuart");
> + sport->lpuart32 = sdata->is_32;
>  
>   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>   sport->port.membase = devm_ioremap_resource(>dev, res);


Re: [kernel-hardening] Re: [PATCH v9 1/4] syscalls: Verify address limit before returning to user-mode

2017-05-09 Thread Al Viro
On Wed, May 10, 2017 at 04:21:37AM +0100, Al Viro wrote:
> On Wed, May 10, 2017 at 04:12:54AM +0100, Al Viro wrote:
> 
> > Broken commit: "net: don't play with address limits in kernel_recvmsg".
> > It would be OK if it was only about data.  Unfortunately, that's not
> > true in one case: svc_udp_recvfrom() wants ->msg_control.
> > 
> > Another delicate place: you can't assume that write() always advances
> > file position by its (positive) return value.  btrfs stuff is sensitive
> > to that.
> > 
> > ashmem probably _is_ OK with demanding ->read_iter(), but I'm not sure
> > about blind asma->file->f_pos += ret.  That's begging for races.  Actually,
> > scratch that - it *is* racy.
> 
> kvec_length(): please, don't.  I would rather have the last remaining
> iov_length() gone...   What do you need it for, anyway?  You have only
> two users and both have the count passed to them (as *count and *cnt resp.)

fcntl stuff: I've decided not to put something similar into work.compat
since I couldn't decide what to do with compat stuff - word-by-word copy
from userland converting to struct flock + conversion to posix_lock +
actual work + conversion to flock + word-by-word copy to userland...  Smells
like we might be better off with compat_flock_to_posix_lock() et.al.
I'm still not sure; played a bit one way and another and dediced to drop
it for now.  Hell knows...


Re: [kernel-hardening] Re: [PATCH v9 1/4] syscalls: Verify address limit before returning to user-mode

2017-05-09 Thread Al Viro
On Wed, May 10, 2017 at 04:21:37AM +0100, Al Viro wrote:
> On Wed, May 10, 2017 at 04:12:54AM +0100, Al Viro wrote:
> 
> > Broken commit: "net: don't play with address limits in kernel_recvmsg".
> > It would be OK if it was only about data.  Unfortunately, that's not
> > true in one case: svc_udp_recvfrom() wants ->msg_control.
> > 
> > Another delicate place: you can't assume that write() always advances
> > file position by its (positive) return value.  btrfs stuff is sensitive
> > to that.
> > 
> > ashmem probably _is_ OK with demanding ->read_iter(), but I'm not sure
> > about blind asma->file->f_pos += ret.  That's begging for races.  Actually,
> > scratch that - it *is* racy.
> 
> kvec_length(): please, don't.  I would rather have the last remaining
> iov_length() gone...   What do you need it for, anyway?  You have only
> two users and both have the count passed to them (as *count and *cnt resp.)

fcntl stuff: I've decided not to put something similar into work.compat
since I couldn't decide what to do with compat stuff - word-by-word copy
from userland converting to struct flock + conversion to posix_lock +
actual work + conversion to flock + word-by-word copy to userland...  Smells
like we might be better off with compat_flock_to_posix_lock() et.al.
I'm still not sure; played a bit one way and another and dediced to drop
it for now.  Hell knows...


Re: [PATCH] drm/mm: fix duplicate 'const' declaration specifier

2017-05-09 Thread Nick Desaulniers
ah seems like there's more of these:

drivers/gpu/drm/drm_mm.c:922

surprised compiling drivers/gpu/drm/drm_mm.o did not catch this the
first time...


Re: [PATCH] drm/mm: fix duplicate 'const' declaration specifier

2017-05-09 Thread Nick Desaulniers
ah seems like there's more of these:

drivers/gpu/drm/drm_mm.c:922

surprised compiling drivers/gpu/drm/drm_mm.o did not catch this the
first time...


[PATCH net-next V4 05/10] skb_array: introduce batch dequeuing

2017-05-09 Thread Jason Wang
Signed-off-by: Jason Wang 
---
 include/linux/skb_array.h | 25 +
 1 file changed, 25 insertions(+)

diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
index 79850b6..35226cd 100644
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -97,21 +97,46 @@ static inline struct sk_buff *skb_array_consume(struct 
skb_array *a)
return ptr_ring_consume(>ring);
 }
 
+static inline int skb_array_consume_batched(struct skb_array *a,
+   struct sk_buff **array, int n)
+{
+   return ptr_ring_consume_batched(>ring, (void **)array, n);
+}
+
 static inline struct sk_buff *skb_array_consume_irq(struct skb_array *a)
 {
return ptr_ring_consume_irq(>ring);
 }
 
+static inline int skb_array_consume_batched_irq(struct skb_array *a,
+   struct sk_buff **array, int n)
+{
+   return ptr_ring_consume_batched_irq(>ring, (void **)array, n);
+}
+
 static inline struct sk_buff *skb_array_consume_any(struct skb_array *a)
 {
return ptr_ring_consume_any(>ring);
 }
 
+static inline int skb_array_consume_batched_any(struct skb_array *a,
+   struct sk_buff **array, int n)
+{
+   return ptr_ring_consume_batched_any(>ring, (void **)array, n);
+}
+
+
 static inline struct sk_buff *skb_array_consume_bh(struct skb_array *a)
 {
return ptr_ring_consume_bh(>ring);
 }
 
+static inline int skb_array_consume_batched_bh(struct skb_array *a,
+  struct sk_buff **array, int n)
+{
+   return ptr_ring_consume_batched_bh(>ring, (void **)array, n);
+}
+
 static inline int __skb_array_len_with_tag(struct sk_buff *skb)
 {
if (likely(skb)) {
-- 
2.7.4



[PATCH net-next V4 05/10] skb_array: introduce batch dequeuing

2017-05-09 Thread Jason Wang
Signed-off-by: Jason Wang 
---
 include/linux/skb_array.h | 25 +
 1 file changed, 25 insertions(+)

diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
index 79850b6..35226cd 100644
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -97,21 +97,46 @@ static inline struct sk_buff *skb_array_consume(struct 
skb_array *a)
return ptr_ring_consume(>ring);
 }
 
+static inline int skb_array_consume_batched(struct skb_array *a,
+   struct sk_buff **array, int n)
+{
+   return ptr_ring_consume_batched(>ring, (void **)array, n);
+}
+
 static inline struct sk_buff *skb_array_consume_irq(struct skb_array *a)
 {
return ptr_ring_consume_irq(>ring);
 }
 
+static inline int skb_array_consume_batched_irq(struct skb_array *a,
+   struct sk_buff **array, int n)
+{
+   return ptr_ring_consume_batched_irq(>ring, (void **)array, n);
+}
+
 static inline struct sk_buff *skb_array_consume_any(struct skb_array *a)
 {
return ptr_ring_consume_any(>ring);
 }
 
+static inline int skb_array_consume_batched_any(struct skb_array *a,
+   struct sk_buff **array, int n)
+{
+   return ptr_ring_consume_batched_any(>ring, (void **)array, n);
+}
+
+
 static inline struct sk_buff *skb_array_consume_bh(struct skb_array *a)
 {
return ptr_ring_consume_bh(>ring);
 }
 
+static inline int skb_array_consume_batched_bh(struct skb_array *a,
+  struct sk_buff **array, int n)
+{
+   return ptr_ring_consume_batched_bh(>ring, (void **)array, n);
+}
+
 static inline int __skb_array_len_with_tag(struct sk_buff *skb)
 {
if (likely(skb)) {
-- 
2.7.4



[PATCH net-next V4 06/10] tun: export skb_array

2017-05-09 Thread Jason Wang
This patch exports skb_array through tun_get_skb_array(). Caller can
then manipulate skb array directly.

Signed-off-by: Jason Wang 
---
 drivers/net/tun.c  | 13 +
 include/linux/if_tun.h |  5 +
 2 files changed, 18 insertions(+)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index bbd707b..3cbfc5c 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2626,6 +2626,19 @@ struct socket *tun_get_socket(struct file *file)
 }
 EXPORT_SYMBOL_GPL(tun_get_socket);
 
+struct skb_array *tun_get_skb_array(struct file *file)
+{
+   struct tun_file *tfile;
+
+   if (file->f_op != _fops)
+   return ERR_PTR(-EINVAL);
+   tfile = file->private_data;
+   if (!tfile)
+   return ERR_PTR(-EBADFD);
+   return >tx_array;
+}
+EXPORT_SYMBOL_GPL(tun_get_skb_array);
+
 module_init(tun_init);
 module_exit(tun_cleanup);
 MODULE_DESCRIPTION(DRV_DESCRIPTION);
diff --git a/include/linux/if_tun.h b/include/linux/if_tun.h
index ed6da2e..bf9bdf4 100644
--- a/include/linux/if_tun.h
+++ b/include/linux/if_tun.h
@@ -19,6 +19,7 @@
 
 #if defined(CONFIG_TUN) || defined(CONFIG_TUN_MODULE)
 struct socket *tun_get_socket(struct file *);
+struct skb_array *tun_get_skb_array(struct file *file);
 #else
 #include 
 #include 
@@ -28,5 +29,9 @@ static inline struct socket *tun_get_socket(struct file *f)
 {
return ERR_PTR(-EINVAL);
 }
+static inline struct skb_array *tun_get_skb_array(struct file *f)
+{
+   return ERR_PTR(-EINVAL);
+}
 #endif /* CONFIG_TUN */
 #endif /* __IF_TUN_H */
-- 
2.7.4



[PATCH net-next V4 06/10] tun: export skb_array

2017-05-09 Thread Jason Wang
This patch exports skb_array through tun_get_skb_array(). Caller can
then manipulate skb array directly.

Signed-off-by: Jason Wang 
---
 drivers/net/tun.c  | 13 +
 include/linux/if_tun.h |  5 +
 2 files changed, 18 insertions(+)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index bbd707b..3cbfc5c 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2626,6 +2626,19 @@ struct socket *tun_get_socket(struct file *file)
 }
 EXPORT_SYMBOL_GPL(tun_get_socket);
 
+struct skb_array *tun_get_skb_array(struct file *file)
+{
+   struct tun_file *tfile;
+
+   if (file->f_op != _fops)
+   return ERR_PTR(-EINVAL);
+   tfile = file->private_data;
+   if (!tfile)
+   return ERR_PTR(-EBADFD);
+   return >tx_array;
+}
+EXPORT_SYMBOL_GPL(tun_get_skb_array);
+
 module_init(tun_init);
 module_exit(tun_cleanup);
 MODULE_DESCRIPTION(DRV_DESCRIPTION);
diff --git a/include/linux/if_tun.h b/include/linux/if_tun.h
index ed6da2e..bf9bdf4 100644
--- a/include/linux/if_tun.h
+++ b/include/linux/if_tun.h
@@ -19,6 +19,7 @@
 
 #if defined(CONFIG_TUN) || defined(CONFIG_TUN_MODULE)
 struct socket *tun_get_socket(struct file *);
+struct skb_array *tun_get_skb_array(struct file *file);
 #else
 #include 
 #include 
@@ -28,5 +29,9 @@ static inline struct socket *tun_get_socket(struct file *f)
 {
return ERR_PTR(-EINVAL);
 }
+static inline struct skb_array *tun_get_skb_array(struct file *f)
+{
+   return ERR_PTR(-EINVAL);
+}
 #endif /* CONFIG_TUN */
 #endif /* __IF_TUN_H */
-- 
2.7.4



[PATCH net-next V4 10/10] vhost_net: try batch dequing from skb array

2017-05-09 Thread Jason Wang
We used to dequeue one skb during recvmsg() from skb_array, this could
be inefficient because of the bad cache utilization and spinlock
touching for each packet. This patch tries to batch them by calling
batch dequeuing helpers explicitly on the exported skb array and pass
the skb back through msg_control for underlayer socket to finish the
userspace copying.

Batch dequeuing is also the requirement for more batching improvement
on rx.

Tests were done by pktgen on tap with XDP1 in guest on top of batch
zeroing:

rx batch | pps

2562.41Mpps (+6.16%)
1282.48Mpps (+8.80%)
64 2.38Mpps (+3.96%) <- Default
16 2.31Mpps (+1.76%)
4  2.31Mpps (+1.76%)
1  2.30Mpps (+1.32%)
0  2.27Mpps (+7.48%)

Signed-off-by: Jason Wang 
---
 drivers/vhost/net.c | 117 +---
 1 file changed, 111 insertions(+), 6 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 9b51989..fbaecf3 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -28,6 +28,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 
@@ -85,6 +87,13 @@ struct vhost_net_ubuf_ref {
struct vhost_virtqueue *vq;
 };
 
+#define VHOST_RX_BATCH 64
+struct vhost_net_buf {
+   struct sk_buff *queue[VHOST_RX_BATCH];
+   int tail;
+   int head;
+};
+
 struct vhost_net_virtqueue {
struct vhost_virtqueue vq;
size_t vhost_hlen;
@@ -99,6 +108,8 @@ struct vhost_net_virtqueue {
/* Reference counting for outstanding ubufs.
 * Protected by vq mutex. Writers must also take device mutex. */
struct vhost_net_ubuf_ref *ubufs;
+   struct skb_array *rx_array;
+   struct vhost_net_buf rxq;
 };
 
 struct vhost_net {
@@ -117,6 +128,71 @@ struct vhost_net {
 
 static unsigned vhost_net_zcopy_mask __read_mostly;
 
+static void *vhost_net_buf_get_ptr(struct vhost_net_buf *rxq)
+{
+   if (rxq->tail != rxq->head)
+   return rxq->queue[rxq->head];
+   else
+   return NULL;
+}
+
+static int vhost_net_buf_get_size(struct vhost_net_buf *rxq)
+{
+   return rxq->tail - rxq->head;
+}
+
+static int vhost_net_buf_is_empty(struct vhost_net_buf *rxq)
+{
+   return rxq->tail == rxq->head;
+}
+
+static void *vhost_net_buf_consume(struct vhost_net_buf *rxq)
+{
+   void *ret = vhost_net_buf_get_ptr(rxq);
+   ++rxq->head;
+   return ret;
+}
+
+static int vhost_net_buf_produce(struct vhost_net_virtqueue *nvq)
+{
+   struct vhost_net_buf *rxq = >rxq;
+
+   rxq->head = 0;
+   rxq->tail = skb_array_consume_batched(nvq->rx_array, rxq->queue,
+ VHOST_RX_BATCH);
+   return rxq->tail;
+}
+
+static void vhost_net_buf_unproduce(struct vhost_net_virtqueue *nvq)
+{
+   struct vhost_net_buf *rxq = >rxq;
+
+   if (nvq->rx_array && !vhost_net_buf_is_empty(rxq)) {
+   skb_array_unconsume(nvq->rx_array, rxq->queue + rxq->head,
+   vhost_net_buf_get_size(rxq));
+   rxq->head = rxq->tail = 0;
+   }
+}
+
+static int vhost_net_buf_peek(struct vhost_net_virtqueue *nvq)
+{
+   struct vhost_net_buf *rxq = >rxq;
+
+   if (!vhost_net_buf_is_empty(rxq))
+   goto out;
+
+   if (!vhost_net_buf_produce(nvq))
+   return 0;
+
+out:
+   return __skb_array_len_with_tag(vhost_net_buf_get_ptr(rxq));
+}
+
+static void vhost_net_buf_init(struct vhost_net_buf *rxq)
+{
+   rxq->head = rxq->tail = 0;
+}
+
 static void vhost_net_enable_zcopy(int vq)
 {
vhost_net_zcopy_mask |= 0x1 << vq;
@@ -201,6 +277,7 @@ static void vhost_net_vq_reset(struct vhost_net *n)
n->vqs[i].ubufs = NULL;
n->vqs[i].vhost_hlen = 0;
n->vqs[i].sock_hlen = 0;
+   vhost_net_buf_init(>vqs[i].rxq);
}
 
 }
@@ -503,15 +580,14 @@ static void handle_tx(struct vhost_net *net)
mutex_unlock(>mutex);
 }
 
-static int peek_head_len(struct sock *sk)
+static int peek_head_len(struct vhost_net_virtqueue *rvq, struct sock *sk)
 {
-   struct socket *sock = sk->sk_socket;
struct sk_buff *head;
int len = 0;
unsigned long flags;
 
-   if (sock->ops->peek_len)
-   return sock->ops->peek_len(sock);
+   if (rvq->rx_array)
+   return vhost_net_buf_peek(rvq);
 
spin_lock_irqsave(>sk_receive_queue.lock, flags);
head = skb_peek(>sk_receive_queue);
@@ -537,10 +613,11 @@ static int sk_has_rx_data(struct sock *sk)
 
 static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk)
 {
+   struct vhost_net_virtqueue *rvq = >vqs[VHOST_NET_VQ_RX];
struct vhost_net_virtqueue *nvq = >vqs[VHOST_NET_VQ_TX];
struct vhost_virtqueue *vq = >vq;
unsigned long uninitialized_var(endtime);
-   int len = peek_head_len(sk);
+   int len = peek_head_len(rvq, sk);
 

[PATCH net-next V4 10/10] vhost_net: try batch dequing from skb array

2017-05-09 Thread Jason Wang
We used to dequeue one skb during recvmsg() from skb_array, this could
be inefficient because of the bad cache utilization and spinlock
touching for each packet. This patch tries to batch them by calling
batch dequeuing helpers explicitly on the exported skb array and pass
the skb back through msg_control for underlayer socket to finish the
userspace copying.

Batch dequeuing is also the requirement for more batching improvement
on rx.

Tests were done by pktgen on tap with XDP1 in guest on top of batch
zeroing:

rx batch | pps

2562.41Mpps (+6.16%)
1282.48Mpps (+8.80%)
64 2.38Mpps (+3.96%) <- Default
16 2.31Mpps (+1.76%)
4  2.31Mpps (+1.76%)
1  2.30Mpps (+1.32%)
0  2.27Mpps (+7.48%)

Signed-off-by: Jason Wang 
---
 drivers/vhost/net.c | 117 +---
 1 file changed, 111 insertions(+), 6 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 9b51989..fbaecf3 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -28,6 +28,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 
@@ -85,6 +87,13 @@ struct vhost_net_ubuf_ref {
struct vhost_virtqueue *vq;
 };
 
+#define VHOST_RX_BATCH 64
+struct vhost_net_buf {
+   struct sk_buff *queue[VHOST_RX_BATCH];
+   int tail;
+   int head;
+};
+
 struct vhost_net_virtqueue {
struct vhost_virtqueue vq;
size_t vhost_hlen;
@@ -99,6 +108,8 @@ struct vhost_net_virtqueue {
/* Reference counting for outstanding ubufs.
 * Protected by vq mutex. Writers must also take device mutex. */
struct vhost_net_ubuf_ref *ubufs;
+   struct skb_array *rx_array;
+   struct vhost_net_buf rxq;
 };
 
 struct vhost_net {
@@ -117,6 +128,71 @@ struct vhost_net {
 
 static unsigned vhost_net_zcopy_mask __read_mostly;
 
+static void *vhost_net_buf_get_ptr(struct vhost_net_buf *rxq)
+{
+   if (rxq->tail != rxq->head)
+   return rxq->queue[rxq->head];
+   else
+   return NULL;
+}
+
+static int vhost_net_buf_get_size(struct vhost_net_buf *rxq)
+{
+   return rxq->tail - rxq->head;
+}
+
+static int vhost_net_buf_is_empty(struct vhost_net_buf *rxq)
+{
+   return rxq->tail == rxq->head;
+}
+
+static void *vhost_net_buf_consume(struct vhost_net_buf *rxq)
+{
+   void *ret = vhost_net_buf_get_ptr(rxq);
+   ++rxq->head;
+   return ret;
+}
+
+static int vhost_net_buf_produce(struct vhost_net_virtqueue *nvq)
+{
+   struct vhost_net_buf *rxq = >rxq;
+
+   rxq->head = 0;
+   rxq->tail = skb_array_consume_batched(nvq->rx_array, rxq->queue,
+ VHOST_RX_BATCH);
+   return rxq->tail;
+}
+
+static void vhost_net_buf_unproduce(struct vhost_net_virtqueue *nvq)
+{
+   struct vhost_net_buf *rxq = >rxq;
+
+   if (nvq->rx_array && !vhost_net_buf_is_empty(rxq)) {
+   skb_array_unconsume(nvq->rx_array, rxq->queue + rxq->head,
+   vhost_net_buf_get_size(rxq));
+   rxq->head = rxq->tail = 0;
+   }
+}
+
+static int vhost_net_buf_peek(struct vhost_net_virtqueue *nvq)
+{
+   struct vhost_net_buf *rxq = >rxq;
+
+   if (!vhost_net_buf_is_empty(rxq))
+   goto out;
+
+   if (!vhost_net_buf_produce(nvq))
+   return 0;
+
+out:
+   return __skb_array_len_with_tag(vhost_net_buf_get_ptr(rxq));
+}
+
+static void vhost_net_buf_init(struct vhost_net_buf *rxq)
+{
+   rxq->head = rxq->tail = 0;
+}
+
 static void vhost_net_enable_zcopy(int vq)
 {
vhost_net_zcopy_mask |= 0x1 << vq;
@@ -201,6 +277,7 @@ static void vhost_net_vq_reset(struct vhost_net *n)
n->vqs[i].ubufs = NULL;
n->vqs[i].vhost_hlen = 0;
n->vqs[i].sock_hlen = 0;
+   vhost_net_buf_init(>vqs[i].rxq);
}
 
 }
@@ -503,15 +580,14 @@ static void handle_tx(struct vhost_net *net)
mutex_unlock(>mutex);
 }
 
-static int peek_head_len(struct sock *sk)
+static int peek_head_len(struct vhost_net_virtqueue *rvq, struct sock *sk)
 {
-   struct socket *sock = sk->sk_socket;
struct sk_buff *head;
int len = 0;
unsigned long flags;
 
-   if (sock->ops->peek_len)
-   return sock->ops->peek_len(sock);
+   if (rvq->rx_array)
+   return vhost_net_buf_peek(rvq);
 
spin_lock_irqsave(>sk_receive_queue.lock, flags);
head = skb_peek(>sk_receive_queue);
@@ -537,10 +613,11 @@ static int sk_has_rx_data(struct sock *sk)
 
 static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk)
 {
+   struct vhost_net_virtqueue *rvq = >vqs[VHOST_NET_VQ_RX];
struct vhost_net_virtqueue *nvq = >vqs[VHOST_NET_VQ_TX];
struct vhost_virtqueue *vq = >vq;
unsigned long uninitialized_var(endtime);
-   int len = peek_head_len(sk);
+   int len = peek_head_len(rvq, sk);
 
if (!len && 

[PATCH net-next V4 07/10] tap: export skb_array

2017-05-09 Thread Jason Wang
This patch exports skb_array through tap_get_skb_array(). Caller can
then manipulate skb array directly.

Signed-off-by: Jason Wang 
---
 drivers/net/tap.c  | 13 +
 include/linux/if_tap.h |  5 +
 2 files changed, 18 insertions(+)

diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 4d4173d..abdaf86 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -1193,6 +1193,19 @@ struct socket *tap_get_socket(struct file *file)
 }
 EXPORT_SYMBOL_GPL(tap_get_socket);
 
+struct skb_array *tap_get_skb_array(struct file *file)
+{
+   struct tap_queue *q;
+
+   if (file->f_op != _fops)
+   return ERR_PTR(-EINVAL);
+   q = file->private_data;
+   if (!q)
+   return ERR_PTR(-EBADFD);
+   return >skb_array;
+}
+EXPORT_SYMBOL_GPL(tap_get_skb_array);
+
 int tap_queue_resize(struct tap_dev *tap)
 {
struct net_device *dev = tap->dev;
diff --git a/include/linux/if_tap.h b/include/linux/if_tap.h
index 3482c3c..4837157 100644
--- a/include/linux/if_tap.h
+++ b/include/linux/if_tap.h
@@ -3,6 +3,7 @@
 
 #if IS_ENABLED(CONFIG_TAP)
 struct socket *tap_get_socket(struct file *);
+struct skb_array *tap_get_skb_array(struct file *file);
 #else
 #include 
 #include 
@@ -12,6 +13,10 @@ static inline struct socket *tap_get_socket(struct file *f)
 {
return ERR_PTR(-EINVAL);
 }
+static inline struct skb_array *tap_get_skb_array(struct file *f)
+{
+   return ERR_PTR(-EINVAL);
+}
 #endif /* CONFIG_TAP */
 
 #include 
-- 
2.7.4



[PATCH net-next V4 08/10] tun: support receiving skb through msg_control

2017-05-09 Thread Jason Wang
This patch makes tun_recvmsg() can receive from skb from its caller
through msg_control. Vhost_net will be the first user.

Signed-off-by: Jason Wang 
---
 drivers/net/tun.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 3cbfc5c..f8041f9c 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1510,9 +1510,8 @@ static struct sk_buff *tun_ring_recv(struct tun_file 
*tfile, int noblock,
 
 static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
   struct iov_iter *to,
-  int noblock)
+  int noblock, struct sk_buff *skb)
 {
-   struct sk_buff *skb;
ssize_t ret;
int err;
 
@@ -1521,10 +1520,12 @@ static ssize_t tun_do_read(struct tun_struct *tun, 
struct tun_file *tfile,
if (!iov_iter_count(to))
return 0;
 
-   /* Read frames from ring */
-   skb = tun_ring_recv(tfile, noblock, );
-   if (!skb)
-   return err;
+   if (!skb) {
+   /* Read frames from ring */
+   skb = tun_ring_recv(tfile, noblock, );
+   if (!skb)
+   return err;
+   }
 
ret = tun_put_user(tun, tfile, skb, to);
if (unlikely(ret < 0))
@@ -1544,7 +1545,7 @@ static ssize_t tun_chr_read_iter(struct kiocb *iocb, 
struct iov_iter *to)
 
if (!tun)
return -EBADFD;
-   ret = tun_do_read(tun, tfile, to, file->f_flags & O_NONBLOCK);
+   ret = tun_do_read(tun, tfile, to, file->f_flags & O_NONBLOCK, NULL);
ret = min_t(ssize_t, ret, len);
if (ret > 0)
iocb->ki_pos = ret;
@@ -1646,7 +1647,8 @@ static int tun_recvmsg(struct socket *sock, struct msghdr 
*m, size_t total_len,
 SOL_PACKET, TUN_TX_TIMESTAMP);
goto out;
}
-   ret = tun_do_read(tun, tfile, >msg_iter, flags & MSG_DONTWAIT);
+   ret = tun_do_read(tun, tfile, >msg_iter, flags & MSG_DONTWAIT,
+ m->msg_control);
if (ret > (ssize_t)total_len) {
m->msg_flags |= MSG_TRUNC;
ret = flags & MSG_TRUNC ? ret : total_len;
-- 
2.7.4



[PATCH net-next V4 09/10] tap: support receiving skb from msg_control

2017-05-09 Thread Jason Wang
This patch makes tap_recvmsg() can receive from skb from its caller
through msg_control. Vhost_net will be the first user.

Signed-off-by: Jason Wang 
---
 drivers/net/tap.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index abdaf86..9af3239 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -824,15 +824,17 @@ static ssize_t tap_put_user(struct tap_queue *q,
 
 static ssize_t tap_do_read(struct tap_queue *q,
   struct iov_iter *to,
-  int noblock)
+  int noblock, struct sk_buff *skb)
 {
DEFINE_WAIT(wait);
-   struct sk_buff *skb;
ssize_t ret = 0;
 
if (!iov_iter_count(to))
return 0;
 
+   if (skb)
+   goto put;
+
while (1) {
if (!noblock)
prepare_to_wait(sk_sleep(>sk), ,
@@ -856,6 +858,7 @@ static ssize_t tap_do_read(struct tap_queue *q,
if (!noblock)
finish_wait(sk_sleep(>sk), );
 
+put:
if (skb) {
ret = tap_put_user(q, skb, to);
if (unlikely(ret < 0))
@@ -872,7 +875,7 @@ static ssize_t tap_read_iter(struct kiocb *iocb, struct 
iov_iter *to)
struct tap_queue *q = file->private_data;
ssize_t len = iov_iter_count(to), ret;
 
-   ret = tap_do_read(q, to, file->f_flags & O_NONBLOCK);
+   ret = tap_do_read(q, to, file->f_flags & O_NONBLOCK, NULL);
ret = min_t(ssize_t, ret, len);
if (ret > 0)
iocb->ki_pos = ret;
@@ -1155,7 +1158,8 @@ static int tap_recvmsg(struct socket *sock, struct msghdr 
*m,
int ret;
if (flags & ~(MSG_DONTWAIT|MSG_TRUNC))
return -EINVAL;
-   ret = tap_do_read(q, >msg_iter, flags & MSG_DONTWAIT);
+   ret = tap_do_read(q, >msg_iter, flags & MSG_DONTWAIT,
+ m->msg_control);
if (ret > total_len) {
m->msg_flags |= MSG_TRUNC;
ret = flags & MSG_TRUNC ? ret : total_len;
-- 
2.7.4



[PATCH net-next V4 07/10] tap: export skb_array

2017-05-09 Thread Jason Wang
This patch exports skb_array through tap_get_skb_array(). Caller can
then manipulate skb array directly.

Signed-off-by: Jason Wang 
---
 drivers/net/tap.c  | 13 +
 include/linux/if_tap.h |  5 +
 2 files changed, 18 insertions(+)

diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 4d4173d..abdaf86 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -1193,6 +1193,19 @@ struct socket *tap_get_socket(struct file *file)
 }
 EXPORT_SYMBOL_GPL(tap_get_socket);
 
+struct skb_array *tap_get_skb_array(struct file *file)
+{
+   struct tap_queue *q;
+
+   if (file->f_op != _fops)
+   return ERR_PTR(-EINVAL);
+   q = file->private_data;
+   if (!q)
+   return ERR_PTR(-EBADFD);
+   return >skb_array;
+}
+EXPORT_SYMBOL_GPL(tap_get_skb_array);
+
 int tap_queue_resize(struct tap_dev *tap)
 {
struct net_device *dev = tap->dev;
diff --git a/include/linux/if_tap.h b/include/linux/if_tap.h
index 3482c3c..4837157 100644
--- a/include/linux/if_tap.h
+++ b/include/linux/if_tap.h
@@ -3,6 +3,7 @@
 
 #if IS_ENABLED(CONFIG_TAP)
 struct socket *tap_get_socket(struct file *);
+struct skb_array *tap_get_skb_array(struct file *file);
 #else
 #include 
 #include 
@@ -12,6 +13,10 @@ static inline struct socket *tap_get_socket(struct file *f)
 {
return ERR_PTR(-EINVAL);
 }
+static inline struct skb_array *tap_get_skb_array(struct file *f)
+{
+   return ERR_PTR(-EINVAL);
+}
 #endif /* CONFIG_TAP */
 
 #include 
-- 
2.7.4



[PATCH net-next V4 08/10] tun: support receiving skb through msg_control

2017-05-09 Thread Jason Wang
This patch makes tun_recvmsg() can receive from skb from its caller
through msg_control. Vhost_net will be the first user.

Signed-off-by: Jason Wang 
---
 drivers/net/tun.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 3cbfc5c..f8041f9c 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1510,9 +1510,8 @@ static struct sk_buff *tun_ring_recv(struct tun_file 
*tfile, int noblock,
 
 static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
   struct iov_iter *to,
-  int noblock)
+  int noblock, struct sk_buff *skb)
 {
-   struct sk_buff *skb;
ssize_t ret;
int err;
 
@@ -1521,10 +1520,12 @@ static ssize_t tun_do_read(struct tun_struct *tun, 
struct tun_file *tfile,
if (!iov_iter_count(to))
return 0;
 
-   /* Read frames from ring */
-   skb = tun_ring_recv(tfile, noblock, );
-   if (!skb)
-   return err;
+   if (!skb) {
+   /* Read frames from ring */
+   skb = tun_ring_recv(tfile, noblock, );
+   if (!skb)
+   return err;
+   }
 
ret = tun_put_user(tun, tfile, skb, to);
if (unlikely(ret < 0))
@@ -1544,7 +1545,7 @@ static ssize_t tun_chr_read_iter(struct kiocb *iocb, 
struct iov_iter *to)
 
if (!tun)
return -EBADFD;
-   ret = tun_do_read(tun, tfile, to, file->f_flags & O_NONBLOCK);
+   ret = tun_do_read(tun, tfile, to, file->f_flags & O_NONBLOCK, NULL);
ret = min_t(ssize_t, ret, len);
if (ret > 0)
iocb->ki_pos = ret;
@@ -1646,7 +1647,8 @@ static int tun_recvmsg(struct socket *sock, struct msghdr 
*m, size_t total_len,
 SOL_PACKET, TUN_TX_TIMESTAMP);
goto out;
}
-   ret = tun_do_read(tun, tfile, >msg_iter, flags & MSG_DONTWAIT);
+   ret = tun_do_read(tun, tfile, >msg_iter, flags & MSG_DONTWAIT,
+ m->msg_control);
if (ret > (ssize_t)total_len) {
m->msg_flags |= MSG_TRUNC;
ret = flags & MSG_TRUNC ? ret : total_len;
-- 
2.7.4



[PATCH net-next V4 09/10] tap: support receiving skb from msg_control

2017-05-09 Thread Jason Wang
This patch makes tap_recvmsg() can receive from skb from its caller
through msg_control. Vhost_net will be the first user.

Signed-off-by: Jason Wang 
---
 drivers/net/tap.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index abdaf86..9af3239 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -824,15 +824,17 @@ static ssize_t tap_put_user(struct tap_queue *q,
 
 static ssize_t tap_do_read(struct tap_queue *q,
   struct iov_iter *to,
-  int noblock)
+  int noblock, struct sk_buff *skb)
 {
DEFINE_WAIT(wait);
-   struct sk_buff *skb;
ssize_t ret = 0;
 
if (!iov_iter_count(to))
return 0;
 
+   if (skb)
+   goto put;
+
while (1) {
if (!noblock)
prepare_to_wait(sk_sleep(>sk), ,
@@ -856,6 +858,7 @@ static ssize_t tap_do_read(struct tap_queue *q,
if (!noblock)
finish_wait(sk_sleep(>sk), );
 
+put:
if (skb) {
ret = tap_put_user(q, skb, to);
if (unlikely(ret < 0))
@@ -872,7 +875,7 @@ static ssize_t tap_read_iter(struct kiocb *iocb, struct 
iov_iter *to)
struct tap_queue *q = file->private_data;
ssize_t len = iov_iter_count(to), ret;
 
-   ret = tap_do_read(q, to, file->f_flags & O_NONBLOCK);
+   ret = tap_do_read(q, to, file->f_flags & O_NONBLOCK, NULL);
ret = min_t(ssize_t, ret, len);
if (ret > 0)
iocb->ki_pos = ret;
@@ -1155,7 +1158,8 @@ static int tap_recvmsg(struct socket *sock, struct msghdr 
*m,
int ret;
if (flags & ~(MSG_DONTWAIT|MSG_TRUNC))
return -EINVAL;
-   ret = tap_do_read(q, >msg_iter, flags & MSG_DONTWAIT);
+   ret = tap_do_read(q, >msg_iter, flags & MSG_DONTWAIT,
+ m->msg_control);
if (ret > total_len) {
m->msg_flags |= MSG_TRUNC;
ret = flags & MSG_TRUNC ? ret : total_len;
-- 
2.7.4



[PATCH net-next V4 01/10] ptr_ring: batch ring zeroing

2017-05-09 Thread Jason Wang
From: "Michael S. Tsirkin" 

A known weakness in ptr_ring design is that it does not handle well the
situation when ring is almost full: as entries are consumed they are
immediately used again by the producer, so consumer and producer are
writing to a shared cache line.

To fix this, add batching to consume calls: as entries are
consumed do not write NULL into the ring until we get
a multiple (in current implementation 2x) of cache lines
away from the producer. At that point, write them all out.

We do the write out in the reverse order to keep
producer from sharing cache with consumer for as long
as possible.

Writeout also triggers when ring wraps around - there's
no special reason to do this but it helps keep the code
a bit simpler.

What should we do if getting away from producer by 2 cache lines
would mean we are keeping the ring moe than half empty?
Maybe we should reduce the batching in this case,
current patch simply reduces the batching.

Notes:
- it is no longer true that a call to consume guarantees
  that the following call to produce will succeed.
  No users seem to assume that.
- batching can also in theory reduce the signalling rate:
  users that would previously send interrups to the producer
  to wake it up after consuming each entry would now only
  need to do this once in a batch.
  Doing this would be easy by returning a flag to the caller.
  No users seem to do signalling on consume yet so this was not
  implemented yet.

Tested with pktgen on tap with xdp1 in guest:

Before 2.10 Mpps
After  2.27 Mpps (+7.48%)

Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Jesper Dangaard Brouer 
Signed-off-by: Jason Wang 
---
 include/linux/ptr_ring.h | 63 +---
 1 file changed, 54 insertions(+), 9 deletions(-)

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 6c70444..6b2e0dd 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -34,11 +34,13 @@
 struct ptr_ring {
int producer cacheline_aligned_in_smp;
spinlock_t producer_lock;
-   int consumer cacheline_aligned_in_smp;
+   int consumer_head cacheline_aligned_in_smp; /* next valid entry */
+   int consumer_tail; /* next entry to invalidate */
spinlock_t consumer_lock;
/* Shared consumer/producer data */
/* Read-only by both the producer and the consumer */
int size cacheline_aligned_in_smp; /* max entries in queue */
+   int batch; /* number of entries to consume in a batch */
void **queue;
 };
 
@@ -170,7 +172,7 @@ static inline int ptr_ring_produce_bh(struct ptr_ring *r, 
void *ptr)
 static inline void *__ptr_ring_peek(struct ptr_ring *r)
 {
if (likely(r->size))
-   return r->queue[r->consumer];
+   return r->queue[r->consumer_head];
return NULL;
 }
 
@@ -231,9 +233,38 @@ static inline bool ptr_ring_empty_bh(struct ptr_ring *r)
 /* Must only be called after __ptr_ring_peek returned !NULL */
 static inline void __ptr_ring_discard_one(struct ptr_ring *r)
 {
-   r->queue[r->consumer++] = NULL;
-   if (unlikely(r->consumer >= r->size))
-   r->consumer = 0;
+   /* Fundamentally, what we want to do is update consumer
+* index and zero out the entry so producer can reuse it.
+* Doing it naively at each consume would be as simple as:
+*   r->queue[r->consumer++] = NULL;
+*   if (unlikely(r->consumer >= r->size))
+*   r->consumer = 0;
+* but that is suboptimal when the ring is full as producer is writing
+* out new entries in the same cache line.  Defer these updates until a
+* batch of entries has been consumed.
+*/
+   int head = r->consumer_head++;
+
+   /* Once we have processed enough entries invalidate them in
+* the ring all at once so producer can reuse their space in the ring.
+* We also do this when we reach end of the ring - not mandatory
+* but helps keep the implementation simple.
+*/
+   if (unlikely(r->consumer_head - r->consumer_tail >= r->batch ||
+r->consumer_head >= r->size)) {
+   /* Zero out entries in the reverse order: this way we touch the
+* cache line that producer might currently be reading the last;
+* producer won't make progress and touch other cache lines
+* besides the first one until we write out all entries.
+*/
+   while (likely(head >= r->consumer_tail))
+   r->queue[head--] = NULL;
+   r->consumer_tail = r->consumer_head;
+   }
+   if (unlikely(r->consumer_head >= r->size)) {
+   r->consumer_head = 0;
+   r->consumer_tail = 0;
+   }
 }
 
 static inline void *__ptr_ring_consume(struct ptr_ring *r)
@@ -345,14 

[PATCH net-next V4 01/10] ptr_ring: batch ring zeroing

2017-05-09 Thread Jason Wang
From: "Michael S. Tsirkin" 

A known weakness in ptr_ring design is that it does not handle well the
situation when ring is almost full: as entries are consumed they are
immediately used again by the producer, so consumer and producer are
writing to a shared cache line.

To fix this, add batching to consume calls: as entries are
consumed do not write NULL into the ring until we get
a multiple (in current implementation 2x) of cache lines
away from the producer. At that point, write them all out.

We do the write out in the reverse order to keep
producer from sharing cache with consumer for as long
as possible.

Writeout also triggers when ring wraps around - there's
no special reason to do this but it helps keep the code
a bit simpler.

What should we do if getting away from producer by 2 cache lines
would mean we are keeping the ring moe than half empty?
Maybe we should reduce the batching in this case,
current patch simply reduces the batching.

Notes:
- it is no longer true that a call to consume guarantees
  that the following call to produce will succeed.
  No users seem to assume that.
- batching can also in theory reduce the signalling rate:
  users that would previously send interrups to the producer
  to wake it up after consuming each entry would now only
  need to do this once in a batch.
  Doing this would be easy by returning a flag to the caller.
  No users seem to do signalling on consume yet so this was not
  implemented yet.

Tested with pktgen on tap with xdp1 in guest:

Before 2.10 Mpps
After  2.27 Mpps (+7.48%)

Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Jesper Dangaard Brouer 
Signed-off-by: Jason Wang 
---
 include/linux/ptr_ring.h | 63 +---
 1 file changed, 54 insertions(+), 9 deletions(-)

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 6c70444..6b2e0dd 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -34,11 +34,13 @@
 struct ptr_ring {
int producer cacheline_aligned_in_smp;
spinlock_t producer_lock;
-   int consumer cacheline_aligned_in_smp;
+   int consumer_head cacheline_aligned_in_smp; /* next valid entry */
+   int consumer_tail; /* next entry to invalidate */
spinlock_t consumer_lock;
/* Shared consumer/producer data */
/* Read-only by both the producer and the consumer */
int size cacheline_aligned_in_smp; /* max entries in queue */
+   int batch; /* number of entries to consume in a batch */
void **queue;
 };
 
@@ -170,7 +172,7 @@ static inline int ptr_ring_produce_bh(struct ptr_ring *r, 
void *ptr)
 static inline void *__ptr_ring_peek(struct ptr_ring *r)
 {
if (likely(r->size))
-   return r->queue[r->consumer];
+   return r->queue[r->consumer_head];
return NULL;
 }
 
@@ -231,9 +233,38 @@ static inline bool ptr_ring_empty_bh(struct ptr_ring *r)
 /* Must only be called after __ptr_ring_peek returned !NULL */
 static inline void __ptr_ring_discard_one(struct ptr_ring *r)
 {
-   r->queue[r->consumer++] = NULL;
-   if (unlikely(r->consumer >= r->size))
-   r->consumer = 0;
+   /* Fundamentally, what we want to do is update consumer
+* index and zero out the entry so producer can reuse it.
+* Doing it naively at each consume would be as simple as:
+*   r->queue[r->consumer++] = NULL;
+*   if (unlikely(r->consumer >= r->size))
+*   r->consumer = 0;
+* but that is suboptimal when the ring is full as producer is writing
+* out new entries in the same cache line.  Defer these updates until a
+* batch of entries has been consumed.
+*/
+   int head = r->consumer_head++;
+
+   /* Once we have processed enough entries invalidate them in
+* the ring all at once so producer can reuse their space in the ring.
+* We also do this when we reach end of the ring - not mandatory
+* but helps keep the implementation simple.
+*/
+   if (unlikely(r->consumer_head - r->consumer_tail >= r->batch ||
+r->consumer_head >= r->size)) {
+   /* Zero out entries in the reverse order: this way we touch the
+* cache line that producer might currently be reading the last;
+* producer won't make progress and touch other cache lines
+* besides the first one until we write out all entries.
+*/
+   while (likely(head >= r->consumer_tail))
+   r->queue[head--] = NULL;
+   r->consumer_tail = r->consumer_head;
+   }
+   if (unlikely(r->consumer_head >= r->size)) {
+   r->consumer_head = 0;
+   r->consumer_tail = 0;
+   }
 }
 
 static inline void *__ptr_ring_consume(struct ptr_ring *r)
@@ -345,14 +376,27 @@ static inline void **__ptr_ring_init_queue_alloc(int 
size, 

[PATCH net-next V4 02/10] ptr_ring: add ptr_ring_unconsume

2017-05-09 Thread Jason Wang
From: "Michael S. Tsirkin" 

Applications that consume a batch of entries in one go
can benefit from ability to return some of them back
into the ring.

Add an API for that - assuming there's space. If there's no space
naturally can't do this and have to drop entries, but this implies ring
is full so we'd likely drop some anyway.

Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
---
 include/linux/ptr_ring.h | 55 
 1 file changed, 55 insertions(+)

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 6b2e0dd..796b90f 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -403,6 +403,61 @@ static inline int ptr_ring_init(struct ptr_ring *r, int 
size, gfp_t gfp)
return 0;
 }
 
+/*
+ * Return entries into ring. Destroy entries that don't fit.
+ *
+ * Note: this is expected to be a rare slow path operation.
+ *
+ * Note: producer lock is nested within consumer lock, so if you
+ * resize you must make sure all uses nest correctly.
+ * In particular if you consume ring in interrupt or BH context, you must
+ * disable interrupts/BH when doing so.
+ */
+static inline void ptr_ring_unconsume(struct ptr_ring *r, void **batch, int n,
+ void (*destroy)(void *))
+{
+   unsigned long flags;
+   int head;
+
+   spin_lock_irqsave(>consumer_lock, flags);
+   spin_lock(>producer_lock);
+
+   if (!r->size)
+   goto done;
+
+   /*
+* Clean out buffered entries (for simplicity). This way following code
+* can test entries for NULL and if not assume they are valid.
+*/
+   head = r->consumer_head - 1;
+   while (likely(head >= r->consumer_tail))
+   r->queue[head--] = NULL;
+   r->consumer_tail = r->consumer_head;
+
+   /*
+* Go over entries in batch, start moving head back and copy entries.
+* Stop when we run into previously unconsumed entries.
+*/
+   while (n) {
+   head = r->consumer_head - 1;
+   if (head < 0)
+   head = r->size - 1;
+   if (r->queue[head]) {
+   /* This batch entry will have to be destroyed. */
+   goto done;
+   }
+   r->queue[head] = batch[--n];
+   r->consumer_tail = r->consumer_head = head;
+   }
+
+done:
+   /* Destroy all entries left in the batch. */
+   while (n)
+   destroy(batch[--n]);
+   spin_unlock(>producer_lock);
+   spin_unlock_irqrestore(>consumer_lock, flags);
+}
+
 static inline void **__ptr_ring_swap_queue(struct ptr_ring *r, void **queue,
   int size, gfp_t gfp,
   void (*destroy)(void *))
-- 
2.7.4



[PATCH net-next V4 03/10] skb_array: introduce skb_array_unconsume

2017-05-09 Thread Jason Wang
Signed-off-by: Jason Wang 
---
 include/linux/skb_array.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
index f4dfade..79850b6 100644
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -156,6 +156,12 @@ static void __skb_array_destroy_skb(void *ptr)
kfree_skb(ptr);
 }
 
+static inline void skb_array_unconsume(struct skb_array *a,
+  struct sk_buff **skbs, int n)
+{
+   ptr_ring_unconsume(>ring, (void **)skbs, n, __skb_array_destroy_skb);
+}
+
 static inline int skb_array_resize(struct skb_array *a, int size, gfp_t gfp)
 {
return ptr_ring_resize(>ring, size, gfp, __skb_array_destroy_skb);
-- 
2.7.4



[PATCH net-next V4 00/10] vhost_net batch dequeuing

2017-05-09 Thread Jason Wang
This series tries to implement rx batching for vhost-net. This is done
by batching the dequeuing from skb_array which was exported by
underlayer socket and pass the sbk back through msg_control to finish
userspace copying. This is also the requirement for more batching
implemention on rx path.

Tests shows at most 8.8% improvment bon rx pps on top of batch zeroing.

Please review.

Thanks

Changes from V3:
- add batch zeroing patch to fix the build warnings

Changes from V2:
- rebase to net-next HEAD
- use unconsume helpers to put skb back on releasing
- introduce and use vhost_net internal buffer helpers
- renew performance numbers on top of batch zeroing

Changes from V1:
- switch to use for() in __ptr_ring_consume_batched()
- rename peek_head_len_batched() to fetch_skbs()
- use skb_array_consume_batched() instead of
  skb_array_consume_batched_bh() since no consumer run in bh
- drop the lockless peeking patch since skb_array could be resized, so
  it's not safe to call lockless one

Jason Wang (8):
  skb_array: introduce skb_array_unconsume
  ptr_ring: introduce batch dequeuing
  skb_array: introduce batch dequeuing
  tun: export skb_array
  tap: export skb_array
  tun: support receiving skb through msg_control
  tap: support receiving skb from msg_control
  vhost_net: try batch dequing from skb array

Michael S. Tsirkin (2):
  ptr_ring: batch ring zeroing
  ptr_ring: add ptr_ring_unconsume

 drivers/net/tap.c |  25 ++-
 drivers/net/tun.c |  31 ++--
 drivers/vhost/net.c   | 117 +++--
 include/linux/if_tap.h|   5 ++
 include/linux/if_tun.h|   5 ++
 include/linux/ptr_ring.h  | 183 +++---
 include/linux/skb_array.h |  31 
 7 files changed, 370 insertions(+), 27 deletions(-)

-- 
2.7.4



[PATCH net-next V4 04/10] ptr_ring: introduce batch dequeuing

2017-05-09 Thread Jason Wang
This patch introduce a batched version of consuming, consumer can
dequeue more than one pointers from the ring at a time. We don't care
about the reorder of reading here so no need for compiler barrier.

Signed-off-by: Jason Wang 
---
 include/linux/ptr_ring.h | 65 
 1 file changed, 65 insertions(+)

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 796b90f..d8c97ec 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -278,6 +278,22 @@ static inline void *__ptr_ring_consume(struct ptr_ring *r)
return ptr;
 }
 
+static inline int __ptr_ring_consume_batched(struct ptr_ring *r,
+void **array, int n)
+{
+   void *ptr;
+   int i;
+
+   for (i = 0; i < n; i++) {
+   ptr = __ptr_ring_consume(r);
+   if (!ptr)
+   break;
+   array[i] = ptr;
+   }
+
+   return i;
+}
+
 /*
  * Note: resize (below) nests producer lock within consumer lock, so if you
  * call this in interrupt or BH context, you must disable interrupts/BH when
@@ -328,6 +344,55 @@ static inline void *ptr_ring_consume_bh(struct ptr_ring *r)
return ptr;
 }
 
+static inline int ptr_ring_consume_batched(struct ptr_ring *r,
+  void **array, int n)
+{
+   int ret;
+
+   spin_lock(>consumer_lock);
+   ret = __ptr_ring_consume_batched(r, array, n);
+   spin_unlock(>consumer_lock);
+
+   return ret;
+}
+
+static inline int ptr_ring_consume_batched_irq(struct ptr_ring *r,
+  void **array, int n)
+{
+   int ret;
+
+   spin_lock_irq(>consumer_lock);
+   ret = __ptr_ring_consume_batched(r, array, n);
+   spin_unlock_irq(>consumer_lock);
+
+   return ret;
+}
+
+static inline int ptr_ring_consume_batched_any(struct ptr_ring *r,
+  void **array, int n)
+{
+   unsigned long flags;
+   int ret;
+
+   spin_lock_irqsave(>consumer_lock, flags);
+   ret = __ptr_ring_consume_batched(r, array, n);
+   spin_unlock_irqrestore(>consumer_lock, flags);
+
+   return ret;
+}
+
+static inline int ptr_ring_consume_batched_bh(struct ptr_ring *r,
+ void **array, int n)
+{
+   int ret;
+
+   spin_lock_bh(>consumer_lock);
+   ret = __ptr_ring_consume_batched(r, array, n);
+   spin_unlock_bh(>consumer_lock);
+
+   return ret;
+}
+
 /* Cast to structure type and call a function without discarding from FIFO.
  * Function must return a value.
  * Callers must take consumer_lock.
-- 
2.7.4



[PATCH net-next V4 03/10] skb_array: introduce skb_array_unconsume

2017-05-09 Thread Jason Wang
Signed-off-by: Jason Wang 
---
 include/linux/skb_array.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
index f4dfade..79850b6 100644
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -156,6 +156,12 @@ static void __skb_array_destroy_skb(void *ptr)
kfree_skb(ptr);
 }
 
+static inline void skb_array_unconsume(struct skb_array *a,
+  struct sk_buff **skbs, int n)
+{
+   ptr_ring_unconsume(>ring, (void **)skbs, n, __skb_array_destroy_skb);
+}
+
 static inline int skb_array_resize(struct skb_array *a, int size, gfp_t gfp)
 {
return ptr_ring_resize(>ring, size, gfp, __skb_array_destroy_skb);
-- 
2.7.4



[PATCH net-next V4 00/10] vhost_net batch dequeuing

2017-05-09 Thread Jason Wang
This series tries to implement rx batching for vhost-net. This is done
by batching the dequeuing from skb_array which was exported by
underlayer socket and pass the sbk back through msg_control to finish
userspace copying. This is also the requirement for more batching
implemention on rx path.

Tests shows at most 8.8% improvment bon rx pps on top of batch zeroing.

Please review.

Thanks

Changes from V3:
- add batch zeroing patch to fix the build warnings

Changes from V2:
- rebase to net-next HEAD
- use unconsume helpers to put skb back on releasing
- introduce and use vhost_net internal buffer helpers
- renew performance numbers on top of batch zeroing

Changes from V1:
- switch to use for() in __ptr_ring_consume_batched()
- rename peek_head_len_batched() to fetch_skbs()
- use skb_array_consume_batched() instead of
  skb_array_consume_batched_bh() since no consumer run in bh
- drop the lockless peeking patch since skb_array could be resized, so
  it's not safe to call lockless one

Jason Wang (8):
  skb_array: introduce skb_array_unconsume
  ptr_ring: introduce batch dequeuing
  skb_array: introduce batch dequeuing
  tun: export skb_array
  tap: export skb_array
  tun: support receiving skb through msg_control
  tap: support receiving skb from msg_control
  vhost_net: try batch dequing from skb array

Michael S. Tsirkin (2):
  ptr_ring: batch ring zeroing
  ptr_ring: add ptr_ring_unconsume

 drivers/net/tap.c |  25 ++-
 drivers/net/tun.c |  31 ++--
 drivers/vhost/net.c   | 117 +++--
 include/linux/if_tap.h|   5 ++
 include/linux/if_tun.h|   5 ++
 include/linux/ptr_ring.h  | 183 +++---
 include/linux/skb_array.h |  31 
 7 files changed, 370 insertions(+), 27 deletions(-)

-- 
2.7.4



[PATCH net-next V4 04/10] ptr_ring: introduce batch dequeuing

2017-05-09 Thread Jason Wang
This patch introduce a batched version of consuming, consumer can
dequeue more than one pointers from the ring at a time. We don't care
about the reorder of reading here so no need for compiler barrier.

Signed-off-by: Jason Wang 
---
 include/linux/ptr_ring.h | 65 
 1 file changed, 65 insertions(+)

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 796b90f..d8c97ec 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -278,6 +278,22 @@ static inline void *__ptr_ring_consume(struct ptr_ring *r)
return ptr;
 }
 
+static inline int __ptr_ring_consume_batched(struct ptr_ring *r,
+void **array, int n)
+{
+   void *ptr;
+   int i;
+
+   for (i = 0; i < n; i++) {
+   ptr = __ptr_ring_consume(r);
+   if (!ptr)
+   break;
+   array[i] = ptr;
+   }
+
+   return i;
+}
+
 /*
  * Note: resize (below) nests producer lock within consumer lock, so if you
  * call this in interrupt or BH context, you must disable interrupts/BH when
@@ -328,6 +344,55 @@ static inline void *ptr_ring_consume_bh(struct ptr_ring *r)
return ptr;
 }
 
+static inline int ptr_ring_consume_batched(struct ptr_ring *r,
+  void **array, int n)
+{
+   int ret;
+
+   spin_lock(>consumer_lock);
+   ret = __ptr_ring_consume_batched(r, array, n);
+   spin_unlock(>consumer_lock);
+
+   return ret;
+}
+
+static inline int ptr_ring_consume_batched_irq(struct ptr_ring *r,
+  void **array, int n)
+{
+   int ret;
+
+   spin_lock_irq(>consumer_lock);
+   ret = __ptr_ring_consume_batched(r, array, n);
+   spin_unlock_irq(>consumer_lock);
+
+   return ret;
+}
+
+static inline int ptr_ring_consume_batched_any(struct ptr_ring *r,
+  void **array, int n)
+{
+   unsigned long flags;
+   int ret;
+
+   spin_lock_irqsave(>consumer_lock, flags);
+   ret = __ptr_ring_consume_batched(r, array, n);
+   spin_unlock_irqrestore(>consumer_lock, flags);
+
+   return ret;
+}
+
+static inline int ptr_ring_consume_batched_bh(struct ptr_ring *r,
+ void **array, int n)
+{
+   int ret;
+
+   spin_lock_bh(>consumer_lock);
+   ret = __ptr_ring_consume_batched(r, array, n);
+   spin_unlock_bh(>consumer_lock);
+
+   return ret;
+}
+
 /* Cast to structure type and call a function without discarding from FIFO.
  * Function must return a value.
  * Callers must take consumer_lock.
-- 
2.7.4



[PATCH net-next V4 02/10] ptr_ring: add ptr_ring_unconsume

2017-05-09 Thread Jason Wang
From: "Michael S. Tsirkin" 

Applications that consume a batch of entries in one go
can benefit from ability to return some of them back
into the ring.

Add an API for that - assuming there's space. If there's no space
naturally can't do this and have to drop entries, but this implies ring
is full so we'd likely drop some anyway.

Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
---
 include/linux/ptr_ring.h | 55 
 1 file changed, 55 insertions(+)

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 6b2e0dd..796b90f 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -403,6 +403,61 @@ static inline int ptr_ring_init(struct ptr_ring *r, int 
size, gfp_t gfp)
return 0;
 }
 
+/*
+ * Return entries into ring. Destroy entries that don't fit.
+ *
+ * Note: this is expected to be a rare slow path operation.
+ *
+ * Note: producer lock is nested within consumer lock, so if you
+ * resize you must make sure all uses nest correctly.
+ * In particular if you consume ring in interrupt or BH context, you must
+ * disable interrupts/BH when doing so.
+ */
+static inline void ptr_ring_unconsume(struct ptr_ring *r, void **batch, int n,
+ void (*destroy)(void *))
+{
+   unsigned long flags;
+   int head;
+
+   spin_lock_irqsave(>consumer_lock, flags);
+   spin_lock(>producer_lock);
+
+   if (!r->size)
+   goto done;
+
+   /*
+* Clean out buffered entries (for simplicity). This way following code
+* can test entries for NULL and if not assume they are valid.
+*/
+   head = r->consumer_head - 1;
+   while (likely(head >= r->consumer_tail))
+   r->queue[head--] = NULL;
+   r->consumer_tail = r->consumer_head;
+
+   /*
+* Go over entries in batch, start moving head back and copy entries.
+* Stop when we run into previously unconsumed entries.
+*/
+   while (n) {
+   head = r->consumer_head - 1;
+   if (head < 0)
+   head = r->size - 1;
+   if (r->queue[head]) {
+   /* This batch entry will have to be destroyed. */
+   goto done;
+   }
+   r->queue[head] = batch[--n];
+   r->consumer_tail = r->consumer_head = head;
+   }
+
+done:
+   /* Destroy all entries left in the batch. */
+   while (n)
+   destroy(batch[--n]);
+   spin_unlock(>producer_lock);
+   spin_unlock_irqrestore(>consumer_lock, flags);
+}
+
 static inline void **__ptr_ring_swap_queue(struct ptr_ring *r, void **queue,
   int size, gfp_t gfp,
   void (*destroy)(void *))
-- 
2.7.4



[PATCH] drm: mediatek: change the variable type of rdma threshold

2017-05-09 Thread Bibby Hsieh
For some greater resolution, the rdma threshold
variable will overflow.

Signed-off-by: Bibby Hsieh 
---
 drivers/gpu/drm/mediatek/mtk_disp_rdma.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c 
b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
index 0df05f9..2718413 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
@@ -109,7 +109,7 @@ static void mtk_rdma_config(struct mtk_ddp_comp *comp, 
unsigned int width,
unsigned int height, unsigned int vrefresh,
unsigned int bpc)
 {
-   unsigned int threshold;
+   unsigned long long threshold;
unsigned int reg;
 
rdma_update_bits(comp, DISP_REG_RDMA_SIZE_CON_0, 0xfff, width);
@@ -121,10 +121,11 @@ static void mtk_rdma_config(struct mtk_ddp_comp *comp, 
unsigned int width,
 * output threshold to 6 microseconds with 7/6 overhead to
 * account for blanking, and with a pixel depth of 4 bytes:
 */
-   threshold = width * height * vrefresh * 4 * 7 / 100;
+   threshold = (unsigned long long)width * height * vrefresh *
+   4 * 7 / 100;
reg = RDMA_FIFO_UNDERFLOW_EN |
  RDMA_FIFO_PSEUDO_SIZE(SZ_8K) |
- RDMA_OUTPUT_VALID_FIFO_THRESHOLD(threshold);
+ (unsigned int)RDMA_OUTPUT_VALID_FIFO_THRESHOLD(threshold);
writel(reg, comp->regs + DISP_REG_RDMA_FIFO_CON);
 }
 
-- 
1.9.1



[PATCH] drm: mediatek: change the variable type of rdma threshold

2017-05-09 Thread Bibby Hsieh
For some greater resolution, the rdma threshold
variable will overflow.

Signed-off-by: Bibby Hsieh 
---
 drivers/gpu/drm/mediatek/mtk_disp_rdma.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c 
b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
index 0df05f9..2718413 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
@@ -109,7 +109,7 @@ static void mtk_rdma_config(struct mtk_ddp_comp *comp, 
unsigned int width,
unsigned int height, unsigned int vrefresh,
unsigned int bpc)
 {
-   unsigned int threshold;
+   unsigned long long threshold;
unsigned int reg;
 
rdma_update_bits(comp, DISP_REG_RDMA_SIZE_CON_0, 0xfff, width);
@@ -121,10 +121,11 @@ static void mtk_rdma_config(struct mtk_ddp_comp *comp, 
unsigned int width,
 * output threshold to 6 microseconds with 7/6 overhead to
 * account for blanking, and with a pixel depth of 4 bytes:
 */
-   threshold = width * height * vrefresh * 4 * 7 / 100;
+   threshold = (unsigned long long)width * height * vrefresh *
+   4 * 7 / 100;
reg = RDMA_FIFO_UNDERFLOW_EN |
  RDMA_FIFO_PSEUDO_SIZE(SZ_8K) |
- RDMA_OUTPUT_VALID_FIFO_THRESHOLD(threshold);
+ (unsigned int)RDMA_OUTPUT_VALID_FIFO_THRESHOLD(threshold);
writel(reg, comp->regs + DISP_REG_RDMA_FIFO_CON);
 }
 
-- 
1.9.1



Re: [PATCH 1/3] ptr_ring: batch ring zeroing

2017-05-09 Thread Jason Wang



On 2017年05月09日 21:33, Michael S. Tsirkin wrote:

I love this idea.  Reviewed and discussed the idea in-person with MST
during netdevconf[1] at this laptop.  I promised I will also run it
through my micro-benchmarking[2] once I return home (hint ptr_ring gets
used in network stack as skb_array).

I'm merging this through my tree. Any objections?



Batch dequeuing series depends on this, maybe it's better to have this 
in that series. Let me post a V4 series with this.


Thanks


Re: [PATCH 1/3] ptr_ring: batch ring zeroing

2017-05-09 Thread Jason Wang



On 2017年05月09日 21:33, Michael S. Tsirkin wrote:

I love this idea.  Reviewed and discussed the idea in-person with MST
during netdevconf[1] at this laptop.  I promised I will also run it
through my micro-benchmarking[2] once I return home (hint ptr_ring gets
used in network stack as skb_array).

I'm merging this through my tree. Any objections?



Batch dequeuing series depends on this, maybe it's better to have this 
in that series. Let me post a V4 series with this.


Thanks


Re: linux-next: build warning after merge of the block tree

2017-05-09 Thread Stephen Rothwell
Hi Markus,

On Wed, 10 May 2017 04:20:54 +0200 Markus Trippelsdorf  
wrote:
>
> Yes, it was missing a (void) like "(void)strlcpy(...)". But Jens
> unfortunately removed both warnings, so the following patch should now
> be enough:
> 
> diff --git a/block/elevator.c b/block/elevator.c
> index fda6be933130..dd0ed19e4fb7 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -1099,8 +1099,7 @@ ssize_t elv_iosched_store(struct request_queue *q, 
> const char *name,
>   return count;
>  
>   strlcpy(elevator_name, skip_spaces(name), sizeof(elevator_name));
> - strstrip(elevator_name);
> - ret = __elevator_change(q, elevator_name);
> + ret = __elevator_change(q, strstrip(elevator_name));
>   if (!ret)
>   return count;

I think you (or someone) needs to do a proper patch submission to Jens,
please.
-- 
Cheers,
Stephen Rothwell


Re: linux-next: build warning after merge of the block tree

2017-05-09 Thread Stephen Rothwell
Hi Markus,

On Wed, 10 May 2017 04:20:54 +0200 Markus Trippelsdorf  
wrote:
>
> Yes, it was missing a (void) like "(void)strlcpy(...)". But Jens
> unfortunately removed both warnings, so the following patch should now
> be enough:
> 
> diff --git a/block/elevator.c b/block/elevator.c
> index fda6be933130..dd0ed19e4fb7 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -1099,8 +1099,7 @@ ssize_t elv_iosched_store(struct request_queue *q, 
> const char *name,
>   return count;
>  
>   strlcpy(elevator_name, skip_spaces(name), sizeof(elevator_name));
> - strstrip(elevator_name);
> - ret = __elevator_change(q, elevator_name);
> + ret = __elevator_change(q, strstrip(elevator_name));
>   if (!ret)
>   return count;

I think you (or someone) needs to do a proper patch submission to Jens,
please.
-- 
Cheers,
Stephen Rothwell


  1   2   3   4   5   6   7   8   9   10   >