Re: [PATCH v1] mlx4: remove unused fields

2016-09-29 Thread David Miller
From: David Decotigny 
Date: Wed, 28 Sep 2016 11:00:04 -0700

> From: David Decotigny 
> 
> This also can address following UBSAN warnings:
> [   36.640343] 
> 
> [   36.648772] UBSAN: Undefined behaviour in 
> drivers/net/ethernet/mellanox/mlx4/fw.c:857:26
> [   36.656853] shift exponent 64 is too large for 32-bit type 'int'
> [   36.663348] 
> 
> [   36.671783] 
> 
> [   36.680213] UBSAN: Undefined behaviour in 
> drivers/net/ethernet/mellanox/mlx4/fw.c:861:27
> [   36.688297] shift exponent 35 is too large for 32-bit type 'int'
> [   36.694702] 
> 
> 
> Tested:
>   reboot with UBSAN, no warning.
> 
> Signed-off-by: David Decotigny 

Applied to net-next, thanks.


Re: [PATCH v8 0/8] thunderbolt: Introducing Thunderbolt(TM) Networking

2016-09-29 Thread David Miller
From: Amir Levy 
Date: Wed, 28 Sep 2016 17:44:22 +0300

> This driver enables Thunderbolt Networking on non-Apple platforms
> running Linux.

Greg, any idea where this should get merged once fully vetted?  I can
take it through the net-next tree, but I'm fine with another more
appropriate tree taking it as well.

Thanks!


Re: linux-next: build failure after merge of the tty tree

2016-09-29 Thread Greg KH
On Fri, Sep 30, 2016 at 01:54:37PM +1000, Stephen Rothwell wrote:
> Hi Greg,
> 
> After merging the tty tree, today's linux-next build (arm
> multi_v7_defconfig) failed like this:
> 
> drivers/tty/serial/amba-pl011.c: In function 'pl011_console_match':
> drivers/tty/serial/amba-pl011.c:2346:44: error: passing argument 3 of 
> 'uart_parse_earlycon' from incompatible pointer type 
> [-Werror=incompatible-pointer-types]
>   if (uart_parse_earlycon(options, , , ))
> ^
> In file included from drivers/tty/serial/amba-pl011.c:45:0:
> include/linux/serial_core.h:384:5: note: expected 'resource_size_t * {aka 
> unsigned int *}' but argument is of type 'long unsigned int *'
>  int uart_parse_earlycon(char *p, unsigned char *iotype, resource_size_t 
> *addr,
>  ^
> 
> Caused by commit
> 
>   8b8f347d3a48 ("serial: pl011: add console matching function")
> 
> interacting with commit
> 
>   46e36683f433 ("serial: earlycon: Extend earlycon command line option to 
> support 64-bit addresses")
> 
> I have reverted commit 8b8f347d3a48 for today.

Ick, sorry about that.  I wonder why I'm not seeing that same build
failure here, odd.  I'll go revert the same patch in my tree now as
well.

thanks,

greg k-h


Re: linux-next: manual merge of the tty tree with the arm64 tree

2016-09-29 Thread Greg KH
On Fri, Sep 30, 2016 at 01:38:22PM +1000, Stephen Rothwell wrote:
> Hi Greg,
> 
> Today's linux-next merge of the tty tree got conflicts in:
> 
>   arch/arm64/Kconfig
> 
> between commit:
> 
>   1d8f51d41fc7 ("arm/arm64: arch_timer: Use archdata to indicate vdso 
> suitability")
> 
> from the arm64 tree and commit:
> 
>   888125a71298 ("ARM64: ACPI: enable ACPI_SPCR_TABLE")
> 
> from the tty tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.

Looks good to me.


Re: [PATCH] ext4: Check for encryption feature before fscrypt_process_policy()

2016-09-29 Thread Theodore Ts'o
On Thu, Sep 22, 2016 at 08:50:54AM +0200, Richard Weinberger wrote:
> ...otherwise an user can enable encryption for certain files even
> when the filesystem is unable to support it.
> Such a case would be a filesystem created by mkfs.ext4's default
> settings, 1KiB block size. Ext4 supports encyption only when block size
> is equal to PAGE_SIZE.
> But this constraint is only checked when the encryption feature flag
> is set.
> 
> Signed-off-by: Richard Weinberger 

Thanks, applied.

- Ted


Re: [PATCH locking/Documentation 1/2] Add note of release-acquire store vulnerability

2016-09-29 Thread Boqun Feng
Hi Paul,

On Thu, Sep 29, 2016 at 10:23:22AM -0700, Paul E. McKenney wrote:
> On Thu, Sep 29, 2016 at 06:10:37PM +0100, Will Deacon wrote:
> > On Thu, Sep 29, 2016 at 09:43:53AM -0700, Paul E. McKenney wrote:
> > > On Thu, Sep 29, 2016 at 05:03:08PM +0100, Will Deacon wrote:
> > > > On Thu, Sep 29, 2016 at 05:58:17PM +0200, Peter Zijlstra wrote:
> > > > > On Thu, Sep 29, 2016 at 08:54:01AM -0700, Paul E. McKenney wrote:
> > > > > > If two processes are related by a RELEASE+ACQUIRE pair, ordering 
> > > > > > can be
> > > > > > broken if a third process overwrites the value written by the 
> > > > > > RELEASE
> > > > > > operation before the ACQUIRE operation has a chance of reading it.
> > > > > > This commit therefore updates the documentation to call this 
> > > > > > vulnerability
> > > > > > out explicitly.
> > > > > > 
> > > > > > Reported-by: Alan Stern 
> > > > > > Signed-off-by: Paul E. McKenney 
> > > > > 
> > > > > > + However, please note that a chain of RELEASE+ACQUIRE pairs 
> > > > > > may be
> > > > > > + broken by a store by another thread that overwrites the 
> > > > > > RELEASE
> > > > > > + operation's store before the ACQUIRE operation's read.
> > > > > 
> > > > > This is the powerpc lwsync quirk, right? Where the barrier disappears
> > > > > when it looses the store.
> > > > > 
> > > > > Or is there more to it? Its not entirely clear from the Changelog, 
> > > > > which
> > > > > I feel should describe the reason for the behaviour.
> > > > 
> > > > If I've groked it correctly, it's for cases like:
> > > > 
> > > > 
> > > > PO:
> > > > Wx=1
> > > > WyRel=1
> > > > 
> > > > P1:
> > > > Wy=2
> > > > 
> > > > P2:
> > > > RyAcq=2
> > > > Rx=0
> > > > 
> > > > Final value of y is 2.
> > > > 
> > > > 
> > > > This is permitted on arm64. If you make P1's store a store-release, then
> > > > it's forbidden, but I suspect that's not generally true of the kernel
> > > > memory model.
> > > 
> > > That is the one!  And to Peter's point, powerpc does the same for the
> > > example as shown.  However, on powerpc, upgrading P1's store to release
> > > has no effect because there is no earlier access for the resulting
> > > lwsync to influence.  For whatever it might be worth, C11 won't guarantee
> > > ordering in that case, either.  Nor will the current Linux-kernel memory
> > > model.  (Yes, I did just try it to make sure.  Why do you ask?)
> > > 
> > > So you guys are fishing for an expanded commit log, for example, like
> > > the following?  ;-)
> > > 
> > >   Thanx, Paul
> > > 
> > > 
> > > 
> > > If two processes are related by a RELEASE+ACQUIRE pair, ordering can be
> > > broken if a third process overwrites the value written by the RELEASE
> > > operation before the ACQUIRE operation has a chance of reading it, for
> > > example:
> > > 
> > >   P0(int *x, int *y)
> > >   {
> > >   WRITE_ONCE(*x, 1);
> > >   smp_wmb();
> > >   smp_store_release(y, 1);
> > >   }
> > > 
> > >   P1(int *y)
> > >   {
> > >   smp_store_release(y, 2);
> > >   }
> > > 
> > >   P2(int *x, int *y)
> > >   {
> > >   r1 = smp_load_acquire(y);
> > >   r2 = READ_ONCE(*x);
> > >   }
> > > 
> > > Both ARM and powerpc allow the "after the dust settles" outcome (r1=2 &&
> > > r2=0), as does the current version of the early prototype Linux-kernel
> > > memory model.
> > 
> > FWIW, ARM doesn't allow this and arm64 only allows it if P1 uses WRITE_ONCE
> > instead of store-release.
> 
> Good catch, apologies for the error.  The following, then?
> 
>   Thanx, Paul
> 
> 
> 
> If two processes are related by a RELEASE+ACQUIRE pair, ordering can be
> broken if a third process overwrites the value written by the RELEASE
> operation before the ACQUIRE operation has a chance of reading it, for
> example:
> 
>   P0(int *x, int *y)
>   {
>   WRITE_ONCE(*x, 1);
>   smp_wmb();
   ^^^

What is this smp_wmb() for?

>   smp_store_release(y, 1);
>   }
> 
>   P1(int *y)
>   {
>   WRITE_ONCE(*y, 2);

If we change this WRITE_ONCE to a relaxed atomic operation(e.g.
xchg_relaxed(y, 2)), both herd and ppcmem said the exist-clause "y = 2
/\ 2:r1 = 2 /\ 2:r2 = 0" wouldn't be triggered on PPC.

I guess we will get the same behavior on ARM/ARM64, Will?

If a normal store could break chain, while a RmW atomic won't, do we
want to call it out in the document and build our memory model around
this?

I asked because in spin_unlock_wait() fix, we kind of relied on this. So
it's good for us to clarify it?

Regards,
Boqun

>   }
> 
>   P2(int *x, int *y)
>   {
>   r1 = smp_load_acquire(y);
>   

Re: [PATCH v2 1/1] s390/spinlock: Provide vcpu_is_preempted

2016-09-29 Thread Boqun Feng
On Fri, Sep 30, 2016 at 12:49:52PM +0800, Pan Xinhui wrote:
> 
> 
> 在 2016/9/29 23:51, Christian Borntraeger 写道:
> > this implements the s390 backend for commit
> > "kernel/sched: introduce vcpu preempted check interface"
> > by reworking the existing smp_vcpu_scheduled into
> > arch_vcpu_is_preempted. We can then also get rid of the
> > local cpu_is_preempted function by moving the
> > CIF_ENABLED_WAIT test into arch_vcpu_is_preempted.
> > 
> > Signed-off-by: Christian Borntraeger 
> > ---
> 
> hi, Christian
>   thanks for your patch!
> 
> >  arch/s390/include/asm/spinlock.h |  3 +++
> >  arch/s390/kernel/smp.c   |  9 +++--
> >  arch/s390/lib/spinlock.c | 25 -
> >  3 files changed, 18 insertions(+), 19 deletions(-)
> > 
> > diff --git a/arch/s390/include/asm/spinlock.h 
> > b/arch/s390/include/asm/spinlock.h
> > index 63ebf37..e16e02f 100644
> > --- a/arch/s390/include/asm/spinlock.h
> > +++ b/arch/s390/include/asm/spinlock.h
> > @@ -21,6 +21,9 @@ _raw_compare_and_swap(unsigned int *lock, unsigned int 
> > old, unsigned int new)
> > return __sync_bool_compare_and_swap(lock, old, new);
> >  }
> > 
> > +bool arch_vcpu_is_preempted(int cpu);
> > +#define vcpu_is_preempted arch_vcpu_is_preempted
> > +
> >  /*
> >   * Simple spin lock operations.  There are two variants, one clears IRQ's
> >   * on the local processor, one does not.
> > diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
> > index 7b89a75..4aadd16 100644
> > --- a/arch/s390/kernel/smp.c
> > +++ b/arch/s390/kernel/smp.c
> > @@ -376,10 +376,15 @@ int smp_find_processor_id(u16 address)
> > return -1;
> >  }
> > 
> > -int smp_vcpu_scheduled(int cpu)
> root@ltcalpine2-lp13:~/linux# git grep -wn smp_vcpu_scheduled arch/s390/
> arch/s390/include/asm/smp.h:34:extern int smp_vcpu_scheduled(int cpu);
> arch/s390/include/asm/smp.h:56:static inline int smp_vcpu_scheduled(int cpu) 
> { return 1; }
> arch/s390/kernel/smp.c:371:int smp_vcpu_scheduled(int cpu)
> arch/s390/lib/spinlock.c:44:if (smp_vcpu_scheduled(cpu))
> 
> > +bool arch_vcpu_is_preempted(int cpu)
> >  {
> > -   return pcpu_running(pcpu_devices + cpu);
> > +   if (test_cpu_flag_of(CIF_ENABLED_WAIT, cpu))
> > +   return false;
> > +   if (pcpu_running(pcpu_devices + cpu))
> > +   return false;
> I saw smp_vcpu_scheduled() returns true always on !SMP system.
> 
> maybe we can do somegthing silimar. like below
> 
> #ifndef CONFIG_SMP
> static inline bool arch_vcpu_is_preempted(int cpu) { return 
> !test_cpu_flag_of(CIF_ENABLED_WAIT, cpu); }
> #else
> ...
> 
> but I can't help thinking that if this is a!SMP system, maybe we could only
> #ifndef CONFIG_SMP
> static inline bool arch_vcpu_is_preempted(int cpu) { return false; }
> #else

Why do we need a vcpu_is_preempted() implementation for UP? Where will
you use it?

Regards,
Boqun

> ...
> 
> 
> thanks
> xinhui
> 
> > +   return true;
> >  }
> > +EXPORT_SYMBOL(arch_vcpu_is_preempted);
> > 
> >  void smp_yield_cpu(int cpu)
> >  {
> > diff --git a/arch/s390/lib/spinlock.c b/arch/s390/lib/spinlock.c
> > index e5f50a7..e48a48e 100644
> > --- a/arch/s390/lib/spinlock.c
> > +++ b/arch/s390/lib/spinlock.c
> > @@ -37,15 +37,6 @@ static inline void _raw_compare_and_delay(unsigned int 
> > *lock, unsigned int old)
> > asm(".insn rsy,0xeb22,%0,0,%1" : : "d" (old), "Q" (*lock));
> >  }
> > 
> > -static inline int cpu_is_preempted(int cpu)
> > -{
> > -   if (test_cpu_flag_of(CIF_ENABLED_WAIT, cpu))
> > -   return 0;
> > -   if (smp_vcpu_scheduled(cpu))
> > -   return 0;
> > -   return 1;
> > -}
> > -
> >  void arch_spin_lock_wait(arch_spinlock_t *lp)
> >  {
> > unsigned int cpu = SPINLOCK_LOCKVAL;
> > @@ -62,7 +53,7 @@ void arch_spin_lock_wait(arch_spinlock_t *lp)
> > continue;
> > }
> > /* First iteration: check if the lock owner is running. */
> > -   if (first_diag && cpu_is_preempted(~owner)) {
> > +   if (first_diag && arch_vcpu_is_preempted(~owner)) {
> > smp_yield_cpu(~owner);
> > first_diag = 0;
> > continue;
> > @@ -81,7 +72,7 @@ void arch_spin_lock_wait(arch_spinlock_t *lp)
> >  * yield the CPU unconditionally. For LPAR rely on the
> >  * sense running status.
> >  */
> > -   if (!MACHINE_IS_LPAR || cpu_is_preempted(~owner)) {
> > +   if (!MACHINE_IS_LPAR || arch_vcpu_is_preempted(~owner)) {
> > smp_yield_cpu(~owner);
> > first_diag = 0;
> > }
> > @@ -108,7 +99,7 @@ void arch_spin_lock_wait_flags(arch_spinlock_t *lp, 
> > unsigned long flags)
> > continue;
> > }
> > /* Check if the lock owner is running. */
> > -   if (first_diag && cpu_is_preempted(~owner)) {
> > +   if (first_diag && arch_vcpu_is_preempted(~owner)) {
> >

Re: linux-next: manual merge of the tty tree with the pm tree

2016-09-29 Thread Greg KH
On Fri, Sep 30, 2016 at 01:42:23PM +1000, Stephen Rothwell wrote:
> Hi Greg,
> 
> Today's linux-next merge of the tty tree got a conflict in:
> 
>   include/linux/acpi.h
> 
> between commit:
> 
>   058dfc767008 ("ACPI / watchdog: Add support for WDAT hardware watchdog")
> 
> from the pm tree and commit:
> 
>   ad1696f6f09d ("ACPI: parse SPCR and enable matching console")
> 
> from the tty tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.

Looks good to me, thanks.

greg k-h


Re: linux-next: manual merge of the tty tree with the v4l-dvb tree

2016-09-29 Thread Greg KH
On Fri, Sep 30, 2016 at 01:33:19PM +1000, Stephen Rothwell wrote:
> Hi Greg,
> 
> Today's linux-next merge of the tty tree got a conflict in:
> 
>   MAINTAINERS
> 
> between commit:
> 
>   71fb2c74287d ("[media] MAINTAINERS: atmel-isc: add entry for Atmel ISC")
> 
> from the v4l-dvb tree and commit:
> 
>   5615c3715749 ("MAINTAINERS: update entry for atmel_serial driver")
> 
> from the tty tree.

Ick, MAINTAINERS is a tough thing to merge at times, sorry, patch looks
fine to me.

greg k-h


Re: [PATCH 2/3] zram: support page-based parallel write

2016-09-29 Thread Minchan Kim
Hi Sergey,

On Thu, Sep 29, 2016 at 12:18:31PM +0900, Sergey Senozhatsky wrote:
> Hello Minchan,
> 
> On (09/22/16 15:42), Minchan Kim wrote:
> > zram supports stream-based parallel compression. IOW, it can support
> > parallel compression on SMP system only if each cpus has streams.
> > For example, assuming 4 CPU system, there are 4 sources for compressing
> > in system and each source must be located in each CPUs for full
> > parallel compression.
> > 
> > So, if there is *one* stream in the system, it cannot be compressed
> > in parallel although the system supports multiple CPUs. This patch
> > aims to overcome such weakness.
> > 
> > The idea is to use multiple background threads to compress pages
> > in idle CPU and foreground just queues BIOs without interrupting
> > while other CPUs consumes pages in BIO to compress.
> > It means zram begins to support asynchronous writeback to increase
> > write bandwidth.
> > 
> > 1) test cp A to B as an example of single stream compression and
> > enhanced 36%.
> > 
> > x86_64, 4 CPU
> > Copy kernel source to zram
> > old: 3.4s, new: 2.2s
> > 
> > 2) test per-process reclaim to swap: 524M
> > x86_64, 4 CPU:
> > old: 1.2s new: 0.3s
> > 
> > 3) FIO benchamrk
> > random read was worse so it supports only write at the moment.
> > Later, We might revisit asynchronous read.
> 
> 
> sorry for long reply.

Never mind. Better to late response rather than no-reply. ;)
And thanks for the testing!

> 
> frankly speaking, sorry, I'm very skeptical about the patch set.

It seems I was not good sales guy. Let me try again.

> 
> from your tests it seems that only a tiny corner case can gain some
> extra performance: when we have SMP system with multiple CPUs, but
> *guaranteed* only one process doing *only* one type of requests.
> as soon as this process starts to do things simultaneously (like
> mixed READ-WRITE) _or_ there are several processes: we are done. and
> for that tiny corner case we are about to add a complex logic and a
> big pile of code. I'm quite sure I'll never enable CONFIG_ZRAM_ASYNC_IO.
> why would you enable it? I mean what setups you are looking at that will
> benefit? hosting a CVS repository? :) just kidding.

Could you retest the benchmark without direct IO? Instead of dio,
I used fsync_on_close to flush buffered IO.

DIO isn't normal workload compared to buffered IO. The reason I used DIO
for zram benchmark was that it's handy to transfer IO to block layer effectively
and no-noise of page cache.
If we use buffered IO, the result would be fake as dirty page was just
queued in page cache without any flushing.
I think you know already it very well so no need to explan any more. :)

More important thing is current zram is poor for parallel IO.
Let's thing two usecases, zram-swap, zram-fs.

1) zram-swap

parallel IO can be done only where every CPU have reclaim context.
IOW,

1. kswapd on CPU 0
2. A process direct reclaim on CPU 1
3. process direct reclaim on CPU 2
4. process direct reclaim on CPU 3

I don't think it's usual workload. Most of time, a kswapd and a process
direct reclaim in embedded platform workload. The point is we can not
use full bandwidth.

2) zram-fs

Currently, there is a work per bdi. So, without fsync(and friends),
every IO submit would be done via that work on worker thread.
It means the IO couldn't be parallelized. However, if we use fsync,
it could be parallelized but it depends on the sync granuarity.
For exmaple, if your test application uses fsync directly, the IO
would be done in the CPU context your application running on. So,
if you has 4 test applications, every CPU could be utilized.
However, if you test application doesn't use fsync directly and
parant process calls sync if every test child applications, the
IO could be done 2 CPU context(1 is parent process context and
other is bdi work context).
So, In summary, we were poor for parallel IO workload without
sync or DIO.

> 
> are there any block devices being specifically optimized for a "one
> process doing one OP" cases?

Maybe, above(ie, we were poor for parallel IO workload without sync
or DIO) would be enough for justification. But if we need more :)
we have used per-process reclaim(e.g., memcg or out-of-tree per-process
reclaim whatever). It was direct reclaim context so writeback
couldn't use full bandwidth via every CPUs.
NR_CPU times faster would be great. I took experiement on my ARM
mahcine(4-CPU). 256 reclaims takes 6sec in old but with async IO,
it took 1.6 sec.

Please consider it again.

> 
> my tests show a dramatic performance drop down with NEW zram.
> even "one" process case (one fio job) is almost x3 slower.
> somtimes WRITE test case even go from MB/s to KB/s
> 
>   WRITE:  3181.4MB/s→  948111KB/s
> 
> 
> I've attached the .config
> 
> 
> ENV
> ===
> 
>   x86_64 SMP (4 CPUs), "bare zram" 2g, lzo, static compression buffer.
> 
> 
> TEST COMMAND
> 
> 
>   ZRAM_SIZE=2G ZRAM_COMP_ALG=lzo LOG_SUFFIX={NEW, OLD} FIO_LOOPS=2 
> 

Re: [PATCH v3 2/2] usb: dwc3: Wait for control tranfer completed when stopping gadget

2016-09-29 Thread Baolin Wang
Hi Felipe,

On 19 September 2016 at 19:52, Baolin Wang  wrote:
> When we change the USB function with configfs dynamically, we possibly met 
> this
> situation: one core is doing the control transfer, another core is trying to
> unregister the USB gadget from userspace, we must wait for completing this
> control tranfer, or it will hang the controller to set the DEVCTRLHLT flag.
>

Any comments about this new version patchset? Thanks.

> Signed-off-by: Baolin Wang 
> ---
>  drivers/usb/dwc3/core.h   |2 ++
>  drivers/usb/dwc3/ep0.c|2 ++
>  drivers/usb/dwc3/gadget.c |   23 +++
>  3 files changed, 27 insertions(+)
>
> diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
> index b2317e7..01a6fbd 100644
> --- a/drivers/usb/dwc3/core.h
> +++ b/drivers/usb/dwc3/core.h
> @@ -745,6 +745,7 @@ struct dwc3_scratchpad_array {
>   * @ep0_usb_req: dummy req used while handling STD USB requests
>   * @ep0_bounce_addr: dma address of ep0_bounce
>   * @scratch_addr: dma address of scratchbuf
> + * @ep0_in_setup: One control tranfer is completed and enter setup phase
>   * @lock: for synchronizing
>   * @dev: pointer to our struct device
>   * @xhci: pointer to our xHCI child
> @@ -843,6 +844,7 @@ struct dwc3 {
> dma_addr_t  ep0_bounce_addr;
> dma_addr_t  scratch_addr;
> struct dwc3_request ep0_usb_req;
> +   struct completion   ep0_in_setup;
>
> /* device lock */
> spinlock_t  lock;
> diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c
> index fe79d77..06c167a 100644
> --- a/drivers/usb/dwc3/ep0.c
> +++ b/drivers/usb/dwc3/ep0.c
> @@ -311,6 +311,8 @@ void dwc3_ep0_out_start(struct dwc3 *dwc)
> ret = dwc3_ep0_start_trans(dwc, 0, dwc->ctrl_req_addr, 8,
> DWC3_TRBCTL_CONTROL_SETUP, false);
> WARN_ON(ret < 0);
> +
> +   complete(>ep0_in_setup);
>  }
>
>  static struct dwc3_ep *dwc3_wIndex_to_dep(struct dwc3 *dwc, __le16 wIndex_le)
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index ca2ae5b..3a30d51 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -1437,6 +1437,15 @@ static int dwc3_gadget_run_stop(struct dwc3 *dwc, int 
> is_on, int suspend)
> if (pm_runtime_suspended(dwc->dev))
> return 0;
>
> +   /*
> +* Per databook, when we want to stop the gadget, if a control 
> transfer
> +* is still in process, complete it and get the core into setup phase.
> +*/
> +   if (!is_on && dwc->ep0state != EP0_SETUP_PHASE) {
> +   reinit_completion(>ep0_in_setup);
> +   return -EBUSY;
> +   }
> +
> reg = dwc3_readl(dwc->regs, DWC3_DCTL);
> if (is_on) {
> if (dwc->revision <= DWC3_REVISION_187A) {
> @@ -1487,10 +1496,22 @@ static int dwc3_gadget_pullup(struct usb_gadget *g, 
> int is_on)
>
> is_on = !!is_on;
>
> +try_again:
> spin_lock_irqsave(>lock, flags);
> ret = dwc3_gadget_run_stop(dwc, is_on, false);
> spin_unlock_irqrestore(>lock, flags);
>
> +   if (ret == -EBUSY) {
> +   ret = wait_for_completion_timeout(>ep0_in_setup,
> + msecs_to_jiffies(500));
> +   if (ret == 0) {
> +   dev_err(dwc->dev, "timeout to stop gadget.\n");
> +   return -ETIMEDOUT;
> +   } else {
> +   goto try_again;
> +   }
> +   }
> +
> return ret;
>  }
>
> @@ -2914,6 +2935,8 @@ int dwc3_gadget_init(struct dwc3 *dwc)
> goto err4;
> }
>
> +   init_completion(>ep0_in_setup);
> +
> dwc->gadget.ops = _gadget_ops;
> dwc->gadget.speed   = USB_SPEED_UNKNOWN;
> dwc->gadget.sg_supported= true;
> --
> 1.7.9.5
>



-- 
Baolin.wang
Best Regards


Re: [PATCH net-next 0/6] rxrpc: Fixes and adjustments

2016-09-29 Thread David Miller
From: David Howells <dhowe...@redhat.com>
Date: Thu, 29 Sep 2016 23:15:27 +0100

> This set of patches contains some fixes and adjustments:
 ...
> The patches can be found here also:
> 
>   
> http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=rxrpc-rewrite
> 
> Tagged thusly:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
>   rxrpc-rewrite-20160929

Pulled, thanks David.


Re: [PATCH V2 for-next 0/8] Bug Fixes and Code Improvement in HNS driver

2016-09-29 Thread David Miller
From: Salil Mehta 
Date: Thu, 29 Sep 2016 18:09:08 +0100

> This patch-set introduces fix to some Bugs, potential problems
> and code improvements identified during internal review and
> testing of Hisilicon Network Subsystem driver.
> 
> Submit Change
> V1->V2: This addresses the feedbacks provided by David Miller
> and Doug Ledford

So Doug my understanding is if this makes it through review
this is going to be merged into your tree, you prepare a
branch for me, and then I pull from that?

Thanks in advance.


Re: pull-request: wireless-drivers-next 2016-09-29

2016-09-29 Thread David Miller
From: Kalle Valo 
Date: Thu, 29 Sep 2016 19:57:28 +0300

> this should be the last wireless-drivers-next pull request for 4.9, from
> now on only important bugfixes. Nothing really special stands out,
> iwlwifi being most active but other drivers also getting attention. More
> details in the signed tag. Please let me know if there are any problems.

Pulled, thanks Kalle.

> Or actually I had one problem. While doing a test merge I noticed that
> net-next fails to compile for me, but I don't think this is anything
> wireless related:
> 
>   CC  net/netfilter/core.o
> net/netfilter/core.c: In function 'nf_set_hooks_head':
> net/netfilter/core.c:96:149: error: 'struct net_device' has no member named 
> 'nf_hooks_ingress'

Yes, I am aware of this build issue and will tackle it myself if someone
doesn't beat me to it.

Thanks again.


Re: [PATCH 3.12 000/119] 3.12.64-stable review

2016-09-29 Thread Mike Galbraith
This one seems to be missing.

135e8c9250dd sched/core: Fix a race between try_to_wake_up() and a woken up task


Re: [PATCH net-next 00/10] net: dsa: mv88e6xxx: Global (1) cosmetics

2016-09-29 Thread David Miller
From: Vivien Didelot 
Date: Thu, 29 Sep 2016 12:21:52 -0400

> The Global (1) internal SMI device of Marvell switches is a set of
> registers providing support to different units for MAC addresses (ATU),
> VLANs (VTU), PHY polling (PPU), etc.
 ...

Looks like a very nice set of cleanups to me.

Series applied, thanks!


Re: [PATCH v2 3/3] net: make net namespace sysctls belong to container's owner

2016-09-29 Thread David Miller
From: Dmitry Torokhov 
Date: Thu, 29 Sep 2016 08:46:05 -0700

> Hi David,
> 
> On Wed, Aug 10, 2016 at 2:36 PM, Dmitry Torokhov
>  wrote:
>> If net namespace is attached to a user namespace let's make container's
>> root owner of sysctls affecting said network namespace instead of global
>> root.
>>
>> This also allows us to clean up net_ctl_permissions() because we do not
>> need to fudge permissions anymore for the container's owner since it now
>> owns the objects in question.
>>
>> Acked-by: "Eric W. Biederman" 
>> Signed-off-by: Dmitry Torokhov 
> 
> I was looking at linux-next today, and I noticed that, when you merged
> my patch, you basically reverted the following commit:
> 
> commit d6e0d306449bcb5fa3c80e7a3edf11d45abf9ae9
> Author: Tyler Hicks 
> Date:   Thu Jun 2 23:43:22 2016 -0500
> 
> net: Use ns_capable_noaudit() when determining net sysctl permissions

Please send me a fixup patch for this, sorry.


Re: [PATCH v3 1/2] config: Adding the new config parameter CONFIG_PROVE_LOCKING_SMALL for sparc

2016-09-29 Thread David Miller
From: Babu Moger 
Date: Thu, 29 Sep 2016 08:53:24 -0500

> 
> On 9/28/2016 3:39 AM, Peter Zijlstra wrote:
>> On Tue, Sep 27, 2016 at 12:33:27PM -0700, Babu Moger wrote:
>>> This new config parameter limits the space used for "Lock debugging:
>>> prove locking correctness" by about 4MB. The current sparc systems
>>> have
>>> the limitation of 32MB size for kernel size including .text, .data and
>>> .bss sections. With PROVE_LOCKING feature, the kernel size could grow
>>> beyond this limit and causing system boot-up issues. With this option,
>>> kernel limits the size of the entries of lock_chains, stack_trace
>>> etc.,
>>> so that kernel fits in required size limit. This is not visible to
>>> user
>>> and only used for sparc.
>>>
>>> Signed-off-by: Babu Moger 
>> You forgot to Cc Dave, and since you're touching sparc I need an Ack
>> from him before I can queue this.
> Dave, Can you please take a look at the patch. Please ack it if it
> looks good.

I am travelling and will look at it when I get a chance.


Re: [PATCH v1 12/12] mm: ppc64: Add THP migration support for ppc64.

2016-09-29 Thread Aneesh Kumar K.V
zi@sent.com writes:

> From: Zi Yan 
>
> Signed-off-by: Zi Yan 
> ---
>  arch/powerpc/Kconfig |  4 
>  arch/powerpc/include/asm/book3s/64/pgtable.h | 23 +++
>  2 files changed, 27 insertions(+)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 927d2ab..84ffd4c 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -553,6 +553,10 @@ config ARCH_SPARSEMEM_DEFAULT
>  config SYS_SUPPORTS_HUGETLBFS
>   bool
>  
> +config ARCH_ENABLE_THP_MIGRATION
> + def_bool y
> + depends on PPC64 && TRANSPARENT_HUGEPAGE && MIGRATION
> +
>  source "mm/Kconfig"
>  
>  config ARCH_MEMORY_PROBE
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
> b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 263bf39..9dee0467 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -521,7 +521,9 @@ static inline bool pte_user(pte_t pte)
>   * Clear bits not found in swap entries here.
>   */
>  #define __pte_to_swp_entry(pte)  ((swp_entry_t) { pte_val((pte)) & 
> ~_PAGE_PTE })
> +#define __pmd_to_swp_entry(pte)  ((swp_entry_t) { pmd_val((pte)) & 
> ~_PAGE_PTE })
>  #define __swp_entry_to_pte(x)__pte((x).val | _PAGE_PTE)
> +#define __swp_entry_to_pmd(x)__pmd((x).val | _PAGE_PTE)


We definitely need a comment around that. This will work only for 64K
linux page size, on 4k we may consider it a hugepd directory entry. But
This should be ok because we support THP only with 64K linux page size.
Hence my suggestion to add proper comments or move it to right headers.


>  
>  #ifdef CONFIG_MEM_SOFT_DIRTY
>  #define _PAGE_SWP_SOFT_DIRTY   (1UL << (SWP_TYPE_BITS + _PAGE_BIT_SWAP_TYPE))
> @@ -662,6 +664,10 @@ static inline int pmd_bad(pmd_t pmd)
>   return radix__pmd_bad(pmd);
>   return hash__pmd_bad(pmd);
>  }
> +static inline int __pmd_present(pmd_t pte)
> +{
> + return !!(pmd_val(pte) & _PAGE_PRESENT);
> +}
>  
>  static inline void pud_set(pud_t *pudp, unsigned long val)
>  {
> @@ -850,6 +856,23 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
>  #define pmd_soft_dirty(pmd)pte_soft_dirty(pmd_pte(pmd))
>  #define pmd_mksoft_dirty(pmd)  pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
>  #define pmd_clear_soft_dirty(pmd) pte_pmd(pte_clear_soft_dirty(pmd_pte(pmd)))
> +
> +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
> +static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd)
> +{
> + return pte_pmd(pte_swp_mksoft_dirty(pmd_pte(pmd)));
> +}
> +
> +static inline int pmd_swp_soft_dirty(pmd_t pmd)
> +{
> + return pte_swp_soft_dirty(pmd_pte(pmd));
> +}
> +
> +static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd)
> +{
> + return pte_pmd(pte_swp_clear_soft_dirty(pmd_pte(pmd)));
> +}
> +#endif
>  #endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
>  
>  #ifdef CONFIG_NUMA_BALANCING

Did we test this with Radix config ? If not I will suggest we hold off
the ppc64 patch and you can merge rest of the changes.

-aneesh



Re: [PATCH] spmi: regmap: enable userspace writes

2016-09-29 Thread kgunda

On 2016-09-29 23:30, Mark Brown wrote:

On Thu, Sep 29, 2016 at 05:06:26PM +0530, Kiran Gunda wrote:


-#undef REGMAP_ALLOW_WRITE_DEBUGFS
+#define REGMAP_ALLOW_WRITE_DEBUGFS


This is completely inappropriate for upstream, if you need to do
debugging on your platform you can enable this locally but enabling
random writes from userspace to any regmap device is really not a good
idea for system stablity or robustness.


Sure. I will remove this change and send the next version only to update
the spmi device name.


Re: [PATCH v3 0/4] implement vcpu preempted check

2016-09-29 Thread Pan Xinhui



在 2016/9/29 18:31, Peter Zijlstra 写道:

On Thu, Sep 29, 2016 at 12:23:19PM +0200, Christian Borntraeger wrote:

On 09/29/2016 12:10 PM, Peter Zijlstra wrote:

On Thu, Jul 21, 2016 at 07:45:10AM -0400, Pan Xinhui wrote:

change from v2:
no code change, fix typos, update some comments

change from v1:
a simplier definition of default vcpu_is_preempted
skip mahcine type check on ppc, and add config. remove dedicated macro.
add one patch to drop overload of rwsem_spin_on_owner and 
mutex_spin_on_owner.
add more comments
thanks boqun and Peter's suggestion.

This patch set aims to fix lock holder preemption issues.


So I really like the concept, but I would also really like to see
support for more hypervisors included before we can move forward with
this.

Please consider s390 and (x86/arm) KVM. Once we have a few, more can
follow later, but I think its important to not only have PPC support for
this.


Actually the s390 preemted check via sigp sense running  is available for
all hypervisors (z/VM, LPAR and KVM) which implies everywhere as you can no
longer buy s390 systems without LPAR.

As Heiko already pointed out we could simply use a small inline function
that calls cpu_is_preempted from arch/s390/lib/spinlock (or smp_vcpu_scheduled 
from smp.c)


Sure, and I had vague memories of Heiko's email. This patch set however
completely fails to do that trivial hooking up.



sorry for that.
I will try to work it out on x86.

Hi, Will
I appreciate that if you or some other arm guys could help on it. :)



Re: [PATCH v2 1/1] s390/spinlock: Provide vcpu_is_preempted

2016-09-29 Thread Pan Xinhui



在 2016/9/29 23:51, Christian Borntraeger 写道:

this implements the s390 backend for commit
"kernel/sched: introduce vcpu preempted check interface"
by reworking the existing smp_vcpu_scheduled into
arch_vcpu_is_preempted. We can then also get rid of the
local cpu_is_preempted function by moving the
CIF_ENABLED_WAIT test into arch_vcpu_is_preempted.

Signed-off-by: Christian Borntraeger 
---


hi, Christian
thanks for your patch!


 arch/s390/include/asm/spinlock.h |  3 +++
 arch/s390/kernel/smp.c   |  9 +++--
 arch/s390/lib/spinlock.c | 25 -
 3 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/arch/s390/include/asm/spinlock.h b/arch/s390/include/asm/spinlock.h
index 63ebf37..e16e02f 100644
--- a/arch/s390/include/asm/spinlock.h
+++ b/arch/s390/include/asm/spinlock.h
@@ -21,6 +21,9 @@ _raw_compare_and_swap(unsigned int *lock, unsigned int old, 
unsigned int new)
return __sync_bool_compare_and_swap(lock, old, new);
 }

+bool arch_vcpu_is_preempted(int cpu);
+#define vcpu_is_preempted arch_vcpu_is_preempted
+
 /*
  * Simple spin lock operations.  There are two variants, one clears IRQ's
  * on the local processor, one does not.
diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index 7b89a75..4aadd16 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -376,10 +376,15 @@ int smp_find_processor_id(u16 address)
return -1;
 }

-int smp_vcpu_scheduled(int cpu)
 
root@ltcalpine2-lp13:~/linux# git grep -wn smp_vcpu_scheduled arch/s390/

arch/s390/include/asm/smp.h:34:extern int smp_vcpu_scheduled(int cpu);
arch/s390/include/asm/smp.h:56:static inline int smp_vcpu_scheduled(int cpu) { 
return 1; }
arch/s390/kernel/smp.c:371:int smp_vcpu_scheduled(int cpu)
arch/s390/lib/spinlock.c:44:if (smp_vcpu_scheduled(cpu))


+bool arch_vcpu_is_preempted(int cpu)
 {
-   return pcpu_running(pcpu_devices + cpu);
+   if (test_cpu_flag_of(CIF_ENABLED_WAIT, cpu))
+   return false;
+   if (pcpu_running(pcpu_devices + cpu))
+   return false;

I saw smp_vcpu_scheduled() returns true always on !SMP system.

maybe we can do somegthing silimar. like below

#ifndef CONFIG_SMP
static inline bool arch_vcpu_is_preempted(int cpu) { return 
!test_cpu_flag_of(CIF_ENABLED_WAIT, cpu); }
#else
...

but I can't help thinking that if this is a!SMP system, maybe we could only
#ifndef CONFIG_SMP
static inline bool arch_vcpu_is_preempted(int cpu) { return false; }
#else
...


thanks
xinhui


+   return true;
 }
+EXPORT_SYMBOL(arch_vcpu_is_preempted);

 void smp_yield_cpu(int cpu)
 {
diff --git a/arch/s390/lib/spinlock.c b/arch/s390/lib/spinlock.c
index e5f50a7..e48a48e 100644
--- a/arch/s390/lib/spinlock.c
+++ b/arch/s390/lib/spinlock.c
@@ -37,15 +37,6 @@ static inline void _raw_compare_and_delay(unsigned int 
*lock, unsigned int old)
asm(".insn rsy,0xeb22,%0,0,%1" : : "d" (old), "Q" (*lock));
 }

-static inline int cpu_is_preempted(int cpu)
-{
-   if (test_cpu_flag_of(CIF_ENABLED_WAIT, cpu))
-   return 0;
-   if (smp_vcpu_scheduled(cpu))
-   return 0;
-   return 1;
-}
-
 void arch_spin_lock_wait(arch_spinlock_t *lp)
 {
unsigned int cpu = SPINLOCK_LOCKVAL;
@@ -62,7 +53,7 @@ void arch_spin_lock_wait(arch_spinlock_t *lp)
continue;
}
/* First iteration: check if the lock owner is running. */
-   if (first_diag && cpu_is_preempted(~owner)) {
+   if (first_diag && arch_vcpu_is_preempted(~owner)) {
smp_yield_cpu(~owner);
first_diag = 0;
continue;
@@ -81,7 +72,7 @@ void arch_spin_lock_wait(arch_spinlock_t *lp)
 * yield the CPU unconditionally. For LPAR rely on the
 * sense running status.
 */
-   if (!MACHINE_IS_LPAR || cpu_is_preempted(~owner)) {
+   if (!MACHINE_IS_LPAR || arch_vcpu_is_preempted(~owner)) {
smp_yield_cpu(~owner);
first_diag = 0;
}
@@ -108,7 +99,7 @@ void arch_spin_lock_wait_flags(arch_spinlock_t *lp, unsigned 
long flags)
continue;
}
/* Check if the lock owner is running. */
-   if (first_diag && cpu_is_preempted(~owner)) {
+   if (first_diag && arch_vcpu_is_preempted(~owner)) {
smp_yield_cpu(~owner);
first_diag = 0;
continue;
@@ -127,7 +118,7 @@ void arch_spin_lock_wait_flags(arch_spinlock_t *lp, 
unsigned long flags)
 * yield the CPU unconditionally. For LPAR rely on the
 * sense running status.
 */
-   if (!MACHINE_IS_LPAR || cpu_is_preempted(~owner)) {
+   if (!MACHINE_IS_LPAR || 

Re: [PATCH] mm: exclude isolated non-lru pages from NR_ISOLATED_ANON or NR_ISOLATED_FILE.

2016-09-29 Thread Ming Ling
On 四,  9月 29, 2016 at 10:01:43上午 +0200, Michal Hocko wrote:
> On Wed 28-09-16 17:31:03, ming.ling wrote:
> > Non-lru pages don't belong to any lru, so accounting them to
> > NR_ISOLATED_ANON or NR_ISOLATED_FILE doesn't make any sense.
> > It may misguide functions such as pgdat_reclaimable_pages and
> > too_many_isolated.
> 
> OK, but it would be better to mention that this related to pfn based
> migration (e.g. compaction). It is also highly appreciated to describe
> the actual problem that you are seeing and tryin to fix. Is a reclaim
> artificially throttled (hung) because of too many non LRU pages being
> isolated? Is there a way to trigger that condition? So please tell us
> more.
>
Yes, you are right, it only related to pfn based migration such as
compaction and alloc_contig_range. On mobile devices such as 512M ram
android Phone, it may use a big zram swap. In some case zram(zsmalloc) 
use too many non-lru pages,such as:
MemTotal: 468148 kB
Normal free:5620kB
Free swap:4736kB
Total swap:409596kB
ZRAM: 164616kB(zsmalloc non-lru pages)
active_anon:60700kB
inactive_anon:60744kB
active_file:34420kB
inactive_file:37532kB
More non-lru pages which used by zram for swap, it influences
pgdat_reclaimable_pages and too_many_isolated more. But it is very hard
to make a special case making reclaim artificially throttled (hung) 
because of too many non LRU pages being isolated. Larger swap disk,
higher swapness, more backup applications, lower lowmemory killer minfree
water level, more high order pageblock allocation may trigger that condition.  
> > This patch adds NR_ISOLATED_NONLRU to vmstat and moves isolated non-lru
> > pages from NR_ISOLATED_ANON or NR_ISOLATED_FILE to NR_ISOLATED_NONLRU.
> > And with non-lru pages in vmstat, it helps to optimize algorithm of
> > function too_many_isolated oneday.
> 
> I am not entirely sure a new vmstat counter is really needed but I have
> to admit that the role of too_many_isolated in isolate_migratepages_block
> is quite unclear to me. Besides that I believe the patch is not correct
> because you are checking PageLRU after those pages have already been
> isolated from the LRU. For example putback_movable_pages operates on
> pages which are put back to the LRU. I haven't checked others but I
> suspect they would be in a similar situation.
> 
Yes, i had make a mistake. After those pages have already been isolated from
the LRU, it has to check __PageMovable instead of PageLRU such as function
putback_movable_pages, unmap_and_move and __unmap_and_move.
I will correct it in next patch,. Thank you very much.
> > Signed-off-by: ming.ling 
> > ---
> >  include/linux/mmzone.h |  1 +
> >  mm/compaction.c| 12 +---
> >  mm/migrate.c   | 14 ++
> >  3 files changed, 20 insertions(+), 7 deletions(-)
> > 
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index 7f2ae99..dc0adba 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -169,6 +169,7 @@ enum node_stat_item {
> > NR_VMSCAN_IMMEDIATE,/* Prioritise for reclaim when writeback ends */
> > NR_DIRTIED, /* page dirtyings since bootup */
> > NR_WRITTEN, /* page writings since bootup */
> > +   NR_ISOLATED_NONLRU, /* Temporary isolated pages from non-lru */
> > NR_VM_NODE_STAT_ITEMS
> >  };
> >  
> > diff --git a/mm/compaction.c b/mm/compaction.c
> > index 9affb29..8da1dca 100644
> > --- a/mm/compaction.c
> > +++ b/mm/compaction.c
> > @@ -638,16 +638,21 @@ isolate_freepages_range(struct compact_control *cc,
> >  static void acct_isolated(struct zone *zone, struct compact_control *cc)
> >  {
> > struct page *page;
> > -   unsigned int count[2] = { 0, };
> > +   unsigned int count[3] = { 0, };
> >  
> > if (list_empty(>migratepages))
> > return;
> >  
> > -   list_for_each_entry(page, >migratepages, lru)
> > -   count[!!page_is_file_cache(page)]++;
> > +   list_for_each_entry(page, >migratepages, lru) {
> > +   if (PageLRU(page))
> > +   count[!!page_is_file_cache(page)]++;
> > +   else
> > +   count[2]++;
> > +   }
> >  
> > mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_ANON, count[0]);
> > mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, count[1]);
> > +   mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_NONLRU, count[2]);
> >  }
> >  
> >  /* Similar to reclaim, but different enough that they don't share logic */
> > @@ -659,6 +664,7 @@ static bool too_many_isolated(struct zone *zone)
> > node_page_state(zone->zone_pgdat, NR_INACTIVE_ANON);
> > active = node_page_state(zone->zone_pgdat, NR_ACTIVE_FILE) +
> > node_page_state(zone->zone_pgdat, NR_ACTIVE_ANON);
> > +   /* Is it necessary to add NR_ISOLATED_NONLRU?? */
> > isolated = node_page_state(zone->zone_pgdat, 

Re: [PATCH v4 00/12] re-enable DAX PMD support

2016-09-29 Thread Darrick J. Wong
On Thu, Sep 29, 2016 at 09:03:43PM -0600, Ross Zwisler wrote:
> On Fri, Sep 30, 2016 at 09:43:45AM +1000, Dave Chinner wrote:
> > Finally: none of the patches in your tree have reviewed-by tags.
> > That says to me that none of this code has been reviewed yet.
> > Reviewed-by tags are non-negotiable requirement for anything going
> > through my trees. I don't have time right now to review this code,
> > so you're going to need to chase up other reviewers before merging.
> > 
> > And, really, this is getting very late in the cycle to be merging
> > new code - we're less than one working day away from the merge
> > window opening and we've missed the last linux-next build. I'd
> > suggest that we'd might be best served by slipping this to the PMD
> > support code to the next cycle when there's no time pressure for
> > review and we can get a decent linux-next soak on the code.
> 
> I absolutely support your policy of only sending code to Linux that has passed
> peer review.
> 
> However, I do feel compelled to point out that this is not new code.  I didn't
> just spring it on everyone in the hours before the v4.8 merge window.  I
> posted the first version of this patch set on August 15th, *seven weeks ago*:
> 
> https://lkml.org/lkml/2016/8/15/613
> 
> This was the day after v4.7-rc2 was released.
> 
> Since then I have responded promptly to the little review feedback
> that I've received.  I've also reviewed and tested other DAX changes,
> like the struct iomap changes from Christoph.  Those changes were
> first posted to the mailing list on September 9th, four weeks after
> mine.  Nevertheless, I was happy to rebase my changes on top of his,
> which meant a full rewrite of the DAX PMD fault handler so it would be
> based on struct iomap.  His changes are going to be merged for v4.9,
> and mine are not.

I'm not knocking the iomap migration, but it did cause a fair amount of
churn in the XFS reflink patchset -- and that's for a filesystem that
already /had/ iomap implemented.  It'd be neat to have all(?) the DAX
filesystems (ext[24], XFS) move over to iomap so that you wouldn't have
to support multiple ways of talking to FSes.  AFAICT ext4 hasn't gotten
iomap, which complicates things.  But that's my opinion, maybe you're
fine with supporting iomap and not-iomap.

The thing that (personally) makes it harder to review these
multi-subsystem patches is that I'm not a domain expert in some of those
subsystems -- memory in this case.  I get to the point where I'm
thinking "Uh, this looks ok, and it seems to work on my test VM, but is
that enough to stick my neck out and Reviewed-by?" and then get stuck.
It's hard to get unstuck with a complex piece of machinery.

> Please, help me understand what I can do to get my code reviewed.  Do
> I need to more aggressively ping my patch series, asking people by
> name for reviews?  Do we need to rework our code flow to Linus so that
> the DAX changes go through a filesystem tree like XFS or ext4, and ask
> the developers of that filesystem to help with reviews?  Something
> else?

FWIW, I /think/ it looks fine, though I'm afraid enough of the memory
manager that I haven't said anything yet.  I'll look it over more
tomorrow when my brain is fresher.  If reflink for XFS lands in 4.9 I'll
start looking again at pagecache sharing and/or dax+reflink.

> I'm honestly very frustrated by this because I've done my best to be
> open to constructive criticism and I've tried to respond promptly to
> the feedback that I've received.  In the end, though, a system where
> it's a requirement that all upstreamed code be peer reviewed but in
> which I can't get any feedback is essentially a system where I'm not
> allowed to contribute.

I have the same frustrations with getting non-XFS/non-ext4 patches
reviewed and upstreamed by whomever the maintainer is.  I wish we had a
broader range of people who knew both FS and MM, but wow is that a long
onboarding process. :/

--D

> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


linux-next: build failure after merge of the tty tree

2016-09-29 Thread Stephen Rothwell
Hi Greg,

After merging the tty tree, today's linux-next build (arm
multi_v7_defconfig) failed like this:

drivers/tty/serial/amba-pl011.c: In function 'pl011_console_match':
drivers/tty/serial/amba-pl011.c:2346:44: error: passing argument 3 of 
'uart_parse_earlycon' from incompatible pointer type 
[-Werror=incompatible-pointer-types]
  if (uart_parse_earlycon(options, , , ))
^
In file included from drivers/tty/serial/amba-pl011.c:45:0:
include/linux/serial_core.h:384:5: note: expected 'resource_size_t * {aka 
unsigned int *}' but argument is of type 'long unsigned int *'
 int uart_parse_earlycon(char *p, unsigned char *iotype, resource_size_t *addr,
 ^

Caused by commit

  8b8f347d3a48 ("serial: pl011: add console matching function")

interacting with commit

  46e36683f433 ("serial: earlycon: Extend earlycon command line option to 
support 64-bit addresses")

I have reverted commit 8b8f347d3a48 for today.

-- 
Cheers,
Stephen Rothwell


[PATCH 2/2] perf tools: Support insn and insnlen in perf script

2016-09-29 Thread Andi Kleen
From: Andi Kleen 

When looking at Intel PT traces with perf script it is useful to have
some indication of the instruction. Dump the instruction bytes and
instruction length, which can be used for simple pattern analysis in
scripts.

% perf record -e intel_pt// foo
% perf script --itrace=i0ns -F ip,insn,insnlen
 8101232f ilen: 5 insn: 0f 1f 44 00 00
 81012334 ilen: 1 insn: 5b
 81012335 ilen: 1 insn: 5d
 81012336 ilen: 1 insn: c3
 810123e3 ilen: 1 insn: 5b
 810123e4 ilen: 2 insn: 41 5c
 810123e6 ilen: 1 insn: 5d
 810123e7 ilen: 1 insn: c3
 810124a6 ilen: 2 insn: 31 c0
 810124a8 ilen: 9 insn: 41 83 bc 24 a8 01 00 00 01
 810124b1 ilen: 2 insn: 75 87
...

Signed-off-by: Andi Kleen 
---
 tools/perf/Documentation/perf-script.txt |  6 +-
 tools/perf/builtin-script.c  | 25 +++--
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index 053bbbd84ece..c01904f388ce 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -117,7 +117,7 @@ OPTIONS
 Comma separated list of fields to print. Options are:
 comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
 srcline, period, iregs, brstack, brstacksym, flags, bpf-output,
-callindent. Field list can be prepended with the type, trace, sw or hw,
+callindent, insn, insnlen. Field list can be prepended with the type, 
trace, sw or hw,
 to indicate to which event type the field list applies.
 e.g., -F sw:comm,tid,time,ip,sym  and -F trace:time,cpu,trace
 
@@ -181,6 +181,10 @@ OPTIONS
Instruction Trace decoding. For calls and returns, it will display the
name of the symbol indented with spaces to reflect the stack depth.
 
+   When doing instruction trace decoding insn and insnlen give the
+   instruction bytes and the instruction length of the current
+   instruction.
+
Finally, a user may not set fields to none for all event types.
i.e., -F "" is not allowed.
 
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 7228d141a789..11cf75d5dbda 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -66,6 +66,8 @@ enum perf_output_field {
PERF_OUTPUT_WEIGHT  = 1U << 18,
PERF_OUTPUT_BPF_OUTPUT  = 1U << 19,
PERF_OUTPUT_CALLINDENT  = 1U << 20,
+   PERF_OUTPUT_INSN= 1U << 21,
+   PERF_OUTPUT_INSNLEN = 1U << 22,
 };
 
 struct output_option {
@@ -93,6 +95,8 @@ struct output_option {
{.str = "weight",   .field = PERF_OUTPUT_WEIGHT},
{.str = "bpf-output",   .field = PERF_OUTPUT_BPF_OUTPUT},
{.str = "callindent", .field = PERF_OUTPUT_CALLINDENT},
+   {.str = "insn", .field = PERF_OUTPUT_INSN},
+   {.str = "insnlen", .field = PERF_OUTPUT_INSNLEN},
 };
 
 /* default set to maintain compatibility with current format */
@@ -624,6 +628,21 @@ static void print_sample_callindent(struct perf_sample 
*sample,
printf("%*s", spacing - len, "");
 }
 
+
+static void print_insn(struct perf_sample *sample,
+  struct perf_event_attr *attr)
+{
+   if (PRINT_FIELD(INSNLEN))
+   printf(" ilen: %d", sample->insn_len);
+   if (PRINT_FIELD(INSN)) {
+   int i;
+
+   printf(" insn:");
+   for (i = 0; i < sample->insn_len; i++)
+   printf(" %02x", (unsigned char)sample->insn[i]);
+   }
+}
+
 static void print_sample_bts(struct perf_sample *sample,
 struct perf_evsel *evsel,
 struct thread *thread,
@@ -668,6 +687,8 @@ static void print_sample_bts(struct perf_sample *sample,
if (print_srcline_last)
map__fprintf_srcline(al->map, al->addr, "\n  ", stdout);
 
+   print_insn(sample, attr);
+
printf("\n");
 }
 
@@ -911,7 +932,7 @@ static void process_event(struct perf_script *script,
 
if (perf_evsel__is_bpf_output(evsel) && PRINT_FIELD(BPF_OUTPUT))
print_sample_bpf_output(sample);
-
+   print_insn(sample, attr);
printf("\n");
 }
 
@@ -2124,7 +2145,7 @@ int cmd_script(int argc, const char **argv, const char 
*prefix __maybe_unused)
 "Valid types: hw,sw,trace,raw. "
 "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
 "addr,symoff,period,iregs,brstack,brstacksym,flags,"
-"bpf-output,callindent", parse_output_fields),
+"bpf-output,callindent,insn,insnlen", parse_output_fields),
OPT_BOOLEAN('a', "all-cpus", _wide,
"system-wide collection from all CPUs"),
OPT_STRING('S', "symbols", 

[PATCH 1/2] perf intel-pt-decoder: Report instruction bytes and length in sample

2016-09-29 Thread Andi Kleen
From: Andi Kleen 

Change the Intel PT decoder to pass up the length and the instruction
bytes of the decoded or sampled instruction in the perf sample.

The decoder already knows this information, we just need to pass it
up. Since it is only a couple of movs it is not very expensive.

Used in the next patch.

Signed-off-by: Andi Kleen 
---
 tools/perf/util/event.h  | 3 +++
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c  | 2 ++
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.h  | 3 +++
 tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.h | 2 +-
 tools/perf/util/intel-pt.c   | 5 +
 5 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 8d363d5e65a2..c735c53a26f8 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -177,6 +177,8 @@ enum {
PERF_IP_FLAG_TRACE_BEGIN|\
PERF_IP_FLAG_TRACE_END)
 
+#define MAX_INSN 16
+
 struct perf_sample {
u64 ip;
u32 pid, tid;
@@ -193,6 +195,7 @@ struct perf_sample {
u32 flags;
u16 insn_len;
u8  cpumode;
+   char insn[MAX_INSN];
void *raw_data;
struct ip_callchain *callchain;
struct branch_stack *branch_stack;
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c 
b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 8ff6c6a61291..8a5e21abb790 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -949,6 +949,8 @@ out:
 out_no_progress:
decoder->state.insn_op = intel_pt_insn->op;
decoder->state.insn_len = intel_pt_insn->length;
+   memcpy(decoder->state.insn, intel_pt_insn->buf,
+  sizeof(decoder->state.insn));
 
if (decoder->tx_flags & INTEL_PT_IN_TX)
decoder->state.flags |= INTEL_PT_IN_TX;
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h 
b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index 02c38fec1c37..fbd7d08d97d5 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -20,6 +20,8 @@
 #include 
 #include 
 
+#define MAX_INSN   16
+
 #include "intel-pt-insn-decoder.h"
 
 #define INTEL_PT_IN_TX (1 << 0)
@@ -66,6 +68,7 @@ struct intel_pt_state {
uint32_t flags;
enum intel_pt_insn_op insn_op;
int insn_len;
+   char insn[MAX_INSN];
 };
 
 struct intel_pt_insn;
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.h 
b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.h
index b0adbf37323e..47e196dec224 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.h
@@ -20,7 +20,7 @@
 #include 
 
 #define INTEL_PT_INSN_DESC_MAX 32
-#define INTEL_PT_INSN_DBG_BUF_SZ   16
+#define INTEL_PT_INSN_DBG_BUF_SZ   16 /* Must be >= MAX_INSN */
 
 enum intel_pt_insn_op {
INTEL_PT_OP_OTHER,
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index b9cc353cace2..363eba09c609 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -140,6 +140,7 @@ struct intel_pt_queue {
u32 flags;
u16 insn_len;
u64 last_insn_cnt;
+   char insn[MAX_INSN];
 };
 
 static void intel_pt_dump(struct intel_pt *pt __maybe_unused,
@@ -817,6 +818,7 @@ static void intel_pt_sample_flags(struct intel_pt_queue 
*ptq)
if (ptq->state->flags & INTEL_PT_IN_TX)
ptq->flags |= PERF_IP_FLAG_IN_TX;
ptq->insn_len = ptq->state->insn_len;
+   memcpy(ptq->insn, ptq->state->insn, MAX_INSN);
}
 }
 
@@ -997,6 +999,7 @@ static int intel_pt_synth_branch_sample(struct 
intel_pt_queue *ptq)
sample.cpu = ptq->cpu;
sample.flags = ptq->flags;
sample.insn_len = ptq->insn_len;
+   memcpy(sample.insn, ptq->insn, MAX_INSN);
 
/*
 * perf report cannot handle events without a branch stack when using
@@ -1058,6 +1061,7 @@ static int intel_pt_synth_instruction_sample(struct 
intel_pt_queue *ptq)
sample.cpu = ptq->cpu;
sample.flags = ptq->flags;
sample.insn_len = ptq->insn_len;
+   memcpy(sample.insn, ptq->insn, MAX_INSN);
 
ptq->last_insn_cnt = ptq->state->tot_insn_cnt;
 
@@ -1120,6 +1124,7 @@ static int intel_pt_synth_transaction_sample(struct 
intel_pt_queue *ptq)
sample.cpu = ptq->cpu;
sample.flags = ptq->flags;
sample.insn_len = ptq->insn_len;
+   memcpy(sample.insn, ptq->insn, MAX_INSN);
 
if (pt->synth_opts.callchain) {
thread_stack__sample(ptq->thread, ptq->chain,
-- 
2.5.5



linux-next: manual merge of the tty tree with the pm tree

2016-09-29 Thread Stephen Rothwell
Hi Greg,

Today's linux-next merge of the tty tree got a conflict in:

  include/linux/acpi.h

between commit:

  058dfc767008 ("ACPI / watchdog: Add support for WDAT hardware watchdog")

from the pm tree and commit:

  ad1696f6f09d ("ACPI: parse SPCR and enable matching console")

from the tty tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc include/linux/acpi.h
index 19e650c940b6,2353827731d2..
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@@ -1093,10 -1074,10 +1093,16 @@@ void acpi_table_upgrade(void)
  static inline void acpi_table_upgrade(void) { }
  #endif
  
 +#if defined(CONFIG_ACPI) && defined(CONFIG_ACPI_WATCHDOG)
 +extern bool acpi_has_watchdog(void);
 +#else
 +static inline bool acpi_has_watchdog(void) { return false; }
 +#endif
 +
+ #ifdef CONFIG_ACPI_SPCR_TABLE
+ int parse_spcr(bool earlycon);
+ #else
+ static inline int parse_spcr(bool earlycon) { return 0; }
+ #endif
+ 
  #endif/*_LINUX_ACPI_H*/


linux-next: manual merge of the tty tree with the arm64 tree

2016-09-29 Thread Stephen Rothwell
Hi Greg,

Today's linux-next merge of the tty tree got conflicts in:

  arch/arm64/Kconfig

between commit:

  1d8f51d41fc7 ("arm/arm64: arch_timer: Use archdata to indicate vdso 
suitability")

from the arm64 tree and commit:

  888125a71298 ("ARM64: ACPI: enable ACPI_SPCR_TABLE")

from the tty tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc arch/arm64/Kconfig
index 17c14a1d9112,11a2d36b27ef..
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@@ -4,7 -4,7 +4,8 @@@ config ARM6
select ACPI_GENERIC_GSI if ACPI
select ACPI_REDUCED_HARDWARE_ONLY if ACPI
select ACPI_MCFG if ACPI
+   select ACPI_SPCR_TABLE if ACPI
 +  select ARCH_CLOCKSOURCE_DATA
select ARCH_HAS_DEVMEM_IS_ALLOWED
select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE


Re: [PATCH 0/4] Add DMA support for ti_am335x_adc driver

2016-09-29 Thread John Syne
I applied your patches to the following kernel:

github.com/RobertCNelson/bb-kernel.git 
branch am33x-v4.8

Changing 
#define REG_DMAENABLE_CLEAR 0x038
to
#define REG_DMAENABLE_CLEAR 0x03C

Also applying the following DTS changes:

target = <>;
__overlay__ {

status = "okay";
adc {
ti,adc-channels = <0 1 2 3 4 5 6>;
ti,chan-step-avg = <0x1 0x1 0x1 0x1 0x1 0x1 
0x1>;
ti,chan-step-opendelay = <0x0 0x0 0x0 0x0 0x0 
0x0 0x0>;
ti,chan-step-sampledelay = <0x0 0x0 0x0 0x0 0x0 
0x0 0x0>;
};
};

Running on a BeagleBoneBlack, I was able to stream samples at 200ksps 
continuously and it looked stable. 

htop showed CPU utilization between 5% and 7%

Thank you for getting this to work.

Regards,
John




> On Sep 29, 2016, at 6:01 AM, Mugunthan V N  wrote:
> 
> On Sunday 25 September 2016 03:11 PM, Jonathan Cameron wrote:
>> On 21/09/16 17:11, Mugunthan V N wrote:
 The ADC has a 64 work depth fifo length which holds the ADC data
 till the CPU reads. So when a user program needs a large ADC data
 to operate on, then it has to do multiple reads to get its
 buffer. Currently if the application asks for 4 samples per
 channel with all 8 channels are enabled, kernel can provide only
 3 samples per channel when all 8 channels are enabled (logs at
 [1]). So with DMA support user can request for large number of
 samples at a time (logs at [2]).
 
 Tested the patch on AM437x-gp-evm and AM335x Boneblack with the
 patch [3] to enable ADC and pushed a branch for testing [4]
 
 [1] - http://pastebin.ubuntu.com/23211490/
 [2] - http://pastebin.ubuntu.com/23211492/
 [3] - http://pastebin.ubuntu.com/23211494/
 [4] - git://git.ti.com/~mugunthanvnm/ti-linux-kernel/linux.git iio-dma
>> Just curious.  How fast is the ADC sampling at in these?  Never that
>> obvious for this driver!
>> 
>> I'm also curious as to whether you started to hit the limits of the
>> kfifo based interface.  Might be worth considering adding alternative
>> support for the dma buffers interface which is obviously much lower
>> overhead.
>> 
>> Good to have this work prior to that as the kfifo stuff is somewhat
>> easier to use.
> 
> Currently ADC clock is 3MHz, which can produce a data rate of 225KBps
> per channel with no open delay and no averaging of samples. So when all
> 8 Channels are enables the data rate will be 1.75MBps
> 
> ADC can be operated at 24MHz, which can generate a data rate of 28MBps
> with all 8 channels enabled and no open delay and averaging, but our
> target is to get 800K samples per second per channel which has a data
> rate of 12.5MBps
> 
> I think with this data rate, DMA will be the best option to implement
> without any data loss and less cpu overload to read the ADC samples.
> 
> Regards
> Mugunthan V N
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



linux-next: manual merge of the tty tree with the v4l-dvb tree

2016-09-29 Thread Stephen Rothwell
Hi Greg,

Today's linux-next merge of the tty tree got a conflict in:

  MAINTAINERS

between commit:

  71fb2c74287d ("[media] MAINTAINERS: atmel-isc: add entry for Atmel ISC")

from the v4l-dvb tree and commit:

  5615c3715749 ("MAINTAINERS: update entry for atmel_serial driver")

from the tty tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc MAINTAINERS
index 8c86c07409c8,4e2ae17c478e..
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@@ -7848,14 -7745,12 +7843,20 @@@ T:   git git://git.monstr.eu/linux-2.6-mi
  S:Supported
  F:arch/microblaze/
  
+ MICROCHIP / ATMEL AT91 / AT32 SERIAL DRIVER
+ M:Richard Genoud 
+ S:Maintained
+ F:drivers/tty/serial/atmel_serial.c
+ F:include/linux/atmel_serial.h
+ 
 +MICROCHIP / ATMEL ISC DRIVER
 +M:Songjun Wu 
 +L:linux-me...@vger.kernel.org
 +S:Supported
 +F:drivers/media/platform/atmel/atmel-isc.c
 +F:drivers/media/platform/atmel/atmel-isc-regs.h
 +F:devicetree/bindings/media/atmel-isc.txt
 +
  MICROSOFT SURFACE PRO 3 BUTTON DRIVER
  M:Chen Yu 
  L:platform-driver-...@vger.kernel.org


[PATCH v6 7/7] net: Suppress the "Comparison to NULL could be written" warnings

2016-09-29 Thread Jia He
This is to suppress the checkpatch.pl warning "Comparison to NULL
could be written". No functional changes here.

Signed-off-by: Jia He 
---
 net/ipv4/proc.c | 32 
 net/sctp/proc.c |  2 +-
 2 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index 9d7a39a..b39faf6 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -357,22 +357,22 @@ static void icmp_put(struct seq_file *seq)
atomic_long_t *ptr = net->mib.icmpmsg_statistics->mibs;
 
seq_puts(seq, "\nIcmp: InMsgs InErrors InCsumErrors");
-   for (i = 0; icmpmibmap[i].name != NULL; i++)
+   for (i = 0; icmpmibmap[i].name; i++)
seq_printf(seq, " In%s", icmpmibmap[i].name);
seq_puts(seq, " OutMsgs OutErrors");
-   for (i = 0; icmpmibmap[i].name != NULL; i++)
+   for (i = 0; icmpmibmap[i].name; i++)
seq_printf(seq, " Out%s", icmpmibmap[i].name);
seq_printf(seq, "\nIcmp: %lu %lu %lu",
snmp_fold_field(net->mib.icmp_statistics, ICMP_MIB_INMSGS),
snmp_fold_field(net->mib.icmp_statistics, ICMP_MIB_INERRORS),
snmp_fold_field(net->mib.icmp_statistics, ICMP_MIB_CSUMERRORS));
-   for (i = 0; icmpmibmap[i].name != NULL; i++)
+   for (i = 0; icmpmibmap[i].name; i++)
seq_printf(seq, " %lu",
   atomic_long_read(ptr + icmpmibmap[i].index));
seq_printf(seq, " %lu %lu",
snmp_fold_field(net->mib.icmp_statistics, ICMP_MIB_OUTMSGS),
snmp_fold_field(net->mib.icmp_statistics, ICMP_MIB_OUTERRORS));
-   for (i = 0; icmpmibmap[i].name != NULL; i++)
+   for (i = 0; icmpmibmap[i].name; i++)
seq_printf(seq, " %lu",
   atomic_long_read(ptr + (icmpmibmap[i].index | 
0x100)));
 }
@@ -389,7 +389,7 @@ static int snmp_seq_show_ipstats(struct seq_file *seq, void 
*v)
memset(buff64, 0, IPSTATS_MIB_MAX * sizeof(u64));
 
seq_puts(seq, "Ip: Forwarding DefaultTTL");
-   for (i = 0; snmp4_ipstats_list[i].name != NULL; i++)
+   for (i = 0; snmp4_ipstats_list[i].name; i++)
seq_printf(seq, " %s", snmp4_ipstats_list[i].name);
 
seq_printf(seq, "\nIp: %d %d",
@@ -400,7 +400,7 @@ static int snmp_seq_show_ipstats(struct seq_file *seq, void 
*v)
snmp_get_cpu_field64_batch(buff64, snmp4_ipstats_list,
   net->mib.ip_statistics,
   offsetof(struct ipstats_mib, syncp));
-   for (i = 0; snmp4_ipstats_list[i].name != NULL; i++)
+   for (i = 0; snmp4_ipstats_list[i].name; i++)
seq_printf(seq, " %llu", buff64[i]);
 
return 0;
@@ -415,13 +415,13 @@ static int snmp_seq_show_tcp_udp(struct seq_file *seq, 
void *v)
memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
 
seq_puts(seq, "\nTcp:");
-   for (i = 0; snmp4_tcp_list[i].name != NULL; i++)
+   for (i = 0; snmp4_tcp_list[i].name; i++)
seq_printf(seq, " %s", snmp4_tcp_list[i].name);
 
seq_puts(seq, "\nTcp:");
snmp_get_cpu_field_batch(buff, snmp4_tcp_list,
 net->mib.tcp_statistics);
-   for (i = 0; snmp4_tcp_list[i].name != NULL; i++) {
+   for (i = 0; snmp4_tcp_list[i].name; i++) {
/* MaxConn field is signed, RFC 2012 */
if (snmp4_tcp_list[i].entry == TCP_MIB_MAXCONN)
seq_printf(seq, " %ld", buff[i]);
@@ -434,10 +434,10 @@ static int snmp_seq_show_tcp_udp(struct seq_file *seq, 
void *v)
snmp_get_cpu_field_batch(buff, snmp4_udp_list,
 net->mib.udp_statistics);
seq_puts(seq, "\nUdp:");
-   for (i = 0; snmp4_udp_list[i].name != NULL; i++)
+   for (i = 0; snmp4_udp_list[i].name; i++)
seq_printf(seq, " %s", snmp4_udp_list[i].name);
seq_puts(seq, "\nUdp:");
-   for (i = 0; snmp4_udp_list[i].name != NULL; i++)
+   for (i = 0; snmp4_udp_list[i].name; i++)
seq_printf(seq, " %lu", buff[i]);
 
memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
@@ -446,10 +446,10 @@ static int snmp_seq_show_tcp_udp(struct seq_file *seq, 
void *v)
seq_puts(seq, "\nUdpLite:");
snmp_get_cpu_field_batch(buff, snmp4_udp_list,
 net->mib.udplite_statistics);
-   for (i = 0; snmp4_udp_list[i].name != NULL; i++)
+   for (i = 0; snmp4_udp_list[i].name; i++)
seq_printf(seq, " %s", snmp4_udp_list[i].name);
seq_puts(seq, "\nUdpLite:");
-   for (i = 0; snmp4_udp_list[i].name != NULL; i++)
+   for (i = 0; snmp4_udp_list[i].name; i++)
seq_printf(seq, " %lu", buff[i]);
 
seq_putc(seq, '\n');
@@ -492,21 +492,21 @@ static int netstat_seq_show(struct seq_file *seq, void *v)
struct net *net = seq->private;
 

[PATCH v6 6/7] ipv6: Remove useless parameter in __snmp6_fill_statsdev

2016-09-29 Thread Jia He
The parameter items(is always ICMP6_MIB_MAX) is useless for 
__snmp6_fill_statsdev

Signed-off-by: Jia He 
---
 net/ipv6/addrconf.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 2f1f5d4..35d4baa 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4961,18 +4961,18 @@ static inline size_t inet6_if_nlmsg_size(void)
 }
 
 static inline void __snmp6_fill_statsdev(u64 *stats, atomic_long_t *mib,
- int items, int bytes)
+   int bytes)
 {
int i;
-   int pad = bytes - sizeof(u64) * items;
+   int pad = bytes - sizeof(u64) * ICMP6_MIB_MAX;
BUG_ON(pad < 0);
 
/* Use put_unaligned() because stats may not be aligned for u64. */
-   put_unaligned(items, [0]);
-   for (i = 1; i < items; i++)
+   put_unaligned(ICMP6_MIB_MAX, [0]);
+   for (i = 1; i < ICMP6_MIB_MAX; i++)
put_unaligned(atomic_long_read([i]), [i]);
 
-   memset([items], 0, pad);
+   memset([ICMP6_MIB_MAX], 0, pad);
 }
 
 static inline void __snmp6_fill_stats64(u64 *stats, void __percpu *mib,
@@ -5005,7 +5005,7 @@ static void snmp6_fill_stats(u64 *stats, struct inet6_dev 
*idev, int attrtype,
 offsetof(struct ipstats_mib, syncp));
break;
case IFLA_INET6_ICMP6STATS:
-   __snmp6_fill_statsdev(stats, idev->stats.icmpv6dev->mibs, 
ICMP6_MIB_MAX, bytes);
+   __snmp6_fill_statsdev(stats, idev->stats.icmpv6dev->mibs, 
bytes);
break;
}
 }
-- 
2.5.5



[PATCH v6 4/7] proc: Reduce cache miss in sctp_snmp_seq_show

2016-09-29 Thread Jia He
This is to use the generic interfaces snmp_get_cpu_field{,64}_batch to
aggregate the data by going through all the items of each cpu sequentially.

Signed-off-by: Jia He 
---
 net/sctp/proc.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/sctp/proc.c b/net/sctp/proc.c
index ef8ba77..09e16c2 100644
--- a/net/sctp/proc.c
+++ b/net/sctp/proc.c
@@ -73,13 +73,17 @@ static const struct snmp_mib sctp_snmp_list[] = {
 /* Display sctp snmp mib statistics(/proc/net/sctp/snmp). */
 static int sctp_snmp_seq_show(struct seq_file *seq, void *v)
 {
+   unsigned long buff[SCTP_MIB_MAX];
struct net *net = seq->private;
int i;
 
+   memset(buff, 0, sizeof(unsigned long) * SCTP_MIB_MAX);
+
+   snmp_get_cpu_field_batch(buff, sctp_snmp_list,
+net->sctp.sctp_statistics);
for (i = 0; sctp_snmp_list[i].name != NULL; i++)
seq_printf(seq, "%-32s\t%ld\n", sctp_snmp_list[i].name,
-  snmp_fold_field(net->sctp.sctp_statistics,
- sctp_snmp_list[i].entry));
+   buff[i]);
 
return 0;
 }
-- 
2.5.5



[PATCH v6 5/7] proc: Reduce cache miss in xfrm_statistics_seq_show

2016-09-29 Thread Jia He
This is to use the generic interfaces snmp_get_cpu_field{,64}_batch to
aggregate the data by going through all the items of each cpu sequentially.

Signed-off-by: Jia He 
---
 net/xfrm/xfrm_proc.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/xfrm/xfrm_proc.c b/net/xfrm/xfrm_proc.c
index 9c4fbd8..ba2b539 100644
--- a/net/xfrm/xfrm_proc.c
+++ b/net/xfrm/xfrm_proc.c
@@ -50,12 +50,18 @@ static const struct snmp_mib xfrm_mib_list[] = {
 
 static int xfrm_statistics_seq_show(struct seq_file *seq, void *v)
 {
+   unsigned long buff[LINUX_MIB_XFRMMAX];
struct net *net = seq->private;
int i;
+
+   memset(buff, 0, sizeof(unsigned long) * LINUX_MIB_XFRMMAX);
+
+   snmp_get_cpu_field_batch(buff, xfrm_mib_list,
+net->mib.xfrm_statistics);
for (i = 0; xfrm_mib_list[i].name; i++)
seq_printf(seq, "%-24s\t%lu\n", xfrm_mib_list[i].name,
-  snmp_fold_field(net->mib.xfrm_statistics,
-  xfrm_mib_list[i].entry));
+   buff[i]);
+
return 0;
 }
 
-- 
2.5.5



[PATCH v6 1/7] net:snmp: Introduce generic interfaces for snmp_get_cpu_field{,64}

2016-09-29 Thread Jia He
This is to introduce the generic interfaces for snmp_get_cpu_field{,64}.
It exchanges the two for-loops for collecting the percpu statistics data.
This can aggregate the data by going through all the items of each cpu
sequentially.

Signed-off-by: Jia He 
Suggested-by: Marcelo Ricardo Leitner 
---
 include/net/ip.h | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/include/net/ip.h b/include/net/ip.h
index 9742b92..bc43c0f 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -219,6 +219,29 @@ static inline u64 snmp_fold_field64(void __percpu *mib, 
int offt, size_t syncp_o
 }
 #endif
 
+#define snmp_get_cpu_field64_batch(buff64, stats_list, mib_statistic, offset) \
+{ \
+   int i, c; \
+   for_each_possible_cpu(c) { \
+   for (i = 0; stats_list[i].name; i++) \
+   buff64[i] += snmp_get_cpu_field64( \
+   mib_statistic, \
+   c, stats_list[i].entry, \
+   offset); \
+   } \
+}
+
+#define snmp_get_cpu_field_batch(buff, stats_list, mib_statistic) \
+{ \
+   int i, c; \
+   for_each_possible_cpu(c) { \
+   for (i = 0; stats_list[i].name; i++) \
+   buff[i] += snmp_get_cpu_field( \
+   mib_statistic, \
+   c, stats_list[i].entry); \
+   } \
+}
+
 void inet_get_local_port_range(struct net *net, int *low, int *high);
 
 #ifdef CONFIG_SYSCTL
-- 
2.5.5



[PATCH v6 2/7] proc: Reduce cache miss in snmp_seq_show

2016-09-29 Thread Jia He
This is to use the generic interfaces snmp_get_cpu_field{,64}_batch to
aggregate the data by going through all the items of each cpu sequentially.
Then snmp_seq_show is split into 2 parts to avoid build warning "the frame
size" larger than 1024.

Signed-off-by: Jia He 
---
 net/ipv4/proc.c | 70 ++---
 1 file changed, 47 insertions(+), 23 deletions(-)

diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index 9f665b6..9d7a39a 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -46,6 +46,8 @@
 #include 
 #include 
 
+#define TCPUDP_MIB_MAX max_t(u32, UDP_MIB_MAX, TCP_MIB_MAX)
+
 /*
  * Report socket allocation statistics [m...@utu.fi]
  */
@@ -378,13 +380,15 @@ static void icmp_put(struct seq_file *seq)
 /*
  * Called from the PROCfs module. This outputs /proc/net/snmp.
  */
-static int snmp_seq_show(struct seq_file *seq, void *v)
+static int snmp_seq_show_ipstats(struct seq_file *seq, void *v)
 {
-   int i;
struct net *net = seq->private;
+   u64 buff64[IPSTATS_MIB_MAX];
+   int i;
 
-   seq_puts(seq, "Ip: Forwarding DefaultTTL");
+   memset(buff64, 0, IPSTATS_MIB_MAX * sizeof(u64));
 
+   seq_puts(seq, "Ip: Forwarding DefaultTTL");
for (i = 0; snmp4_ipstats_list[i].name != NULL; i++)
seq_printf(seq, " %s", snmp4_ipstats_list[i].name);
 
@@ -393,57 +397,77 @@ static int snmp_seq_show(struct seq_file *seq, void *v)
   net->ipv4.sysctl_ip_default_ttl);
 
BUILD_BUG_ON(offsetof(struct ipstats_mib, mibs) != 0);
+   snmp_get_cpu_field64_batch(buff64, snmp4_ipstats_list,
+  net->mib.ip_statistics,
+  offsetof(struct ipstats_mib, syncp));
for (i = 0; snmp4_ipstats_list[i].name != NULL; i++)
-   seq_printf(seq, " %llu",
-  snmp_fold_field64(net->mib.ip_statistics,
-snmp4_ipstats_list[i].entry,
-offsetof(struct ipstats_mib, 
syncp)));
+   seq_printf(seq, " %llu", buff64[i]);
 
-   icmp_put(seq);  /* RFC 2011 compatibility */
-   icmpmsg_put(seq);
+   return 0;
+}
+
+static int snmp_seq_show_tcp_udp(struct seq_file *seq, void *v)
+{
+   unsigned long buff[TCPUDP_MIB_MAX];
+   struct net *net = seq->private;
+   int i;
+
+   memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
 
seq_puts(seq, "\nTcp:");
for (i = 0; snmp4_tcp_list[i].name != NULL; i++)
seq_printf(seq, " %s", snmp4_tcp_list[i].name);
 
seq_puts(seq, "\nTcp:");
+   snmp_get_cpu_field_batch(buff, snmp4_tcp_list,
+net->mib.tcp_statistics);
for (i = 0; snmp4_tcp_list[i].name != NULL; i++) {
/* MaxConn field is signed, RFC 2012 */
if (snmp4_tcp_list[i].entry == TCP_MIB_MAXCONN)
-   seq_printf(seq, " %ld",
-  snmp_fold_field(net->mib.tcp_statistics,
-  snmp4_tcp_list[i].entry));
+   seq_printf(seq, " %ld", buff[i]);
else
-   seq_printf(seq, " %lu",
-  snmp_fold_field(net->mib.tcp_statistics,
-  snmp4_tcp_list[i].entry));
+   seq_printf(seq, " %lu", buff[i]);
}
 
+   memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
+
+   snmp_get_cpu_field_batch(buff, snmp4_udp_list,
+net->mib.udp_statistics);
seq_puts(seq, "\nUdp:");
for (i = 0; snmp4_udp_list[i].name != NULL; i++)
seq_printf(seq, " %s", snmp4_udp_list[i].name);
-
seq_puts(seq, "\nUdp:");
for (i = 0; snmp4_udp_list[i].name != NULL; i++)
-   seq_printf(seq, " %lu",
-  snmp_fold_field(net->mib.udp_statistics,
-  snmp4_udp_list[i].entry));
+   seq_printf(seq, " %lu", buff[i]);
+
+   memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
 
/* the UDP and UDP-Lite MIBs are the same */
seq_puts(seq, "\nUdpLite:");
+   snmp_get_cpu_field_batch(buff, snmp4_udp_list,
+net->mib.udplite_statistics);
for (i = 0; snmp4_udp_list[i].name != NULL; i++)
seq_printf(seq, " %s", snmp4_udp_list[i].name);
-
seq_puts(seq, "\nUdpLite:");
for (i = 0; snmp4_udp_list[i].name != NULL; i++)
-   seq_printf(seq, " %lu",
-  snmp_fold_field(net->mib.udplite_statistics,
-  snmp4_udp_list[i].entry));
+   seq_printf(seq, " %lu", buff[i]);
 
seq_putc(seq, '\n');
return 0;
 }
 
+static int 

[PATCH v6 0/7] Reduce cache miss for snmp_fold_field

2016-09-29 Thread Jia He
In a PowerPc server with large cpu number(160), besides commit
a3a773726c9f ("net: Optimize snmp stat aggregation by walking all
the percpu data at once"), I watched several other snmp_fold_field
callsites which would cause high cache miss rate.

test source code:

My simple test case, which read from the procfs items endlessly:
/***/
#include 
#include 
#include 
#include 
#include 
#define LINELEN  2560
int main(int argc, char **argv)
{
int i;
int fd = -1 ;
int rdsize = 0;
char buf[LINELEN+1];

buf[LINELEN] = 0;
memset(buf,0,LINELEN);

if(1 >= argc) {
printf("file name empty\n");
return -1;
}

fd = open(argv[1], O_RDWR, 0644);
if(0 > fd){
printf("open error\n");
return -2;
}

for(i=0;i<0x;i++) {
while(0 < (rdsize = read(fd,buf,LINELEN))){
//nothing here
}

lseek(fd, 0, SEEK_SET);
}

close(fd);
return 0;
}
/**/

compile and run:

gcc test.c -o test

perf stat -d -e cache-misses ./test /proc/net/snmp
perf stat -d -e cache-misses ./test /proc/net/snmp6
perf stat -d -e cache-misses ./test /proc/net/sctp/snmp
perf stat -d -e cache-misses ./test /proc/net/xfrm_stat

before the patch set:

 Performance counter stats for 'system wide':

 355911097  cache-misses
 [40.08%]
2356829300  L1-dcache-loads 
 [60.04%]
 355642645  L1-dcache-load-misses #   15.09% of all L1-dcache 
hits   [60.02%]
 346544541  LLC-loads   
 [59.97%]
389763  LLC-load-misses   #0.11% of all LL-cache 
hits[40.02%]

   6.245162638 seconds time elapsed

After the patch set:
===
 Performance counter stats for 'system wide':

 194992476  cache-misses
 [40.03%]
6718051877  L1-dcache-loads 
 [60.07%]
 194871921  L1-dcache-load-misses #2.90% of all L1-dcache 
hits   [60.11%]
 187632232  LLC-loads   
 [60.04%]
464466  LLC-load-misses   #0.25% of all LL-cache 
hits[39.89%]

   6.868422769 seconds time elapsed
The cache-miss rate can be reduced from 15% to 2.9%

changelog
=
v6:
- correct v5 
v5:
- order local variables from longest to shortest line
v4:
- move memset into one block of if statement in snmp6_seq_show_item
- remove the changes in netstat_seq_show considerred the stack usage is too 
large
v3:
- introduce generic interface (suggested by Marcelo Ricardo Leitner)
- use max_t instead of self defined macro (suggested by David Miller)
v2:
- fix bug in udplite statistics.
- snmp_seq_show is split into 2 parts

Jia He (7):
  net:snmp: Introduce generic interfaces for snmp_get_cpu_field{,64}
  proc: Reduce cache miss in snmp_seq_show
  proc: Reduce cache miss in snmp6_seq_show
  proc: Reduce cache miss in sctp_snmp_seq_show
  proc: Reduce cache miss in xfrm_statistics_seq_show
  ipv6: Remove useless parameter in __snmp6_fill_statsdev
  net: Suppress the "Comparison to NULL could be written" warnings

 include/net/ip.h |  23 
 net/ipv4/proc.c  | 102 +++
 net/ipv6/addrconf.c  |  12 +++---
 net/ipv6/proc.c  |  30 +++
 net/sctp/proc.c  |  10 +++--
 net/xfrm/xfrm_proc.c |  10 -
 6 files changed, 129 insertions(+), 58 deletions(-)

-- 
2.5.5



[PATCH v6 3/7] proc: Reduce cache miss in snmp6_seq_show

2016-09-29 Thread Jia He
This is to use the generic interfaces snmp_get_cpu_field{,64}_batch to
aggregate the data by going through all the items of each cpu sequentially.

Signed-off-by: Jia He 
---
 net/ipv6/proc.c | 30 ++
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c
index 679253d0..cc8e3ae 100644
--- a/net/ipv6/proc.c
+++ b/net/ipv6/proc.c
@@ -30,6 +30,11 @@
 #include 
 #include 
 
+#define MAX4(a, b, c, d) \
+   max_t(u32, max_t(u32, a, b), max_t(u32, c, d))
+#define SNMP_MIB_MAX MAX4(UDP_MIB_MAX, TCP_MIB_MAX, \
+   IPSTATS_MIB_MAX, ICMP_MIB_MAX)
+
 static int sockstat6_seq_show(struct seq_file *seq, void *v)
 {
struct net *net = seq->private;
@@ -191,25 +196,34 @@ static void snmp6_seq_show_item(struct seq_file *seq, 
void __percpu *pcpumib,
atomic_long_t *smib,
const struct snmp_mib *itemlist)
 {
+   unsigned long buff[SNMP_MIB_MAX];
int i;
-   unsigned long val;
 
-   for (i = 0; itemlist[i].name; i++) {
-   val = pcpumib ?
-   snmp_fold_field(pcpumib, itemlist[i].entry) :
-   atomic_long_read(smib + itemlist[i].entry);
-   seq_printf(seq, "%-32s\t%lu\n", itemlist[i].name, val);
+   if (pcpumib) {
+   memset(buff, 0, sizeof(unsigned long) * SNMP_MIB_MAX);
+
+   snmp_get_cpu_field_batch(buff, itemlist, pcpumib);
+   for (i = 0; itemlist[i].name; i++)
+   seq_printf(seq, "%-32s\t%lu\n",
+  itemlist[i].name, buff[i]);
+   } else {
+   for (i = 0; itemlist[i].name; i++)
+   seq_printf(seq, "%-32s\t%lu\n", itemlist[i].name,
+  atomic_long_read(smib + itemlist[i].entry));
}
 }
 
 static void snmp6_seq_show_item64(struct seq_file *seq, void __percpu *mib,
  const struct snmp_mib *itemlist, size_t 
syncpoff)
 {
+   u64 buff64[SNMP_MIB_MAX];
int i;
 
+   memset(buff64, 0, sizeof(unsigned long) * SNMP_MIB_MAX);
+
+   snmp_get_cpu_field64_batch(buff64, itemlist, mib, syncpoff);
for (i = 0; itemlist[i].name; i++)
-   seq_printf(seq, "%-32s\t%llu\n", itemlist[i].name,
-  snmp_fold_field64(mib, itemlist[i].entry, syncpoff));
+   seq_printf(seq, "%-32s\t%llu\n", itemlist[i].name, buff64[i]);
 }
 
 static int snmp6_seq_show(struct seq_file *seq, void *v)
-- 
2.5.5



Re: [PATCH] drm/mediatek: fix a typo

2016-09-29 Thread Bibby Hsieh
On Thu, 2016-09-29 at 10:46 +0200, Matthias Brugger wrote:
> 
> On 29/09/16 06:01, CK Hu wrote:
> > Acked-by: CK Hu 
> >
> > On Thu, 2016-09-29 at 11:22 +0800, Bibby Hsieh wrote:
> >> Fix the typo: OD_RELAYMODE->OD_CFG
> >>
> 

Hi, Matthias
Thanks for your reply.

> Although it is quite clear what the patch does, could you write one 
> sentence to explain what it does. Maybe explain even which effect it 
> has, which error get fixed etc.

Ok, I will do that.

> As we are getting public available boards now, we should take more care 
> about fixes. If you have a fix for a commit introduced in an earlier 
> version of linux and it should be fixed for this version as well (e.g. 
> v4.6 does have the feature but it does not work correctly) then please 
> add these two lines before your Signed-off-by:
> Fixes:  ("")
> Cc: sta...@vger.kernel.org # v4.6+
> 
> Where v4.6+ stands for the oldest version where this should get fixed.
> 

Ok, but the patch hasn't been merged into v4.8 (just in drm-next [1] and
linux-next [2]), how can I mark that?

[1]
https://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-next=7216436420414144646f5d8343d061355fd23483
[2]
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=7216436420414144646f5d8343d061355fd23483


> Thanks a lot,
> Matthias
> 
> >> Signed-off-by: Bibby Hsieh 
> >> ---
> >>  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c |2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
> >> b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> >> index df33b3c..aa5f20f 100644
> >> --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> >> +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> >> @@ -123,7 +123,7 @@ static void mtk_od_config(struct mtk_ddp_comp *comp, 
> >> unsigned int w,
> >>  unsigned int bpc)
> >>  {
> >>writel(w << 16 | h, comp->regs + DISP_OD_SIZE);
> >> -  writel(OD_RELAYMODE, comp->regs + OD_RELAYMODE);
> >> +  writel(OD_RELAYMODE, comp->regs + OD_CFG);
> >>mtk_dither_set(comp, bpc, DISP_OD_CFG);
> >>  }
> >>
> >
> >

-- 
Bibby



Re: [PATCH v14 2/4] CMDQ: Mediatek CMDQ driver

2016-09-29 Thread CK Hu
Hi, HS:

On Mon, 2016-09-05 at 09:44 +0800, HS Liao wrote:
> This patch is first version of Mediatek Command Queue(CMDQ) driver. The
> CMDQ is used to help write registers with critical time limitation,
> such as updating display configuration during the vblank. It controls
> Global Command Engine (GCE) hardware to achieve this requirement.
> Currently, CMDQ only supports display related hardwares, but we expect
> it can be extended to other hardwares for future requirements.
> 
> Signed-off-by: HS Liao 
> Signed-off-by: CK Hu 
> ---

[snip...]

> +
> +struct cmdq_task {
> + struct cmdq *cmdq;
> + struct list_headlist_entry;
> + void*va_base;
> + dma_addr_t  pa_base;
> + size_t  cmd_buf_size; /* command occupied size */
> + size_t  buf_size; /* real buffer size */
> + boolfinalized;
> + struct cmdq_thread  *thread;

I think thread info could be removed from cmdq_task. Only
cmdq_task_handle_error() and cmdq_task_insert_into_thread() use
task->thread and caller of both function has the thread info. So you
could just pass thread info into these two function and remove thread
info in cmdq_task.

> + struct cmdq_task_cb cb;

I think this callback function is equal to mailbox client tx_done
callback. It's better to use already-defined interface rather than
creating your own.

> +};
> +

[snip...]

> +
> +static int cmdq_suspend(struct device *dev)
> +{
> + struct cmdq *cmdq = dev_get_drvdata(dev);
> + struct cmdq_thread *thread;
> + int i;
> + bool task_running = false;
> +
> + mutex_lock(>task_mutex);
> + cmdq->suspended = true;
> + mutex_unlock(>task_mutex);
> +
> + for (i = 0; i < ARRAY_SIZE(cmdq->thread); i++) {
> + thread = >thread[i];
> + if (!list_empty(>task_busy_list)) {
> + mod_timer(>timeout, jiffies + 1);
> + task_running = true;
> + }
> + }
> +
> + if (task_running) {
> + dev_warn(dev, "exist running task(s) in suspend\n");
> + msleep(20);

Why sleep here? It looks like a recovery but could 20ms recovery
something? I think warning message is enough because you see the warning
message, and you fix the bug, so no need to recovery anything.

> + }
> +
> + clk_unprepare(cmdq->clock);
> + return 0;
> +}
> +

Regards,
CK




Re: [PATCH v4 00/12] re-enable DAX PMD support

2016-09-29 Thread Ross Zwisler
On Fri, Sep 30, 2016 at 09:43:45AM +1000, Dave Chinner wrote:
> Finally: none of the patches in your tree have reviewed-by tags.
> That says to me that none of this code has been reviewed yet.
> Reviewed-by tags are non-negotiable requirement for anything going
> through my trees. I don't have time right now to review this code,
> so you're going to need to chase up other reviewers before merging.
> 
> And, really, this is getting very late in the cycle to be merging
> new code - we're less than one working day away from the merge
> window opening and we've missed the last linux-next build. I'd
> suggest that we'd might be best served by slipping this to the PMD
> support code to the next cycle when there's no time pressure for
> review and we can get a decent linux-next soak on the code.

I absolutely support your policy of only sending code to Linux that has passed
peer review.

However, I do feel compelled to point out that this is not new code.  I didn't
just spring it on everyone in the hours before the v4.8 merge window.  I
posted the first version of this patch set on August 15th, *seven weeks ago*:

https://lkml.org/lkml/2016/8/15/613

This was the day after v4.7-rc2 was released.

Since then I have responded promptly to the little review feedback that I've
received.  I've also reviewed and tested other DAX changes, like the struct
iomap changes from Christoph.  Those changes were first posted to the mailing
list on September 9th, four weeks after mine.  Nevertheless, I was happy to
rebase my changes on top of his, which meant a full rewrite of the DAX PMD
fault handler so it would be based on struct iomap.  His changes are going to
be merged for v4.9, and mine are not.

Please, help me understand what I can do to get my code reviewed.  Do I need
to more aggressively ping my patch series, asking people by name for reviews?
Do we need to rework our code flow to Linus so that the DAX changes go through
a filesystem tree like XFS or ext4, and ask the developers of that filesystem
to help with reviews?  Something else?

I'm honestly very frustrated by this because I've done my best to be open to
constructive criticism and I've tried to respond promptly to the feedback that
I've received.  In the end, though, a system where it's a requirement that all
upstreamed code be peer reviewed but in which I can't get any feedback is
essentially a system where I'm not allowed to contribute.


Re: [lkp] [staging] d4f56b47a8: divide error: 0000 [#1] PREEMPT SMP KASAN

2016-09-29 Thread Ye Xiaolong
On 09/30, Viresh Kumar wrote:
>On Fri, Sep 30, 2016 at 7:29 AM, kernel test robot
> wrote:
>>
>>
>> FYI, we noticed the following commit:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
>> commit d4f56b47a8fac90b15adfae80a42a2735d6b3213 ("staging: greybus: Add 
>> drivers/staging/greybus to the build")
>>
>> in testcase: trinity
>> with following parameters:
>>
>> runtime: 300s
>>
>>
>> Trinity is a linux system call fuzz tester.
>>
>>
>> on test machine: qemu-system-x86_64 -enable-kvm -m 512M
>>
>> caused below changes:
>>
>>
>> ++++
>> || 526dec0642 | d4f56b47a8 |
>> ++++
>> | boot_successes | 5  | 0  |
>> | boot_failures  | 8  | 12 |
>> | calltrace:SyS_open | 8  ||
>> | invoked_oom-killer:gfp_mask=0x | 1  ||
>> | Mem-Info   | 1  ||
>> | IP-Config:Auto-configuration_of_network_failed | 2  ||
>> | BUG:kernel_hang_in_test_stage  | 6  ||
>> | divide_error:#[##]PREEMPT_SMP_KASAN| 0  | 12 |
>> | RIP:gb_timesync_init   | 0  | 12 |
>> | calltrace:gb_init  | 0  | 12 |
>> | Kernel_panic-not_syncing:Fatal_exception   | 0  | 12 |
>> ++++
>>
>>
>>
>> [   16.795543] FPGA image file name: xlinx_fpga_firmware.bit
>> [   16.796615] GPIO INIT FAIL!!
>> [   16.799462] Unable to find a compatible ARMv7 timer
>> [   16.799948] divide error:  [#1] PREEMPT SMP KASAN
>> [   16.800459] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
>> 4.8.0-rc6-02364-gd4f56b4 #29
>> [   16.801197] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>> Debian-1.8.2-1 04/01/2014
>> [   16.802055] task: 88001a124000 task.stack: 88001a14
>> [   16.802645] RIP: 0010:[]  [] 
>> gb_timesync_init+0x35/0x78
>> [   16.803534] RSP: :88001a147e58  EFLAGS: 00010246
>> [   16.804040] RAX: 00038d7ea4c68000 RBX:  RCX: 
>> 8114ea41
>> [   16.804716] RDX:  RSI:  RDI: 
>> 88001a124c2c
>> [   16.805393] RBP: 88001a147e60 R08: 0001 R09: 
>> 
>> [   16.806066] R10: 88001a147d70 R11: 83cddb35 R12: 
>> 82f67cc6
>> [   16.806744] R13:  R14: 82fbe8b0 R15: 
>> 82fbe8f8
>> [   16.807421] FS:  () GS:88001a40() 
>> knlGS:
>> [   16.808185] CS:  0010 DS:  ES:  CR0: 80050033
>> [   16.808728] CR2:  CR3: 02c0a000 CR4: 
>> 06b0
>> [   16.809405] DR0:  DR1:  DR2: 
>> 
>> [   16.810078] DR3:  DR6: fffe0ff0 DR7: 
>> 0400
>> [   16.810752] Stack:
>> [   16.811058]   88001a147e78 82f67d45 
>> 
>> [   16.811819]  88001a147ee8 82efe339 82b89800 
>> 0012
>> [   16.812576]  88001fa80fe5  82b0495f 
>> 0006
>> [   16.813332] Call Trace:
>> [   16.813577]  [] gb_init+0x7f/0xb3
>> [   16.814045]  [] do_one_initcall+0x9a/0x12c
>> [   16.814588]  [] kernel_init_freeable+0x1b0/0x246
>> [   16.815180]  [] kernel_init+0xc/0x108
>> [   16.815679]  [] ret_from_fork+0x1f/0x40
>> [   16.816197]  [] ? rest_init+0x13c/0x13c
>> [   16.816724] Code: 85 c0 89 c3 74 12 48 c7 c7 64 ae b4 82 31 c0 e8 40 b5 
>> 27 fe 89 d8 eb 53 e8 cb 55 23 ff 31 d2 89 c6 48 b8 00 80 c6 a4 7e 8d 03 00 
>> <48> f7 f6 31 d2 48 c7 c7 84 ae b4 82 48 89 35 de 65 64 01 48 89
>> [   16.819509] RIP  [] gb_timesync_init+0x35/0x78
>> [   16.820094]  RSP 
>> [   16.820548] ---[ end trace c73ba0f929e81492 ]---
>> [   16.821001] Kernel panic - not syncing: Fatal exception
>
>Can you please confirm if below patch fixes it for you ?
>
>https://marc.info/?l=linux-kernel=147490908100954

Sure, I'll verify it and provide the result later.

Thanks,
Xiaolong


[PATCH] aer: function comments cleanup

2016-09-29 Thread Cao jin
Signed-off-by: Cao jin 
---
 drivers/pci/pcie/aer/aerdrv.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/pci/pcie/aer/aerdrv.c b/drivers/pci/pcie/aer/aerdrv.c
index 48d21e0..20eea86 100644
--- a/drivers/pci/pcie/aer/aerdrv.c
+++ b/drivers/pci/pcie/aer/aerdrv.c
@@ -294,7 +294,6 @@ static void aer_remove(struct pcie_device *dev)
 /**
  * aer_probe - initialize resources
  * @dev: pointer to the pcie_dev data structure
- * @id: pointer to the service id data structure
  *
  * Invoked when PCI Express bus loads AER service driver.
  */
-- 
2.1.0





Re: [RFC][PATCH 0/7] printk: use alt_printk to handle printk() recursive calls

2016-09-29 Thread Sergey Senozhatsky
On (09/29/16 15:25), Petr Mladek wrote:
> On Tue 2016-09-27 23:22:30, Sergey Senozhatsky wrote:
> > Hello,
> > 
> > RFC
> > 
> > This patch set extends a lock-less NMI per-cpu buffers idea to
> > handle recursive printk() calls. The basic mechanism is pretty much the
> > same -- at the beginning of a deadlock-prone section we switch to lock-less
> > printk callback, and return back to a default printk implementation at the
> > end; the messages are getting flushed to a logbuf buffer from a safer
> > context.
> 
> I was skeptical but I really like this way now.
>
> The switching of the buffers is a bit hairy in this version but I
> think that we could make it much better.
> 
> Other than that it looks like a big win. It kills a lot of
> printk-related pain points. And it will not be that complicated
> after all.

many thanks for looking at this train wreck.

so, like I said, it addresses printk()-recursion in *ideally* quite
a minimalistic way -- just several alt_printk_enter/exit calls in
printk.c, without ever touching any other parts of the kernel.

gunning down printk deadlocks in general, however, requires much more
effort; or even a completely different approach.

a) a lock-less printk() by default
   um, `#define printk alt_printk'. but this will break printk() from irq.
   and the ordering of messages from per-cpu buffers may be far from correct.

b) combining a DEFERRED_WARN + alt_printk
   DEFERRED_WARN potentially is a never ending thing. we can add some
   lockdep annotations, perhaps, and hope that error handling branches
   that may contain WARN_ONs/printk-s will be executed with prove_locking
   enabled on someone's machine.

c) ...

-ss


Re: [PATCH net v2] L2TP:Adjust intf MTU,factor underlay L3,overlay L2

2016-09-29 Thread R. Parameswaran

Hi James,

On Thu, 29 Sep 2016, James Chapman wrote:

> On 22/09/16 21:52, R. Parameswaran wrote:
> > From ed585bdd6d3d2b3dec58d414f514cd764d89159d Mon Sep 17 00:00:00 2001
> > From: "R. Parameswaran" 
> > Date: Thu, 22 Sep 2016 13:19:25 -0700
> > Subject: [PATCH] L2TP:Adjust intf MTU,factor underlay L3,overlay L2
> >
> > Take into account all of the tunnel encapsulation headers when setting
> > up the MTU on the L2TP logical interface device. Otherwise, packets
> > created by the applications on top of the L2TP layer are larger
> > than they ought to be, relative to the underlay MTU, leading to
> > needless fragmentation once the outer IP encap is added.
> >
> > Specifically, take into account the (outer, underlay) IP header
> > imposed on the encapsulated L2TP packet, and the Layer 2 header
> > imposed on the inner IP packet prior to L2TP encapsulation.
> >
> > Do not assume an Ethernet (non-jumbo) underlay. Use the PMTU mechanism
> > and the dst entry in the L2TP tunnel socket to directly pull up
> > the underlay MTU (as the baseline number on top of which the
> > encapsulation headers are factored in).  Fall back to Ethernet MTU
> > if this fails.
> >
> > Signed-off-by: R. Parameswaran 
> >
> > Reviewed-by: "N. Prachanda" ,
> > Reviewed-by: "R. Shearman" ,
> > Reviewed-by: "D. Fawcus" 
> > ---
> >  net/l2tp/l2tp_eth.c | 48 
> >  1 file changed, 44 insertions(+), 4 deletions(-)
> >
> > diff --git a/net/l2tp/l2tp_eth.c b/net/l2tp/l2tp_eth.c
> > index 57fc5a4..dbcd6bd 100644
> > --- a/net/l2tp/l2tp_eth.c
> > +++ b/net/l2tp/l2tp_eth.c
> > @@ -30,6 +30,9 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> > +#include 
> > +#include 
> >  
> >  #include "l2tp_core.h"
> >  
> > @@ -206,6 +209,46 @@ static void l2tp_eth_show(struct seq_file *m, void 
> > *arg)
> >  }
> >  #endif
> >  
> > +static void l2tp_eth_adjust_mtu(struct l2tp_tunnel *tunnel,
> > +   struct l2tp_session *session,
> > +   struct net_device *dev)
> > +{
> > +   unsigned int overhead = 0;
> > +   struct dst_entry *dst;
> > +
> > +   if (session->mtu != 0) {
> > +   dev->mtu = session->mtu;
> > +   dev->needed_headroom += session->hdr_len;
> > +   if (tunnel->encap == L2TP_ENCAPTYPE_UDP)
> > +   dev->needed_headroom += sizeof(struct udphdr);
> > +   return;
> > +   }
> > +   overhead = session->hdr_len;
> > +   /* Adjust MTU, factor overhead - underlay L3 hdr, overlay L2 hdr*/
> > +   if (tunnel->sock->sk_family == AF_INET)
> > +   overhead += (ETH_HLEN + sizeof(struct iphdr));
> > +   else if (tunnel->sock->sk_family == AF_INET6)
> > +   overhead += (ETH_HLEN + sizeof(struct ipv6hdr));
> What about options in the IP header? If certain options are set on the
> socket, the IP header may be larger.
> 

Thanks for the reply - It looks like IP options can only be 
enabled through setsockopt on an application's socket (if there's any 
other way to turn on IP options, please let me know - didn't see any 
sysctl setting for transmit). This scenario would come 
into picture when an application opens a raw IP or UDP socket such that it 
routes into the L2TP logical interface.

If you take the case of a plain IP (ethernet) interface, even if an
application opened a socket turning on IP options, it would not change
the MTU of the underlying interface, and it would not affect other 
applications transacting packets on the same interface. I know its not an 
exact parallel to this case, but since the IP option control is per 
application, we probably should not factor it into the L2TP logical interface?
We cannot affect other applications/processes running on the same L2TP 
tunnel. Also, since the application  using IP options knows that it has turned 
on IP options, maybe we can count on it to factor the size of the options 
into the size of the payload it sends into the socket, or set the mtu on the 
L2TP interface through config? 

Other than this, I don't see keepalives or anything else in which the 
kernel will source its own packet into the L2TP interface, outside of 
an application injected packet - if there is something like that, please
let me know. The user space L2TP daemon would probably fall in the 
category of applications.

thanks,

Ramkumar 


> > +   /* Additionally, if the encap is UDP, account for UDP header size */
> > +   if (tunnel->encap == L2TP_ENCAPTYPE_UDP)
> > +   overhead += sizeof(struct udphdr);
> > +   /* If PMTU discovery was enabled, use discovered MTU on L2TP device */
> > +   dst = sk_dst_get(tunnel->sock);
> > +   if (dst) {
> > +   u32 pmtu = dst_mtu(dst);
> > +
> > +   if (pmtu != 0)
> > +   dev->mtu = pmtu;
> > +   dst_release(dst);
> > +   }
> > +   /* else (no PMTUD) L2TP dev MTU defaulted 

[git pull] drm fixes for 4.8 final

2016-09-29 Thread Dave Airlie
Hi Linus,

One big regression fix for udl, along with two amdgpu fixes and two
nouveau fixes.

All seems pretty safe and useful.

Dave.

The following changes since commit 08895a8b6b06ed2323cd97a36ee40a116b3db8ed:

  Linux 4.8-rc8 (2016-09-25 18:47:13 -0700)

are available in the git repository at:

  git://people.freedesktop.org/~airlied/linux tags/drm-fixes-for-v4.8-final

for you to fetch changes up to 90fd68dcf9a763f7e575c8467415bd8a66d073f4:

  drm/udl: fix line iterator in damage handling (2016-09-28 13:29:18 +1000)


drm fixes for final 4.8, udl, amdgpu/radeon and nouveau


Alex Deucher (1):
  drm/radeon/si/dpm: add workaround for for Jet parts

Dave Airlie (2):
  Merge branch 'drm-fixes-4.8' of
git://people.freedesktop.org/~agd5f/linux into drm-fixes
  Merge branch 'linux-4.8' of git://github.com/skeggsb/linux into drm-fixes

David Herrmann (1):
  drm/udl: fix line iterator in damage handling

Grazvydas Ignotas (1):
  drm/amdgpu: disable CRTCs before teardown

Ilia Mirkin (1):
  drm/nouveau/fifo/nv04: avoid ramht race against cookie insertion

Karol Herbst (1):
  drm/nouveau: Revert "bus: remove cpu_coherent flag"

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
 drivers/gpu/drm/nouveau/include/nvkm/core/device.h | 1 +
 drivers/gpu/drm/nouveau/nouveau_bo.c   | 3 ++-
 drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c   | 1 +
 drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 1 +
 drivers/gpu/drm/nouveau/nvkm/engine/fifo/dmanv04.c | 3 +++
 drivers/gpu/drm/radeon/si_dpm.c| 6 ++
 drivers/gpu/drm/udl/udl_fb.c   | 2 +-
 8 files changed, 16 insertions(+), 3 deletions(-)


[PATCH v15 09/15] selftests/powerpc: Add ptrace tests for TAR, PPR, DSCR in suspended TM

2016-09-29 Thread wei . guo . simon
From: Anshuman Khandual 

This patch adds ptrace interface test for TAR, PPR, DSCR
registers inside suspended TM context.

Signed-off-by: Anshuman Khandual 
Signed-off-by: Simon Guo 
---
 tools/testing/selftests/powerpc/ptrace/Makefile|   2 +-
 .../selftests/powerpc/ptrace/ptrace-tm-spd-tar.c   | 174 +
 2 files changed, 175 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-tar.c

diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile 
b/tools/testing/selftests/powerpc/ptrace/Makefile
index 9af9ad5..19e4a7c 100644
--- a/tools/testing/selftests/powerpc/ptrace/Makefile
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -1,5 +1,5 @@
 TEST_PROGS := ptrace-ebb ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr \
-ptrace-tar ptrace-tm-tar
+ptrace-tar ptrace-tm-tar ptrace-tm-spd-tar
 
 include ../../lib.mk
 
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-tar.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-tar.c
new file mode 100644
index 000..b3c061d
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-tar.c
@@ -0,0 +1,174 @@
+/*
+ * Ptrace test for TAR, PPR, DSCR registers in the TM Suspend context
+ *
+ * Copyright (C) 2015 Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include "ptrace.h"
+#include "tm.h"
+#include "ptrace-tar.h"
+
+int shm_id;
+int *cptr, *pptr;
+
+__attribute__((used)) void wait_parent(void)
+{
+   cptr[2] = 1;
+   while (!cptr[1])
+   asm volatile("" : : : "memory");
+}
+
+void tm_spd_tar(void)
+{
+   unsigned long result, texasr;
+   unsigned long regs[3];
+   int ret;
+
+   cptr = (int *)shmat(shm_id, NULL, 0);
+
+trans:
+   cptr[2] = 0;
+   asm __volatile__(
+   "li 4, %[tar_1];"
+   "mtspr %[sprn_tar],  4;"/* TAR_1 */
+   "li 4, %[dscr_1];"
+   "mtspr %[sprn_dscr], 4;"/* DSCR_1 */
+   "or 31,31,31;"  /* PPR_1*/
+
+   "1: ;"
+   "tbegin.;"
+   "beq 2f;"
+
+   "li 4, %[tar_2];"
+   "mtspr %[sprn_tar],  4;"/* TAR_2 */
+   "li 4, %[dscr_2];"
+   "mtspr %[sprn_dscr], 4;"/* DSCR_2 */
+   "or 1,1,1;" /* PPR_2 */
+
+   "tsuspend.;"
+   "li 4, %[tar_3];"
+   "mtspr %[sprn_tar],  4;"/* TAR_3 */
+   "li 4, %[dscr_3];"
+   "mtspr %[sprn_dscr], 4;"/* DSCR_3 */
+   "or 6,6,6;" /* PPR_3 */
+   "bl wait_parent;"
+   "tresume.;"
+
+   "tend.;"
+   "li 0, 0;"
+   "ori %[res], 0, 0;"
+   "b 3f;"
+
+   /* Transaction abort handler */
+   "2: ;"
+   "li 0, 1;"
+   "ori %[res], 0, 0;"
+   "mfspr %[texasr], %[sprn_texasr];"
+
+   "3: ;"
+
+   : [res] "=r" (result), [texasr] "=r" (texasr)
+   : [val] "r" (cptr[1]), [sprn_dscr]"i"(SPRN_DSCR),
+   [sprn_tar]"i"(SPRN_TAR), [sprn_ppr]"i"(SPRN_PPR),
+   [sprn_texasr]"i"(SPRN_TEXASR), [tar_1]"i"(TAR_1),
+   [dscr_1]"i"(DSCR_1), [tar_2]"i"(TAR_2), [dscr_2]"i"(DSCR_2),
+   [tar_3]"i"(TAR_3), [dscr_3]"i"(DSCR_3)
+   : "memory", "r0", "r1", "r3", "r4", "r5", "r6"
+   );
+
+   /* TM failed, analyse */
+   if (result) {
+   if (!cptr[0])
+   goto trans;
+
+   regs[0] = mfspr(SPRN_TAR);
+   regs[1] = mfspr(SPRN_PPR);
+   regs[2] = mfspr(SPRN_DSCR);
+
+   shmdt();
+   printf("%-30s TAR: %lu PPR: %lx DSCR: %lu\n",
+   user_read, regs[0], regs[1], regs[2]);
+
+   ret = validate_tar_registers(regs, TAR_4, PPR_4, DSCR_4);
+   if (ret)
+   exit(1);
+   exit(0);
+   }
+   shmdt();
+   exit(1);
+}
+
+int trace_tm_spd_tar(pid_t child)
+{
+   unsigned long regs[3];
+
+   FAIL_IF(start_trace(child));
+   FAIL_IF(show_tar_registers(child, regs));
+   printf("%-30s TAR: %lu PPR: %lx DSCR: %lu\n",
+   ptrace_read_running, regs[0], regs[1], regs[2]);
+
+   FAIL_IF(validate_tar_registers(regs, TAR_3, PPR_3, DSCR_3));
+   FAIL_IF(show_tm_checkpointed_state(child, regs));
+   printf("%-30s TAR: %lu PPR: %lx DSCR: %lu\n",
+   ptrace_read_ckpt, 

[PATCH v15 15/15] selftests/powerpc: Fix a build issue

2016-09-29 Thread wei . guo . simon
From: Anshuman Khandual 

Fixes the following build failure -

cp_abort.c:90:3: error: ‘for’ loop initial declarations are only
allowed in C99 or C11 mode
   for (int i = 0; i < NUM_LOOPS; i++) {
   ^
cp_abort.c:90:3: note: use option -std=c99, -std=gnu99, -std=c11 or
-std=gnu11 to compile your code
cp_abort.c:97:3: error: ‘for’ loop initial declarations are only
allowed in C99 or C11 mode
   for (int i = 0; i < NUM_LOOPS; i++) {

Signed-off-by: Anshuman Khandual 
Signed-off-by: Simon Guo 
Reviewed-by: Cyril Bur 
---
 tools/testing/selftests/powerpc/context_switch/cp_abort.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/powerpc/context_switch/cp_abort.c 
b/tools/testing/selftests/powerpc/context_switch/cp_abort.c
index 5a5b55a..1ce7dce 100644
--- a/tools/testing/selftests/powerpc/context_switch/cp_abort.c
+++ b/tools/testing/selftests/powerpc/context_switch/cp_abort.c
@@ -67,7 +67,7 @@ int test_cp_abort(void)
/* 128 bytes for a full cache line */
char buf[128] __cacheline_aligned;
cpu_set_t cpuset;
-   int fd1[2], fd2[2], pid;
+   int fd1[2], fd2[2], pid, i;
char c;
 
/* only run this test on a P9 or later */
@@ -87,14 +87,14 @@ int test_cp_abort(void)
FAIL_IF(pid < 0);
 
if (!pid) {
-   for (int i = 0; i < NUM_LOOPS; i++) {
+   for (i = 0; i < NUM_LOOPS; i++) {
FAIL_IF((write(fd1[WRITE_FD], , 1)) != 1);
FAIL_IF((read(fd2[READ_FD], , 1)) != 1);
/* A paste succeeds if CR0 EQ bit is set */
FAIL_IF(paste(buf) & 0x2000);
}
} else {
-   for (int i = 0; i < NUM_LOOPS; i++) {
+   for (i = 0; i < NUM_LOOPS; i++) {
FAIL_IF((read(fd1[READ_FD], , 1)) != 1);
copy(buf);
FAIL_IF((write(fd2[WRITE_FD], , 1) != 1));
-- 
1.8.3.1



[PATCH v15 12/15] selftests/powerpc: Add ptrace tests for VSX, VMX registers in suspended TM

2016-09-29 Thread wei . guo . simon
From: Anshuman Khandual 

This patch adds ptrace interface test for VSX, VMX registers
inside suspended TM context.

Signed-off-by: Anshuman Khandual 
Signed-off-by: Simon Guo 
---
 tools/testing/selftests/powerpc/ptrace/Makefile|   3 +-
 .../selftests/powerpc/ptrace/ptrace-tm-spd-vsx.c   | 185 +
 2 files changed, 187 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-vsx.c

diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile 
b/tools/testing/selftests/powerpc/ptrace/Makefile
index a518fbd..b5b097a 100644
--- a/tools/testing/selftests/powerpc/ptrace/Makefile
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -1,5 +1,6 @@
 TEST_PROGS := ptrace-ebb ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr \
-ptrace-tar ptrace-tm-tar ptrace-tm-spd-tar ptrace-vsx ptrace-tm-vsx
+ptrace-tar ptrace-tm-tar ptrace-tm-spd-tar ptrace-vsx ptrace-tm-vsx \
+ptrace-tm-spd-vsx
 
 
 include ../../lib.mk
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-vsx.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-vsx.c
new file mode 100644
index 000..0df3c23
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-vsx.c
@@ -0,0 +1,185 @@
+/*
+ * Ptrace test for VMX/VSX registers in the TM Suspend context
+ *
+ * Copyright (C) 2015 Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include "ptrace.h"
+#include "tm.h"
+#include "ptrace-vsx.h"
+
+int shm_id;
+int *cptr, *pptr;
+
+unsigned long fp_load[VEC_MAX];
+unsigned long fp_load_new[VEC_MAX];
+unsigned long fp_store[VEC_MAX];
+unsigned long fp_load_ckpt[VEC_MAX];
+unsigned long fp_load_ckpt_new[VEC_MAX];
+
+__attribute__((used)) void load_vsx(void)
+{
+   loadvsx(fp_load, 0);
+}
+
+__attribute__((used)) void load_vsx_new(void)
+{
+   loadvsx(fp_load_new, 0);
+}
+
+__attribute__((used)) void load_vsx_ckpt(void)
+{
+   loadvsx(fp_load_ckpt, 0);
+}
+
+__attribute__((used)) void wait_parent(void)
+{
+   cptr[2] = 1;
+   while (!cptr[1])
+   asm volatile("" : : : "memory");
+}
+
+void tm_spd_vsx(void)
+{
+   unsigned long result, texasr;
+   int ret;
+
+   cptr = (int *)shmat(shm_id, NULL, 0);
+
+trans:
+   cptr[2] = 0;
+   asm __volatile__(
+   "bl load_vsx_ckpt;"
+
+   "1: ;"
+   "tbegin.;"
+   "beq 2f;"
+
+   "bl load_vsx_new;"
+   "tsuspend.;"
+   "bl load_vsx;"
+   "bl wait_parent;"
+   "tresume.;"
+
+   "tend.;"
+   "li 0, 0;"
+   "ori %[res], 0, 0;"
+   "b 3f;"
+
+   "2: ;"
+   "li 0, 1;"
+   "ori %[res], 0, 0;"
+   "mfspr %[texasr], %[sprn_texasr];"
+
+   "3: ;"
+   : [res] "=r" (result), [texasr] "=r" (texasr)
+   : [fp_load] "r" (fp_load), [fp_load_ckpt] "r" (fp_load_ckpt),
+   [sprn_texasr] "i"  (SPRN_TEXASR)
+   : "memory", "r0", "r1", "r2", "r3", "r4",
+   "r8", "r9", "r10", "r11"
+   );
+
+   if (result) {
+   if (!cptr[0])
+   goto trans;
+   shmdt((void *)cptr);
+
+   storevsx(fp_store, 0);
+   ret = compare_vsx_vmx(fp_store, fp_load_ckpt_new);
+   if (ret)
+   exit(1);
+   exit(0);
+   }
+   shmdt((void *)cptr);
+   exit(1);
+}
+
+int trace_tm_spd_vsx(pid_t child)
+{
+   unsigned long vsx[VSX_MAX];
+   unsigned long vmx[VMX_MAX + 2][2];
+
+   FAIL_IF(start_trace(child));
+   FAIL_IF(show_vsx(child, vsx));
+   FAIL_IF(validate_vsx(vsx, fp_load));
+   FAIL_IF(show_vmx(child, vmx));
+   FAIL_IF(validate_vmx(vmx, fp_load));
+   FAIL_IF(show_vsx_ckpt(child, vsx));
+   FAIL_IF(validate_vsx(vsx, fp_load_ckpt));
+   FAIL_IF(show_vmx_ckpt(child, vmx));
+   FAIL_IF(validate_vmx(vmx, fp_load_ckpt));
+
+   memset(vsx, 0, sizeof(vsx));
+   memset(vmx, 0, sizeof(vmx));
+
+   load_vsx_vmx(fp_load_ckpt_new, vsx, vmx);
+
+   FAIL_IF(write_vsx_ckpt(child, vsx));
+   FAIL_IF(write_vmx_ckpt(child, vmx));
+
+   pptr[0] = 1;
+   pptr[1] = 1;
+   FAIL_IF(stop_trace(child));
+
+   return TEST_PASS;
+}
+
+int ptrace_tm_spd_vsx(void)
+{
+   pid_t pid;
+   int ret, status, i;
+
+   SKIP_IF(!have_htm());
+   shm_id = shmget(IPC_PRIVATE, sizeof(int) * 3, 0777|IPC_CREAT);
+
+   for (i = 0; i < 128; i++) {
+   fp_load[i] = 1 + rand();
+   fp_load_new[i] = 1 + 2 * rand();

Re: [PATCH v1 00/12] THP migration support

2016-09-29 Thread Zi Yan
>
> Thanks for helping,

:)

>
> I think that you seem to do some testing with these patches on powerpc,
> which shows that thp migration can be enabled relatively easily for
> non-x86_64. This is a good news to me.

Right. I did some THP migration tests on both x86_64 and IBM ppc64.

You can use the code here to test the THP migration,
and compare the migration time between 512 base pages and 1 THP.
https://github.com/x-y-z/thp-migration-bench

NUMA (or fake NUMA) setup and libnuma are needed. Since it simply tries to
migrate pages from node 0 to node 1.

make bench should give you the result like:

THP Migration
Total time: 676.870346 us
Test successful.
---
Base Page Migration
Total time: 2340.078354 us
Test successful.

>
> And I apology for my slow development over this patchset.
> My previous post was about 5 months ago, and I've not done ver.2 due to
> many interruptions. Someone also privately asked me about the progress
> of this work, so I promised ver.2 will be posted in a few weeks.
> Your patch 12/12 will come with it.

Looking forward to it. :)

—
Best Regards,
Yan Zi


signature.asc
Description: OpenPGP digital signature


[PATCH v15 13/15] selftests/powerpc: Add ptrace tests for TM SPR registers

2016-09-29 Thread wei . guo . simon
From: Anshuman Khandual 

This patch adds ptrace interface test for TM SPR registers. This
also adds ptrace interface based helper functions related to TM
SPR registers access.

Signed-off-by: Anshuman Khandual 
Signed-off-by: Simon Guo 
---
 tools/testing/selftests/powerpc/ptrace/Makefile|   3 +-
 .../selftests/powerpc/ptrace/ptrace-tm-spr.c   | 168 +
 tools/testing/selftests/powerpc/ptrace/ptrace.h|  35 +
 3 files changed, 204 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-tm-spr.c

diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile 
b/tools/testing/selftests/powerpc/ptrace/Makefile
index b5b097a..ec2a9b0 100644
--- a/tools/testing/selftests/powerpc/ptrace/Makefile
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -1,7 +1,6 @@
 TEST_PROGS := ptrace-ebb ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr \
 ptrace-tar ptrace-tm-tar ptrace-tm-spd-tar ptrace-vsx ptrace-tm-vsx \
-ptrace-tm-spd-vsx
-
+ptrace-tm-spd-vsx ptrace-tm-spr
 
 include ../../lib.mk
 
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spr.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spr.c
new file mode 100644
index 000..94e57cb
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spr.c
@@ -0,0 +1,168 @@
+/*
+ * Ptrace test TM SPR registers
+ *
+ * Copyright (C) 2015 Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include "ptrace.h"
+#include "tm.h"
+
+/* Tracee and tracer shared data */
+struct shared {
+   int flag;
+   struct tm_spr_regs regs;
+};
+unsigned long tfhar;
+
+int shm_id;
+struct shared *cptr, *pptr;
+
+int shm_id1;
+int *cptr1, *pptr1;
+
+#define TM_KVM_SCHED   0xe001ac01
+int validate_tm_spr(struct tm_spr_regs *regs)
+{
+   FAIL_IF(regs->tm_tfhar != tfhar);
+   FAIL_IF((regs->tm_texasr == TM_KVM_SCHED) && (regs->tm_tfiar != 0));
+
+   return TEST_PASS;
+}
+
+void tm_spr(void)
+{
+   unsigned long result, texasr;
+   int ret;
+
+   cptr = (struct shared *)shmat(shm_id, NULL, 0);
+   cptr1 = (int *)shmat(shm_id1, NULL, 0);
+
+trans:
+   cptr1[0] = 0;
+   asm __volatile__(
+   "1: ;"
+   /* TM failover handler should follow "tbegin.;" */
+   "mflr 31;"
+   "bl 4f;"/* $ = TFHAR - 12 */
+   "4: ;"
+   "mflr %[tfhar];"
+   "mtlr 31;"
+
+   "tbegin.;"
+   "beq 2f;"
+
+   "tsuspend.;"
+   "li 8, 1;"
+   "sth 8, 0(%[cptr1]);"
+   "tresume.;"
+   "b .;"
+
+   "tend.;"
+   "li 0, 0;"
+   "ori %[res], 0, 0;"
+   "b 3f;"
+
+   "2: ;"
+
+   "li 0, 1;"
+   "ori %[res], 0, 0;"
+   "mfspr %[texasr], %[sprn_texasr];"
+
+   "3: ;"
+   : [tfhar] "=r" (tfhar), [res] "=r" (result),
+   [texasr] "=r" (texasr), [cptr1] "=r" (cptr1)
+   : [sprn_texasr] "i"  (SPRN_TEXASR)
+   : "memory", "r0", "r1", "r2", "r3", "r4",
+   "r8", "r9", "r10", "r11", "r31"
+   );
+
+   /* There are 2 32bit instructions before tbegin. */
+   tfhar += 12;
+
+   if (result) {
+   if (!cptr->flag)
+   goto trans;
+
+   ret = validate_tm_spr((struct tm_spr_regs *)>regs);
+   shmdt((void *)cptr);
+   shmdt((void *)cptr1);
+   if (ret)
+   exit(1);
+   exit(0);
+   }
+   shmdt((void *)cptr);
+   shmdt((void *)cptr1);
+   exit(1);
+}
+
+int trace_tm_spr(pid_t child)
+{
+   FAIL_IF(start_trace(child));
+   FAIL_IF(show_tm_spr(child, (struct tm_spr_regs *)>regs));
+
+   printf("TFHAR: %lx TEXASR: %lx TFIAR: %lx\n", pptr->regs.tm_tfhar,
+   pptr->regs.tm_texasr, pptr->regs.tm_tfiar);
+
+   pptr->flag = 1;
+   FAIL_IF(stop_trace(child));
+
+   return TEST_PASS;
+}
+
+int ptrace_tm_spr(void)
+{
+   pid_t pid;
+   int ret, status;
+
+   SKIP_IF(!have_htm());
+   shm_id = shmget(IPC_PRIVATE, sizeof(struct shared), 0777|IPC_CREAT);
+   shm_id1 = shmget(IPC_PRIVATE, sizeof(int), 0777|IPC_CREAT);
+   pid = fork();
+   if (pid < 0) {
+   perror("fork() failed");
+   return TEST_FAIL;
+   }
+
+   if (pid == 0)
+   tm_spr();
+
+   if (pid) {
+   pptr = (struct shared *)shmat(shm_id, NULL, 0);
+   pptr1 = (int *)shmat(shm_id1, NULL, 0);
+

Re: [PATCH] mm: exclude isolated non-lru pages from NR_ISOLATED_ANON or NR_ISOLATED_FILE.

2016-09-29 Thread Minchan Kim
Hello,

On Wed, Sep 28, 2016 at 05:31:03PM +0800, ming.ling wrote:
> Non-lru pages don't belong to any lru, so accounting them to
> NR_ISOLATED_ANON or NR_ISOLATED_FILE doesn't make any sense.
> It may misguide functions such as pgdat_reclaimable_pages and
> too_many_isolated.

I agree this part. It would be happier if you give any story you suffered
from. Although you don't have, it's okay because you are correcting
clearly wrong part. Thanks. :)

> 
> This patch adds NR_ISOLATED_NONLRU to vmstat and moves isolated non-lru
> pages from NR_ISOLATED_ANON or NR_ISOLATED_FILE to NR_ISOLATED_NONLRU.
> And with non-lru pages in vmstat, it helps to optimize algorithm of
> function too_many_isolated oneday.

Need more justfication to add new vmstat because once we add it, it's
really hard to change/remove it(i.e., maintainace trobule) so I want
to add it when it really would be helpful sometime, not now.

Could you resend the patch without part adding new vmstat?

Thanks.

> 
> Signed-off-by: ming.ling 
> ---
>  include/linux/mmzone.h |  1 +
>  mm/compaction.c| 12 +---
>  mm/migrate.c   | 14 ++
>  3 files changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 7f2ae99..dc0adba 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -169,6 +169,7 @@ enum node_stat_item {
>   NR_VMSCAN_IMMEDIATE,/* Prioritise for reclaim when writeback ends */
>   NR_DIRTIED, /* page dirtyings since bootup */
>   NR_WRITTEN, /* page writings since bootup */
> + NR_ISOLATED_NONLRU, /* Temporary isolated pages from non-lru */
>   NR_VM_NODE_STAT_ITEMS
>  };
>  
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 9affb29..8da1dca 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -638,16 +638,21 @@ isolate_freepages_range(struct compact_control *cc,
>  static void acct_isolated(struct zone *zone, struct compact_control *cc)
>  {
>   struct page *page;
> - unsigned int count[2] = { 0, };
> + unsigned int count[3] = { 0, };
>  
>   if (list_empty(>migratepages))
>   return;
>  
> - list_for_each_entry(page, >migratepages, lru)
> - count[!!page_is_file_cache(page)]++;
> + list_for_each_entry(page, >migratepages, lru) {
> + if (PageLRU(page))
> + count[!!page_is_file_cache(page)]++;
> + else
> + count[2]++;
> + }
>  
>   mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_ANON, count[0]);
>   mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, count[1]);
> + mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_NONLRU, count[2]);
>  }
>  
>  /* Similar to reclaim, but different enough that they don't share logic */
> @@ -659,6 +664,7 @@ static bool too_many_isolated(struct zone *zone)
>   node_page_state(zone->zone_pgdat, NR_INACTIVE_ANON);
>   active = node_page_state(zone->zone_pgdat, NR_ACTIVE_FILE) +
>   node_page_state(zone->zone_pgdat, NR_ACTIVE_ANON);
> + /* Is it necessary to add NR_ISOLATED_NONLRU?? */
>   isolated = node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE) +
>   node_page_state(zone->zone_pgdat, NR_ISOLATED_ANON);
>  
> diff --git a/mm/migrate.c b/mm/migrate.c
> index f7ee04a..cd5abb2 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -168,8 +168,11 @@ void putback_movable_pages(struct list_head *l)
>   continue;
>   }
>   list_del(>lru);
> - dec_node_page_state(page, NR_ISOLATED_ANON +
> - page_is_file_cache(page));
> + if (PageLRU(page))
> + dec_node_page_state(page, NR_ISOLATED_ANON +
> + page_is_file_cache(page));
> + else
> + dec_node_page_state(page, NR_ISOLATED_NONLRU);
>   /*
>* We isolated non-lru movable page so here we can use
>* __PageMovable because LRU page's mapping cannot have
> @@ -1121,8 +1124,11 @@ out:
>* restored.
>*/
>   list_del(>lru);
> - dec_node_page_state(page, NR_ISOLATED_ANON +
> - page_is_file_cache(page));
> + if (PageLRU(page))
> + dec_node_page_state(page, NR_ISOLATED_ANON +
> + page_is_file_cache(page));
> + else
> + dec_node_page_state(page, NR_ISOLATED_NONLRU);
>   }
>  
>   /*
> -- 
> 1.9.1
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


[PATCH v15 14/15] selftests/powerpc: Add .gitignore file for ptrace executables

2016-09-29 Thread wei . guo . simon
From: Anshuman Khandual 

This patch adds a .gitignore file for all the executables in
the ptrace test directory thus making invisible with git status
query.

Signed-off-by: Anshuman Khandual 
Signed-off-by: Simon Guo 
---
 tools/testing/selftests/powerpc/ptrace/.gitignore | 11 +++
 1 file changed, 11 insertions(+)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/.gitignore

diff --git a/tools/testing/selftests/powerpc/ptrace/.gitignore 
b/tools/testing/selftests/powerpc/ptrace/.gitignore
new file mode 100644
index 000..bdf3566
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/.gitignore
@@ -0,0 +1,11 @@
+ptrace-ebb
+ptrace-gpr
+ptrace-tm-gpr
+ptrace-tm-spd-gpr
+ptrace-tar
+ptrace-tm-tar
+ptrace-tm-spd-tar
+ptrace-vsx
+ptrace-tm-vsx
+ptrace-tm-spd-vsx
+ptrace-tm-spr
-- 
1.8.3.1



[PATCH v15 04/15] selftests/powerpc: Add ptrace tests for GPR/FPR registers

2016-09-29 Thread wei . guo . simon
From: Anshuman Khandual 

This patch adds ptrace interface test for GPR/FPR registers.
This adds ptrace interface based helper functions related to
GPR/FPR access and some assembly helper functions related to
GPR/FPR registers.

Signed-off-by: Anshuman Khandual 
Signed-off-by: Simon Guo 
---
 tools/testing/selftests/powerpc/math/fpu_asm.S |  42 +---
 tools/testing/selftests/powerpc/ptrace/Makefile|   4 +-
 .../testing/selftests/powerpc/ptrace/ptrace-gpr.c  | 123 
 .../testing/selftests/powerpc/ptrace/ptrace-gpr.h  |  74 
 tools/testing/selftests/powerpc/ptrace/ptrace.h| 211 +
 tools/testing/selftests/powerpc/utility/reg.S  | 132 +
 tools/testing/selftests/powerpc/utility/reg.h  | 101 ++
 7 files changed, 645 insertions(+), 42 deletions(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-gpr.c
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-gpr.h
 create mode 100644 tools/testing/selftests/powerpc/utility/reg.S

diff --git a/tools/testing/selftests/powerpc/math/fpu_asm.S 
b/tools/testing/selftests/powerpc/math/fpu_asm.S
index 8d4eb96..6e423fa 100644
--- a/tools/testing/selftests/powerpc/math/fpu_asm.S
+++ b/tools/testing/selftests/powerpc/math/fpu_asm.S
@@ -8,49 +8,11 @@
  */
 
 #include "basic_asm.h"
-
-#define PUSH_FPU(pos) \
-   stfdf14,pos(sp); \
-   stfdf15,pos+8(sp); \
-   stfdf16,pos+16(sp); \
-   stfdf17,pos+24(sp); \
-   stfdf18,pos+32(sp); \
-   stfdf19,pos+40(sp); \
-   stfdf20,pos+48(sp); \
-   stfdf21,pos+56(sp); \
-   stfdf22,pos+64(sp); \
-   stfdf23,pos+72(sp); \
-   stfdf24,pos+80(sp); \
-   stfdf25,pos+88(sp); \
-   stfdf26,pos+96(sp); \
-   stfdf27,pos+104(sp); \
-   stfdf28,pos+112(sp); \
-   stfdf29,pos+120(sp); \
-   stfdf30,pos+128(sp); \
-   stfdf31,pos+136(sp);
-
-#define POP_FPU(pos) \
-   lfd f14,pos(sp); \
-   lfd f15,pos+8(sp); \
-   lfd f16,pos+16(sp); \
-   lfd f17,pos+24(sp); \
-   lfd f18,pos+32(sp); \
-   lfd f19,pos+40(sp); \
-   lfd f20,pos+48(sp); \
-   lfd f21,pos+56(sp); \
-   lfd f22,pos+64(sp); \
-   lfd f23,pos+72(sp); \
-   lfd f24,pos+80(sp); \
-   lfd f25,pos+88(sp); \
-   lfd f26,pos+96(sp); \
-   lfd f27,pos+104(sp); \
-   lfd f28,pos+112(sp); \
-   lfd f29,pos+120(sp); \
-   lfd f30,pos+128(sp); \
-   lfd f31,pos+136(sp);
+#include "reg.h"
 
 # Careful calling this, it will 'clobber' fpu (by design)
 # Don't call this from C
+# double precision
 FUNC_START(load_fpu)
lfd f14,0(r3)
lfd f15,8(r3)
diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile 
b/tools/testing/selftests/powerpc/ptrace/Makefile
index 84c1c01..e9b8e7d 100644
--- a/tools/testing/selftests/powerpc/ptrace/Makefile
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -1,10 +1,10 @@
-TEST_PROGS := ptrace-ebb
+TEST_PROGS := ptrace-ebb ptrace-gpr
 
 include ../../lib.mk
 
 all: $(TEST_PROGS)
 CFLAGS += -m64
-$(TEST_PROGS): ../harness.c ../utility/utils.c ptrace.h
+$(TEST_PROGS): ../harness.c ../utility/reg.S ../utility/utils.c ptrace.h
 ptrace-ebb: ../pmu/event.c ../pmu/lib.c ../pmu/ebb/ebb_handler.S 
../pmu/ebb/busy_loop.S
 ptrace-ebb: CFLAGS += -I../pmu/ebb
 clean:
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-gpr.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-gpr.c
new file mode 100644
index 000..0b4ebcc
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-gpr.c
@@ -0,0 +1,123 @@
+/*
+ * Ptrace test for GPR/FPR registers
+ *
+ * Copyright (C) 2015 Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include "ptrace.h"
+#include "ptrace-gpr.h"
+#include "reg.h"
+
+/* Tracer and Tracee Shared Data */
+int shm_id;
+int *cptr, *pptr;
+
+float a = FPR_1;
+float b = FPR_2;
+float c = FPR_3;
+
+void gpr(void)
+{
+   unsigned long gpr_buf[18];
+   float fpr_buf[32];
+
+   cptr = (int *)shmat(shm_id, NULL, 0);
+
+   asm __volatile__(
+   ASM_LOAD_GPR_IMMED(gpr_1)
+   ASM_LOAD_FPR_SINGLE_PRECISION(flt_1)
+   :
+   : [gpr_1]"i"(GPR_1), [flt_1] "r" ()
+   : "memory", "r6", "r7", "r8", "r9", "r10",
+   "r11", "r12", "r13", "r14", "r15", "r16", "r17",
+   "r18", "r19", "r20", "r21", "r22", "r23", "r24",
+   "r25", "r26", "r27", "r28", "r29", "r30", "r31"
+   );
+
+   cptr[1] = 1;
+
+   

[PATCH v15 11/15] selftests/powerpc: Add ptrace tests for VSX, VMX registers in TM

2016-09-29 Thread wei . guo . simon
From: Anshuman Khandual 

This patch adds ptrace interface test for VSX, VMX registers
inside TM context. This also adds ptrace interface based helper
functions related to chckpointed VSX, VMX registers access.

Signed-off-by: Anshuman Khandual 
Signed-off-by: Simon Guo 
---
 tools/testing/selftests/powerpc/ptrace/Makefile|   3 +-
 .../selftests/powerpc/ptrace/ptrace-tm-vsx.c   | 168 +
 2 files changed, 170 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-tm-vsx.c

diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile 
b/tools/testing/selftests/powerpc/ptrace/Makefile
index 9d9f658..a518fbd 100644
--- a/tools/testing/selftests/powerpc/ptrace/Makefile
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -1,5 +1,6 @@
 TEST_PROGS := ptrace-ebb ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr \
-ptrace-tar ptrace-tm-tar ptrace-tm-spd-tar ptrace-vsx
+ptrace-tar ptrace-tm-tar ptrace-tm-spd-tar ptrace-vsx ptrace-tm-vsx
+
 
 include ../../lib.mk
 
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-tm-vsx.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-vsx.c
new file mode 100644
index 000..b4081e2
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-vsx.c
@@ -0,0 +1,168 @@
+/*
+ * Ptrace test for VMX/VSX registers in the TM context
+ *
+ * Copyright (C) 2015 Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include "ptrace.h"
+#include "tm.h"
+#include "ptrace-vsx.h"
+
+int shm_id;
+unsigned long *cptr, *pptr;
+
+unsigned long fp_load[VEC_MAX];
+unsigned long fp_store[VEC_MAX];
+unsigned long fp_load_ckpt[VEC_MAX];
+unsigned long fp_load_ckpt_new[VEC_MAX];
+
+__attribute__((used)) void load_vsx(void)
+{
+   loadvsx(fp_load, 0);
+}
+
+__attribute__((used)) void load_vsx_ckpt(void)
+{
+   loadvsx(fp_load_ckpt, 0);
+}
+
+void tm_vsx(void)
+{
+   unsigned long result, texasr;
+   int ret;
+
+   cptr = (unsigned long *)shmat(shm_id, NULL, 0);
+
+trans:
+   cptr[1] = 0;
+   asm __volatile__(
+   "bl load_vsx_ckpt;"
+
+   "1: ;"
+   "tbegin.;"
+   "beq 2f;"
+
+   "bl load_vsx;"
+   "tsuspend.;"
+   "li 7, 1;"
+   "stw 7, 0(%[cptr1]);"
+   "tresume.;"
+   "b .;"
+
+   "tend.;"
+   "li 0, 0;"
+   "ori %[res], 0, 0;"
+   "b 3f;"
+
+   "2: ;"
+   "li 0, 1;"
+   "ori %[res], 0, 0;"
+   "mfspr %[texasr], %[sprn_texasr];"
+
+   "3: ;"
+   : [res] "=r" (result), [texasr] "=r" (texasr)
+   : [fp_load] "r" (fp_load), [fp_load_ckpt] "r" (fp_load_ckpt),
+   [sprn_texasr] "i"  (SPRN_TEXASR), [cptr1] "r" ([1])
+   : "memory", "r0", "r1", "r2", "r3", "r4",
+   "r7", "r8", "r9", "r10", "r11"
+   );
+
+   if (result) {
+   if (!cptr[0])
+   goto trans;
+
+   shmdt((void *)cptr);
+   storevsx(fp_store, 0);
+   ret = compare_vsx_vmx(fp_store, fp_load_ckpt_new);
+   if (ret)
+   exit(1);
+   exit(0);
+   }
+   shmdt((void *)cptr);
+   exit(1);
+}
+
+int trace_tm_vsx(pid_t child)
+{
+   unsigned long vsx[VSX_MAX];
+   unsigned long vmx[VMX_MAX + 2][2];
+
+   FAIL_IF(start_trace(child));
+   FAIL_IF(show_vsx(child, vsx));
+   FAIL_IF(validate_vsx(vsx, fp_load));
+   FAIL_IF(show_vmx(child, vmx));
+   FAIL_IF(validate_vmx(vmx, fp_load));
+   FAIL_IF(show_vsx_ckpt(child, vsx));
+   FAIL_IF(validate_vsx(vsx, fp_load_ckpt));
+   FAIL_IF(show_vmx_ckpt(child, vmx));
+   FAIL_IF(validate_vmx(vmx, fp_load_ckpt));
+   memset(vsx, 0, sizeof(vsx));
+   memset(vmx, 0, sizeof(vmx));
+
+   load_vsx_vmx(fp_load_ckpt_new, vsx, vmx);
+
+   FAIL_IF(write_vsx_ckpt(child, vsx));
+   FAIL_IF(write_vmx_ckpt(child, vmx));
+   pptr[0] = 1;
+   FAIL_IF(stop_trace(child));
+   return TEST_PASS;
+}
+
+int ptrace_tm_vsx(void)
+{
+   pid_t pid;
+   int ret, status, i;
+
+   SKIP_IF(!have_htm());
+   shm_id = shmget(IPC_PRIVATE, sizeof(int) * 2, 0777|IPC_CREAT);
+
+   for (i = 0; i < 128; i++) {
+   fp_load[i] = 1 + rand();
+   fp_load_ckpt[i] = 1 + 2 * rand();
+   fp_load_ckpt_new[i] = 1 + 3 * rand();
+   }
+
+   pid = fork();
+   if (pid < 0) {
+   perror("fork() failed");
+   return TEST_FAIL;
+   }
+
+   if 

[PATCH v15 06/15] selftests/powerpc: Add ptrace tests for GPR/FPR registers in suspended TM

2016-09-29 Thread wei . guo . simon
From: Anshuman Khandual 

This patch adds ptrace interface test for GPR/FPR registers
inside suspended TM context.

Signed-off-by: Anshuman Khandual 
Signed-off-by: Simon Guo 
---
 tools/testing/selftests/powerpc/ptrace/Makefile|   2 +-
 .../selftests/powerpc/ptrace/ptrace-tm-spd-gpr.c   | 169 +
 2 files changed, 170 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-gpr.c

diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile 
b/tools/testing/selftests/powerpc/ptrace/Makefile
index bb958a8..9f3ed2b 100644
--- a/tools/testing/selftests/powerpc/ptrace/Makefile
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -1,4 +1,4 @@
-TEST_PROGS := ptrace-ebb ptrace-gpr ptrace-tm-gpr
+TEST_PROGS := ptrace-ebb ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr
 
 include ../../lib.mk
 
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-gpr.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-gpr.c
new file mode 100644
index 000..327fa94
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-gpr.c
@@ -0,0 +1,169 @@
+/*
+ * Ptrace test for GPR/FPR registers in TM Suspend context
+ *
+ * Copyright (C) 2015 Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include "ptrace.h"
+#include "ptrace-gpr.h"
+#include "tm.h"
+
+/* Tracer and Tracee Shared Data */
+int shm_id;
+int *cptr, *pptr;
+
+float a = FPR_1;
+float b = FPR_2;
+float c = FPR_3;
+float d = FPR_4;
+
+__attribute__((used)) void wait_parent(void)
+{
+   cptr[2] = 1;
+   while (!cptr[1])
+   asm volatile("" : : : "memory");
+}
+
+void tm_spd_gpr(void)
+{
+   unsigned long gpr_buf[18];
+   unsigned long result, texasr;
+   float fpr_buf[32];
+
+   cptr = (int *)shmat(shm_id, NULL, 0);
+
+trans:
+   cptr[2] = 0;
+   asm __volatile__(
+   ASM_LOAD_GPR_IMMED(gpr_1)
+   ASM_LOAD_FPR_SINGLE_PRECISION(flt_1)
+
+   "1: ;"
+   "tbegin.;"
+   "beq 2f;"
+
+   ASM_LOAD_GPR_IMMED(gpr_2)
+   "tsuspend.;"
+   ASM_LOAD_GPR_IMMED(gpr_4)
+   ASM_LOAD_FPR_SINGLE_PRECISION(flt_4)
+
+   "bl wait_parent;"
+   "tresume.;"
+   "tend.;"
+   "li 0, 0;"
+   "ori %[res], 0, 0;"
+   "b 3f;"
+
+   /* Transaction abort handler */
+   "2: ;"
+   "li 0, 1;"
+   "ori %[res], 0, 0;"
+   "mfspr %[texasr], %[sprn_texasr];"
+
+   "3: ;"
+   : [res] "=r" (result), [texasr] "=r" (texasr)
+   : [gpr_1]"i"(GPR_1), [gpr_2]"i"(GPR_2), [gpr_4]"i"(GPR_4),
+   [sprn_texasr] "i" (SPRN_TEXASR), [flt_1] "r" (),
+   [flt_2] "r" (), [flt_4] "r" ()
+   : "memory", "r5", "r6", "r7",
+   "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15",
+   "r16", "r17", "r18", "r19", "r20", "r21", "r22", "r23",
+   "r24", "r25", "r26", "r27", "r28", "r29", "r30", "r31"
+   );
+
+   if (result) {
+   if (!cptr[0])
+   goto trans;
+
+   shmdt((void *)cptr);
+   store_gpr(gpr_buf);
+   store_fpr_single_precision(fpr_buf);
+
+   if (validate_gpr(gpr_buf, GPR_3))
+   exit(1);
+
+   if (validate_fpr_float(fpr_buf, c))
+   exit(1);
+   exit(0);
+   }
+   shmdt((void *)cptr);
+   exit(1);
+}
+
+int trace_tm_spd_gpr(pid_t child)
+{
+   unsigned long gpr[18];
+   unsigned long fpr[32];
+
+   FAIL_IF(start_trace(child));
+   FAIL_IF(show_gpr(child, gpr));
+   FAIL_IF(validate_gpr(gpr, GPR_4));
+   FAIL_IF(show_fpr(child, fpr));
+   FAIL_IF(validate_fpr(fpr, FPR_4_REP));
+   FAIL_IF(show_ckpt_fpr(child, fpr));
+   FAIL_IF(validate_fpr(fpr, FPR_1_REP));
+   FAIL_IF(show_ckpt_gpr(child, gpr));
+   FAIL_IF(validate_gpr(gpr, GPR_1));
+   FAIL_IF(write_ckpt_gpr(child, GPR_3));
+   FAIL_IF(write_ckpt_fpr(child, FPR_3_REP));
+
+   pptr[0] = 1;
+   pptr[1] = 1;
+   FAIL_IF(stop_trace(child));
+   return TEST_PASS;
+}
+
+int ptrace_tm_spd_gpr(void)
+{
+   pid_t pid;
+   int ret, status;
+
+   SKIP_IF(!have_htm());
+   shm_id = shmget(IPC_PRIVATE, sizeof(int) * 3, 0777|IPC_CREAT);
+   pid = fork();
+   if (pid < 0) {
+   perror("fork() failed");
+   return TEST_FAIL;
+   }
+
+   if (pid == 0)
+   tm_spd_gpr();
+
+   

[PATCH v15 05/15] selftests/powerpc: Add ptrace tests for GPR/FPR registers in TM

2016-09-29 Thread wei . guo . simon
From: Anshuman Khandual 

This patch adds ptrace interface test for GPR/FPR registers
inside TM context. This adds ptrace interface based helper
functions related to checkpointed GPR/FPR access.

Signed-off-by: Anshuman Khandual 
Signed-off-by: Simon Guo 
---
 tools/testing/selftests/powerpc/ptrace/Makefile|   5 +-
 .../selftests/powerpc/ptrace/ptrace-tm-gpr.c   | 158 +
 2 files changed, 161 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-tm-gpr.c

diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile 
b/tools/testing/selftests/powerpc/ptrace/Makefile
index e9b8e7d..bb958a8 100644
--- a/tools/testing/selftests/powerpc/ptrace/Makefile
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -1,11 +1,12 @@
-TEST_PROGS := ptrace-ebb ptrace-gpr
+TEST_PROGS := ptrace-ebb ptrace-gpr ptrace-tm-gpr
 
 include ../../lib.mk
 
 all: $(TEST_PROGS)
-CFLAGS += -m64
+CFLAGS += -m64 -I../tm -mhtm
 $(TEST_PROGS): ../harness.c ../utility/reg.S ../utility/utils.c ptrace.h
 ptrace-ebb: ../pmu/event.c ../pmu/lib.c ../pmu/ebb/ebb_handler.S 
../pmu/ebb/busy_loop.S
 ptrace-ebb: CFLAGS += -I../pmu/ebb
+
 clean:
rm -f $(TEST_PROGS) *.o
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-tm-gpr.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-gpr.c
new file mode 100644
index 000..59206b9
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-gpr.c
@@ -0,0 +1,158 @@
+/*
+ * Ptrace test for GPR/FPR registers in TM context
+ *
+ * Copyright (C) 2015 Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include "ptrace.h"
+#include "ptrace-gpr.h"
+#include "tm.h"
+
+/* Tracer and Tracee Shared Data */
+int shm_id;
+unsigned long *cptr, *pptr;
+
+float a = FPR_1;
+float b = FPR_2;
+float c = FPR_3;
+
+void tm_gpr(void)
+{
+   unsigned long gpr_buf[18];
+   unsigned long result, texasr;
+   float fpr_buf[32];
+
+   printf("Starting the child\n");
+   cptr = (unsigned long *)shmat(shm_id, NULL, 0);
+
+trans:
+   cptr[1] = 0;
+   asm __volatile__(
+   ASM_LOAD_GPR_IMMED(gpr_1)
+   ASM_LOAD_FPR_SINGLE_PRECISION(flt_1)
+   "1: ;"
+   "tbegin.;"
+   "beq 2f;"
+   ASM_LOAD_GPR_IMMED(gpr_2)
+   ASM_LOAD_FPR_SINGLE_PRECISION(flt_2)
+   "tsuspend.;"
+   "li 7, 1;"
+   "stw 7, 0(%[cptr1]);"
+   "tresume.;"
+   "b .;"
+
+   "tend.;"
+   "li 0, 0;"
+   "ori %[res], 0, 0;"
+   "b 3f;"
+
+   /* Transaction abort handler */
+   "2: ;"
+   "li 0, 1;"
+   "ori %[res], 0, 0;"
+   "mfspr %[texasr], %[sprn_texasr];"
+
+   "3: ;"
+   : [res] "=r" (result), [texasr] "=r" (texasr)
+   : [gpr_1]"i"(GPR_1), [gpr_2]"i"(GPR_2),
+   [sprn_texasr] "i" (SPRN_TEXASR), [flt_1] "r" (),
+   [flt_2] "r" (), [cptr1] "r" ([1])
+   : "memory", "r7", "r8", "r9", "r10",
+   "r11", "r12", "r13", "r14", "r15", "r16",
+   "r17", "r18", "r19", "r20", "r21", "r22",
+   "r23", "r24", "r25", "r26", "r27", "r28",
+   "r29", "r30", "r31"
+   );
+
+   if (result) {
+   if (!cptr[0])
+   goto trans;
+
+   shmdt((void *)cptr);
+   store_gpr(gpr_buf);
+   store_fpr_single_precision(fpr_buf);
+
+   if (validate_gpr(gpr_buf, GPR_3))
+   exit(1);
+
+   if (validate_fpr_float(fpr_buf, c))
+   exit(1);
+
+   exit(0);
+   }
+   shmdt((void *)cptr);
+   exit(1);
+}
+
+int trace_tm_gpr(pid_t child)
+{
+   unsigned long gpr[18];
+   unsigned long fpr[32];
+
+   FAIL_IF(start_trace(child));
+   FAIL_IF(show_gpr(child, gpr));
+   FAIL_IF(validate_gpr(gpr, GPR_2));
+   FAIL_IF(show_fpr(child, fpr));
+   FAIL_IF(validate_fpr(fpr, FPR_2_REP));
+   FAIL_IF(show_ckpt_fpr(child, fpr));
+   FAIL_IF(validate_fpr(fpr, FPR_1_REP));
+   FAIL_IF(show_ckpt_gpr(child, gpr));
+   FAIL_IF(validate_gpr(gpr, GPR_1));
+   FAIL_IF(write_ckpt_gpr(child, GPR_3));
+   FAIL_IF(write_ckpt_fpr(child, FPR_3_REP));
+
+   pptr[0] = 1;
+   FAIL_IF(stop_trace(child));
+
+   return TEST_PASS;
+}
+
+int ptrace_tm_gpr(void)
+{
+   pid_t pid;
+   int ret, status;
+
+   SKIP_IF(!have_htm());
+   shm_id = shmget(IPC_PRIVATE, sizeof(int) * 2, 

[PATCH v15 08/15] selftests/powerpc: Add ptrace tests for TAR, PPR, DSCR in TM

2016-09-29 Thread wei . guo . simon
From: Anshuman Khandual 

This patch adds ptrace interface test for TAR, PPR, DSCR
registers inside TM context. This also adds ptrace
interface based helper functions related to checkpointed
TAR, PPR, DSCR register access.

Signed-off-by: Anshuman Khandual 
Signed-off-by: Simon Guo 
---
 tools/testing/selftests/powerpc/ptrace/Makefile|   2 +-
 .../selftests/powerpc/ptrace/ptrace-tm-tar.c   | 160 +
 2 files changed, 161 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-tm-tar.c

diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile 
b/tools/testing/selftests/powerpc/ptrace/Makefile
index dfb0847..9af9ad5 100644
--- a/tools/testing/selftests/powerpc/ptrace/Makefile
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -1,5 +1,5 @@
 TEST_PROGS := ptrace-ebb ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr \
-ptrace-tar
+ptrace-tar ptrace-tm-tar
 
 include ../../lib.mk
 
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-tm-tar.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-tar.c
new file mode 100644
index 000..48b462f
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-tar.c
@@ -0,0 +1,160 @@
+/*
+ * Ptrace test for TAR, PPR, DSCR registers in the TM context
+ *
+ * Copyright (C) 2015 Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include "ptrace.h"
+#include "tm.h"
+#include "ptrace-tar.h"
+
+int shm_id;
+unsigned long *cptr, *pptr;
+
+
+void tm_tar(void)
+{
+   unsigned long result, texasr;
+   unsigned long regs[3];
+   int ret;
+
+   cptr = (unsigned long *)shmat(shm_id, NULL, 0);
+
+trans:
+   cptr[1] = 0;
+   asm __volatile__(
+   "li 4, %[tar_1];"
+   "mtspr %[sprn_tar],  4;"/* TAR_1 */
+   "li 4, %[dscr_1];"
+   "mtspr %[sprn_dscr], 4;"/* DSCR_1 */
+   "or 31,31,31;"  /* PPR_1*/
+
+   "1: ;"
+   "tbegin.;"
+   "beq 2f;"
+
+   "li 4, %[tar_2];"
+   "mtspr %[sprn_tar],  4;"/* TAR_2 */
+   "li 4, %[dscr_2];"
+   "mtspr %[sprn_dscr], 4;"/* DSCR_2 */
+   "or 1,1,1;" /* PPR_2 */
+   "tsuspend.;"
+   "li 0, 1;"
+   "stw 0, 0(%[cptr1]);"
+   "tresume.;"
+   "b .;"
+
+   "tend.;"
+   "li 0, 0;"
+   "ori %[res], 0, 0;"
+   "b 3f;"
+
+   /* Transaction abort handler */
+   "2: ;"
+   "li 0, 1;"
+   "ori %[res], 0, 0;"
+   "mfspr %[texasr], %[sprn_texasr];"
+
+   "3: ;"
+
+   : [res] "=r" (result), [texasr] "=r" (texasr)
+   : [sprn_dscr]"i"(SPRN_DSCR), [sprn_tar]"i"(SPRN_TAR),
+   [sprn_ppr]"i"(SPRN_PPR), [sprn_texasr]"i"(SPRN_TEXASR),
+   [tar_1]"i"(TAR_1), [dscr_1]"i"(DSCR_1), [tar_2]"i"(TAR_2),
+   [dscr_2]"i"(DSCR_2), [cptr1] "r" ([1])
+   : "memory", "r0", "r1", "r3", "r4", "r5", "r6"
+   );
+
+   /* TM failed, analyse */
+   if (result) {
+   if (!cptr[0])
+   goto trans;
+
+   regs[0] = mfspr(SPRN_TAR);
+   regs[1] = mfspr(SPRN_PPR);
+   regs[2] = mfspr(SPRN_DSCR);
+
+   shmdt();
+   printf("%-30s TAR: %lu PPR: %lx DSCR: %lu\n",
+   user_read, regs[0], regs[1], regs[2]);
+
+   ret = validate_tar_registers(regs, TAR_4, PPR_4, DSCR_4);
+   if (ret)
+   exit(1);
+   exit(0);
+   }
+   shmdt();
+   exit(1);
+}
+
+int trace_tm_tar(pid_t child)
+{
+   unsigned long regs[3];
+
+   FAIL_IF(start_trace(child));
+   FAIL_IF(show_tar_registers(child, regs));
+   printf("%-30s TAR: %lu PPR: %lx DSCR: %lu\n",
+   ptrace_read_running, regs[0], regs[1], regs[2]);
+
+   FAIL_IF(validate_tar_registers(regs, TAR_2, PPR_2, DSCR_2));
+   FAIL_IF(show_tm_checkpointed_state(child, regs));
+   printf("%-30s TAR: %lu PPR: %lx DSCR: %lu\n",
+   ptrace_read_ckpt, regs[0], regs[1], regs[2]);
+
+   FAIL_IF(validate_tar_registers(regs, TAR_1, PPR_1, DSCR_1));
+   FAIL_IF(write_ckpt_tar_registers(child, TAR_4, PPR_4, DSCR_4));
+   printf("%-30s TAR: %u PPR: %lx DSCR: %u\n",
+   ptrace_write_ckpt, TAR_4, PPR_4, DSCR_4);
+
+   pptr[0] = 1;
+   FAIL_IF(stop_trace(child));
+ 

[PATCH v15 10/15] selftests/powerpc: Add ptrace tests for VSX, VMX registers

2016-09-29 Thread wei . guo . simon
From: Anshuman Khandual 

This patch adds ptrace interface test for VSX, VMX registers.
This also adds ptrace interface based helper functions related
to VSX, VMX registers access. This also adds some assembly
helper functions related to VSX and VMX registers.

Signed-off-by: Anshuman Khandual 
Signed-off-by: Simon Guo 
---
 tools/testing/selftests/powerpc/ptrace/Makefile|   2 +-
 .../testing/selftests/powerpc/ptrace/ptrace-vsx.c  | 117 +
 .../testing/selftests/powerpc/ptrace/ptrace-vsx.h  | 127 ++
 tools/testing/selftests/powerpc/ptrace/ptrace.h| 119 +
 tools/testing/selftests/powerpc/utility/reg.S  | 265 +
 5 files changed, 629 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-vsx.c
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-vsx.h

diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile 
b/tools/testing/selftests/powerpc/ptrace/Makefile
index 19e4a7c..9d9f658 100644
--- a/tools/testing/selftests/powerpc/ptrace/Makefile
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -1,5 +1,5 @@
 TEST_PROGS := ptrace-ebb ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr \
-ptrace-tar ptrace-tm-tar ptrace-tm-spd-tar
+ptrace-tar ptrace-tm-tar ptrace-tm-spd-tar ptrace-vsx
 
 include ../../lib.mk
 
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-vsx.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-vsx.c
new file mode 100644
index 000..04084ee
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-vsx.c
@@ -0,0 +1,117 @@
+/*
+ * Ptrace test for VMX/VSX registers
+ *
+ * Copyright (C) 2015 Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include "ptrace.h"
+#include "ptrace-vsx.h"
+
+/* Tracer and Tracee Shared Data */
+int shm_id;
+int *cptr, *pptr;
+
+unsigned long fp_load[VEC_MAX];
+unsigned long fp_load_new[VEC_MAX];
+unsigned long fp_store[VEC_MAX];
+
+void vsx(void)
+{
+   int ret;
+
+   cptr = (int *)shmat(shm_id, NULL, 0);
+   loadvsx(fp_load, 0);
+   cptr[1] = 1;
+
+   while (!cptr[0])
+   asm volatile("" : : : "memory");
+   shmdt((void *) cptr);
+
+   storevsx(fp_store, 0);
+   ret = compare_vsx_vmx(fp_store, fp_load_new);
+   if (ret)
+   exit(1);
+   exit(0);
+}
+
+int trace_vsx(pid_t child)
+{
+   unsigned long vsx[VSX_MAX];
+   unsigned long vmx[VMX_MAX + 2][2];
+
+   FAIL_IF(start_trace(child));
+   FAIL_IF(show_vsx(child, vsx));
+   FAIL_IF(validate_vsx(vsx, fp_load));
+   FAIL_IF(show_vmx(child, vmx));
+   FAIL_IF(validate_vmx(vmx, fp_load));
+
+   memset(vsx, 0, sizeof(vsx));
+   memset(vmx, 0, sizeof(vmx));
+   load_vsx_vmx(fp_load_new, vsx, vmx);
+
+   FAIL_IF(write_vsx(child, vsx));
+   FAIL_IF(write_vmx(child, vmx));
+   FAIL_IF(stop_trace(child));
+
+   return TEST_PASS;
+}
+
+int ptrace_vsx(void)
+{
+   pid_t pid;
+   int ret, status, i;
+
+   shm_id = shmget(IPC_PRIVATE, sizeof(int) * 2, 0777|IPC_CREAT);
+
+   for (i = 0; i < VEC_MAX; i++)
+   fp_load[i] = i + rand();
+
+   for (i = 0; i < VEC_MAX; i++)
+   fp_load_new[i] = i + 2 * rand();
+
+   pid = fork();
+   if (pid < 0) {
+   perror("fork() failed");
+   return TEST_FAIL;
+   }
+
+   if (pid == 0)
+   vsx();
+
+   if (pid) {
+   pptr = (int *)shmat(shm_id, NULL, 0);
+   while (!pptr[1])
+   asm volatile("" : : : "memory");
+
+   ret = trace_vsx(pid);
+   if (ret) {
+   kill(pid, SIGTERM);
+   shmdt((void *)pptr);
+   shmctl(shm_id, IPC_RMID, NULL);
+   return TEST_FAIL;
+   }
+
+   pptr[0] = 1;
+   shmdt((void *)pptr);
+
+   ret = wait();
+   shmctl(shm_id, IPC_RMID, NULL);
+   if (ret != pid) {
+   printf("Child's exit status not captured\n");
+   return TEST_FAIL;
+   }
+
+   return (WIFEXITED(status) && WEXITSTATUS(status)) ? TEST_FAIL :
+   TEST_PASS;
+   }
+   return TEST_PASS;
+}
+
+int main(int argc, char *argv[])
+{
+   return test_harness(ptrace_vsx, "ptrace_vsx");
+}
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-vsx.h 
b/tools/testing/selftests/powerpc/ptrace/ptrace-vsx.h
new file mode 100644
index 000..f4e4b42
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-vsx.h
@@ -0,0 +1,127 @@
+/*
+ 

[PATCH v15 07/15] selftests/powerpc: Add ptrace tests for TAR, PPR, DSCR registers

2016-09-29 Thread wei . guo . simon
From: Anshuman Khandual 

This patch adds ptrace interface test for TAR, PPR, DSCR
registers. This also adds ptrace interface based helper
functions related to TAR, PPR, DSCR register access.

Signed-off-by: Anshuman Khandual 
Signed-off-by: Simon Guo 
---
 tools/testing/selftests/powerpc/ptrace/Makefile|   3 +-
 .../testing/selftests/powerpc/ptrace/ptrace-tar.c  | 135 +++
 .../testing/selftests/powerpc/ptrace/ptrace-tar.h  |  50 ++
 tools/testing/selftests/powerpc/ptrace/ptrace.h| 181 +
 4 files changed, 368 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-tar.c
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-tar.h

diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile 
b/tools/testing/selftests/powerpc/ptrace/Makefile
index 9f3ed2b..dfb0847 100644
--- a/tools/testing/selftests/powerpc/ptrace/Makefile
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -1,4 +1,5 @@
-TEST_PROGS := ptrace-ebb ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr
+TEST_PROGS := ptrace-ebb ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr \
+ptrace-tar
 
 include ../../lib.mk
 
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-tar.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-tar.c
new file mode 100644
index 000..f9b5069
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-tar.c
@@ -0,0 +1,135 @@
+/*
+ * Ptrace test for TAR, PPR, DSCR registers
+ *
+ * Copyright (C) 2015 Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include "ptrace.h"
+#include "ptrace-tar.h"
+
+/* Tracer and Tracee Shared Data */
+int shm_id;
+int *cptr;
+int *pptr;
+
+void tar(void)
+{
+   unsigned long reg[3];
+   int ret;
+
+   cptr = (int *)shmat(shm_id, NULL, 0);
+   printf("%-30s TAR: %u PPR: %lx DSCR: %u\n",
+   user_write, TAR_1, PPR_1, DSCR_1);
+
+   mtspr(SPRN_TAR, TAR_1);
+   mtspr(SPRN_PPR, PPR_1);
+   mtspr(SPRN_DSCR, DSCR_1);
+
+   cptr[2] = 1;
+
+   /* Wait on parent */
+   while (!cptr[0])
+   asm volatile("" : : : "memory");
+
+   reg[0] = mfspr(SPRN_TAR);
+   reg[1] = mfspr(SPRN_PPR);
+   reg[2] = mfspr(SPRN_DSCR);
+
+   printf("%-30s TAR: %lu PPR: %lx DSCR: %lu\n",
+   user_read, reg[0], reg[1], reg[2]);
+
+   /* Unblock the parent now */
+   cptr[1] = 1;
+   shmdt((int *)cptr);
+
+   ret = validate_tar_registers(reg, TAR_2, PPR_2, DSCR_2);
+   if (ret)
+   exit(1);
+   exit(0);
+}
+
+int trace_tar(pid_t child)
+{
+   unsigned long reg[3];
+
+   FAIL_IF(start_trace(child));
+   FAIL_IF(show_tar_registers(child, reg));
+   printf("%-30s TAR: %lu PPR: %lx DSCR: %lu\n",
+   ptrace_read_running, reg[0], reg[1], reg[2]);
+
+   FAIL_IF(validate_tar_registers(reg, TAR_1, PPR_1, DSCR_1));
+   FAIL_IF(stop_trace(child));
+   return TEST_PASS;
+}
+
+int trace_tar_write(pid_t child)
+{
+   FAIL_IF(start_trace(child));
+   FAIL_IF(write_tar_registers(child, TAR_2, PPR_2, DSCR_2));
+   printf("%-30s TAR: %u PPR: %lx DSCR: %u\n",
+   ptrace_write_running, TAR_2, PPR_2, DSCR_2);
+
+   FAIL_IF(stop_trace(child));
+   return TEST_PASS;
+}
+
+int ptrace_tar(void)
+{
+   pid_t pid;
+   int ret, status;
+
+   shm_id = shmget(IPC_PRIVATE, sizeof(int) * 3, 0777|IPC_CREAT);
+   pid = fork();
+   if (pid < 0) {
+   perror("fork() failed");
+   return TEST_FAIL;
+   }
+
+   if (pid == 0)
+   tar();
+
+   if (pid) {
+   pptr = (int *)shmat(shm_id, NULL, 0);
+   pptr[0] = 0;
+   pptr[1] = 0;
+
+   while (!pptr[2])
+   asm volatile("" : : : "memory");
+   ret = trace_tar(pid);
+   if (ret)
+   return ret;
+
+   ret = trace_tar_write(pid);
+   if (ret)
+   return ret;
+
+   /* Unblock the child now */
+   pptr[0] = 1;
+
+   /* Wait on child */
+   while (!pptr[1])
+   asm volatile("" : : : "memory");
+
+   shmdt((int *)pptr);
+
+   ret = wait();
+   shmctl(shm_id, IPC_RMID, NULL);
+   if (ret != pid) {
+   printf("Child's exit status not captured\n");
+   return TEST_PASS;
+   }
+
+   return (WIFEXITED(status) && WEXITSTATUS(status)) ? TEST_FAIL :
+   

[PATCH v15 02/15] selftests/powerpc: move shared utility files into new utility/ dir

2016-09-29 Thread wei . guo . simon
From: Simon Guo 

There are some functions, especially register related, which can
be shared across multiple selftests/powerpc test directories.

This patch creates a new utility directory to store those shared
functionalities, so that the file layout becomes more neat.

Signed-off-by: Simon Guo 
---
 tools/testing/selftests/powerpc/Makefile   |  2 +-
 tools/testing/selftests/powerpc/alignment/Makefile |  2 +-
 tools/testing/selftests/powerpc/basic_asm.h| 70 -
 .../testing/selftests/powerpc/benchmarks/Makefile  |  2 +-
 .../selftests/powerpc/benchmarks/context_switch.c  |  2 +-
 .../selftests/powerpc/context_switch/Makefile  |  2 +-
 .../testing/selftests/powerpc/copyloops/validate.c |  2 +-
 tools/testing/selftests/powerpc/instructions.h | 68 -
 tools/testing/selftests/powerpc/math/fpu_asm.S |  2 +-
 tools/testing/selftests/powerpc/math/vmx_asm.S |  2 +-
 tools/testing/selftests/powerpc/mm/Makefile|  2 +-
 tools/testing/selftests/powerpc/pmu/Makefile   |  4 +-
 tools/testing/selftests/powerpc/pmu/ebb/Makefile   |  2 +-
 tools/testing/selftests/powerpc/reg.h  | 89 --
 .../testing/selftests/powerpc/stringloops/memcmp.c |  2 +-
 tools/testing/selftests/powerpc/tm/Makefile|  2 +-
 tools/testing/selftests/powerpc/tm/tm.h|  2 +-
 .../testing/selftests/powerpc/utility/basic_asm.h  | 73 ++
 .../selftests/powerpc/utility/instructions.h   | 68 +
 tools/testing/selftests/powerpc/utility/reg.h  | 89 ++
 tools/testing/selftests/powerpc/utility/utils.c| 87 +
 tools/testing/selftests/powerpc/utility/utils.h| 70 +
 tools/testing/selftests/powerpc/utils.c| 87 -
 tools/testing/selftests/powerpc/utils.h| 70 -
 24 files changed, 402 insertions(+), 399 deletions(-)
 delete mode 100644 tools/testing/selftests/powerpc/basic_asm.h
 delete mode 100644 tools/testing/selftests/powerpc/instructions.h
 delete mode 100644 tools/testing/selftests/powerpc/reg.h
 create mode 100644 tools/testing/selftests/powerpc/utility/basic_asm.h
 create mode 100644 tools/testing/selftests/powerpc/utility/instructions.h
 create mode 100644 tools/testing/selftests/powerpc/utility/reg.h
 create mode 100644 tools/testing/selftests/powerpc/utility/utils.c
 create mode 100644 tools/testing/selftests/powerpc/utility/utils.h
 delete mode 100644 tools/testing/selftests/powerpc/utils.c
 delete mode 100644 tools/testing/selftests/powerpc/utils.h

diff --git a/tools/testing/selftests/powerpc/Makefile 
b/tools/testing/selftests/powerpc/Makefile
index 1cc6d64..b6eb817 100644
--- a/tools/testing/selftests/powerpc/Makefile
+++ b/tools/testing/selftests/powerpc/Makefile
@@ -8,7 +8,7 @@ ifeq ($(ARCH),powerpc)
 
 GIT_VERSION = $(shell git describe --always --long --dirty || echo "unknown")
 
-CFLAGS := -std=gnu99 -Wall -O2 -Wall -Werror -DGIT_VERSION='"$(GIT_VERSION)"' 
-I$(CURDIR) $(CFLAGS)
+CFLAGS := -std=gnu99 -Wall -O2 -Wall -Werror -DGIT_VERSION='"$(GIT_VERSION)"' 
-I$(CURDIR) -I$(CURDIR)/utility $(CFLAGS)
 
 export CFLAGS
 
diff --git a/tools/testing/selftests/powerpc/alignment/Makefile 
b/tools/testing/selftests/powerpc/alignment/Makefile
index ad6a4e4..b61e5e7 100644
--- a/tools/testing/selftests/powerpc/alignment/Makefile
+++ b/tools/testing/selftests/powerpc/alignment/Makefile
@@ -2,7 +2,7 @@ TEST_PROGS := copy_unaligned copy_first_unaligned 
paste_unaligned paste_last_una
 
 all: $(TEST_PROGS)
 
-$(TEST_PROGS): ../harness.c ../utils.c copy_paste_unaligned_common.c
+$(TEST_PROGS): ../harness.c ../utility/utils.c copy_paste_unaligned_common.c
 
 include ../../lib.mk
 
diff --git a/tools/testing/selftests/powerpc/basic_asm.h 
b/tools/testing/selftests/powerpc/basic_asm.h
deleted file mode 100644
index 3349a07..000
--- a/tools/testing/selftests/powerpc/basic_asm.h
+++ /dev/null
@@ -1,70 +0,0 @@
-#ifndef _SELFTESTS_POWERPC_BASIC_ASM_H
-#define _SELFTESTS_POWERPC_BASIC_ASM_H
-
-#include 
-#include 
-
-#define LOAD_REG_IMMEDIATE(reg,expr) \
-   lis reg,(expr)@highest; \
-   ori reg,reg,(expr)@higher;  \
-   rldicr  reg,reg,32,31;  \
-   orisreg,reg,(expr)@high;\
-   ori reg,reg,(expr)@l;
-
-/*
- * Note: These macros assume that variables being stored on the stack are
- * doublewords, while this is usually the case it may not always be the
- * case for each use case.
- */
-#if defined(_CALL_ELF) && _CALL_ELF == 2
-#define STACK_FRAME_MIN_SIZE 32
-#define STACK_FRAME_TOC_POS  24
-#define __STACK_FRAME_PARAM(_param)  (32 + ((_param)*8))
-#define __STACK_FRAME_LOCAL(_num_params,_var_num)  
((STACK_FRAME_PARAM(_num_params)) + ((_var_num)*8))
-#else
-#define STACK_FRAME_MIN_SIZE 112
-#define STACK_FRAME_TOC_POS  40
-#define __STACK_FRAME_PARAM(i)  (48 + ((i)*8))
-
-/*
- * Caveat: if a 

[PATCH v15 03/15] selftests/powerpc: Add ptrace tests for EBB

2016-09-29 Thread wei . guo . simon
From: Anshuman Khandual 

This patch adds ptrace interface test for EBB/PMU specific
registers. This also adds some generic ptrace interface
based helper functions to be used by other patches later
on in the series.

Signed-off-by: Anshuman Khandual 
Signed-off-by: Simon Guo 
---
 tools/testing/selftests/powerpc/Makefile   |   3 +-
 tools/testing/selftests/powerpc/ptrace/Makefile|  11 +
 .../testing/selftests/powerpc/ptrace/ptrace-ebb.c  | 187 +
 .../testing/selftests/powerpc/ptrace/ptrace-ebb.h  |  99 +
 tools/testing/selftests/powerpc/ptrace/ptrace.h| 225 +
 5 files changed, 524 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/Makefile
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-ebb.c
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-ebb.h
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace.h

diff --git a/tools/testing/selftests/powerpc/Makefile 
b/tools/testing/selftests/powerpc/Makefile
index b6eb817..2fe383c 100644
--- a/tools/testing/selftests/powerpc/Makefile
+++ b/tools/testing/selftests/powerpc/Makefile
@@ -25,7 +25,8 @@ SUB_DIRS = alignment  \
   syscalls \
   tm   \
   vphn \
-  math
+  math \
+  ptrace
 
 endif
 
diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile 
b/tools/testing/selftests/powerpc/ptrace/Makefile
new file mode 100644
index 000..84c1c01
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -0,0 +1,11 @@
+TEST_PROGS := ptrace-ebb
+
+include ../../lib.mk
+
+all: $(TEST_PROGS)
+CFLAGS += -m64
+$(TEST_PROGS): ../harness.c ../utility/utils.c ptrace.h
+ptrace-ebb: ../pmu/event.c ../pmu/lib.c ../pmu/ebb/ebb_handler.S 
../pmu/ebb/busy_loop.S
+ptrace-ebb: CFLAGS += -I../pmu/ebb
+clean:
+   rm -f $(TEST_PROGS) *.o
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-ebb.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-ebb.c
new file mode 100644
index 000..1ec4a6b
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-ebb.c
@@ -0,0 +1,187 @@
+/*
+ * Ptrace interface test for EBB
+ *
+ * Copyright (C) 2015 Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include "ebb.h"
+#include "ptrace.h"
+#include "ptrace-ebb.h"
+
+/* Tracer and Tracee Shared Data */
+int shm_id;
+int *cptr, *pptr;
+
+void ebb(void)
+{
+   struct event event;
+
+   cptr = (int *)shmat(shm_id, NULL, 0);
+
+   event_init_named(, 0x1001e, "cycles");
+   event.attr.config |= (1ull << 63);
+   event.attr.exclusive = 1;
+   event.attr.pinned = 1;
+   event.attr.exclude_kernel = 1;
+   event.attr.exclude_hv = 1;
+   event.attr.exclude_idle = 1;
+
+   if (event_open()) {
+   perror("event_open() failed");
+   exit(1);
+   }
+
+   setup_ebb_handler(standard_ebb_callee);
+   mtspr(SPRN_BESCR, 0x8001ull);
+
+   /*
+* make sure BESCR has been set before continue
+*/
+   mb();
+
+   if (ebb_event_enable()) {
+   perror("ebb_event_handler() failed");
+   exit(1);
+   }
+
+   mtspr(SPRN_PMC1, pmc_sample_period(SAMPLE_PERIOD));
+   core_busy_loop();
+   cptr[0] = 1;
+   while (1)
+   asm volatile("" : : : "memory");
+
+   exit(0);
+}
+
+int validate_ebb(struct ebb_regs *regs)
+{
+   #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+   struct opd *opd = (struct opd *) ebb_handler;
+   #endif
+
+   printf("EBBRR: %lx\n", regs->ebbrr);
+   #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+   printf("EBBHR: %lx; expected: %lx\n",
+   regs->ebbhr, (unsigned long)opd->entry);
+   #else
+   printf("EBBHR: %lx; expected: %lx\n",
+   regs->ebbhr, (unsigned long)ebb_handler);
+   #endif
+   printf("BESCR: %lx\n", regs->bescr);
+
+   #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+   if (regs->ebbhr != opd->entry)
+   return TEST_FAIL;
+   #else
+   if (regs->ebbhr != (unsigned long) ebb_handler)
+   return TEST_FAIL;
+   #endif
+
+   return TEST_PASS;
+}
+
+int validate_pmu(struct pmu_regs *regs)
+{
+   printf("SIAR:  %lx\n", regs->siar);
+   printf("SDAR:  %lx\n", regs->sdar);
+   printf("SIER:  %lx; expected: %lx\n",
+   regs->sier, (unsigned long)SIER_EXP);
+   printf("MMCR2: %lx; expected: %lx\n",
+   regs->mmcr2, (unsigned long)MMCR2_EXP);
+   printf("MMCR0: %lx; 

[PATCH v15 01/15] selftests/powerpc: Add more SPR numbers, TM & VMX instructions to 'reg.h'/'instructions.h'

2016-09-29 Thread wei . guo . simon
From: Anshuman Khandual 

This patch adds SPR number for TAR, PPR, DSCR special
purpose registers. It also adds TM, VSX, VMX related
instructions which will then be used by patches later
in the series.

Now that the new DSCR register definitions (SPRN_DSCR_PRIV and
SPRN_DSCR) are defined outside this directory, use them instead.

Signed-off-by: Anshuman Khandual 
Signed-off-by: Simon Guo 
---
 tools/testing/selftests/powerpc/dscr/dscr.h | 10 -
 tools/testing/selftests/powerpc/reg.h   | 35 ++---
 2 files changed, 36 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/powerpc/dscr/dscr.h 
b/tools/testing/selftests/powerpc/dscr/dscr.h
index a36af1b..18ea223b 100644
--- a/tools/testing/selftests/powerpc/dscr/dscr.h
+++ b/tools/testing/selftests/powerpc/dscr/dscr.h
@@ -28,8 +28,6 @@
 
 #include "utils.h"
 
-#define SPRN_DSCR  0x11/* Privilege state SPR */
-#define SPRN_DSCR_USR  0x03/* Problem state SPR */
 #define THREADS100 /* Max threads */
 #define COUNT  100 /* Max iterations */
 #define DSCR_MAX   16  /* Max DSCR value */
@@ -48,14 +46,14 @@ inline unsigned long get_dscr(void)
 {
unsigned long ret;
 
-   asm volatile("mfspr %0,%1" : "=r" (ret): "i" (SPRN_DSCR));
+   asm volatile("mfspr %0,%1" : "=r" (ret) : "i" (SPRN_DSCR_PRIV));
 
return ret;
 }
 
 inline void set_dscr(unsigned long val)
 {
-   asm volatile("mtspr %1,%0" : : "r" (val), "i" (SPRN_DSCR));
+   asm volatile("mtspr %1,%0" : : "r" (val), "i" (SPRN_DSCR_PRIV));
 }
 
 /* Problem state DSCR access */
@@ -63,14 +61,14 @@ inline unsigned long get_dscr_usr(void)
 {
unsigned long ret;
 
-   asm volatile("mfspr %0,%1" : "=r" (ret): "i" (SPRN_DSCR_USR));
+   asm volatile("mfspr %0,%1" : "=r" (ret) : "i" (SPRN_DSCR));
 
return ret;
 }
 
 inline void set_dscr_usr(unsigned long val)
 {
-   asm volatile("mtspr %1,%0" : : "r" (val), "i" (SPRN_DSCR_USR));
+   asm volatile("mtspr %1,%0" : : "r" (val), "i" (SPRN_DSCR));
 }
 
 /* Default DSCR access */
diff --git a/tools/testing/selftests/powerpc/reg.h 
b/tools/testing/selftests/powerpc/reg.h
index fddf368..f5d33db 100644
--- a/tools/testing/selftests/powerpc/reg.h
+++ b/tools/testing/selftests/powerpc/reg.h
@@ -51,10 +51,39 @@
 #define SPRN_SDAR  781
 #define SPRN_SIER  768
 
-#define SPRN_TEXASR 0x82
+#define SPRN_TEXASR 0x82/* Transaction Exception and Status Register */
 #define SPRN_TFIAR  0x81/* Transaction Failure Inst Addr*/
 #define SPRN_TFHAR  0x80/* Transaction Failure Handler Addr */
-#define TEXASR_FS   0x0800
-#define SPRN_TAR0x32f
+#define SPRN_TAR0x32f  /* Target Address Register */
+
+#define SPRN_DSCR_PRIV 0x11/* Privilege State DSCR */
+#define SPRN_DSCR  0x03/* Data Stream Control Register */
+#define SPRN_PPR   896 /* Program Priority Register */
+
+/* TEXASR register bits */
+#define TEXASR_FC  0xFE00
+#define TEXASR_FP  0x0100
+#define TEXASR_DA  0x0080
+#define TEXASR_NO  0x0040
+#define TEXASR_FO  0x0020
+#define TEXASR_SIC 0x0010
+#define TEXASR_NTC 0x0008
+#define TEXASR_TC  0x0004
+#define TEXASR_TIC 0x0002
+#define TEXASR_IC  0x0001
+#define TEXASR_IFC 0x8000
+#define TEXASR_ABT 0x0001
+#define TEXASR_SPD 0x8000
+#define TEXASR_HV  0x2000
+#define TEXASR_PR  0x1000
+#define TEXASR_FS  0x0800
+#define TEXASR_TE  0x0400
+#define TEXASR_ROT 0x0200
+
+/* Vector Instructions */
+#define VSX_XX1(xs, ra, rb)(((xs) & 0x1f) << 21 | ((ra) << 16) |  \
+((rb) << 11) | (((xs) >> 5)))
+#define STXVD2X(xs, ra, rb).long (0x7c000798 | VSX_XX1((xs), (ra), (rb)))
+#define LXVD2X(xs, ra, rb) .long (0x7c000698 | VSX_XX1((xs), (ra), (rb)))
 
 #endif /* _SELFTESTS_POWERPC_REG_H */
-- 
1.8.3.1



[PATCH v15 00/15] selftests/powerpc: Add ptrace tests for ppc registers

2016-09-29 Thread wei . guo . simon
From: Simon Guo 

This selftest suite is for PPC register ptrace functionality. It
is also useful for Transaction Memory functionality verification.

Signed-off-by: Anshuman Khandual 
Signed-off-by: Simon Guo 

Test Result (All tests pass on both BE and LE) 
-- 
ptrace-ebb  PASS 
ptrace-gpr  PASS 
ptrace-tm-gpr   PASS 
ptrace-tm-spd-gpr   PASS 
ptrace-tar  PASS 
ptrace-tm-tar   PASS 
ptrace-tm-spd-tar   PASS 
ptrace-vsx  PASS 
ptrace-tm-vsx   PASS 
ptrace-tm-spd-vsx   PASS 
ptrace-tm-spr   PASS 

Previous versions: 
== 
RFC: https://lkml.org/lkml/2014/4/1/292
V1:  https://lkml.org/lkml/2014/4/2/43
V2:  https://lkml.org/lkml/2014/5/5/88
V3:  https://lkml.org/lkml/2014/5/23/486
V4:  https://lkml.org/lkml/2014/11/11/6
V5:  https://lkml.org/lkml/2014/11/25/134
V6:  https://lkml.org/lkml/2014/12/2/98
V7:  https://lkml.org/lkml/2015/1/14/19
V8:  https://lkml.org/lkml/2015/5/19/700
V9:  https://lkml.org/lkml/2015/10/8/522
V10: https://lkml.org/lkml/2016/2/16/219
V11: https://lkml.org/lkml/2016/7/16/231
V12: https://lkml.org/lkml/2016/7/27/134
V13: https://lkml.org/lkml/2016/7/27/656
V14: https://lkml.org/lkml/2016/9/12/57

Changes in V15:
---
- Squash patch 1 and 2 to avoid compile error after patch 1.
- Reorganize some code across patch 3 and 4 to avoid compile error
- Created a new directory utility under tools/testing/selftesting/powerpc
to organize common APIs across selftests.
- Use "tbegin." instead of TBEGIN macro. The same for other TM instructions.
- Correct while(ptr); loop without memory barrier.
- Remove an invalid checking on TEXASR in tm-spd-spr.c.
- Use FAIL_IF() as possible to indicate failure line conveniently.
- Consolidate some asm code on GPR/FPR load/save into reg.h/reg.S
- rebased to recent ppc git tree.

Changes in V14: 
--- 
- Remove duplicated NT_PPC_xxx register macro in 
tools/testing/selftests/powerpc/ptrace/ptrace.h
- Clean some coding style warning

Changes in V13: 
--- 
- Remove Cc lines from changelog
- Add more Signed-off-by lines of Simon Guo

Changes in V12: 
--- 
- Revert change which is trying to incoporate following patch:
  [PATCH 3/5] powerpc: tm: Always use fp_state and vr_state to store live 
registers
- Release share memory resource in all self test cases
- Optimize tfhar usage in ptrace-tm-spr.c

Changes in V11: 
--- 
- Rework based on following patch:
  [PATCH 3/5] powerpc: tm: Always use fp_state and vr_state to store live 
registers
- Split EBB/PMU register ptrace implementation.
- Clean some coding style warning
- Added more shared memory based sync between parent and child during TM tests
- Re worded some of the commit messages and cleaned them up
- selftests/powerpc/ebb/reg.h has already moved as selftests/powerpc/reg.h
  Dropped the previous patch doing the same thing
- Combined the definitions of SPRN_DSCR from dscr/ test cases
- Fixed dscr/ test cases for new SPRN_DSCR_PRIV definition available

Changes in V10: 
--- 
- Rebased against the latest mainline 
- Fixed couple of build failures in the test cases related to aux vector 

Changes in V9: 
-- 
- Fixed static build check failure after tm_orig_msr got dropped 
- Fixed asm volatile construct for used registers set 
- Fixed EBB, VSX, VMX tests for LE 
- Fixed TAR test which was failing because of system calls 
- Added checks for PPC_FEATURE2_HTM aux feature in the tests 
- Fixed copyright statements 

Changes in V8: 
-- 
- Split the misc register set into individual ELF core notes 
- Implemented support for VSX register set (on and off TM) 
- Implemented support for EBB register set 
- Implemented review comments on previous versions 
- Some code re-arrangements, re-writes and documentation 
- Added comprehensive list of test cases into selftests 

Changes in V7: 
-- 
- Fixed a config directive in the MISC code 
- Merged the two gitignore patches into a single one 

Changes in V6: 
-- 
- Added two git ignore patches for powerpc selftests 
- Re-formatted all in-code function definitions in kernel-doc format 

Changes in V5: 
-- 
- Changed flush_tmregs_to_thread, so not to take into account self tracing 
- Dropped the 3rd patch in the series which had merged two functions 
- Fixed one build problem for the misc debug register patch 
- Accommodated almost all the review comments from Suka on the 6th patch 
- Minor changes to the self test program 
- Changed commit messages for some of the patches 

Changes in V4: 
-- 
- Added one test program into the powerpc selftest bucket in this regard 
- Split the 2nd patch in the previous series into four different patches 
- Accommodated most of the review comments on the previous patch series 
- Added a patch to merge functions __switch_to_tm and tm_reclaim_task 


Re: [lkp] [staging] d4f56b47a8: divide error: 0000 [#1] PREEMPT SMP KASAN

2016-09-29 Thread Viresh Kumar
On Fri, Sep 30, 2016 at 7:29 AM, kernel test robot
 wrote:
>
>
> FYI, we noticed the following commit:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> commit d4f56b47a8fac90b15adfae80a42a2735d6b3213 ("staging: greybus: Add 
> drivers/staging/greybus to the build")
>
> in testcase: trinity
> with following parameters:
>
> runtime: 300s
>
>
> Trinity is a linux system call fuzz tester.
>
>
> on test machine: qemu-system-x86_64 -enable-kvm -m 512M
>
> caused below changes:
>
>
> ++++
> || 526dec0642 | d4f56b47a8 |
> ++++
> | boot_successes | 5  | 0  |
> | boot_failures  | 8  | 12 |
> | calltrace:SyS_open | 8  ||
> | invoked_oom-killer:gfp_mask=0x | 1  ||
> | Mem-Info   | 1  ||
> | IP-Config:Auto-configuration_of_network_failed | 2  ||
> | BUG:kernel_hang_in_test_stage  | 6  ||
> | divide_error:#[##]PREEMPT_SMP_KASAN| 0  | 12 |
> | RIP:gb_timesync_init   | 0  | 12 |
> | calltrace:gb_init  | 0  | 12 |
> | Kernel_panic-not_syncing:Fatal_exception   | 0  | 12 |
> ++++
>
>
>
> [   16.795543] FPGA image file name: xlinx_fpga_firmware.bit
> [   16.796615] GPIO INIT FAIL!!
> [   16.799462] Unable to find a compatible ARMv7 timer
> [   16.799948] divide error:  [#1] PREEMPT SMP KASAN
> [   16.800459] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
> 4.8.0-rc6-02364-gd4f56b4 #29
> [   16.801197] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> Debian-1.8.2-1 04/01/2014
> [   16.802055] task: 88001a124000 task.stack: 88001a14
> [   16.802645] RIP: 0010:[]  [] 
> gb_timesync_init+0x35/0x78
> [   16.803534] RSP: :88001a147e58  EFLAGS: 00010246
> [   16.804040] RAX: 00038d7ea4c68000 RBX:  RCX: 
> 8114ea41
> [   16.804716] RDX:  RSI:  RDI: 
> 88001a124c2c
> [   16.805393] RBP: 88001a147e60 R08: 0001 R09: 
> 
> [   16.806066] R10: 88001a147d70 R11: 83cddb35 R12: 
> 82f67cc6
> [   16.806744] R13:  R14: 82fbe8b0 R15: 
> 82fbe8f8
> [   16.807421] FS:  () GS:88001a40() 
> knlGS:
> [   16.808185] CS:  0010 DS:  ES:  CR0: 80050033
> [   16.808728] CR2:  CR3: 02c0a000 CR4: 
> 06b0
> [   16.809405] DR0:  DR1:  DR2: 
> 
> [   16.810078] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [   16.810752] Stack:
> [   16.811058]   88001a147e78 82f67d45 
> 
> [   16.811819]  88001a147ee8 82efe339 82b89800 
> 0012
> [   16.812576]  88001fa80fe5  82b0495f 
> 0006
> [   16.813332] Call Trace:
> [   16.813577]  [] gb_init+0x7f/0xb3
> [   16.814045]  [] do_one_initcall+0x9a/0x12c
> [   16.814588]  [] kernel_init_freeable+0x1b0/0x246
> [   16.815180]  [] kernel_init+0xc/0x108
> [   16.815679]  [] ret_from_fork+0x1f/0x40
> [   16.816197]  [] ? rest_init+0x13c/0x13c
> [   16.816724] Code: 85 c0 89 c3 74 12 48 c7 c7 64 ae b4 82 31 c0 e8 40 b5 27 
> fe 89 d8 eb 53 e8 cb 55 23 ff 31 d2 89 c6 48 b8 00 80 c6 a4 7e 8d 03 00 <48> 
> f7 f6 31 d2 48 c7 c7 84 ae b4 82 48 89 35 de 65 64 01 48 89
> [   16.819509] RIP  [] gb_timesync_init+0x35/0x78
> [   16.820094]  RSP 
> [   16.820548] ---[ end trace c73ba0f929e81492 ]---
> [   16.821001] Kernel panic - not syncing: Fatal exception

Can you please confirm if below patch fixes it for you ?

https://marc.info/?l=linux-kernel=147490908100954


Re: [PATCH] pinctrl: freescale: avoid overwriting pin config when freeing GPIO

2016-09-29 Thread Viresh Kumar
On 29-09-16, 15:16, Vladimir Zapolskiy wrote:
> If you look at the top I agree that this solution may be only one platform
> specific, but it fixes the broken driver of i.MX I2C bus controller.

Yeah, I saw that..

> Why do you get an impression that it looks like a hack?

Because we have to reorder things to make it work on a platform. This may break
things on other platforms and we don't know about it yet.

> Why pinctrl_select_state() is not done in gpio_request_one()? Because
> the first function gets pin mux/config setting and the second does not.
> How do you intend to get pin mux/config setting in gpio_request_one()?

Lets see what Linus has to say on this..

> Anyway I don't see any problems in pinctrl or gpio subsystems, the bugs
> must be addressed and fixed in i2c.

I think it can be a gpio driver specific thing as well and not really subsystem
level one.

-- 
viresh


Re: [PATCH v14 13/15] selftests/powerpc: Add ptrace tests for TM SPR registers

2016-09-29 Thread Simon Guo
Hi Cyril,
On Wed, Sep 14, 2016 at 03:04:12PM +1000, Cyril Bur wrote:
> On Mon, 2016-09-12 at 15:33 +0800, wei.guo.si...@gmail.com wrote:
> > From: Anshuman Khandual 
> > 
> > This patch adds ptrace interface test for TM SPR registers. This
> > also adds ptrace interface based helper functions related to TM
> > SPR registers access.
> > 
> 
> I'm seeing this one fail a lot, it does occasionally succeed but fails
> a lot on my test setup.
> 
> I use qemu on a power8 for most of my testing:
> qemu-system-ppc64 --enable-kvm -machine pseries,accel=kvm,usb=off -m
> 4096 -realtime mlock=off -smp 4,sockets=1,cores=2,threads=2 -nographic
> -vga none
> 
> 
> > Signed-off-by: Anshuman Khandual 
> > Signed-off-by: Simon Guo 
> > ---
> >  tools/testing/selftests/powerpc/ptrace/Makefile|   3 +-
> >  .../selftests/powerpc/ptrace/ptrace-tm-spr.c   | 186
> > +
> >  tools/testing/selftests/powerpc/ptrace/ptrace.h|  35 
> >  3 files changed, 223 insertions(+), 1 deletion(-)
> >  create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-tm-
> > spr.c
> > 
> > diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile
> > b/tools/testing/selftests/powerpc/ptrace/Makefile
> > index 797840a..f34670e 100644
> > --- a/tools/testing/selftests/powerpc/ptrace/Makefile
> > +++ b/tools/testing/selftests/powerpc/ptrace/Makefile
> > @@ -1,7 +1,8 @@
> >  TEST_PROGS := ptrace-ebb ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr
> > \
> >  ptrace-tar ptrace-tm-tar ptrace-tm-spd-tar ptrace-vsx ptrace-tm-vsx
> > \
> > -ptrace-tm-spd-vsx
> > +ptrace-tm-spd-vsx ptrace-tm-spr
> >  
> > +include ../../lib.mk
> >  
> >  all: $(TEST_PROGS)
> >  CFLAGS += -m64
> > diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spr.c
> > b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spr.c
> > new file mode 100644
> > index 000..2863070
> > --- /dev/null
> > +++ b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spr.c
> > @@ -0,0 +1,186 @@
> > +/*
> > + * Ptrace test TM SPR registers
> > + *
> > + * Copyright (C) 2015 Anshuman Khandual, IBM Corporation.
> > + *
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License
> > + * as published by the Free Software Foundation; either version
> > + * 2 of the License, or (at your option) any later version.
> > + */
> > +#include "ptrace.h"
> > +
> > +/* Tracee and tracer shared data */
> > +struct shared {
> > +   int flag;
> > +   struct tm_spr_regs regs;
> > +};
> > +unsigned long tfhar;
> > +
> > +int shm_id;
> > +volatile struct shared *cptr, *pptr;
> > +
> > +int shm_id1;
> > +volatile int *cptr1, *pptr1;
> > +
> > +#define TM_SCHED   0xde018c01
> > +#define TM_KVM_SCHED   0xe001ac01
> > +
> > +int validate_tm_spr(struct tm_spr_regs *regs)
> > +{
> > +   if (regs->tm_tfhar != tfhar)
> > +   return TEST_FAIL;
> > +
> > +   if ((regs->tm_texasr != TM_SCHED) && (regs->tm_texasr !=
> > TM_KVM_SCHED))
> > +   return TEST_FAIL;
> 
> The above condition fails, should this test try again if this condition
> is true, rather than fail?
> 

I reproduced the failure with your configuration. Besides treclaim, there are 
many other reasons that may lead to the transaction failure according to ISA.  
At least I observed following Texasr values:
  11801
  18801
  1a801


I noticed some FIAR locates at IPI handling related code.  My previous 
configuration 
is 4 sockets/each with only 1 core/1 thread. That is probably the reason I 
always 
passed the test in the old configuration.(per my understanding, IPI is limited 
within 
threads).  So I removed the checking regarding specfied TEXASR value.

And I think I have reworked all your other comments.

I will send out v15 soon. Again thanks for your code inspection.

BR,
- Simon


Re: [PATCH] pinctrl: freescale: avoid overwriting pin config when freeing GPIO

2016-09-29 Thread Viresh Kumar
On 29-09-16, 09:33, Stefan Agner wrote:
> You need to differentiate between Vybrid and i.MX:
> 
> Vybrid muxes a pin to GPIO on gpio_request_one (via .gpio_request_enable
> callback)
> i.MX does not mux a pin as GPIO on its own, but needs to be muxed
> explicitly. That has been always the case...
> 
> I don't know what behavior is right, it is just "different"...

Hmm, I think What Vybrid and Tegra have done is better, but it would be better
to get inputs from Linus, which you already asked for :)

-- 
viresh


Re: [PATCH 3/3] arm64: dump: Add checking for writable and exectuable pages

2016-09-29 Thread Mark Rutland
Hi,

On Thu, Sep 29, 2016 at 02:32:57PM -0700, Laura Abbott wrote:
> Page mappings with full RWX permissions are a security risk. x86
> has an option to walk the page tables and dump any bad pages.
> (See e1a58320a38d ("x86/mm: Warn on W^X mappings")). Add a similar
> implementation for arm64.
> 
> Signed-off-by: Laura Abbott 

> @@ -31,6 +32,8 @@ struct ptdump_info {
>   const struct addr_marker*markers;
>   unsigned long   base_addr;
>   unsigned long   max_addr;

(unrelated aside: it looks like max_addr is never used or even assigned to;
care to delete it in a prep patch?)

> + /* Internal, do not touch */
> + struct list_headnode;
>  };

> +static LIST_HEAD(dump_info);

With the EFI runtime map tables it's unfortunately valid (and very likely with
64K pages) that there will be RWX mappings, at least with contemporary versions
of the UEFI spec. Luckily, those are only installed rarely and transiently.

Given that (and other potential ptdump users), I don't think we should have a
dynamic list of ptdump_infos for W^X checks, and should instead have
ptdump_check_wx() explicitly check the tables we care about. More comments
below on that.

I think we only care about the swapper and hyp tables, as nothing else is
permanent. Does that sound sane?

>  struct prot_bits {
> @@ -219,6 +223,15 @@ static void note_page(struct pg_state *st, unsigned long 
> addr, unsigned level,
>   unsigned long delta;
>  
>   if (st->current_prot) {
> + if (st->check_wx &&
> + ((st->current_prot & PTE_RDONLY) != PTE_RDONLY) &&
> + ((st->current_prot & PTE_PXN) != PTE_PXN)) {
> + WARN_ONCE(1, "arm64/mm: Found insecure W+X 
> mapping at address %p/%pS\n",
> +  (void *)st->start_address,
> +  (void *)st->start_address);
> + st->wx_pages += (addr - st->start_address) / 
> PAGE_SIZE;
> + }
> +

Currently note_page() is painful to read due to the indentation and logic.
Rather than adding to that, could we factor this into a helper? e.g.

note_prot_wx(struct pg_state *st, unsigned long addr)
{
if (!st->check_wx)
return;
if ((st->current_prot & PTE_RDONLY) == PTE_RDONLY)
return;
if ((st->current_prot & PTE_PXN) == PTE_PXN)
return;

WARN_ONCE(1, "arm64/mm: Found insecure W+X mapping at address %p/%pS\n",
  (void *)st->start_address, (void *)st->start_address);

st->wx_pages += (addr - st->start_address) / PAGE_SIZE;
}

> +void ptdump_check_wx(void)
> +{
> + struct ptdump_info *info;
> +
> + list_for_each_entry(info, _info, node) {
> + struct pg_state st = {
> + .seq = NULL,
> + .marker = info->markers,
> + .check_wx = true,
> + };
> +
> + __walk_pgd(, info->mm, info->base_addr);
> + note_page(, 0, 0, 0);
> + if (st.wx_pages)
> + pr_info("Checked W+X mappings (%p): FAILED, %lu W+X 
> pages found\n",
> + info->mm,
> + st.wx_pages);
> + else
> + pr_info("Checked W+X mappings (%p): passed, no W+X 
> pages found\n", info->mm);
> + }
> +}

As above, I don't think we should use a list of arbitrary ptdump_infos.

Given we won't log addresses in the walking code, I think that we can make up a
trivial marker array, and then just use init_mm direct, e.g. (never even
compile-tested):

void ptdump_check_wx(void)
{
struct pg_state st = {
.seq = NULL,
.marker = (struct addr_markers[]) {
{ -1, NULL},
},
.check_wx = true,
};

__walk_pgd(, init_mm, 0);
note_page(, 0, 0, 0);
if (st.wx_pages)
pr_info("Checked W+X mappings (%p): FAILED, %lu W+X pages 
found\n",
info->mm,
st.wx_pages);
else
pr_info("Checked W+X mappings (%p): passed, no W+X pages 
found\n", info->mm);
}

Otherwise, this looks good to me. Thanks for putting this together!

Mark.


Re: [RFC][PATCH 7/7] printk: new printk() recursion detection

2016-09-29 Thread Sergey Senozhatsky
On (09/29/16 15:19), Petr Mladek wrote:
> I am sorry but I do not understand this much. printk() should set the
> alternative implementation in the critical section by default.
> Why do we need to handle this so specially?
> 
> Is it because of flushing in NMI context when panicing? I would call
> vprintk_emit() directly from the flush_line() function in this case.
> Then all other possible error printk's will get redirected to the
> NMI buffer which is good enouh.

I'm going to re-do the entire thing. I had some cases in mind, like
WARN from vsnprintf from printk from alt_printk_flushing from panic.
or something like this. perhaps too complicated, will re-think it.

-ss


[lkp] [staging] d4f56b47a8: divide error: 0000 [#1] PREEMPT SMP KASAN

2016-09-29 Thread kernel test robot

FYI, we noticed the following commit:

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
commit d4f56b47a8fac90b15adfae80a42a2735d6b3213 ("staging: greybus: Add 
drivers/staging/greybus to the build")

in testcase: trinity
with following parameters:

runtime: 300s


Trinity is a linux system call fuzz tester.


on test machine: qemu-system-x86_64 -enable-kvm -m 512M

caused below changes:


++++
|| 526dec0642 | d4f56b47a8 |
++++
| boot_successes | 5  | 0  |
| boot_failures  | 8  | 12 |
| calltrace:SyS_open | 8  ||
| invoked_oom-killer:gfp_mask=0x | 1  ||
| Mem-Info   | 1  ||
| IP-Config:Auto-configuration_of_network_failed | 2  ||
| BUG:kernel_hang_in_test_stage  | 6  ||
| divide_error:#[##]PREEMPT_SMP_KASAN| 0  | 12 |
| RIP:gb_timesync_init   | 0  | 12 |
| calltrace:gb_init  | 0  | 12 |
| Kernel_panic-not_syncing:Fatal_exception   | 0  | 12 |
++++



[   16.795543] FPGA image file name: xlinx_fpga_firmware.bit
[   16.796615] GPIO INIT FAIL!!
[   16.799462] Unable to find a compatible ARMv7 timer
[   16.799948] divide error:  [#1] PREEMPT SMP KASAN
[   16.800459] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.8.0-rc6-02364-gd4f56b4 #29
[   16.801197] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Debian-1.8.2-1 04/01/2014
[   16.802055] task: 88001a124000 task.stack: 88001a14
[   16.802645] RIP: 0010:[]  [] 
gb_timesync_init+0x35/0x78
[   16.803534] RSP: :88001a147e58  EFLAGS: 00010246
[   16.804040] RAX: 00038d7ea4c68000 RBX:  RCX: 8114ea41
[   16.804716] RDX:  RSI:  RDI: 88001a124c2c
[   16.805393] RBP: 88001a147e60 R08: 0001 R09: 
[   16.806066] R10: 88001a147d70 R11: 83cddb35 R12: 82f67cc6
[   16.806744] R13:  R14: 82fbe8b0 R15: 82fbe8f8
[   16.807421] FS:  () GS:88001a40() 
knlGS:
[   16.808185] CS:  0010 DS:  ES:  CR0: 80050033
[   16.808728] CR2:  CR3: 02c0a000 CR4: 06b0
[   16.809405] DR0:  DR1:  DR2: 
[   16.810078] DR3:  DR6: fffe0ff0 DR7: 0400
[   16.810752] Stack:
[   16.811058]   88001a147e78 82f67d45 

[   16.811819]  88001a147ee8 82efe339 82b89800 
0012
[   16.812576]  88001fa80fe5  82b0495f 
0006
[   16.813332] Call Trace:
[   16.813577]  [] gb_init+0x7f/0xb3
[   16.814045]  [] do_one_initcall+0x9a/0x12c
[   16.814588]  [] kernel_init_freeable+0x1b0/0x246
[   16.815180]  [] kernel_init+0xc/0x108
[   16.815679]  [] ret_from_fork+0x1f/0x40
[   16.816197]  [] ? rest_init+0x13c/0x13c
[   16.816724] Code: 85 c0 89 c3 74 12 48 c7 c7 64 ae b4 82 31 c0 e8 40 b5 27 
fe 89 d8 eb 53 e8 cb 55 23 ff 31 d2 89 c6 48 b8 00 80 c6 a4 7e 8d 03 00 <48> f7 
f6 31 d2 48 c7 c7 84 ae b4 82 48 89 35 de 65 64 01 48 89 
[   16.819509] RIP  [] gb_timesync_init+0x35/0x78
[   16.820094]  RSP 
[   16.820548] ---[ end trace c73ba0f929e81492 ]---
[   16.821001] Kernel panic - not syncing: Fatal exception


To reproduce:

git clone 
git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml  # job file is attached in this email
bin/lkp run job.yaml



Thanks,
Xiaolong
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.8.0-rc6 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y

Re: BUG: scheduling while atomic in f_fs when gadget remove driver

2016-09-29 Thread Chen Yu
Hi Michal,

Thanks for the patch.

在 2016/9/29 5:38, Michal Nazarewicz 写道:
> On Wed, Sep 28 2016, Michal Nazarewicz wrote:
>> With that done, the only thing which needs a mutex is
>> epfile->read_buffer.
> 
> Perhaps this would do:
> 

I tested the patch on Hikey board with adb function on android, it does fix the 
problem.

thanks
Chen Yu

>  >8 -- -
>>From 6416a1065203a39328311f6c58083089efe169aa Mon Sep 17 00:00:00 2001
> From: Michal Nazarewicz 
> Date: Wed, 28 Sep 2016 23:36:56 +0200
> Subject: [RFC] usb: gadget: f_fs: stop sleeping in ffs_func_eps_disable
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> ffs_func_eps_disable is called from atomic context so it cannot sleep
> thus cannot grab a mutex.  Change the handling of epfile->read_buffer
> to use non-sleeping synchronisation method.
> 
> Reported-by: Chen Yu 
> Signed-off-by: Michał Nazarewicz 
> Fixes: 9353afbbfa7b ("buffer data from ‘oversized’ OUT requests")
> ---
>  drivers/usb/gadget/function/f_fs.c | 89 
> +++---
>  1 file changed, 73 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/usb/gadget/function/f_fs.c 
> b/drivers/usb/gadget/function/f_fs.c
> index 759f5d4..8db53da 100644
> --- a/drivers/usb/gadget/function/f_fs.c
> +++ b/drivers/usb/gadget/function/f_fs.c
> @@ -136,8 +136,50 @@ struct ffs_epfile {
>   /*
>* Buffer for holding data from partial reads which may happen since
>* we’re rounding user read requests to a multiple of a max packet size.
> +  *
> +  * The pointer starts with NULL value and may be initialised to other
> +  * value by __ffs_epfile_read_data function which may need to allocate
> +  * the temporary buffer.
> +  *
> +  * In normal operation, subsequent calls to __ffs_epfile_read_buffered
> +  * will consume data from the buffer and eventually free it.
> +  * Importantly, while the function is using the buffer, it sets the
> +  * pointer to NULL.  This is all right since __ffs_epfile_read_data and
> +  * __ffs_epfile_read_buffered can never run concurrently (as they are
> +  * protected by epfile->mutex) so the latter will not assign a new value
> +  * to the buffer.
> +  *
> +  * Meanwhile __ffs_func_eps_disable frees the buffer (if the pointer is
> +  * valid) and sets the pointer to READ_BUFFER_DROP value.  This special
> +  * value is crux of the synchronisation between __ffs_func_eps_disable
> +  * and __ffs_epfile_read_data.
> +  *
> +  * Once __ffs_epfile_read_data is about to finish it will try to set the
> +  * pointer back to its old value (as described above), but seeing as the
> +  * pointer is not-NULL (namely READ_BUFFER_DROP) it will instead free
> +  * the buffer.
> +  *
> +  * This how concurrent calls to the two functions would look like (‘<->’
> +  * denotes xchg operation):
> +  *
> +  *   read_buffer = some buffer
> +  *
> +  *  THREAD A THREAD B
> +  *   __ffs_epfile_read_data:
> +  *  buf = NULL
> +  *  buf <-> read_buffer
> +  *  … do stuff on buf …
> +  *__ffs_func_eps_disable:
> +  *buf = READ_BUFFER_DROP
> +  *buf <-> read_buffer
> +  *kfree(buf);
> +  *
> +  *  old = cmpxchg(read_buffer, NULL, buf)
> +  *  if (old)
> +  *  kfree(buf)
>*/
> - struct ffs_buffer   *read_buffer;   /* P: epfile->mutex */
> + struct ffs_buffer   *read_buffer;
> +#define READ_BUFFER_DROP ((struct ffs_buffer *)ERR_PTR(-ESHUTDOWN))
>  
>   charname[5];
>  
> @@ -740,21 +782,31 @@ static void ffs_epfile_async_io_complete(struct usb_ep 
> *_ep,
>  static ssize_t __ffs_epfile_read_buffered(struct ffs_epfile *epfile,
> struct iov_iter *iter)
>  {
> - struct ffs_buffer *buf = epfile->read_buffer;
> + /*
> +  * Null out epfile->read_buffer so ffs_func_eps_disable does not free
> +  * the buffer while we are using it.
> +  */
> + struct ffs_buffer *buf = xchg(>read_buffer, NULL);
>   ssize_t ret;
> - if (!buf)
> + if (!buf || buf == READ_BUFFER_DROP)
>   return 0;
>  
>   ret = copy_to_iter(buf->data, buf->length, iter);
>   if (buf->length == ret) {
>   kfree(buf);
> - epfile->read_buffer = NULL;
> - } else if (unlikely(iov_iter_count(iter))) {
> + return ret;
> + }
> +
> + if (unlikely(iov_iter_count(iter))) {
>   ret = -EFAULT;
>   } else {
>   buf->length -= ret;
>   buf->data += ret;
>   }
> +
> +  

Re: [PATCH v2 0/3] g_NCR5380: Modernization

2016-09-29 Thread Martin K. Petersen
> "Ondrej" == Ondrej Zary  writes:

Ondrej> This small patch series removes deprecated code from g_NCR5380
Ondrej> driver and converts it from scsi_module.c to scsi_add_host().

Applied to 4.9/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering


[PATCH 2/2] mmc: sdhci-of-arasan: mark sdhci_arasan_reset() static

2016-09-29 Thread Baoyou Xie
We get 1 warning when building kernel with W=1:
drivers/mmc/host/sdhci-of-arasan.c:253:6: warning: no previous prototype for 
'sdhci_arasan_reset' [-Wmissing-prototypes]

In fact, this function is only used in the file in which it is
declared and don't need a declaration, but can be made static.
So this patch marks it 'static'.

Signed-off-by: Baoyou Xie 
---
 drivers/mmc/host/sdhci-of-arasan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci-of-arasan.c 
b/drivers/mmc/host/sdhci-of-arasan.c
index da8e40a..e263671 100644
--- a/drivers/mmc/host/sdhci-of-arasan.c
+++ b/drivers/mmc/host/sdhci-of-arasan.c
@@ -250,7 +250,7 @@ static void sdhci_arasan_hs400_enhanced_strobe(struct 
mmc_host *mmc,
writel(vendor, host->ioaddr + SDHCI_ARASAN_VENDOR_REGISTER);
 }
 
-void sdhci_arasan_reset(struct sdhci_host *host, u8 mask)
+static void sdhci_arasan_reset(struct sdhci_host *host, u8 mask)
 {
u8 ctrl;
struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
-- 
2.7.4



[PATCH 1/2] mmc: block: add missing header dependencies

2016-09-29 Thread Baoyou Xie
We get 1 warning when building kernel with W=1:
drivers/mmc/card/block.c:2147:5: warning: no previous prototype for 
'mmc_blk_issue_rq' [-Wmissing-prototypes]

In fact, this function is declared in drivers/mmc/card/block.h,
so this patch adds missing header dependencies.

Signed-off-by: Baoyou Xie 
---
 drivers/mmc/card/block.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index c333511..0f2cc9f2 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -46,6 +46,7 @@
 #include 
 
 #include "queue.h"
+#include "block.h"
 
 MODULE_ALIAS("mmc:block");
 #ifdef MODULE_PARAM_PREFIX
-- 
2.7.4



Re: [RFC PATCH v1 0/2] printk: Shared kernel logging

2016-09-29 Thread Kees Cook
On Thu, Sep 29, 2016 at 5:55 PM, Sean Hudson  wrote:
> This patch set is based on Linus' v4.8-rc8 tag.
>
> This debug feature allows the kernel to use an external buffer and control
> block for kernel log messages. The feature is controlled by an optional
> command line parameter. The existing buffer and control block can contain
> existing log messages from previous boot cycles and/or the bootloader. The
> command line parameter was chosen for flexibility, cross arch portability,
> and the ability to dynamically enable/disable this feature. The parameter
> specifies the address of a control block used to replace the default log
> buffer. Existing bootloader and kernel log messages are kept, in order,
> inside the new buffer. After a boot that preserves the buffer contents, a
> bootloader can display both kernel and bootloader log entries from multiple,
> previous boots. It also allows the kernel to display bootloader log entries
> along with its own messages.
>
> This feature is intended for debug purposes and has no effect unless the
> command line parameter is specified. Further, it validates the passed
> control block carefully and if any checks fail, it falls back to the default
> behaviour. As such, it can be left enabled by default.
>
> Memory Reservation
>
> This feature expects the bootloader to reserve/preserve the shared buffer
> memory. This reservation needs to prevent the kernel from overwriting the
> external log control block and log entries. In my testing, I've used the
> 'fdt' commands in uboot to dynamically inject reserved memory regions via
> the DT to the kernel.

Interesting! I wonder if this can be adjusted to incorporate the
existing console logging feature in the pstore which does a similar
thing? Though pstore doesn't know about bootloader logs, really, it's
just storing kernel logs in a ring buffer. Maybe this can provide a
backend to pstore or something, especially since pstore initialization
happens "too late" for this to really be very sensible. It just seems
like it'd be nice to have a single persistent console memory region...

-Kees

>
> Sean Hudson (2):
>   printk: collect offsets into replaceable structure
>   printk: external log buffer (CONFIG_LOGBUFFER)
>
>  init/Kconfig   |  12 +
>  init/main.c|   2 +
>  kernel/printk/printk.c | 598 
> +++--
>  3 files changed, 445 insertions(+), 167 deletions(-)
>
> --
> 1.9.1
>
>



-- 
Kees Cook
Nexus Security


Re: [PATCH v2] UFS: Date Segment only need for WRITE DESCRIPTOR

2016-09-29 Thread Martin K. Petersen
> "Kiwoong" == Kiwoong Kim  writes:

Kiwoong> I think that the patch is correct.  UFS spec says "The Data
Kiwoong> Segment area is empty" for Read Descriptor.  I have been using
Kiwoong> similar code with it and it works.  That have been already
Kiwoong> applied in Android kernel.

That's fine. Just checking.

Applied to 4.9/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering


[PATCH] drivers: video: console: bitblit.c

2016-09-29 Thread Nahom
Fixed an indentation coding style issue.

Signed-off-by: Nahom 
---
 drivers/video/console/bitblit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/video/console/bitblit.c b/drivers/video/console/bitblit.c
index dbfe4ee..4e7d0e3 100644
--- a/drivers/video/console/bitblit.c
+++ b/drivers/video/console/bitblit.c
@@ -256,7 +256,7 @@ static void bit_cursor(struct vc_data *vc, struct fb_info 
*info, int mode,
y += softback_lines;
}
 
-   c = scr_readw((u16 *) vc->vc_pos);
+   c = scr_readw((u16 *) vc->vc_pos);
attribute = get_attribute(info, c);
src = vc->vc_font.data + ((c & charmask) * (w * vc->vc_font.height));
 
-- 
2.7.4



Re: [kernel-hardening] [PATCH 0/3] WX Checking for arm64

2016-09-29 Thread Kees Cook
On Thu, Sep 29, 2016 at 2:32 PM, Laura Abbott  wrote:
>
> Hi,
>
> This is an implementation to check for writable and executable pages on arm64.
> This is heavily based on the x86 version which uses the existing page table
> dumping code to do the checking. Some notes:
>
> - The W^X checking is important so this option should become defaut 
> eventually.
>   To make this feasible, the debugfs functionality has been split out as a
>   separate option. I didn't see a good way to make it modular like x86 but
>   an option should be good enough.
> - This checks all page tables registered with ptdump_register. I don't see 
> this
>   being called elsewhere right now though.
> - Once this is merged, I'd like to see about moving DEBUG_WX to the top level
>   instead of having each arch call it in mark_rodata.

Awesome!

Yeah, I think we should take a look at refactoring x86, arm, and arm64
to use a common infrastructure with callbacks. That way other
architectures can gain all these features with just a few callbacks
implemented.

-Kees

>
> Laura Abbott (3):
>   arm64: dump: Make ptdump debugfs a separate option
>   arm64: dump: Make the page table dumping seq_file optional
>   arm64: dump: Add checking for writable and exectuable pages
>
>  arch/arm64/Kconfig.debug| 34 ++-
>  arch/arm64/include/asm/ptdump.h | 25 ++-
>  arch/arm64/mm/Makefile  |  3 +-
>  arch/arm64/mm/dump.c| 92 
> -
>  arch/arm64/mm/mmu.c |  2 +
>  arch/arm64/mm/ptdump_debugfs.c  | 33 +++
>  6 files changed, 157 insertions(+), 32 deletions(-)
>  create mode 100644 arch/arm64/mm/ptdump_debugfs.c
>
> --
> 2.10.0
>



-- 
Kees Cook
Nexus Security


Re: [PATCH 1/3] arm64: dump: Make ptdump debugfs a separate option

2016-09-29 Thread Mark Rutland
On Thu, Sep 29, 2016 at 06:11:44PM -0700, Laura Abbott wrote:
> On 09/29/2016 05:48 PM, Mark Rutland wrote:
> >On Thu, Sep 29, 2016 at 05:31:09PM -0700, Laura Abbott wrote:
> >>On 09/29/2016 05:13 PM, Mark Rutland wrote:
> >>>On Thu, Sep 29, 2016 at 02:32:55PM -0700, Laura Abbott wrote:
> +int ptdump_register(struct ptdump_info *info, const char *name)
> +{
> + ptdump_initialize(info);
> + return ptdump_debugfs_create(info, name);
> }

> >I meant moving ptdump_register into ptdump_debugfs.c, perhaps renamed to 
> >make it
> >clear it's debugfs-specific.
> >
> >We could instead update existing users to call ptdump_debugfs_create()
> >directly, and have that call ptdump_initialize(), which could itself become a
> >staic inline in a header.
> 
> Ah okay, I see what you are suggesting. ptdump_initialize should still
> happen regardless of debugfs status though so I guess ptdump_debugfs_create
> would just get turned into just ptdump_initialize
> which seems a little unclear. I'll come up with some other shed
> colors^W^Wfunction names.

Cheers!

FWIW, given ptsump_initialize() is only going to be called with the ptdump core
and debugfs code, I'm not all that concerned by what it's called. A few leading
underscores is about the only thing that comes to mind, but even as-is I think
it should be fine.

Thanks,
Mark.


[PATCH v2] MIPS: loongson32: Remove several RTC-related macros

2016-09-29 Thread Yang Ling
Add regs-rtc.h to replace the macros of redundancy.

Signed-off-by: Yang Ling 

---
V2:
  Add the header file regs-rtc.h in loongson1.h.
---
 arch/mips/include/asm/mach-loongson32/loongson1.h |  1 +
 arch/mips/include/asm/mach-loongson32/regs-rtc.h  | 23 +++
 arch/mips/loongson32/common/platform.c| 22 +-
 3 files changed, 33 insertions(+), 13 deletions(-)
 create mode 100644 arch/mips/include/asm/mach-loongson32/regs-rtc.h

diff --git a/arch/mips/include/asm/mach-loongson32/loongson1.h 
b/arch/mips/include/asm/mach-loongson32/loongson1.h
index 3584c40..a4cacda 100644
--- a/arch/mips/include/asm/mach-loongson32/loongson1.h
+++ b/arch/mips/include/asm/mach-loongson32/loongson1.h
@@ -52,6 +52,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #endif /* __ASM_MACH_LOONGSON32_LOONGSON1_H */
diff --git a/arch/mips/include/asm/mach-loongson32/regs-rtc.h 
b/arch/mips/include/asm/mach-loongson32/regs-rtc.h
new file mode 100644
index 000..1fe724b
--- /dev/null
+++ b/arch/mips/include/asm/mach-loongson32/regs-rtc.h
@@ -0,0 +1,23 @@
+/*
+ * Copyright (c) 2016 Yang Ling 
+ *
+ * Loongson 1 RTC timer Register Definitions.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under  the terms of the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#ifndef __ASM_MACH_LOONGSON32_REGS_RTC_H
+#define __ASM_MACH_LOONGSON32_REGS_RTC_H
+
+#define LS1X_RTC_REG(x) \
+   ((void __iomem *)KSEG1ADDR(LS1X_RTC_BASE + (x)))
+
+#define LS1X_RTC_CTRL  LS1X_RTC_REG(0x40)
+
+#define RTC_EXTCLK_OK  (BIT(5) | BIT(8))
+#define RTC_EXTCLK_EN  BIT(8)
+
+#endif /* __ASM_MACH_LOONGSON32_REGS_RTC_H */
diff --git a/arch/mips/loongson32/common/platform.c 
b/arch/mips/loongson32/common/platform.c
index beff085..4e28e0f 100644
--- a/arch/mips/loongson32/common/platform.c
+++ b/arch/mips/loongson32/common/platform.c
@@ -23,10 +23,6 @@
 #include 
 #include 
 
-#define LS1X_RTC_CTRL  ((void __iomem *)KSEG1ADDR(LS1X_RTC_BASE + 0x40))
-#define RTC_EXTCLK_OK  (BIT(5) | BIT(8))
-#define RTC_EXTCLK_EN  BIT(8)
-
 /* 8250/16550 compatible UART */
 #define LS1X_UART(_id) \
{   \
@@ -70,15 +66,6 @@ void __init ls1x_serial_set_uartclk(struct platform_device 
*pdev)
p->uartclk = clk_get_rate(clk);
 }
 
-void __init ls1x_rtc_set_extclk(struct platform_device *pdev)
-{
-   u32 val;
-
-   val = __raw_readl(LS1X_RTC_CTRL);
-   if (!(val & RTC_EXTCLK_OK))
-   __raw_writel(val | RTC_EXTCLK_EN, LS1X_RTC_CTRL);
-}
-
 /* CPUFreq */
 static struct plat_ls1x_cpufreq ls1x_cpufreq_pdata = {
.clk_name   = "cpu_clk",
@@ -357,6 +344,15 @@ struct platform_device ls1x_ehci_pdev = {
 };
 
 /* Real Time Clock */
+void __init ls1x_rtc_set_extclk(struct platform_device *pdev)
+{
+   u32 val;
+
+   val = __raw_readl(LS1X_RTC_CTRL);
+   if (!(val & RTC_EXTCLK_OK))
+   __raw_writel(val | RTC_EXTCLK_EN, LS1X_RTC_CTRL);
+}
+
 struct platform_device ls1x_rtc_pdev = {
.name   = "ls1x-rtc",
.id = -1,
-- 
1.9.1



Hi linux

2016-09-29 Thread Chris Rankin
Good morning linux

http://www.scanman.com.au/particularly.php?stick=2ubmk03m5uy9q


Chris Rankin


Re: [RFC][PATCH 6/7] printk: use alternative printk buffers

2016-09-29 Thread Sergey Senozhatsky
On (09/29/16 15:00), Petr Mladek wrote:
[..]
> > @@ -1791,7 +1791,7 @@ asmlinkage int vprintk_emit(int facility, int level,
> > zap_locks();
> > }
> >  
> > -   lockdep_off();
> > +   alt_printk_enter();
> 
> IMHO, we could not longer enter vprintk_emit() recursively. The same
> section that was guarded by logbuf_cpu is guarded by
> alt_printk_enter()/exit() now.

you might be very right here. I'll take a look.

> IMHO, we could remove all the logic around the recursion. Then we
> could even disable/enable irqs inside alt_printk_enter()/exit().

I was thinking of doing something like this; but that would require
storing 'unsigned long' flags in per-cpu data

alt_enter()
{
unsinged long flags;

local_irq_save(flags);
ctx = this_cpu_ptr();
ctx->flags = flags;
...
}

alt_exit()
{
ctx = this_cpu_ptr();
...
local_irq_restore(ctx->flags);
}


and the decision was to keep `unsigned long flags' on stack in the
alt_enter/exit caller. besides in most of the cases we already have
it (in vprintk_emit() and console_unlock()).

but I can certainly hide these details in alt_enter/exit.


> And to correct myself from the previous mail. It is enough to disable
> IRQs. It is enough to make sure that we will not preempt and will
> stay on the same CPU.

ah, no prob.

> > @@ -2479,7 +2490,9 @@ void console_unlock(void)
> >  */
> > raw_spin_lock(_lock);
> > retry = console_seq != log_next_seq;
> > -   raw_spin_unlock_irqrestore(_lock, flags);
> > +   raw_spin_unlock(_lock);
> > +   alt_printk_exit();
> > +   local_irq_restore(flags);
> 
> We should mention that this patch makes an obsolete artefact from
> printk_deferred(). It opens the door for another big cleanup and
> relief.

do you mean that, once alt_printk is done properly, we can drop
printk_deferred()? I was thinking of it, but decided not to
mention/touch it in this patch set.

-ss


Re: [PATCH 1/3] arm64: dump: Make ptdump debugfs a separate option

2016-09-29 Thread Laura Abbott

On 09/29/2016 05:48 PM, Mark Rutland wrote:

On Thu, Sep 29, 2016 at 05:31:09PM -0700, Laura Abbott wrote:

On 09/29/2016 05:13 PM, Mark Rutland wrote:

On Thu, Sep 29, 2016 at 02:32:55PM -0700, Laura Abbott wrote:

+int ptdump_register(struct ptdump_info *info, const char *name)
+{
+   ptdump_initialize(info);
+   return ptdump_debugfs_create(info, name);
}


It feels like a layering violation to have the core ptdump code call the
debugfs ptdump code. Is there some reason this has to live here?


Which 'this' are you referring to here? Are you suggesting moving
the ptdump_register elsewhere or moving the debugfs create elsewhere?


Sorry, I should have worded that better.

I meant moving ptdump_register into ptdump_debugfs.c, perhaps renamed to make it
clear it's debugfs-specific.

We could instead update existing users to call ptdump_debugfs_create()
directly, and have that call ptdump_initialize(), which could itself become a
staic inline in a header.


Ah okay, I see what you are suggesting. ptdump_initialize should still
happen regardless of debugfs status though so I guess 
ptdump_debugfs_create would just get turned into just ptdump_initialize
which seems a little unclear. I'll come up with some other shed 
colors^W^Wfunction names.


Thanks,
Laura



[PATCH] cpufreq: intel_pstate: Clarify comment in get_target_pstate_use_performance()

2016-09-29 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

Make the comment explaining the meaning of the perf_scaled variable
in get_target_pstate_use_performance() more straightforward.

Signed-off-by: Rafael J. Wysocki 
---
 drivers/cpufreq/intel_pstate.c |9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

Index: linux-pm/drivers/cpufreq/intel_pstate.c
===
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -1251,10 +1251,11 @@ static inline int32_t get_target_pstate_
u64 duration_ns;
 
/*
-* perf_scaled is the average performance during the last sampling
-* period scaled by the ratio of the maximum P-state to the P-state
-* requested last time (in percent).  That measures the system's
-* response to the previous P-state selection.
+* perf_scaled is the ratio of the average P-state during the last
+* sampling period to the P-state requested last time (in percent).
+*
+* That measures the system's response to the previous P-state
+* selection.
 */
max_pstate = cpu->pstate.max_pstate_physical;
current_pstate = cpu->pstate.current_pstate;



Re: [RFC][PATCH 3/7] printk: introduce per-cpu alt_print seq buffer

2016-09-29 Thread Sergey Senozhatsky
On (09/29/16 14:26), Petr Mladek wrote:
[..]
> >  printk()
> >   local_irq_save()
> >   alt_printk_enter()
> 
> We need to make sure that exit() is called on the same CPU.
> Therefore we need to disable preemption as well.

local_irq_save() does this for us, we can't get sched tick or
re-sched IPI, and even more - we eliminate race conditions on
this CPU. only one path can touch alt_printk related stuff,
NMI works with its own buffer.

[..]
> What do you think about my approach with the printk_context per-CPU
> value from the WARN_DEFERRED() patchset? The main idea is that
> the entry()/exit() functions manipulate preempt_count-like per-CPU
> variable. The printk() function selects the safe implementation
> according to the current state.

I'll take a look.

hm, what I was thinking of... you are right, this all smells a bit
bad. I'll revisit it.

thanks!

-ss


Re: [PATCH 1/2] f2fs: use crc and cp version to determine roll-forward recovery

2016-09-29 Thread Chao Yu
On 2016/9/30 8:53, Jaegeuk Kim wrote:
> On Thu, Sep 29, 2016 at 08:01:32PM +0800, Chao Yu wrote:
>> On 2016/9/20 10:55, Jaegeuk Kim wrote:
>>> Previously, we used cp_version only to detect recoverable dnodes.
>>> In order to avoid same garbage cp_version, we needed to truncate the next
>>> dnode during checkpoint, resulting in additional discard or data write.
>>> If we can distinguish this by using crc in addition to cp_version, we can
>>> remove this overhead.
>>>
>>> There is backward compatibility concern where it changes node_footer layout.
>>> But, it only affects the direct nodes written after the last checkpoint.
>>> We simply expect that user would change kernel versions back and forth after
>>> stable checkpoint.
>>
>> Seems with new released v4.8 f2fs, old image with recoverable data could be
>> mounted successfully, but meanwhile all fsynced data which needs to be 
>> recovered
>> will be lost w/o any hints?
>>
>> Could we release a new version mkfs paired with new kernel module, so we can 
>> tag
>> image as a new layout one, then new kernel module can recognize the image 
>> layout
>> and adjust version suited comparing method with old or new image?
> 
> Hmm, how about adding a checkpoint flag like CP_CRC_RECOVERY_FLAG?
> Then, we can proceed crc|cp_ver, if the last checkpoint has this flag.
> 
> Any thought?

Ah, that's better. :)

Thanks,

> 
>>
>> Thanks,
>>
>>
> 
> .
> 



[PATCH v2] x86/entry/64: Fix context tracking state warning when load_gs_index fails

2016-09-29 Thread Wanpeng Li
From: Wanpeng Li 

 WARNING: CPU: 0 PID: 3331 at arch/x86/entry/common.c:45 
enter_from_user_mode+0x32/0x50
 CPU: 0 PID: 3331 Comm: ldt_gdt_64 Not tainted 4.8.0-rc7+ #13
 Call Trace:
  dump_stack+0x99/0xd0
  __warn+0xd1/0xf0
  warn_slowpath_null+0x1d/0x20
  enter_from_user_mode+0x32/0x50
  error_entry+0x6d/0xc0
  ? general_protection+0x12/0x30
  ? native_load_gs_index+0xd/0x20
  ? do_set_thread_area+0x19c/0x1f0
  SyS_set_thread_area+0x24/0x30
  do_int80_syscall_32+0x7c/0x220
  entry_INT80_compat+0x38/0x50

This can be reproduced by running the GS testcase of ldt_gdt test unit in 
selftests.

do_int80_syscall_32() will call enter_form_user_mode() to convert context 
tracking state from user state to kernel state. The load_gs_index can fail 
with user gsbase, gsbase will be fixed up and proceed if this happen. 
However, enter_from_user_mode() will be called again in the fixed up path 
though it is context tracking kernel state currently. 

This patch fix it by just fixing up gsbase and telling lockdep that IRQs 
are off once load_gs_index failed with user gsbase.

Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: "H. Peter Anvin" 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Signed-off-by: Wanpeng Li 
---
v1 -> v2:
 * more readable 

 arch/x86/entry/entry_64.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index d172c61..02fff3e 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1002,7 +1002,6 @@ ENTRY(error_entry)
testb   $3, CS+8(%rsp)
jz  .Lerror_kernelspace
 
-.Lerror_entry_from_usermode_swapgs:
/*
 * We entered from user mode or we're pretending to have entered
 * from user mode due to an IRET fault.
@@ -1045,7 +1044,8 @@ ENTRY(error_entry)
 * gsbase and proceed.  We'll fix up the exception and land in
 * .Lgs_change's error handler with kernel gsbase.
 */
-   jmp .Lerror_entry_from_usermode_swapgs
+   SWAPGS
+   jmp .Lerror_entry_done
 
 .Lbstep_iret:
/* Fix truncated RIP */
-- 
1.9.1



[RFC PATCH v1 1/2] printk: collect offsets into replaceable structure

2016-09-29 Thread Sean Hudson
Currently, printk relies on several indices that are declared as static
global variables in printk.c.  This patch collects those into a single
structure referenced by a pointer.  This allows easier replacement of
these indices and pinning to a specific locatino.

Signed-off-by: Sean Hudson 
---
 kernel/printk/printk.c | 335 +
 1 file changed, 172 insertions(+), 163 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index eea6dbc..7a441f5 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -353,28 +353,6 @@ DEFINE_RAW_SPINLOCK(logbuf_lock);
 
 #ifdef CONFIG_PRINTK
 DECLARE_WAIT_QUEUE_HEAD(log_wait);
-/* the next printk record to read by syslog(READ) or /proc/kmsg */
-static u64 syslog_seq;
-static u32 syslog_idx;
-static enum log_flags syslog_prev;
-static size_t syslog_partial;
-
-/* index and sequence number of the first record stored in the buffer */
-static u64 log_first_seq;
-static u32 log_first_idx;
-
-/* index and sequence number of the next record to store in the buffer */
-static u64 log_next_seq;
-static u32 log_next_idx;
-
-/* the next printk record to write to the console */
-static u64 console_seq;
-static u32 console_idx;
-static enum log_flags console_prev;
-
-/* the next printk record to read after the last 'clear' command */
-static u64 clear_seq;
-static u32 clear_idx;
 
 #define PREFIX_MAX 32
 #define LOG_LINE_MAX   (1024 - PREFIX_MAX)
@@ -386,19 +364,54 @@ static u32 clear_idx;
 #define LOG_ALIGN __alignof__(struct printk_log)
 #define __LOG_BUF_LEN (1 << CONFIG_LOG_BUF_SHIFT)
 static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
-static char *log_buf = __log_buf;
-static u32 log_buf_len = __LOG_BUF_LEN;
+
+/*
+ * This control block collects tracking offsets for the log.
+ */
+struct lcb_t {
+   /* Pointer to log buffer space and length of space */
+   char *log_buf;
+   u32 log_buf_len;
+
+   /* index and sequence of the first record stored in the buffer */
+   u64 log_first_seq;
+   u32 log_first_idx;
+
+   /* index and sequence of the next record to store in the buffer */
+   u64 log_next_seq;
+   u32 log_next_idx;
+
+   /* the next printk record to read by syslog(READ) or /proc/kmsg */
+   u64 syslog_seq;
+   u32 syslog_idx;
+   enum log_flags syslog_prev;
+   size_t syslog_partial;
+
+   /* the next printk record to write to the console */
+   u64 console_seq;
+   u32 console_idx;
+   enum log_flags console_prev;
+
+   /* the next printk record to read after the last 'clear' command */
+   u64 clear_seq;
+   u32 clear_idx;
+} lcb_t;
+
+static struct lcb_t __lcb = {
+   __log_buf, __LOG_BUF_LEN, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
+
+static struct lcb_t *lcb = &__lcb;
 
 /* Return log buffer address */
 char *log_buf_addr_get(void)
 {
-   return log_buf;
+   return lcb->log_buf;
 }
 
 /* Return log buffer size */
 u32 log_buf_len_get(void)
 {
-   return log_buf_len;
+   return lcb->log_buf_len;
 }
 
 /* human readable text of the record */
@@ -416,21 +429,21 @@ static char *log_dict(const struct printk_log *msg)
 /* get record by index; idx must point to valid msg */
 static struct printk_log *log_from_idx(u32 idx)
 {
-   struct printk_log *msg = (struct printk_log *)(log_buf + idx);
+   struct printk_log *msg = (struct printk_log *)(lcb->log_buf + idx);
 
/*
 * A length == 0 record is the end of buffer marker. Wrap around and
 * read the message at the start of the buffer.
 */
if (!msg->len)
-   return (struct printk_log *)log_buf;
+   return (struct printk_log *)lcb->log_buf;
return msg;
 }
 
 /* get next record; idx must point to valid msg */
 static u32 log_next(u32 idx)
 {
-   struct printk_log *msg = (struct printk_log *)(log_buf + idx);
+   struct printk_log *msg = (struct printk_log *)(lcb->log_buf + idx);
 
/* length == 0 indicates the end of the buffer; wrap */
/*
@@ -439,7 +452,7 @@ static u32 log_next(u32 idx)
 * return the one after that.
 */
if (!msg->len) {
-   msg = (struct printk_log *)log_buf;
+   msg = (struct printk_log *)lcb->log_buf;
return msg->len;
}
return idx + msg->len;
@@ -458,10 +471,11 @@ static int logbuf_has_space(u32 msg_size, bool empty)
 {
u32 free;
 
-   if (log_next_idx > log_first_idx || empty)
-   free = max(log_buf_len - log_next_idx, log_first_idx);
+   if (lcb->log_next_idx > lcb->log_first_idx || empty)
+   free = max(lcb->log_buf_len - lcb->log_next_idx,
+   lcb->log_first_idx);
else
-   free = log_first_idx - log_next_idx;
+   free = lcb->log_first_idx - lcb->log_next_idx;
 
/*
 * We need space also 

[RFC PATCH v1 2/2] printk: external log buffer (CONFIG_LOGBUFFER)

2016-09-29 Thread Sean Hudson
This debug feature provides a convenient way to collect log entries across
multiple, warmboot cycles and to share those entries with a boot loader.
It allows the kernel to use an external buffer for kernel log messages and
is controlled by an optional command line parameter. The buffer can contain
existing log messages from previous boot cycles and/or the bootloader. The
command line parameter was chosen for flexibility, cross arch portability,
and the ability to dynamically enable/disable this feature. The parameter
specifies the address of a control block used to replace the default log
buffer.  Existing bootloader and kernel log messages are kept, in order,
inside the new buffer.  After a boot that preserves the buffer contents, a
bootloader can display both kernel and bootloader log entries from
multiple, previous boots. It also allows the kernel to display bootloader
log entries along with its own messages.

This feature is intended for debug purposes and has no effect unless the
command line parameter is specified.  Further, it validates the passed
control block carefully and if any checks fail, it falls back to the
default behaviour.  As such, it can be left enabled by default.

Memory Reservation
--
This feature expects the bootloader to reserve/preserve the shared buffer
memory. This reservation needs to prevent the kernel from overwriting the
external log control block and log entries. In my testing, I've used the
'fdt' commands in uboot to dynamically inject reserved memory regions via
the DT to the kernel.

Based on the initial work of Wolfgang Denk and Igor Lisitsin [1].
Also based on work by: Alexander Streit 


[1] 
http://git.denx.de/?p=linux-2.6-denx.git;a=commitdiff;h=212f61c7fd3b952a81d1459dd32a86a32ddfd4ce

Signed-off-by: Sean Hudson 
---
 init/Kconfig   |  12 +++
 init/main.c|   2 +
 kernel/printk/printk.c | 267 +++--
 3 files changed, 275 insertions(+), 6 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index cac3f09..746183b 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1493,6 +1493,18 @@ config PRINTK
  very difficult to diagnose system problems, saying N here is
  strongly discouraged.
 
+config LOGBUFFER
+   bool "External logbuffer" if PRINTK
+   default n
+   depends on PRINTK
+   help
+ This option enables support for an alternative, "external" printk log
+ buffer. If memory contents are preserved, e.g. after a warmboot, this
+ provides a known location for the boot loader to read and display 
printk
+ entries from the kernel.  If desired, the bootloader can write its own
+ log entries which the kernel will display with its own log entries.
+ Further, this capability can be used across multiple warmboot cycles.
+
 config PRINTK_NMI
def_bool y
depends on PRINTK
diff --git a/init/main.c b/init/main.c
index a8a58e2..4a5913c 100644
--- a/init/main.c
+++ b/init/main.c
@@ -93,6 +93,7 @@ static int kernel_init(void *);
 extern void init_IRQ(void);
 extern void fork_init(void);
 extern void radix_tree_init(void);
+extern void setup_ext_logbuff(void);
 
 /*
  * Debug helper: via this flag we know that we are in 'early bootup code'
@@ -535,6 +536,7 @@ asmlinkage __visible void __init start_kernel(void)
sort_main_extable();
trap_init();
mm_init();
+   setup_ext_logbuff();
 
/*
 * Set up the scheduler prior starting any interrupts (such as the
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 7a441f5..017b4d4 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -331,17 +331,17 @@ enum log_flags {
 };
 
 struct printk_log {
-   u64 ts_nsec;/* timestamp in nanoseconds */
+#ifdef CONFIG_LOGBUFFER
+   u32 log_magic;  /* sanity check number */
+#endif
u16 len;/* length of entire record */
u16 text_len;   /* length of text buffer */
u16 dict_len;   /* length of dictionary buffer */
u8 facility;/* syslog facility */
u8 flags:5; /* internal record flags */
u8 level:3; /* syslog level */
+   u64 ts_nsec;/* timestamp in nanoseconds */
 }
-#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
-__packed __aligned(4)
-#endif
 ;
 
 /*
@@ -366,7 +366,14 @@ DECLARE_WAIT_QUEUE_HEAD(log_wait);
 static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
 
 /*
- * This control block collects tracking offsets for the log.
+ * This control block collects tracking offsets for the log into a single
+ * place.  It also facilitates pointing the log to another location, and,
+ * when combined with the CONFIG_LOGBUFFER feature, it allows log sharing
+ * between the bootloader and the kernel.
+ *
+ * NOTE:
+ *   By convention, the control 

[RFC PATCH v1 0/2] printk: Shared kernel logging

2016-09-29 Thread Sean Hudson
This patch set is based on Linus' v4.8-rc8 tag.

This debug feature allows the kernel to use an external buffer and control
block for kernel log messages. The feature is controlled by an optional
command line parameter. The existing buffer and control block can contain
existing log messages from previous boot cycles and/or the bootloader. The
command line parameter was chosen for flexibility, cross arch portability,
and the ability to dynamically enable/disable this feature. The parameter
specifies the address of a control block used to replace the default log
buffer. Existing bootloader and kernel log messages are kept, in order,
inside the new buffer. After a boot that preserves the buffer contents, a
bootloader can display both kernel and bootloader log entries from multiple,
previous boots. It also allows the kernel to display bootloader log entries
along with its own messages.

This feature is intended for debug purposes and has no effect unless the
command line parameter is specified. Further, it validates the passed
control block carefully and if any checks fail, it falls back to the default
behaviour. As such, it can be left enabled by default.

Memory Reservation

This feature expects the bootloader to reserve/preserve the shared buffer
memory. This reservation needs to prevent the kernel from overwriting the
external log control block and log entries. In my testing, I've used the
'fdt' commands in uboot to dynamically inject reserved memory regions via
the DT to the kernel.

Sean Hudson (2):
  printk: collect offsets into replaceable structure
  printk: external log buffer (CONFIG_LOGBUFFER)

 init/Kconfig   |  12 +
 init/main.c|   2 +
 kernel/printk/printk.c | 598 +++--
 3 files changed, 445 insertions(+), 167 deletions(-)

-- 
1.9.1




[PATCH v4 RESEND] f2fs: introduce get_checkpoint_version for cleanup

2016-09-29 Thread Tiezhu Yang
There exists almost same codes when get the value of pre_version
and cur_version in function validate_checkpoint, this patch adds
get_checkpoint_version to clean up redundant codes.

Signed-off-by: Tiezhu Yang 
---
 fs/f2fs/checkpoint.c | 66 ++--
 1 file changed, 38 insertions(+), 28 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index de8693c..764ed0b 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -663,45 +663,55 @@ static void write_orphan_inodes(struct f2fs_sb_info *sbi, 
block_t start_blk)
}
 }
 
-static struct page *validate_checkpoint(struct f2fs_sb_info *sbi,
-   block_t cp_addr, unsigned long long *version)
+static int get_checkpoint_version(struct f2fs_sb_info *sbi, block_t cp_addr,
+   struct f2fs_checkpoint **cp_block, struct page **cp_page,
+   unsigned long long *version)
 {
-   struct page *cp_page_1, *cp_page_2 = NULL;
unsigned long blk_size = sbi->blocksize;
-   struct f2fs_checkpoint *cp_block;
-   unsigned long long cur_version = 0, pre_version = 0;
-   size_t crc_offset;
+   size_t crc_offset = 0;
__u32 crc = 0;
 
-   /* Read the 1st cp block in this CP pack */
-   cp_page_1 = get_meta_page(sbi, cp_addr);
+   *cp_page = get_meta_page(sbi, cp_addr);
+   *cp_block = (struct f2fs_checkpoint *)page_address(*cp_page);
 
-   /* get the version number */
-   cp_block = (struct f2fs_checkpoint *)page_address(cp_page_1);
-   crc_offset = le32_to_cpu(cp_block->checksum_offset);
-   if (crc_offset >= blk_size)
-   goto invalid_cp1;
+   crc_offset = le32_to_cpu((*cp_block)->checksum_offset);
+   if (crc_offset >= blk_size) {
+   f2fs_msg(sbi->sb, KERN_WARNING,
+   "invalid crc_offset: %zu", crc_offset);
+   return -EINVAL;
+   }
 
-   crc = le32_to_cpu(*((__le32 *)((unsigned char *)cp_block + 
crc_offset)));
-   if (!f2fs_crc_valid(sbi, crc, cp_block, crc_offset))
-   goto invalid_cp1;
+   crc = le32_to_cpu(*((__le32 *)((unsigned char *)*cp_block
+   + crc_offset)));
+   if (!f2fs_crc_valid(sbi, crc, *cp_block, crc_offset)) {
+   f2fs_msg(sbi->sb, KERN_WARNING, "invalid crc value");
+   return -EINVAL;
+   }
 
-   pre_version = cur_cp_version(cp_block);
+   *version = cur_cp_version(*cp_block);
+   return 0;
+}
 
-   /* Read the 2nd cp block in this CP pack */
-   cp_addr += le32_to_cpu(cp_block->cp_pack_total_block_count) - 1;
-   cp_page_2 = get_meta_page(sbi, cp_addr);
+static struct page *validate_checkpoint(struct f2fs_sb_info *sbi,
+   block_t cp_addr, unsigned long long *version)
+{
+   struct page *cp_page_1 = NULL, *cp_page_2 = NULL;
+   struct f2fs_checkpoint *cp_block = NULL;
+   unsigned long long cur_version = 0, pre_version = 0;
+   int err;
 
-   cp_block = (struct f2fs_checkpoint *)page_address(cp_page_2);
-   crc_offset = le32_to_cpu(cp_block->checksum_offset);
-   if (crc_offset >= blk_size)
-   goto invalid_cp2;
+   err = get_checkpoint_version(sbi, cp_addr, _block,
+   _page_1, version);
+   if (err)
+   goto invalid_cp1;
+   pre_version = *version;
 
-   crc = le32_to_cpu(*((__le32 *)((unsigned char *)cp_block + 
crc_offset)));
-   if (!f2fs_crc_valid(sbi, crc, cp_block, crc_offset))
+   cp_addr += le32_to_cpu(cp_block->cp_pack_total_block_count) - 1;
+   err = get_checkpoint_version(sbi, cp_addr, _block,
+   _page_2, version);
+   if (err)
goto invalid_cp2;
-
-   cur_version = cur_cp_version(cp_block);
+   cur_version = *version;
 
if (cur_version == pre_version) {
*version = cur_version;
-- 
1.8.3.1

Re:[PATCH v4] f2fs: introduce get_checkpoint_version for cleanup

2016-09-29 Thread Tiezhu Yang
At 2016-09-30 07:47:58, "Tiezhu Yang"  wrote:
>There exists almost same codes when get the value of pre_version
>and cur_version in function validate_checkpoint, this patch adds
>get_checkpoint_version to clean up redundant codes.
>
>Signed-off-by: Tiezhu Yang 
>---
> fs/f2fs/checkpoint.c | 66 ++--
> 1 file changed, 38 insertions(+), 28 deletions(-)
>
>diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
>index de8693c..764ed0b 100644
>--- a/fs/f2fs/checkpoint.c
>+++ b/fs/f2fs/checkpoint.c
>@@ -663,45 +663,55 @@ static void write_orphan_inodes(struct f2fs_sb_info 
>*sbi, block_t start_blk)
>   }
> }
> 
>-static struct page *validate_checkpoint(struct f2fs_sb_info *sbi,
>-  block_t cp_addr, unsigned long long *version)
>+static int get_checkpoint_version(struct f2fs_sb_info *sbi, block_t cp_addr,
>+  struct f2fs_checkpoint **cp_block, struct page **cp_page,
>+  unsigned long long *version)
> {
>-  struct page *cp_page_1, *cp_page_2 = NULL;
>   unsigned long blk_size = sbi->blocksize;
>-  struct f2fs_checkpoint *cp_block;
>-  unsigned long long cur_version = 0, pre_version = 0;
>-  size_t crc_offset;
>+  size_t crc_offset = 0;
>   __u32 crc = 0;
> 
>-  /* Read the 1st cp block in this CP pack */
>-  cp_page_1 = get_meta_page(sbi, cp_addr);
>+  *cp_page = get_meta_page(sbi, cp_addr);
>+  *cp_block = (struct f2fs_checkpoint *)page_address(*cp_page);
> 
>-  /* get the version number */
>-  cp_block = (struct f2fs_checkpoint *)page_address(cp_page_1);
>-  crc_offset = le32_to_cpu(cp_block->checksum_offset);
>-  if (crc_offset >= blk_size)
>-  goto invalid_cp1;
>+  crc_offset = le32_to_cpu((*cp_block)->checksum_offset);
>+  if (crc_offset >= blk_size) {
>+  f2fs_msg(sbi->sb, KERN_WARNING,
>+  "invalid crc_offset: %zu.", crc_offset);

Sorry, The full stop '.' is needless, I will resend it.

Thanks,
>+  return -EINVAL;
>+  }
> 
>-  crc = le32_to_cpu(*((__le32 *)((unsigned char *)cp_block + 
>crc_offset)));
>-  if (!f2fs_crc_valid(sbi, crc, cp_block, crc_offset))
>-  goto invalid_cp1;
>+  crc = le32_to_cpu(*((__le32 *)((unsigned char *)*cp_block
>+  + crc_offset)));
>+  if (!f2fs_crc_valid(sbi, crc, *cp_block, crc_offset)) {
>+  f2fs_msg(sbi->sb, KERN_WARNING, "invalid crc value.");

ditto

>+  return -EINVAL;
>+  }
> 
>-  pre_version = cur_cp_version(cp_block);
>+  *version = cur_cp_version(*cp_block);
>+  return 0;
>+}
> 
>-  /* Read the 2nd cp block in this CP pack */
>-  cp_addr += le32_to_cpu(cp_block->cp_pack_total_block_count) - 1;
>-  cp_page_2 = get_meta_page(sbi, cp_addr);
>+static struct page *validate_checkpoint(struct f2fs_sb_info *sbi,
>+  block_t cp_addr, unsigned long long *version)
>+{
>+  struct page *cp_page_1 = NULL, *cp_page_2 = NULL;
>+  struct f2fs_checkpoint *cp_block = NULL;
>+  unsigned long long cur_version = 0, pre_version = 0;
>+  int err;
> 
>-  cp_block = (struct f2fs_checkpoint *)page_address(cp_page_2);
>-  crc_offset = le32_to_cpu(cp_block->checksum_offset);
>-  if (crc_offset >= blk_size)
>-  goto invalid_cp2;
>+  err = get_checkpoint_version(sbi, cp_addr, _block,
>+  _page_1, version);
>+  if (err)
>+  goto invalid_cp1;
>+  pre_version = *version;
> 
>-  crc = le32_to_cpu(*((__le32 *)((unsigned char *)cp_block + 
>crc_offset)));
>-  if (!f2fs_crc_valid(sbi, crc, cp_block, crc_offset))
>+  cp_addr += le32_to_cpu(cp_block->cp_pack_total_block_count) - 1;
>+  err = get_checkpoint_version(sbi, cp_addr, _block,
>+  _page_2, version);
>+  if (err)
>   goto invalid_cp2;
>-
>-  cur_version = cur_cp_version(cp_block);
>+  cur_version = *version;
> 
>   if (cur_version == pre_version) {
>   *version = cur_version;
>-- 
>1.8.3.1


Re: [PATCH 1/2] f2fs: use crc and cp version to determine roll-forward recovery

2016-09-29 Thread Jaegeuk Kim
On Thu, Sep 29, 2016 at 08:01:32PM +0800, Chao Yu wrote:
> On 2016/9/20 10:55, Jaegeuk Kim wrote:
> > Previously, we used cp_version only to detect recoverable dnodes.
> > In order to avoid same garbage cp_version, we needed to truncate the next
> > dnode during checkpoint, resulting in additional discard or data write.
> > If we can distinguish this by using crc in addition to cp_version, we can
> > remove this overhead.
> > 
> > There is backward compatibility concern where it changes node_footer layout.
> > But, it only affects the direct nodes written after the last checkpoint.
> > We simply expect that user would change kernel versions back and forth after
> > stable checkpoint.
> 
> Seems with new released v4.8 f2fs, old image with recoverable data could be
> mounted successfully, but meanwhile all fsynced data which needs to be 
> recovered
> will be lost w/o any hints?
> 
> Could we release a new version mkfs paired with new kernel module, so we can 
> tag
> image as a new layout one, then new kernel module can recognize the image 
> layout
> and adjust version suited comparing method with old or new image?

Hmm, how about adding a checkpoint flag like CP_CRC_RECOVERY_FLAG?
Then, we can proceed crc|cp_ver, if the last checkpoint has this flag.

Any thought?

> 
> Thanks,
> 
> 


Re: [PATCH PoC 0/7] mmc: switch to blk-mq

2016-09-29 Thread Linus Walleij
On Thu, Sep 22, 2016 at 6:57 AM, Bartlomiej Zolnierkiewicz
 wrote:

> Since Linus Walleij is also working on that and I won't
> probably have time to touch this code till the end of
> upcoming month, here it is (basically a code dump of my
> proof-of-concept work).  I hope that it would be useful
> to somebody.
>
> It is extremely ugly & full of bogus debug code but boots
> fine on my Odroid-XU3 and benchmarks can be run.

Haha, it is still good discussion material.

FWIW your patchset is way more advanced than whatever I
cooked up, and the approach taken: first rip out async requests,
then adding a mq callback block and add async requests back
after adding a function to monitor if the queue is busy is a way
better approach.

I sat down with Ulf Hansson and Arnd Bergmann to discuss the
material and issues we face if/when migrating the MMC/SD code
to blk-mq.

Just for context to everyone: MMC/SD has an asynchronous
request handling that achieves a call all the way into the driver
to do some DMA mapping (flush) of SGlists with dma_map_sg()
before the hardware start processing the actual request. There
is a post_req() callback as well performing dma_unmap_sg().

This is mostly a non-issue on coherent memory architectures
like x86, but gives a nice performance boost on ARM (etc)
systems. In theory the callback could be used for other stuff
but all current drivers ultimately call
dma_map_sg()/dma_unmap_sg().

The interesting solution to achieve asynchronous requests,
a.k.a. double-buffering a.k.a. request pipelining is basically this
from the last patch:

-   mq->qdepth = 1;
+   mq->qdepth = 2;

So we claim that the hardware queue has a depth of two
requests but well... that is not really true. If we start confusing
concepts like this to get parallelism, what shall we set this
to when we exploit command queueing and actually have a
queue depth of say 64? that will result in a pile of hacks.

The proper solution would be to augment struct blk_mq_ops
vtable with a .pre_queue_rq() and .post_complete_rq() or
something.

The way I read the code the init_request() and exit_request()
callbacks cannot be used as they only deal with allocating the
struct and this seems to happen before the request is actually
filled in with the data (correct me if I don't understand this right!)
this seems to be confirmed by the presence of a .reinit_request()
callback. So we can't map/unmap the requests in these
callbacks.

We noted that this dma map/upmap optimization can also be
applicable for USB mass storage, so we get an optimization
from the MQ block layer that we can reuse in more than
MMC/SD.

After this we will still run into the same issue that you find after
this patchset: regressions in performance because of the
absence of an elevator/scheduler algorithm in blk-mq. So we
cannot really apply the patch set before or at the same time
as we're fixing that.

Apart from that we saw some really arcane things in the
MMC/SD core, mmc_claim_host() being the most obvious
example, as far as we can tell some kind of reimplementation of
mutex_trylock(). Some serious cleanup may be needed here.
It's nice that your first patch rips out the quirky kthread that
polls the block queue for new requests and send them down
to the mmc core, including picking out a few NULL requests
and flusing it's asynch work queue with that.

Yours,
Linus Walleij


Re: [PATCH 1/3] arm64: dump: Make ptdump debugfs a separate option

2016-09-29 Thread Mark Rutland
On Thu, Sep 29, 2016 at 05:31:09PM -0700, Laura Abbott wrote:
> On 09/29/2016 05:13 PM, Mark Rutland wrote:
> >On Thu, Sep 29, 2016 at 02:32:55PM -0700, Laura Abbott wrote:
> >>+int ptdump_register(struct ptdump_info *info, const char *name)
> >>+{
> >>+   ptdump_initialize(info);
> >>+   return ptdump_debugfs_create(info, name);
> >> }
> >
> >It feels like a layering violation to have the core ptdump code call the
> >debugfs ptdump code. Is there some reason this has to live here?
> 
> Which 'this' are you referring to here? Are you suggesting moving
> the ptdump_register elsewhere or moving the debugfs create elsewhere?

Sorry, I should have worded that better.

I meant moving ptdump_register into ptdump_debugfs.c, perhaps renamed to make it
clear it's debugfs-specific.

We could instead update existing users to call ptdump_debugfs_create()
directly, and have that call ptdump_initialize(), which could itself become a
staic inline in a header.

Thanks,
Mark.


Re: [RFC 0/5] printk: Implement WARN_*DEFERRED()

2016-09-29 Thread Sergey Senozhatsky
On (09/29/16 13:28), Petr Mladek wrote:
> On Wed 2016-09-28 10:18:45, Sergey Senozhatsky wrote:
> > On (09/27/16 18:02), Petr Mladek wrote:
> > > The main trick is that we replace the per-CPU function pointer
> > > by a preempt_count-like variable that could track the printk context.
> > > 
> > > I know that Sergey has another ideas in this area. But I wanted to see
> > > how this approach would look like.
> > 
> > well, yes. I was looking at WARN_*_DEFERRED [1] for some time, and, I
> > think, the maintenance cost of that solution is just too high:
> > 
> > a) every existing WARN_* in sched/timekeeping/who knows where else
> >must be evaluated to ensure that in can't be called from printk()
> >path. if `false' - then the corresponding macro must be replaced
> >with _DEFERRED flavor.
> > 
> > b) any patch that adds new WARN_* usages must be additionally checked
> >to ensure that each of new WARN_* macros cannot be called from printk
> >path. if `false' -- the corresponding macro must be replaced with
> >_DEFERRED flavor.
> > 
> > c) any patch that refactors the code or moves some function calls around
> >etc. must be additionally checked for any accidental WARN_* from printk
> >path. even though if none of the patches added any new WARN_* to the 
> > code.
> > 
> > b) apart from WARN_* there can be `accidental' pr_err/pr_debug/etc. not
> >necessarily newly added (see 'c').
> > 
> > 
> > that's too much.
> > 
> > it takes a lot of additional effort, because both reviewer and contributor
> > must consider printk() internals. and, what's worse, if something goes
> > unnoticed we end up having a printk() deadlock again.
> > 
> > so I decided to address some of printk() issues in printk.c, not in
> > kernel/time/timekeeping.c or kernel/sched/core.c or anywhere else.
> 
> I see the point.

well, just my 5 cents.

> Your approach (alt buffer) adds some complexity to the printk code

it does.
the other thing is that there are several ways to deadlock printk().
alt_printk is addressing deadlocks that were caused by printk()
recursion only.

   printk()
 acquire_lock()
   printk()
 acquire_lock()

which is a sub-set of all of the printk() deadlock scenarios. all of
the locks that printk() acquires can be taken outside of printk() path.

for example, cat /proc/console locks the console_lock() for seq output.
thus we can have something like

console_unlock()// lock  >lock
  up()
activate_task()
  WARN_ON()
printk()
  console_trylock() // lock >lock


DEFERRED_WARN is a good thing; it's just quite hard to keep everything
working, given that any of those "9 patches per hour" can break something
with just one WARN_ON().


I assume that doing something like this

#define WARN_ON(condition, format...) ({\
printk_deferred_enter();\
WARN(condition, ##format);  \
printk_deferred_exit(); \
})

is less than exciting because WARN_ON from irq won't immediately print
the backtrace anymore.

thoughts?

> but it allows to remove printk_deferred()/WARN_DEFERRED() and all
> the risk of it.

at some point we even can drop the entire deferred_printk() thing.
but alt_printk needs some love and care first.

> I am going to look closely on it.

thanks.

-ss


Re: [PATCH] x86/entry/64: Fix context tracking state warning when load_gs_index fails

2016-09-29 Thread Wanpeng Li
2016-09-30 5:01 GMT+08:00 Andy Lutomirski :
> On Mon, Sep 26, 2016 at 4:49 AM, Wanpeng Li  wrote:
>> From: Wanpeng Li 
>>
>>  WARNING: CPU: 0 PID: 3331 at arch/x86/entry/common.c:45 
>> enter_from_user_mode+0x32/0x50
>>  CPU: 0 PID: 3331 Comm: ldt_gdt_64 Not tainted 4.8.0-rc7+ #13
>>  Call Trace:
>>   dump_stack+0x99/0xd0
>>   __warn+0xd1/0xf0
>>   warn_slowpath_null+0x1d/0x20
>>   enter_from_user_mode+0x32/0x50
>>   error_entry+0x6d/0xc0
>>   ? general_protection+0x12/0x30
>>   ? native_load_gs_index+0xd/0x20
>>   ? do_set_thread_area+0x19c/0x1f0
>>   SyS_set_thread_area+0x24/0x30
>>   do_int80_syscall_32+0x7c/0x220
>>   entry_INT80_compat+0x38/0x50
>>
>> This can be reproduced by running the GS testcase of ldt_gdt test unit in
>> selftests.
>>
>> do_int80_syscall_32() will call enter_form_user_mode() to convert context
>> tracking state from user state to kernel state. The load_gs_index can fail
>> with user gsbase, gsbase will be fixed up and proceed if this happen.
>> However, enter_from_user_mode() will be called again in the fixed up path
>> though it is context tracking kernel state currently.
>>
>> This patch fix it by just fixing up gsbase and telling lockdep that IRQs
>> are off once load_gs_index failed with user gsbase.
>>
>> Cc: Thomas Gleixner 
>> Cc: Ingo Molnar 
>> Cc: "H. Peter Anvin" 
>> Cc: Andy Lutomirski 
>> Cc: Borislav Petkov 
>> Signed-off-by: Wanpeng Li 
>> ---
>>  arch/x86/entry/entry_64.S | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
>> index d172c61..dc1ec23 100644
>> --- a/arch/x86/entry/entry_64.S
>> +++ b/arch/x86/entry/entry_64.S
>> @@ -1045,7 +1045,9 @@ ENTRY(error_entry)
>>  * gsbase and proceed.  We'll fix up the exception and land in
>>  * .Lgs_change's error handler with kernel gsbase.
>>  */
>> -   jmp .Lerror_entry_from_usermode_swapgs
>> +   SWAPGS
>> +   TRACE_IRQS_OFF
>
> Let's make this more readable: can you change this to:
>
> SWAPGS
> jmp .Lerror_entry_done
>
> and remove the .Lerror_entry_from_usermode_swapgs label as well?

I will do this in v2. Btw, could you point out why the GS testcase in
tools/testing/selftests/x86/ldt_gdt.c will #GP? SDM said that swapgs
will #GP if CPL != 0, however, native_load_gs_index() is CPL == 0.

Regards,
Wanpeng Li


Re: [PATCH v14 0/9] acpi, clocksource: add GTDT driver and GTDT support in arm_arch_timer

2016-09-29 Thread Xiongfeng Wang
for sbsa watchdog part,  Tested-by:  wangxiongfe...@huawei.com on D05 board.

On 2016/9/29 2:17, fu@linaro.org wrote:
> From: Fu Wei 
> 
> This patchset:
> (1)Preparation for adding GTDT support in arm_arch_timer:
> 1. Move some enums and marcos to header file;
> 2. Add a new enum for spi type;
> 3. Improve printk relevant code.
> 
> (2)Introduce ACPI GTDT parser: drivers/acpi/arm64/acpi_gtdt.c
> Parse all kinds of timer in GTDT table of ACPI:arch timer,
> memory-mapped timer and SBSA Generic Watchdog timer.
> This driver can help to simplify all the relevant timer drivers,
> and separate all the ACPI GTDT knowledge from them.
> 
> (3)Simplify ACPI code for arm_arch_timer
> 
> (4)Add GTDT support for ARM memory-mapped timer, also refactor
> original memory-mapped timer dt support for reusing some common
> code.
> 
> This patchset depends on the following patchset:
> [UPDATE PATCH V11 1/8] ACPI: I/O Remapping Table (IORT) initial support
> https://lkml.org/lkml/2016/9/12/949
> 
> This patchset has been tested on the following platforms:
> (1)ARM Foundation v8 model
> 
> Changelog:
> v14: https://lkml.org/lkml/2016/9/28/
>  Separate memory-mapped timer GTDT support into two patches
>  1. Refactor the timer init code to prepare for GTDT
>  2. Add GTDT support for memory-mapped timer
> 
> v13: http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1231717.html
>  Improve arm_arch_timer code for memory-mapped
>  timer GTDT support, refactor original memory-mapped timer
>  dt support for reusing some common code.
> 
> v12: https://lkml.org/lkml/2016/9/13/250
>  Rebase to latest Linux 4.8-rc6
>  Delete the confusing "skipping" in the error message.
> 
> V11: https://lkml.org/lkml/2016/9/6/354
>  Rebase to latest Linux 4.8-rc5
>  Delete typedef (suggested by checkpatch.pl)
> 
> V10: https://lkml.org/lkml/2016/7/26/215
>  Drop the "readq" patch.
>  Rebase to latest Linux 4.7.
> 
> V9: https://lkml.org/lkml/2016/7/25/345
> Improve pr_err message in acpi gtdt driver.
> Update Commit message for 7/9
> shorten the irq mapping function name
> Improve GTDT driver for memory-mapped timer
> 
> v8: https://lkml.org/lkml/2016/7/19/660
> Improve "pr_fmt(fmt)" definition: add "ACPI" in front of "GTDT",
> and also improve printk message.
> Simplify is_timer_block and is_watchdog.
> Merge acpi_gtdt_desc_init and gtdt_arch_timer_init into acpi_gtdt_init();
> Delete __init in include/linux/acpi.h for GTDT API
> Make ARM64 select GTDT.
> Delete "#include " from acpi_gtdt.c
> Simplify GT block parse code.
> 
> v7: https://lkml.org/lkml/2016/7/13/769
> Move the GTDT driver to drivers/acpi/arm64
> Add add the ARM64-specific ACPI Support maintainers in MAINTAINERS
> Merge 3 patches of GTDT parser driver.
> Fix the for_each_platform_timer bug.
> 
> v6: https://lkml.org/lkml/2016/6/29/580
> split the GTDT driver to 4 parts: basic, arch_timer, memory-mapped timer,
> and SBSA Generic Watchdog timer
> Improve driver by suggestions and example code from Daniel Lezcano
> 
> v5: https://lkml.org/lkml/2016/5/24/356
> Sorting out all patches, simplify the API of GTDT driver:
> GTDT driver just fills the data struct for arm_arch_timer driver.
> 
> v4: https://lists.linaro.org/pipermail/linaro-acpi/2016-March/006667.html
> Delete the kvm relevant patches
> Separate two patches for sorting out the code for arm_arch_timer.
> Improve irq info export code to allow missing irq info in GTDT table.
> 
> v3: https://lkml.org/lkml/2016/2/1/658
> Improve GTDT driver code:
>   (1)improve pr_* by defining pr_fmt(fmt)
>   (2)simplify gtdt_sbsa_gwdt_init
>   (3)improve gtdt_arch_timer_data_init, if table is NULL, it will try
>   to get GTDT table.
> Move enum ppi_nr to arm_arch_timer.h, and add enum spi_nr.
> Add arm_arch_timer get ppi from DT and GTDT support for kvm.
> 
> v2: https://lkml.org/lkml/2015/12/2/10
> Rebase to latest kernel version(4.4-rc3).
> Fix the bug about the config problem,
> use CONFIG_ACPI_GTDT instead of CONFIG_ACPI in arm_arch_timer.c
> 
> v1: The first upstreaming version: https://lkml.org/lkml/2015/10/28/553
> 
> Fu Wei (9):
>   clocksource/drivers/arm_arch_timer: Move enums and defines to header
> file
>   clocksource/drivers/arm_arch_timer: Add a new enum for spi type
>   clocksource/drivers/arm_arch_timer: Improve printk relevant code
>   acpi/arm64: Add GTDT table parse driver
>   clocksource/drivers/arm_arch_timer: Simplify ACPI support code.
>   acpi/arm64: Add memory-mapped timer support in GTDT driver
>   clocksource/drivers/arm_arch_timer: Refactor the timer init code to
> prepare for GTDT
>   clocksource/drivers/arm_arch_timer: Add GTDT support for memory-mapped
> timer
>   acpi/arm64: Add SBSA Generic Watchdog support in GTDT 

  1   2   3   4   5   6   7   8   >