Re: [PATCH V6 0/6] Intel memory b/w monitoring support

2016-03-11 Thread Peter Zijlstra
On Sat, Mar 12, 2016 at 01:56:13AM +, Luck, Tony wrote:
> Some tracing printk() show that we are calling update_sample() with totally 
> bogus arguments.
> 
> There are a few good calls, then I see rmid=-380863112 evt_type=-30689 first=0
> 
> That turns into a wild vrmid, and we fault accessing mbm_current->prev_msr

It's because I'm a right idiot.. The below should sort that methinks.

Will push a new branch

--- a/arch/x86/events/intel/cqm.c
+++ b/arch/x86/events/intel/cqm.c
@@ -466,9 +466,9 @@ static bool is_mbm_event(int e)
 static void cqm_mask_call(struct rmid_read *rr)
 {
if (is_mbm_event(rr->evt_type))
-   on_each_cpu_mask(_cpumask, __intel_mbm_event_count, , 1);
+   on_each_cpu_mask(_cpumask, __intel_mbm_event_count, rr, 1);
else
-   on_each_cpu_mask(_cpumask, __intel_cqm_event_count, , 1);
+   on_each_cpu_mask(_cpumask, __intel_cqm_event_count, rr, 1);
 }
 
 /*


Re: [PATCH V6 0/6] Intel memory b/w monitoring support

2016-03-11 Thread Peter Zijlstra
On Sat, Mar 12, 2016 at 01:56:13AM +, Luck, Tony wrote:
> Some tracing printk() show that we are calling update_sample() with totally 
> bogus arguments.
> 
> There are a few good calls, then I see rmid=-380863112 evt_type=-30689 first=0
> 
> That turns into a wild vrmid, and we fault accessing mbm_current->prev_msr

It's because I'm a right idiot.. The below should sort that methinks.

Will push a new branch

--- a/arch/x86/events/intel/cqm.c
+++ b/arch/x86/events/intel/cqm.c
@@ -466,9 +466,9 @@ static bool is_mbm_event(int e)
 static void cqm_mask_call(struct rmid_read *rr)
 {
if (is_mbm_event(rr->evt_type))
-   on_each_cpu_mask(_cpumask, __intel_mbm_event_count, , 1);
+   on_each_cpu_mask(_cpumask, __intel_mbm_event_count, rr, 1);
else
-   on_each_cpu_mask(_cpumask, __intel_cqm_event_count, , 1);
+   on_each_cpu_mask(_cpumask, __intel_cqm_event_count, rr, 1);
 }
 
 /*


Re: [linux-sunxi] Re: [PATCH v8 2/2] ASoc: sun4i-codec: Add FM, Line and Mic inputs

2016-03-11 Thread Danny Milosavljevic
Hi,

does anyone know the answer to the questions below?

> The other direction (making two controls which both do the same and update 
> each other's value) doesn't seem to be easily available. 

> Should I write a _put handler that does it manually?

> (Or should that be handled by alsa-lib mixer modules instead?)

> Is it possible to merge the "Left Mixer" control and respective 
> "Right Mixer" control into one selem in alsamixer?
> 
> Because alsamixer actually has support for one-sided muting.

If that's unknown, I can post v9 without doing the grouping-together of mixer 
controls and we can use it like that - I don't want the patch to stall 
indefinitely on mere usability issues...

Regards,
Danny


Re: [linux-sunxi] Re: [PATCH v8 2/2] ASoc: sun4i-codec: Add FM, Line and Mic inputs

2016-03-11 Thread Danny Milosavljevic
Hi,

does anyone know the answer to the questions below?

> The other direction (making two controls which both do the same and update 
> each other's value) doesn't seem to be easily available. 

> Should I write a _put handler that does it manually?

> (Or should that be handled by alsa-lib mixer modules instead?)

> Is it possible to merge the "Left Mixer" control and respective 
> "Right Mixer" control into one selem in alsamixer?
> 
> Because alsamixer actually has support for one-sided muting.

If that's unknown, I can post v9 without doing the grouping-together of mixer 
controls and we can use it like that - I don't want the patch to stall 
indefinitely on mere usability issues...

Regards,
Danny


Re: [PATCH v10 06/59] PCI: Kill wrong quirk about M7101

2016-03-11 Thread Meelis Roos
> On Thu, Mar 10, 2016 at 9:40 AM, Bjorn Helgaas  wrote:
> > On Wed, Feb 24, 2016 at 06:11:57PM -0800, Yinghai Lu wrote:
> >> Meelis reported that qla2000 driver does not get loaded on one sparc 
> >> system.
> >>
> >> schizo f00732d0: PCI host bridge to bus 0001:00
> >> pci_bus 0001:00: root bus resource [io  0x7fe0100-0x7fe01ff] (bus 
> >> address [0x-0xff])
> >> pci 0001:00:06.0: quirk: [io  0x7fe01000800-0x7fe0100083f] claimed by 
> >> ali7101 ACPI
> >> pci 0001:00:06.0: quirk: [io  0x7fe01000600-0x7fe0100061f] claimed by 
> >> ali7101 SMB
> >> pci 0001:00:07.0: can't claim BAR 0 [io  0x7fe0100-0x7fe0100]: 
> >> address conflict with 0001:00:06.0 [io  0x7fe01000600-0x7fe0100061f]
> >>
> >> So the quirk for M7101 claim the io range early.

But why did it work until 4.2 and only with 4.3 the allocations broke?


> >>
> >> According to spec with M7101 in M1543 page 103/104,
> >>   http://www.versalogic.com/Support/Downloads/pdf/ali1543.pdf
> >> 0xe0, and 0xe2 do not include address info for acpi/smb.
> >>
> >> Kill wrong quirk about them.
> >
> > This needs an explanation for why the quirk was added in the first
> > place, and why it is now safe to remove it.
> 
> The related commit does not tell much about why it is there exactly.
> But it is added the same time with intel piix4.
> 
> Maybe Linus could have some hint about that quirk?
> 
> commit 34f550135e349102bd065488ea217ab27f0d
> Author: Linus Torvalds 
> Date:   Fri Nov 23 15:32:20 2007 -0500
> 
> Import 2.3.49pre2
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index cda39a5..8029b19 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -113,6 +113,55 @@ static void __init quirk_s3_64M(struct pci_dev *dev)
> }
>  }
> 
> +static void __init quirk_io_region(struct pci_dev *dev, unsigned region, 
> unsign
> ed size, int nr)
> +{
> +   region &= ~(size-1);
> +   if (region) {
> +   struct resource *res = dev->resource + nr;
> +
> +   res->name = dev->name;
> +   res->start = region;
> +   res->end = region + size - 1;
> +   res->flags = IORESOURCE_IO;
> +   pci_claim_resource(dev, nr);
> +   }
> +}
> +
> +/*
> + * Let's make the southbridge information explicit instead
> + * of having to worry about people probing the ACPI areas,
> + * for example.. (Yes, it happens, and if you read the wrong
> + * ACPI register it will put the machine to sleep with no
> + * way of waking it up again. Bummer).
> + *
> + * ALI M7101: Two IO regions pointed to by words at
> + * 0xE0 (64 bytes of ACPI registers)
> + * 0xE2 (32 bytes of SMB registers)
> + */
> +static void __init quirk_ali7101(struct pci_dev *dev)
> +{
> +   u16 region;
> +
> +   pci_read_config_word(dev, 0xE0, );
> +   quirk_io_region(dev, region, 64, PCI_BRIDGE_RESOURCES);
> +   pci_read_config_word(dev, 0xE2, );
> +   quirk_io_region(dev, region, 32, PCI_BRIDGE_RESOURCES+1);
> +}
> +
> +/*
> + * PIIX4 ACPI: Two IO regions pointed to by longwords at
> + * 0x40 (64 bytes of ACPI registers)
> + * 0x90 (32 bytes of SMB registers)
> + */
> +static void __init quirk_piix4acpi(struct pci_dev *dev)
> +{
> +   u32 region;
> +
> +   pci_read_config_dword(dev, 0x40, );
> +   quirk_io_region(dev, region, 64, PCI_BRIDGE_RESOURCES);
> +   pci_read_config_dword(dev, 0x90, );
> +   quirk_io_region(dev, region, 32, PCI_BRIDGE_RESOURCES+1);
> +}
> 
>  /*
>   *  The main table of quirks.
> @@ -143,6 +192,8 @@ static struct pci_fixup pci_fixups[] __initdata = {
> { PCI_FIXUP_FINAL,  PCI_VENDOR_ID_INTEL,
> PCI_DEVICE_ID_INTEL_82443BX_2,  quirk_natoma },
> { PCI_FIXUP_FINAL,  PCI_VENDOR_ID_SI,
> PCI_DEVICE_ID_SI_5597,  quirk_nopcipci },
> { PCI_FIXUP_FINAL,  PCI_VENDOR_ID_SI,
> PCI_DEVICE_ID_SI_496,   quirk_nopcipci },
> +   { PCI_FIXUP_FINAL,  PCI_VENDOR_ID_INTEL,
> PCI_DEVICE_ID_INTEL_82371AB_3,  quirk_piix4acpi },
> +   { PCI_FIXUP_FINAL,  PCI_VENDOR_ID_AL,
> PCI_DEVICE_ID_AL_M7101, quirk_ali7101 },
> { 0 }
>  };
> 

-- 
Meelis Roos (mr...@linux.ee)


Re: [PATCH v10 06/59] PCI: Kill wrong quirk about M7101

2016-03-11 Thread Meelis Roos
> On Thu, Mar 10, 2016 at 9:40 AM, Bjorn Helgaas  wrote:
> > On Wed, Feb 24, 2016 at 06:11:57PM -0800, Yinghai Lu wrote:
> >> Meelis reported that qla2000 driver does not get loaded on one sparc 
> >> system.
> >>
> >> schizo f00732d0: PCI host bridge to bus 0001:00
> >> pci_bus 0001:00: root bus resource [io  0x7fe0100-0x7fe01ff] (bus 
> >> address [0x-0xff])
> >> pci 0001:00:06.0: quirk: [io  0x7fe01000800-0x7fe0100083f] claimed by 
> >> ali7101 ACPI
> >> pci 0001:00:06.0: quirk: [io  0x7fe01000600-0x7fe0100061f] claimed by 
> >> ali7101 SMB
> >> pci 0001:00:07.0: can't claim BAR 0 [io  0x7fe0100-0x7fe0100]: 
> >> address conflict with 0001:00:06.0 [io  0x7fe01000600-0x7fe0100061f]
> >>
> >> So the quirk for M7101 claim the io range early.

But why did it work until 4.2 and only with 4.3 the allocations broke?


> >>
> >> According to spec with M7101 in M1543 page 103/104,
> >>   http://www.versalogic.com/Support/Downloads/pdf/ali1543.pdf
> >> 0xe0, and 0xe2 do not include address info for acpi/smb.
> >>
> >> Kill wrong quirk about them.
> >
> > This needs an explanation for why the quirk was added in the first
> > place, and why it is now safe to remove it.
> 
> The related commit does not tell much about why it is there exactly.
> But it is added the same time with intel piix4.
> 
> Maybe Linus could have some hint about that quirk?
> 
> commit 34f550135e349102bd065488ea217ab27f0d
> Author: Linus Torvalds 
> Date:   Fri Nov 23 15:32:20 2007 -0500
> 
> Import 2.3.49pre2
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index cda39a5..8029b19 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -113,6 +113,55 @@ static void __init quirk_s3_64M(struct pci_dev *dev)
> }
>  }
> 
> +static void __init quirk_io_region(struct pci_dev *dev, unsigned region, 
> unsign
> ed size, int nr)
> +{
> +   region &= ~(size-1);
> +   if (region) {
> +   struct resource *res = dev->resource + nr;
> +
> +   res->name = dev->name;
> +   res->start = region;
> +   res->end = region + size - 1;
> +   res->flags = IORESOURCE_IO;
> +   pci_claim_resource(dev, nr);
> +   }
> +}
> +
> +/*
> + * Let's make the southbridge information explicit instead
> + * of having to worry about people probing the ACPI areas,
> + * for example.. (Yes, it happens, and if you read the wrong
> + * ACPI register it will put the machine to sleep with no
> + * way of waking it up again. Bummer).
> + *
> + * ALI M7101: Two IO regions pointed to by words at
> + * 0xE0 (64 bytes of ACPI registers)
> + * 0xE2 (32 bytes of SMB registers)
> + */
> +static void __init quirk_ali7101(struct pci_dev *dev)
> +{
> +   u16 region;
> +
> +   pci_read_config_word(dev, 0xE0, );
> +   quirk_io_region(dev, region, 64, PCI_BRIDGE_RESOURCES);
> +   pci_read_config_word(dev, 0xE2, );
> +   quirk_io_region(dev, region, 32, PCI_BRIDGE_RESOURCES+1);
> +}
> +
> +/*
> + * PIIX4 ACPI: Two IO regions pointed to by longwords at
> + * 0x40 (64 bytes of ACPI registers)
> + * 0x90 (32 bytes of SMB registers)
> + */
> +static void __init quirk_piix4acpi(struct pci_dev *dev)
> +{
> +   u32 region;
> +
> +   pci_read_config_dword(dev, 0x40, );
> +   quirk_io_region(dev, region, 64, PCI_BRIDGE_RESOURCES);
> +   pci_read_config_dword(dev, 0x90, );
> +   quirk_io_region(dev, region, 32, PCI_BRIDGE_RESOURCES+1);
> +}
> 
>  /*
>   *  The main table of quirks.
> @@ -143,6 +192,8 @@ static struct pci_fixup pci_fixups[] __initdata = {
> { PCI_FIXUP_FINAL,  PCI_VENDOR_ID_INTEL,
> PCI_DEVICE_ID_INTEL_82443BX_2,  quirk_natoma },
> { PCI_FIXUP_FINAL,  PCI_VENDOR_ID_SI,
> PCI_DEVICE_ID_SI_5597,  quirk_nopcipci },
> { PCI_FIXUP_FINAL,  PCI_VENDOR_ID_SI,
> PCI_DEVICE_ID_SI_496,   quirk_nopcipci },
> +   { PCI_FIXUP_FINAL,  PCI_VENDOR_ID_INTEL,
> PCI_DEVICE_ID_INTEL_82371AB_3,  quirk_piix4acpi },
> +   { PCI_FIXUP_FINAL,  PCI_VENDOR_ID_AL,
> PCI_DEVICE_ID_AL_M7101, quirk_ali7101 },
> { 0 }
>  };
> 

-- 
Meelis Roos (mr...@linux.ee)


Re: [PATCH] staging: xgifb: Fix comments style in vb_init.c

2016-03-11 Thread YU Bo

On Fri, Mar 11, 2016 at 01:41:40PM -0800, Kroah-Hartman wrote:

On Tue, Feb 23, 2016 at 11:45:18PM -0500, YU Bo wrote:

Fix comments to use trailing */ on separate lines.

Signed-off-by: YU BO 


You sent me 2 patches that did different things yet had the same
subject line :(

Please fix up and resend.

Ok.

Thank you very much for your time and patience.

YU Bo


thanks,

greg k-h


Re: [PATCH] staging: xgifb: Fix comments style in vb_init.c

2016-03-11 Thread YU Bo

On Fri, Mar 11, 2016 at 01:41:40PM -0800, Kroah-Hartman wrote:

On Tue, Feb 23, 2016 at 11:45:18PM -0500, YU Bo wrote:

Fix comments to use trailing */ on separate lines.

Signed-off-by: YU BO 


You sent me 2 patches that did different things yet had the same
subject line :(

Please fix up and resend.

Ok.

Thank you very much for your time and patience.

YU Bo


thanks,

greg k-h


e827091cb1 "block: merge: get the 1st and last bvec via helpers" broken

2016-03-11 Thread Kent Overstreet
I don't know exactly how it's broken, but with that patch segment counting is
broken - I'm seeing blk_rq_map_sg() overrun the end of the sgtable.

I suggest reverting it for 4.5...


e827091cb1 "block: merge: get the 1st and last bvec via helpers" broken

2016-03-11 Thread Kent Overstreet
I don't know exactly how it's broken, but with that patch segment counting is
broken - I'm seeing blk_rq_map_sg() overrun the end of the sgtable.

I suggest reverting it for 4.5...


Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-11 Thread Theodore Ts'o
On Fri, Mar 11, 2016 at 04:44:16PM -0800, Linus Torvalds wrote:
> On Fri, Mar 11, 2016 at 4:35 PM, Theodore Ts'o  wrote:
> >
> > At the end of the day it's about whether you trust the userspace
> > program or not.
> 
> There's a big difference between "give the user rope", and "tie the
> rope in a noose and put a banana peel so that the user might stumble
> into the rope and hang himself", though.

So let's see.  The user application has to explicitly request
NO_HIDE_STALE via an fallocate flag --- so it requires changing the
source code and recompiling the application.  And then, the system
administrator has to pass in a mount option specifying a group that
the application has to run under.  And then the application has to run
setgid with that group's privileges.

I hardly think that can be considered handing the user a pre-tied
noose.

Sure, the application can do something stupid --- but I'd arguing
giving root to some junior sysadmin is far more likely to cause
problems.

Cheers,

- Ted


Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-11 Thread Theodore Ts'o
On Fri, Mar 11, 2016 at 04:44:16PM -0800, Linus Torvalds wrote:
> On Fri, Mar 11, 2016 at 4:35 PM, Theodore Ts'o  wrote:
> >
> > At the end of the day it's about whether you trust the userspace
> > program or not.
> 
> There's a big difference between "give the user rope", and "tie the
> rope in a noose and put a banana peel so that the user might stumble
> into the rope and hang himself", though.

So let's see.  The user application has to explicitly request
NO_HIDE_STALE via an fallocate flag --- so it requires changing the
source code and recompiling the application.  And then, the system
administrator has to pass in a mount option specifying a group that
the application has to run under.  And then the application has to run
setgid with that group's privileges.

I hardly think that can be considered handing the user a pre-tied
noose.

Sure, the application can do something stupid --- but I'd arguing
giving root to some junior sysadmin is far more likely to cause
problems.

Cheers,

- Ted


Re: [PATCHSET RFC cgroup/for-4.6] cgroup, sched: implement resource group and PRIO_RGRP

2016-03-11 Thread Mike Galbraith
On Fri, 2016-03-11 at 10:41 -0500, Tejun Heo wrote:
> Hello,
> 
> This patchset extends cgroup v2 to support rgroup (resource group) for
> in-process hierarchical resource control and implements PRIO_RGRP for
> setpriority(2) on top to allow in-process hierarchical CPU cycle
> control in a seamless way.
> 
> cgroup v1 allowed putting threads of a process in different cgroups
> which enabled ad-hoc in-process resource control of some resources.
> Unfortunately, this approach was fraught with problems such as
> membership ambiguity with per-process resources and lack of isolation
> between system management and in-process properties.  For a more
> detailed discussion on the subject, please refer to the following
> message.
> 
>  [1] [RFD] cgroup: thread granularity support for cpu controller
> 
> This patchset implements the mechanism outlined in the above message.
> The new mechanism is named rgroup (resource group).  When explicitly
> designating a non-rgroup cgroup, the term sgroup (system group) is
> used.  rgroup has the following properties.
> 
> * A rgroup is a cgroup which is invisible on and transparent to the
>   system-level cgroupfs interface.
> 
> * A rgroup can be created by specifying CLONE_NEWRGRP flag, along with
>   CLONE_THREAD, during clone(2).  A new rgroup is created under the
>   parent thread's cgroup and the new thread is created in it.
> 
> * A rgroup is automatically destroyed when empty.
> 
> * A top-level rgroup of a process is a rgroup whose parent cgroup is a
>   sgroup.  A process may have multiple top-level rgroups and thus
>   multiple rgroup subtrees under the same parent sgroup.
> 
> * Unlike sgroups, rgroups are allowed to compete against peer threads.
>   Each rgroup behaves equivalent to a sibling task.
> 
> * rgroup subtrees are local to the process.  When the process forks or
>   execs, its rgroup subtrees are collapsed.
> 
> * When a process is migrated to a different cgroup, its rgroup
>   subtrees are preserved.
> 
> * Subset of controllers available on the parent sgroup are available
>   to rgroup subtrees.  Controller management on rgroups is automatic
>   and implicit and doesn't interfere with system-level cgroup
>   controller management.  If a controller is made unavailable on the
>   parent sgroup, it's automatically disabled from child rgroup
>   subtrees.
> 
> rgroup lays the foundation for other kernel mechanisms to make use of
> resource controllers while providing proper isolation between system
> management and in-process operations removing the awkward and
> layer-violating requirement for coordination between individual
> applications and system management.  On top of the rgroup mechanism,
> PRIO_RGRP is implemented for {set|get}priority(2).
> 
> * PRIO_RGRP can only be used if the target task is already in a
>   rgroup.  If setpriority(2) is used and cpu controller is available,
>   cpu controller is enabled until the target rgroup is covered and the
>   specified nice value is set as the weight of the rgroup.
> 
> * The specified nice value has the same meaning as for tasks.  For
>   example, a rgroup and a task competing under the same parent would
>   behave exactly the same as two tasks.
> 
> * For top-level rgroups, PRIO_RGRP follows the same rlimit
>   restrictions as PRIO_PROCESS; however, as nested rgroups only
>   distribute CPU cycles which are allocated to the process, no
>   restriction is applied.
> 
> PRIO_RGRP allows in-process hierarchical control of CPU cycles in a
> manner which is a straight-forward and minimal extension of existing
> task and priority management.

Hrm.  You're showing that per-thread groups can coexist just fine,
which is good given need and usage exists today out in the wild.  Why
do such groups have to be invisible with a unique interface though?

Given the core has to deal with them whether they're visible or not,
and given they exist to fulfill a need, seems they should be first
class citizens, not some Quasimodo like creature sneaking into the
cathedral via a back door and slinking about in the shadows.

-Mike


Re: [lustre-devel] [PATCH 07/10] staging: lustre: cleanup comment style for lnet selftest

2016-03-11 Thread Greg Kroah-Hartman
On Fri, Mar 11, 2016 at 10:25:32PM -0800, Greg Kroah-Hartman wrote:
> On Fri, Mar 11, 2016 at 10:24:00PM -0800, Greg Kroah-Hartman wrote:
> > On Sat, Mar 12, 2016 at 01:39:01AM +, Dilger, Andreas wrote:
> > > On 2016/03/11, 18:29, "lustre-devel on behalf of James Simmons"
> > >  > > jsimm...@infradead.org> wrote:
> > > 
> > > >Apply a consistent style for comments in the lnet selftest
> > > >code.
> > > >
> > > >Signed-off-by: James Simmons 
> > > >---
> > > > drivers/staging/lustre/lnet/selftest/brw_test.c  |8 ++--
> > > > drivers/staging/lustre/lnet/selftest/conctl.c|   50
> > > >+++---
> > > > drivers/staging/lustre/lnet/selftest/conrpc.c|   23 +-
> > > > drivers/staging/lustre/lnet/selftest/console.c   |   11 +++--
> > > > drivers/staging/lustre/lnet/selftest/framework.c |   20 
> > > > drivers/staging/lustre/lnet/selftest/ping_test.c |2 +-
> > > > drivers/staging/lustre/lnet/selftest/rpc.c   |   46
> > > >++--
> > > > drivers/staging/lustre/lnet/selftest/rpc.h   |2 +-
> > > > drivers/staging/lustre/lnet/selftest/selftest.h  |3 +-
> > > > drivers/staging/lustre/lnet/selftest/timer.c |6 +-
> > > > 10 files changed, 87 insertions(+), 84 deletions(-)
> > > >
> > > >diff --git a/drivers/staging/lustre/lnet/selftest/brw_test.c
> > > >b/drivers/staging/lustre/lnet/selftest/brw_test.c
> > > >index eebc924..6ac4d02 100644
> > > >--- a/drivers/staging/lustre/lnet/selftest/brw_test.c
> > > >+++ b/drivers/staging/lustre/lnet/selftest/brw_test.c
> > > >@@ -86,7 +86,7 @@ brw_client_init(sfw_test_instance_t *tsi)
> > > > opc = breq->blk_opc;
> > > > flags = breq->blk_flags;
> > > > npg = breq->blk_npg;
> > > >-/*
> > > >+/**
> > > >  * NB: this is not going to work for variable page size,
> > > >  * but we have to keep it for compatibility
> > > >  */
> > > 
> > > The "/**" comment opener is only for header comment blocks that
> > > have markup in them.  I don't think that is kernel style for
> > > normal multi-line comments in the code.
> > 
> > Yes, that is correct.  James, can you fix this up and resend this
> > series?
> 
> Sorry, I meant the series from this patch onward.  I've applied the
> first 6.

Make that just this patch, the ones after this applied just fine.


Re: [PATCHSET RFC cgroup/for-4.6] cgroup, sched: implement resource group and PRIO_RGRP

2016-03-11 Thread Mike Galbraith
On Fri, 2016-03-11 at 10:41 -0500, Tejun Heo wrote:
> Hello,
> 
> This patchset extends cgroup v2 to support rgroup (resource group) for
> in-process hierarchical resource control and implements PRIO_RGRP for
> setpriority(2) on top to allow in-process hierarchical CPU cycle
> control in a seamless way.
> 
> cgroup v1 allowed putting threads of a process in different cgroups
> which enabled ad-hoc in-process resource control of some resources.
> Unfortunately, this approach was fraught with problems such as
> membership ambiguity with per-process resources and lack of isolation
> between system management and in-process properties.  For a more
> detailed discussion on the subject, please refer to the following
> message.
> 
>  [1] [RFD] cgroup: thread granularity support for cpu controller
> 
> This patchset implements the mechanism outlined in the above message.
> The new mechanism is named rgroup (resource group).  When explicitly
> designating a non-rgroup cgroup, the term sgroup (system group) is
> used.  rgroup has the following properties.
> 
> * A rgroup is a cgroup which is invisible on and transparent to the
>   system-level cgroupfs interface.
> 
> * A rgroup can be created by specifying CLONE_NEWRGRP flag, along with
>   CLONE_THREAD, during clone(2).  A new rgroup is created under the
>   parent thread's cgroup and the new thread is created in it.
> 
> * A rgroup is automatically destroyed when empty.
> 
> * A top-level rgroup of a process is a rgroup whose parent cgroup is a
>   sgroup.  A process may have multiple top-level rgroups and thus
>   multiple rgroup subtrees under the same parent sgroup.
> 
> * Unlike sgroups, rgroups are allowed to compete against peer threads.
>   Each rgroup behaves equivalent to a sibling task.
> 
> * rgroup subtrees are local to the process.  When the process forks or
>   execs, its rgroup subtrees are collapsed.
> 
> * When a process is migrated to a different cgroup, its rgroup
>   subtrees are preserved.
> 
> * Subset of controllers available on the parent sgroup are available
>   to rgroup subtrees.  Controller management on rgroups is automatic
>   and implicit and doesn't interfere with system-level cgroup
>   controller management.  If a controller is made unavailable on the
>   parent sgroup, it's automatically disabled from child rgroup
>   subtrees.
> 
> rgroup lays the foundation for other kernel mechanisms to make use of
> resource controllers while providing proper isolation between system
> management and in-process operations removing the awkward and
> layer-violating requirement for coordination between individual
> applications and system management.  On top of the rgroup mechanism,
> PRIO_RGRP is implemented for {set|get}priority(2).
> 
> * PRIO_RGRP can only be used if the target task is already in a
>   rgroup.  If setpriority(2) is used and cpu controller is available,
>   cpu controller is enabled until the target rgroup is covered and the
>   specified nice value is set as the weight of the rgroup.
> 
> * The specified nice value has the same meaning as for tasks.  For
>   example, a rgroup and a task competing under the same parent would
>   behave exactly the same as two tasks.
> 
> * For top-level rgroups, PRIO_RGRP follows the same rlimit
>   restrictions as PRIO_PROCESS; however, as nested rgroups only
>   distribute CPU cycles which are allocated to the process, no
>   restriction is applied.
> 
> PRIO_RGRP allows in-process hierarchical control of CPU cycles in a
> manner which is a straight-forward and minimal extension of existing
> task and priority management.

Hrm.  You're showing that per-thread groups can coexist just fine,
which is good given need and usage exists today out in the wild.  Why
do such groups have to be invisible with a unique interface though?

Given the core has to deal with them whether they're visible or not,
and given they exist to fulfill a need, seems they should be first
class citizens, not some Quasimodo like creature sneaking into the
cathedral via a back door and slinking about in the shadows.

-Mike


Re: [lustre-devel] [PATCH 07/10] staging: lustre: cleanup comment style for lnet selftest

2016-03-11 Thread Greg Kroah-Hartman
On Fri, Mar 11, 2016 at 10:25:32PM -0800, Greg Kroah-Hartman wrote:
> On Fri, Mar 11, 2016 at 10:24:00PM -0800, Greg Kroah-Hartman wrote:
> > On Sat, Mar 12, 2016 at 01:39:01AM +, Dilger, Andreas wrote:
> > > On 2016/03/11, 18:29, "lustre-devel on behalf of James Simmons"
> > >  > > jsimm...@infradead.org> wrote:
> > > 
> > > >Apply a consistent style for comments in the lnet selftest
> > > >code.
> > > >
> > > >Signed-off-by: James Simmons 
> > > >---
> > > > drivers/staging/lustre/lnet/selftest/brw_test.c  |8 ++--
> > > > drivers/staging/lustre/lnet/selftest/conctl.c|   50
> > > >+++---
> > > > drivers/staging/lustre/lnet/selftest/conrpc.c|   23 +-
> > > > drivers/staging/lustre/lnet/selftest/console.c   |   11 +++--
> > > > drivers/staging/lustre/lnet/selftest/framework.c |   20 
> > > > drivers/staging/lustre/lnet/selftest/ping_test.c |2 +-
> > > > drivers/staging/lustre/lnet/selftest/rpc.c   |   46
> > > >++--
> > > > drivers/staging/lustre/lnet/selftest/rpc.h   |2 +-
> > > > drivers/staging/lustre/lnet/selftest/selftest.h  |3 +-
> > > > drivers/staging/lustre/lnet/selftest/timer.c |6 +-
> > > > 10 files changed, 87 insertions(+), 84 deletions(-)
> > > >
> > > >diff --git a/drivers/staging/lustre/lnet/selftest/brw_test.c
> > > >b/drivers/staging/lustre/lnet/selftest/brw_test.c
> > > >index eebc924..6ac4d02 100644
> > > >--- a/drivers/staging/lustre/lnet/selftest/brw_test.c
> > > >+++ b/drivers/staging/lustre/lnet/selftest/brw_test.c
> > > >@@ -86,7 +86,7 @@ brw_client_init(sfw_test_instance_t *tsi)
> > > > opc = breq->blk_opc;
> > > > flags = breq->blk_flags;
> > > > npg = breq->blk_npg;
> > > >-/*
> > > >+/**
> > > >  * NB: this is not going to work for variable page size,
> > > >  * but we have to keep it for compatibility
> > > >  */
> > > 
> > > The "/**" comment opener is only for header comment blocks that
> > > have markup in them.  I don't think that is kernel style for
> > > normal multi-line comments in the code.
> > 
> > Yes, that is correct.  James, can you fix this up and resend this
> > series?
> 
> Sorry, I meant the series from this patch onward.  I've applied the
> first 6.

Make that just this patch, the ones after this applied just fine.


Re: [lustre-devel] [PATCH 07/10] staging: lustre: cleanup comment style for lnet selftest

2016-03-11 Thread Greg Kroah-Hartman
On Fri, Mar 11, 2016 at 10:24:00PM -0800, Greg Kroah-Hartman wrote:
> On Sat, Mar 12, 2016 at 01:39:01AM +, Dilger, Andreas wrote:
> > On 2016/03/11, 18:29, "lustre-devel on behalf of James Simmons"
> >  > jsimm...@infradead.org> wrote:
> > 
> > >Apply a consistent style for comments in the lnet selftest
> > >code.
> > >
> > >Signed-off-by: James Simmons 
> > >---
> > > drivers/staging/lustre/lnet/selftest/brw_test.c  |8 ++--
> > > drivers/staging/lustre/lnet/selftest/conctl.c|   50
> > >+++---
> > > drivers/staging/lustre/lnet/selftest/conrpc.c|   23 +-
> > > drivers/staging/lustre/lnet/selftest/console.c   |   11 +++--
> > > drivers/staging/lustre/lnet/selftest/framework.c |   20 
> > > drivers/staging/lustre/lnet/selftest/ping_test.c |2 +-
> > > drivers/staging/lustre/lnet/selftest/rpc.c   |   46
> > >++--
> > > drivers/staging/lustre/lnet/selftest/rpc.h   |2 +-
> > > drivers/staging/lustre/lnet/selftest/selftest.h  |3 +-
> > > drivers/staging/lustre/lnet/selftest/timer.c |6 +-
> > > 10 files changed, 87 insertions(+), 84 deletions(-)
> > >
> > >diff --git a/drivers/staging/lustre/lnet/selftest/brw_test.c
> > >b/drivers/staging/lustre/lnet/selftest/brw_test.c
> > >index eebc924..6ac4d02 100644
> > >--- a/drivers/staging/lustre/lnet/selftest/brw_test.c
> > >+++ b/drivers/staging/lustre/lnet/selftest/brw_test.c
> > >@@ -86,7 +86,7 @@ brw_client_init(sfw_test_instance_t *tsi)
> > >   opc = breq->blk_opc;
> > >   flags = breq->blk_flags;
> > >   npg = breq->blk_npg;
> > >-  /*
> > >+  /**
> > >* NB: this is not going to work for variable page size,
> > >* but we have to keep it for compatibility
> > >*/
> > 
> > The "/**" comment opener is only for header comment blocks that
> > have markup in them.  I don't think that is kernel style for
> > normal multi-line comments in the code.
> 
> Yes, that is correct.  James, can you fix this up and resend this
> series?

Sorry, I meant the series from this patch onward.  I've applied the
first 6.


Re: [lustre-devel] [PATCH 07/10] staging: lustre: cleanup comment style for lnet selftest

2016-03-11 Thread Greg Kroah-Hartman
On Fri, Mar 11, 2016 at 10:24:00PM -0800, Greg Kroah-Hartman wrote:
> On Sat, Mar 12, 2016 at 01:39:01AM +, Dilger, Andreas wrote:
> > On 2016/03/11, 18:29, "lustre-devel on behalf of James Simmons"
> >  > jsimm...@infradead.org> wrote:
> > 
> > >Apply a consistent style for comments in the lnet selftest
> > >code.
> > >
> > >Signed-off-by: James Simmons 
> > >---
> > > drivers/staging/lustre/lnet/selftest/brw_test.c  |8 ++--
> > > drivers/staging/lustre/lnet/selftest/conctl.c|   50
> > >+++---
> > > drivers/staging/lustre/lnet/selftest/conrpc.c|   23 +-
> > > drivers/staging/lustre/lnet/selftest/console.c   |   11 +++--
> > > drivers/staging/lustre/lnet/selftest/framework.c |   20 
> > > drivers/staging/lustre/lnet/selftest/ping_test.c |2 +-
> > > drivers/staging/lustre/lnet/selftest/rpc.c   |   46
> > >++--
> > > drivers/staging/lustre/lnet/selftest/rpc.h   |2 +-
> > > drivers/staging/lustre/lnet/selftest/selftest.h  |3 +-
> > > drivers/staging/lustre/lnet/selftest/timer.c |6 +-
> > > 10 files changed, 87 insertions(+), 84 deletions(-)
> > >
> > >diff --git a/drivers/staging/lustre/lnet/selftest/brw_test.c
> > >b/drivers/staging/lustre/lnet/selftest/brw_test.c
> > >index eebc924..6ac4d02 100644
> > >--- a/drivers/staging/lustre/lnet/selftest/brw_test.c
> > >+++ b/drivers/staging/lustre/lnet/selftest/brw_test.c
> > >@@ -86,7 +86,7 @@ brw_client_init(sfw_test_instance_t *tsi)
> > >   opc = breq->blk_opc;
> > >   flags = breq->blk_flags;
> > >   npg = breq->blk_npg;
> > >-  /*
> > >+  /**
> > >* NB: this is not going to work for variable page size,
> > >* but we have to keep it for compatibility
> > >*/
> > 
> > The "/**" comment opener is only for header comment blocks that
> > have markup in them.  I don't think that is kernel style for
> > normal multi-line comments in the code.
> 
> Yes, that is correct.  James, can you fix this up and resend this
> series?

Sorry, I meant the series from this patch onward.  I've applied the
first 6.


Re: [lustre-devel] [PATCH 07/10] staging: lustre: cleanup comment style for lnet selftest

2016-03-11 Thread Greg Kroah-Hartman
On Sat, Mar 12, 2016 at 01:39:01AM +, Dilger, Andreas wrote:
> On 2016/03/11, 18:29, "lustre-devel on behalf of James Simmons"
>  jsimm...@infradead.org> wrote:
> 
> >Apply a consistent style for comments in the lnet selftest
> >code.
> >
> >Signed-off-by: James Simmons 
> >---
> > drivers/staging/lustre/lnet/selftest/brw_test.c  |8 ++--
> > drivers/staging/lustre/lnet/selftest/conctl.c|   50
> >+++---
> > drivers/staging/lustre/lnet/selftest/conrpc.c|   23 +-
> > drivers/staging/lustre/lnet/selftest/console.c   |   11 +++--
> > drivers/staging/lustre/lnet/selftest/framework.c |   20 
> > drivers/staging/lustre/lnet/selftest/ping_test.c |2 +-
> > drivers/staging/lustre/lnet/selftest/rpc.c   |   46
> >++--
> > drivers/staging/lustre/lnet/selftest/rpc.h   |2 +-
> > drivers/staging/lustre/lnet/selftest/selftest.h  |3 +-
> > drivers/staging/lustre/lnet/selftest/timer.c |6 +-
> > 10 files changed, 87 insertions(+), 84 deletions(-)
> >
> >diff --git a/drivers/staging/lustre/lnet/selftest/brw_test.c
> >b/drivers/staging/lustre/lnet/selftest/brw_test.c
> >index eebc924..6ac4d02 100644
> >--- a/drivers/staging/lustre/lnet/selftest/brw_test.c
> >+++ b/drivers/staging/lustre/lnet/selftest/brw_test.c
> >@@ -86,7 +86,7 @@ brw_client_init(sfw_test_instance_t *tsi)
> > opc = breq->blk_opc;
> > flags = breq->blk_flags;
> > npg = breq->blk_npg;
> >-/*
> >+/**
> >  * NB: this is not going to work for variable page size,
> >  * but we have to keep it for compatibility
> >  */
> 
> The "/**" comment opener is only for header comment blocks that
> have markup in them.  I don't think that is kernel style for
> normal multi-line comments in the code.

Yes, that is correct.  James, can you fix this up and resend this
series?

thanks,

greg k-h


Re: [lustre-devel] [PATCH 07/10] staging: lustre: cleanup comment style for lnet selftest

2016-03-11 Thread Greg Kroah-Hartman
On Sat, Mar 12, 2016 at 01:39:01AM +, Dilger, Andreas wrote:
> On 2016/03/11, 18:29, "lustre-devel on behalf of James Simmons"
>  jsimm...@infradead.org> wrote:
> 
> >Apply a consistent style for comments in the lnet selftest
> >code.
> >
> >Signed-off-by: James Simmons 
> >---
> > drivers/staging/lustre/lnet/selftest/brw_test.c  |8 ++--
> > drivers/staging/lustre/lnet/selftest/conctl.c|   50
> >+++---
> > drivers/staging/lustre/lnet/selftest/conrpc.c|   23 +-
> > drivers/staging/lustre/lnet/selftest/console.c   |   11 +++--
> > drivers/staging/lustre/lnet/selftest/framework.c |   20 
> > drivers/staging/lustre/lnet/selftest/ping_test.c |2 +-
> > drivers/staging/lustre/lnet/selftest/rpc.c   |   46
> >++--
> > drivers/staging/lustre/lnet/selftest/rpc.h   |2 +-
> > drivers/staging/lustre/lnet/selftest/selftest.h  |3 +-
> > drivers/staging/lustre/lnet/selftest/timer.c |6 +-
> > 10 files changed, 87 insertions(+), 84 deletions(-)
> >
> >diff --git a/drivers/staging/lustre/lnet/selftest/brw_test.c
> >b/drivers/staging/lustre/lnet/selftest/brw_test.c
> >index eebc924..6ac4d02 100644
> >--- a/drivers/staging/lustre/lnet/selftest/brw_test.c
> >+++ b/drivers/staging/lustre/lnet/selftest/brw_test.c
> >@@ -86,7 +86,7 @@ brw_client_init(sfw_test_instance_t *tsi)
> > opc = breq->blk_opc;
> > flags = breq->blk_flags;
> > npg = breq->blk_npg;
> >-/*
> >+/**
> >  * NB: this is not going to work for variable page size,
> >  * but we have to keep it for compatibility
> >  */
> 
> The "/**" comment opener is only for header comment blocks that
> have markup in them.  I don't think that is kernel style for
> normal multi-line comments in the code.

Yes, that is correct.  James, can you fix this up and resend this
series?

thanks,

greg k-h


Re: [PATCHv9 1/3] rdmacg: Added rdma cgroup controller

2016-03-11 Thread Parav Pandit
Hi Tejun,

On Sat, Mar 5, 2016 at 10:50 PM, Parav Pandit  wrote:
> Hi Tejun,
>
> On Sat, Mar 5, 2016 at 6:22 PM, Tejun Heo  wrote:
>> Hello, Parav.
>>
>> On Sat, Mar 05, 2016 at 04:45:09PM +0530, Parav Pandit wrote:
>>> Design that remains same from v6 to v10.
>>>   * spin lock is still fine grained at cgroup level instead of one
>>> global shared lock among all cgroups.
>>>  In future it can be optimized further to do per cpu or using
>>> single lock if required.
>>>   * file type enums are still present for max and current, as
>>> read/write call to those files is already taken care by common
>>> functions with required if/else.
>>>   * Resource limit setting is as it is, because number of devices are
>>> in range of 1 to 4 count in most use cases (as explained in
>>> documentation), and its not hot path.
>>
>> 1 and 2 are not okay.
> For (1) shall I have one spin lock that is uses across multiple
> hierarchy and multiple cgroup.
> Essentially one global lock among all cgroup. During hierarchical
> charging, continue to use same lock it at each level.
> Would that work in this first release?
>

I am waiting for your reply.
Shall one lock for all cgroup is ok with you?

> Can you please review the code for (2), I cannot think of any further
> helper functions that I can write.
> For both the file types, all the code is already common.
> file types are used only to find out whether to reference max variable
> or usage variable in structure.
> Which can also be made as array, but I do not want to lose the code
> readability for that little gain.
> What exactly is the issue in current implementation? You just
> mentioned that "its not good sign".
> Its readable, simple and serves the purpose, what am I missing?
>
If this is ok. I will keep the code as it is, because it uses common
helper functions for max and current files.


>> 3 is fine but resource [un]charging is not hot path?
> charge/uncharge is hot path from cgroup perspective.
> Considering 1 to 4 devices in system rpool list would grow upto 4
> entry deep at each cgroup level.
> I believe this is good enough to start with. O complexity wise its
> O(N). where N is number of devices in system.
>
>
>>
>> Thanks.
>>
>> --
>> tejun


Re: [PATCHv9 1/3] rdmacg: Added rdma cgroup controller

2016-03-11 Thread Parav Pandit
Hi Tejun,

On Sat, Mar 5, 2016 at 10:50 PM, Parav Pandit  wrote:
> Hi Tejun,
>
> On Sat, Mar 5, 2016 at 6:22 PM, Tejun Heo  wrote:
>> Hello, Parav.
>>
>> On Sat, Mar 05, 2016 at 04:45:09PM +0530, Parav Pandit wrote:
>>> Design that remains same from v6 to v10.
>>>   * spin lock is still fine grained at cgroup level instead of one
>>> global shared lock among all cgroups.
>>>  In future it can be optimized further to do per cpu or using
>>> single lock if required.
>>>   * file type enums are still present for max and current, as
>>> read/write call to those files is already taken care by common
>>> functions with required if/else.
>>>   * Resource limit setting is as it is, because number of devices are
>>> in range of 1 to 4 count in most use cases (as explained in
>>> documentation), and its not hot path.
>>
>> 1 and 2 are not okay.
> For (1) shall I have one spin lock that is uses across multiple
> hierarchy and multiple cgroup.
> Essentially one global lock among all cgroup. During hierarchical
> charging, continue to use same lock it at each level.
> Would that work in this first release?
>

I am waiting for your reply.
Shall one lock for all cgroup is ok with you?

> Can you please review the code for (2), I cannot think of any further
> helper functions that I can write.
> For both the file types, all the code is already common.
> file types are used only to find out whether to reference max variable
> or usage variable in structure.
> Which can also be made as array, but I do not want to lose the code
> readability for that little gain.
> What exactly is the issue in current implementation? You just
> mentioned that "its not good sign".
> Its readable, simple and serves the purpose, what am I missing?
>
If this is ok. I will keep the code as it is, because it uses common
helper functions for max and current files.


>> 3 is fine but resource [un]charging is not hot path?
> charge/uncharge is hot path from cgroup perspective.
> Considering 1 to 4 devices in system rpool list would grow upto 4
> entry deep at each cgroup level.
> I believe this is good enough to start with. O complexity wise its
> O(N). where N is number of devices in system.
>
>
>>
>> Thanks.
>>
>> --
>> tejun


Re: [PATCH 4.4 34/74] arm64: vmemmap: use virtual projection of linear region

2016-03-11 Thread Greg Kroah-Hartman
On Sat, Mar 12, 2016 at 01:55:44PM +0800, Ard Biesheuvel wrote:
> 
> 
> > On 12 mrt. 2016, at 13:50, Greg Kroah-Hartman  
> > wrote:
> > 
> >> On Sat, Mar 12, 2016 at 08:51:26AM +0700, Ard Biesheuvel wrote:
> >>> On 8 March 2016 at 20:45, Ard Biesheuvel  
> >>> wrote:
>  On 8 March 2016 at 20:44, Greg Kroah-Hartman 
>   wrote:
> > On Tue, Mar 08, 2016 at 05:40:14PM +0700, Ard Biesheuvel wrote:
> >> On 8 March 2016 at 07:02, Greg Kroah-Hartman 
> >>  wrote:
> >> 4.4-stable review patch.  If anyone has any objections, please let me 
> >> know.
> > 
> > Please hold off on this one. We are seeing some breakage on 64k pages 
> > systems
>  
>  If this problem is also in Linus's tree, I'd like to keep it in to keep
>  things "bug compatible".  Please let me know what fix that I should
>  apply to resolve this.
> >>> 
> >>> I am about to send out the patch that should fix this, so I will put you 
> >>> on cc.
> >> 
> >> Not sure what happened here, but this patch is in 4.4-stable now, but
> >> the fix is not.
> > 
> > Because the fix came out _after_ I released that kernel?  I can't go
> > back in time...
> > 
> 
> I kind of got the whole chronology thing. I am just surprised you
> pulled only that patch (and not the fix) anyway, since you knew it
> would break things, and that a fix was on the way.

That way I knew you all would work quickly to get the fix in :)

We do this all the time, nothing new here, being "bug compatible" is
good...

thanks,

greg k-h


Re: [PATCH 4.4 34/74] arm64: vmemmap: use virtual projection of linear region

2016-03-11 Thread Greg Kroah-Hartman
On Sat, Mar 12, 2016 at 01:55:44PM +0800, Ard Biesheuvel wrote:
> 
> 
> > On 12 mrt. 2016, at 13:50, Greg Kroah-Hartman  
> > wrote:
> > 
> >> On Sat, Mar 12, 2016 at 08:51:26AM +0700, Ard Biesheuvel wrote:
> >>> On 8 March 2016 at 20:45, Ard Biesheuvel  
> >>> wrote:
>  On 8 March 2016 at 20:44, Greg Kroah-Hartman 
>   wrote:
> > On Tue, Mar 08, 2016 at 05:40:14PM +0700, Ard Biesheuvel wrote:
> >> On 8 March 2016 at 07:02, Greg Kroah-Hartman 
> >>  wrote:
> >> 4.4-stable review patch.  If anyone has any objections, please let me 
> >> know.
> > 
> > Please hold off on this one. We are seeing some breakage on 64k pages 
> > systems
>  
>  If this problem is also in Linus's tree, I'd like to keep it in to keep
>  things "bug compatible".  Please let me know what fix that I should
>  apply to resolve this.
> >>> 
> >>> I am about to send out the patch that should fix this, so I will put you 
> >>> on cc.
> >> 
> >> Not sure what happened here, but this patch is in 4.4-stable now, but
> >> the fix is not.
> > 
> > Because the fix came out _after_ I released that kernel?  I can't go
> > back in time...
> > 
> 
> I kind of got the whole chronology thing. I am just surprised you
> pulled only that patch (and not the fix) anyway, since you knew it
> would break things, and that a fix was on the way.

That way I knew you all would work quickly to get the fix in :)

We do this all the time, nothing new here, being "bug compatible" is
good...

thanks,

greg k-h


[PATCH] PM / runtime: Document steps for device removal

2016-03-11 Thread Krzysztof Kozlowski
Put a reminder that during device removal drivers should revert all PM
runtime changes from the probe. Also add a note that
pm_runtime_disable() won't wait for pending suspend requests if
autosuspend is not disabled before.

Signed-off-by: Krzysztof Kozlowski 
---
 Documentation/power/runtime_pm.txt | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/Documentation/power/runtime_pm.txt 
b/Documentation/power/runtime_pm.txt
index 7328cf85236c..c05e5a17a52d 100644
--- a/Documentation/power/runtime_pm.txt
+++ b/Documentation/power/runtime_pm.txt
@@ -410,7 +410,8 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:
   field was previously zero, this prevents subsystem-level runtime PM
   callbacks from being run for the device), make sure that all of the
   pending runtime PM operations on the device are either completed or
-  canceled; returns 1 if there was a resume request pending and it was
+  canceled (although this depends on disabling autosuspend before
+  calling this); returns 1 if there was a resume request pending and it was
   necessary to execute the subsystem-level resume callback for the device
   to satisfy that request, otherwise 0 is returned
 
@@ -586,6 +587,10 @@ drivers to make their ->remove() callbacks avoid races 
with runtime PM directly,
 but also it allows of more flexibility in the handling of devices during the
 removal of their drivers.
 
+Drivers in ->remove() callback should undo the runtime PM changes done
+in ->probe(). Usually this means calling pm_runtime_disable(),
+pm_runtime_dont_use_autosuspend() etc.
+
 The user space can effectively disallow the driver of the device to power 
manage
 it at run time by changing the value of its /sys/devices/.../power/control
 attribute to "on", which causes pm_runtime_forbid() to be called.  In 
principle,
-- 
2.1.4



[PATCH] PM / runtime: Document steps for device removal

2016-03-11 Thread Krzysztof Kozlowski
Put a reminder that during device removal drivers should revert all PM
runtime changes from the probe. Also add a note that
pm_runtime_disable() won't wait for pending suspend requests if
autosuspend is not disabled before.

Signed-off-by: Krzysztof Kozlowski 
---
 Documentation/power/runtime_pm.txt | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/Documentation/power/runtime_pm.txt 
b/Documentation/power/runtime_pm.txt
index 7328cf85236c..c05e5a17a52d 100644
--- a/Documentation/power/runtime_pm.txt
+++ b/Documentation/power/runtime_pm.txt
@@ -410,7 +410,8 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:
   field was previously zero, this prevents subsystem-level runtime PM
   callbacks from being run for the device), make sure that all of the
   pending runtime PM operations on the device are either completed or
-  canceled; returns 1 if there was a resume request pending and it was
+  canceled (although this depends on disabling autosuspend before
+  calling this); returns 1 if there was a resume request pending and it was
   necessary to execute the subsystem-level resume callback for the device
   to satisfy that request, otherwise 0 is returned
 
@@ -586,6 +587,10 @@ drivers to make their ->remove() callbacks avoid races 
with runtime PM directly,
 but also it allows of more flexibility in the handling of devices during the
 removal of their drivers.
 
+Drivers in ->remove() callback should undo the runtime PM changes done
+in ->probe(). Usually this means calling pm_runtime_disable(),
+pm_runtime_dont_use_autosuspend() etc.
+
 The user space can effectively disallow the driver of the device to power 
manage
 it at run time by changing the value of its /sys/devices/.../power/control
 attribute to "on", which causes pm_runtime_forbid() to be called.  In 
principle,
-- 
2.1.4



Re: [PATCH 4.4 34/74] arm64: vmemmap: use virtual projection of linear region

2016-03-11 Thread Ard Biesheuvel


> On 12 mrt. 2016, at 13:50, Greg Kroah-Hartman  
> wrote:
> 
>> On Sat, Mar 12, 2016 at 08:51:26AM +0700, Ard Biesheuvel wrote:
>>> On 8 March 2016 at 20:45, Ard Biesheuvel  wrote:
 On 8 March 2016 at 20:44, Greg Kroah-Hartman  
 wrote:
> On Tue, Mar 08, 2016 at 05:40:14PM +0700, Ard Biesheuvel wrote:
>> On 8 March 2016 at 07:02, Greg Kroah-Hartman 
>>  wrote:
>> 4.4-stable review patch.  If anyone has any objections, please let me 
>> know.
> 
> Please hold off on this one. We are seeing some breakage on 64k pages 
> systems
 
 If this problem is also in Linus's tree, I'd like to keep it in to keep
 things "bug compatible".  Please let me know what fix that I should
 apply to resolve this.
>>> 
>>> I am about to send out the patch that should fix this, so I will put you on 
>>> cc.
>> 
>> Not sure what happened here, but this patch is in 4.4-stable now, but
>> the fix is not.
> 
> Because the fix came out _after_ I released that kernel?  I can't go
> back in time...
> 

I kind of got the whole chronology thing. I am just surprised you pulled only 
that patch (and not the fix) anyway, since you knew it would break things, and 
that a fix was on the way.


Re: [PATCH 4.4 34/74] arm64: vmemmap: use virtual projection of linear region

2016-03-11 Thread Ard Biesheuvel


> On 12 mrt. 2016, at 13:50, Greg Kroah-Hartman  
> wrote:
> 
>> On Sat, Mar 12, 2016 at 08:51:26AM +0700, Ard Biesheuvel wrote:
>>> On 8 March 2016 at 20:45, Ard Biesheuvel  wrote:
 On 8 March 2016 at 20:44, Greg Kroah-Hartman  
 wrote:
> On Tue, Mar 08, 2016 at 05:40:14PM +0700, Ard Biesheuvel wrote:
>> On 8 March 2016 at 07:02, Greg Kroah-Hartman 
>>  wrote:
>> 4.4-stable review patch.  If anyone has any objections, please let me 
>> know.
> 
> Please hold off on this one. We are seeing some breakage on 64k pages 
> systems
 
 If this problem is also in Linus's tree, I'd like to keep it in to keep
 things "bug compatible".  Please let me know what fix that I should
 apply to resolve this.
>>> 
>>> I am about to send out the patch that should fix this, so I will put you on 
>>> cc.
>> 
>> Not sure what happened here, but this patch is in 4.4-stable now, but
>> the fix is not.
> 
> Because the fix came out _after_ I released that kernel?  I can't go
> back in time...
> 

I kind of got the whole chronology thing. I am just surprised you pulled only 
that patch (and not the fix) anyway, since you knew it would break things, and 
that a fix was on the way.


Re: [PATCH 4.4 34/74] arm64: vmemmap: use virtual projection of linear region

2016-03-11 Thread Greg Kroah-Hartman
On Sat, Mar 12, 2016 at 08:51:26AM +0700, Ard Biesheuvel wrote:
> On 8 March 2016 at 20:45, Ard Biesheuvel  wrote:
> > On 8 March 2016 at 20:44, Greg Kroah-Hartman  
> > wrote:
> >> On Tue, Mar 08, 2016 at 05:40:14PM +0700, Ard Biesheuvel wrote:
> >>> On 8 March 2016 at 07:02, Greg Kroah-Hartman  
> >>> wrote:
> >>> > 4.4-stable review patch.  If anyone has any objections, please let me 
> >>> > know.
> >>> >
> >>>
> >>> Please hold off on this one. We are seeing some breakage on 64k pages 
> >>> systems
> >>
> >> If this problem is also in Linus's tree, I'd like to keep it in to keep
> >> things "bug compatible".  Please let me know what fix that I should
> >> apply to resolve this.
> >>
> >
> > I am about to send out the patch that should fix this, so I will put you on 
> > cc.
> >
> 
> Not sure what happened here, but this patch is in 4.4-stable now, but
> the fix is not.

Because the fix came out _after_ I released that kernel?  I can't go
back in time...



Re: [PATCH 4.4 34/74] arm64: vmemmap: use virtual projection of linear region

2016-03-11 Thread Greg Kroah-Hartman
On Sat, Mar 12, 2016 at 08:51:26AM +0700, Ard Biesheuvel wrote:
> On 8 March 2016 at 20:45, Ard Biesheuvel  wrote:
> > On 8 March 2016 at 20:44, Greg Kroah-Hartman  
> > wrote:
> >> On Tue, Mar 08, 2016 at 05:40:14PM +0700, Ard Biesheuvel wrote:
> >>> On 8 March 2016 at 07:02, Greg Kroah-Hartman  
> >>> wrote:
> >>> > 4.4-stable review patch.  If anyone has any objections, please let me 
> >>> > know.
> >>> >
> >>>
> >>> Please hold off on this one. We are seeing some breakage on 64k pages 
> >>> systems
> >>
> >> If this problem is also in Linus's tree, I'd like to keep it in to keep
> >> things "bug compatible".  Please let me know what fix that I should
> >> apply to resolve this.
> >>
> >
> > I am about to send out the patch that should fix this, so I will put you on 
> > cc.
> >
> 
> Not sure what happened here, but this patch is in 4.4-stable now, but
> the fix is not.

Because the fix came out _after_ I released that kernel?  I can't go
back in time...



Re: [PATCH 2/2] Staging: nvec: fix multiline comment style.

2016-03-11 Thread Greg KH
On Sat, Mar 12, 2016 at 10:32:49AM +0530, Neha Rani wrote:
> But it was warned by checkpatch.pl and after modifying, no warning. How can it
> be wrong?

I'll turn it around and ask you how do you know it is correct?  :)

hint, it isn't, read Documentation/CodingStyle please.

thanks,

greg k-h


Re: [PATCH 2/2] Staging: nvec: fix multiline comment style.

2016-03-11 Thread Greg KH
On Sat, Mar 12, 2016 at 10:32:49AM +0530, Neha Rani wrote:
> But it was warned by checkpatch.pl and after modifying, no warning. How can it
> be wrong?

I'll turn it around and ask you how do you know it is correct?  :)

hint, it isn't, read Documentation/CodingStyle please.

thanks,

greg k-h


Re: [PATCH 5/5] hwrng: exynos - Disable runtime PM on driver unbind

2016-03-11 Thread Krzysztof Kozlowski
W dniu 11.03.2016 o 16:49, Krzysztof Kozlowski pisze:
> Driver enabled runtime PM but did not revert this on removal. Re-binding
> of a device triggered warning:
>   exynos-rng 10830400.rng: Unbalanced pm_runtime_enable!
> 
> Fixes: b329669ea0b5 ("hwrng: exynos - Add support for Exynos random number 
> generator")
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  drivers/char/hw_random/exynos-rng.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/char/hw_random/exynos-rng.c 
> b/drivers/char/hw_random/exynos-rng.c
> index 68c349bf66a0..cba1ff538c46 100644
> --- a/drivers/char/hw_random/exynos-rng.c
> +++ b/drivers/char/hw_random/exynos-rng.c
> @@ -154,6 +154,13 @@ static int exynos_rng_probe(struct platform_device *pdev)
>   return ret;
>  }
>  
> +static int exynos_rng_remove(struct platform_device *pdev)
> +{
> + pm_runtime_disable(>dev);
> +

This is not sufficient. pm_runtime_dont_use_autosuspend() is also
necessary here. I will send a v2.

BTW, no problem if it is too late for taking this for v4.6. If this
patchset misses merge window I'll resend it later.

Best regards,
Krzysztof



Re: [PATCH 5/5] hwrng: exynos - Disable runtime PM on driver unbind

2016-03-11 Thread Krzysztof Kozlowski
W dniu 11.03.2016 o 16:49, Krzysztof Kozlowski pisze:
> Driver enabled runtime PM but did not revert this on removal. Re-binding
> of a device triggered warning:
>   exynos-rng 10830400.rng: Unbalanced pm_runtime_enable!
> 
> Fixes: b329669ea0b5 ("hwrng: exynos - Add support for Exynos random number 
> generator")
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  drivers/char/hw_random/exynos-rng.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/char/hw_random/exynos-rng.c 
> b/drivers/char/hw_random/exynos-rng.c
> index 68c349bf66a0..cba1ff538c46 100644
> --- a/drivers/char/hw_random/exynos-rng.c
> +++ b/drivers/char/hw_random/exynos-rng.c
> @@ -154,6 +154,13 @@ static int exynos_rng_probe(struct platform_device *pdev)
>   return ret;
>  }
>  
> +static int exynos_rng_remove(struct platform_device *pdev)
> +{
> + pm_runtime_disable(>dev);
> +

This is not sufficient. pm_runtime_dont_use_autosuspend() is also
necessary here. I will send a v2.

BTW, no problem if it is too late for taking this for v4.6. If this
patchset misses merge window I'll resend it later.

Best regards,
Krzysztof



Re: [BUG] Device unbound in a resumed state

2016-03-11 Thread Krzysztof Kozlowski
2016-03-12 3:46 GMT+09:00 Alan Stern :
> On Fri, 11 Mar 2016, Krzysztof Kozlowski wrote:
>
>> Hi,
>>
>>
>> Could be related (the same?) with [0].
>>
>> I have a driver (hwrng/exynos-rng) which in probe does:
>> pm_runtime_set_autosuspend_delay(>dev, EXYNOS_AUTOSUSPEND_DELAY);
>> pm_runtime_use_autosuspend(>dev);
>> pm_runtime_enable(>dev);
>>
>> and in remove:
>> pm_runtime_disable(>dev)
>
> But not pm_runtime_dont_use_autosuspend()?

Ahh, that one is missing. Thanks!

>
> Why disable runtime PM if you want the runtime-PM methods to put the
> device into a low-power state?

I don't want to play with runtime PM anymore at this time. I am
removing the device so I am cleaning what was done in probe. Without
the pm_runtime_disable() the next probe of device will trigger error
of unbalanced enable:
https://lkml.org/lkml/2016/3/11/59

>
>> Just before unbinding in __device_release_driver() the device is resumed
>> but unfortunately not suspended later. I mean the
>> __device_release_driver()->pm_runtime_put_sync() does not trigger
>> runtime suspend.
>
> Because autosuspend is still in use at this point.
>
>> This leads to leaving the device in active state (e.g. clocks enabled).
>>
>> It does not happen after removal of autosuspend. Also runtime suspend
>> happens after very fast unbind-bind.
>
> Overall it sounds like the system is behaving the way it is supposed
> to.
>
> But maybe we should make pm_runtime_use_autosuspend() call
> pm_runtime_mark_last_busy(), to avoid the unbind - bind - immediate
> autosuspend behavior.

The need of disabling autosuspend was not quite obvious to me and I
did not find information about it in documentation. However now it
makes sense... I'll send a patch updating the docs.

Thank you for feedback!

Best regards,
Krzysztof


Re: [BUG] Device unbound in a resumed state

2016-03-11 Thread Krzysztof Kozlowski
2016-03-12 3:46 GMT+09:00 Alan Stern :
> On Fri, 11 Mar 2016, Krzysztof Kozlowski wrote:
>
>> Hi,
>>
>>
>> Could be related (the same?) with [0].
>>
>> I have a driver (hwrng/exynos-rng) which in probe does:
>> pm_runtime_set_autosuspend_delay(>dev, EXYNOS_AUTOSUSPEND_DELAY);
>> pm_runtime_use_autosuspend(>dev);
>> pm_runtime_enable(>dev);
>>
>> and in remove:
>> pm_runtime_disable(>dev)
>
> But not pm_runtime_dont_use_autosuspend()?

Ahh, that one is missing. Thanks!

>
> Why disable runtime PM if you want the runtime-PM methods to put the
> device into a low-power state?

I don't want to play with runtime PM anymore at this time. I am
removing the device so I am cleaning what was done in probe. Without
the pm_runtime_disable() the next probe of device will trigger error
of unbalanced enable:
https://lkml.org/lkml/2016/3/11/59

>
>> Just before unbinding in __device_release_driver() the device is resumed
>> but unfortunately not suspended later. I mean the
>> __device_release_driver()->pm_runtime_put_sync() does not trigger
>> runtime suspend.
>
> Because autosuspend is still in use at this point.
>
>> This leads to leaving the device in active state (e.g. clocks enabled).
>>
>> It does not happen after removal of autosuspend. Also runtime suspend
>> happens after very fast unbind-bind.
>
> Overall it sounds like the system is behaving the way it is supposed
> to.
>
> But maybe we should make pm_runtime_use_autosuspend() call
> pm_runtime_mark_last_busy(), to avoid the unbind - bind - immediate
> autosuspend behavior.

The need of disabling autosuspend was not quite obvious to me and I
did not find information about it in documentation. However now it
makes sense... I'll send a patch updating the docs.

Thank you for feedback!

Best regards,
Krzysztof


Re: [PATCH V3] net: ezchip: adapt driver to little endian architecture

2016-03-11 Thread Vineet Gupta
On Friday 04 March 2016 03:50 AM, David Miller wrote:
> From: Lada Trimasova 
> Date: Thu,  3 Mar 2016 17:07:46 +0300
> 
>> Since ezchip network driver is written with big endian EZChip platform it
>> is necessary to add support for little endian architecture.
>>
>> The first issue is that the order of the bits in a bit field is
>> implementation specific. So all the bit fields are removed.
>> Named constants are used to access necessary fields.
>>
>> And the second one is that network byte order is big endian.
>> For example, data on ethernet is transmitted with most-significant
>> octet (byte) first. So in case of little endian architecture
>> it is important to swap data byte order when we read it from
>> register. In case of unaligned access we can use "get_unaligned_be32"
>> and in other case we can use function "ioread32_rep" which reads all
>> data from register and works either with little endian or big endian
>> architecture.
>>
>> And then when we are going to write data to register we need to restore
>> byte order using the function "put_unaligned_be32" in case of
>> unaligned access and in other case "iowrite32_rep".
>>
>> The last little fix is a space between type and pointer to observe
>> coding style.
>>
>> Signed-off-by: Lada Trimasova 
> 
> Applied to net-next, thanks.
> 

@Lada, could you provide the corresponding arch/arc/{boot/dts,configs}/ updates 
so
we can switch over to this device-model/driver for OSCI platform for 4.6.

Thx,
-Vineet


Re: [PATCH V3] net: ezchip: adapt driver to little endian architecture

2016-03-11 Thread Vineet Gupta
On Friday 04 March 2016 03:50 AM, David Miller wrote:
> From: Lada Trimasova 
> Date: Thu,  3 Mar 2016 17:07:46 +0300
> 
>> Since ezchip network driver is written with big endian EZChip platform it
>> is necessary to add support for little endian architecture.
>>
>> The first issue is that the order of the bits in a bit field is
>> implementation specific. So all the bit fields are removed.
>> Named constants are used to access necessary fields.
>>
>> And the second one is that network byte order is big endian.
>> For example, data on ethernet is transmitted with most-significant
>> octet (byte) first. So in case of little endian architecture
>> it is important to swap data byte order when we read it from
>> register. In case of unaligned access we can use "get_unaligned_be32"
>> and in other case we can use function "ioread32_rep" which reads all
>> data from register and works either with little endian or big endian
>> architecture.
>>
>> And then when we are going to write data to register we need to restore
>> byte order using the function "put_unaligned_be32" in case of
>> unaligned access and in other case "iowrite32_rep".
>>
>> The last little fix is a space between type and pointer to observe
>> coding style.
>>
>> Signed-off-by: Lada Trimasova 
> 
> Applied to net-next, thanks.
> 

@Lada, could you provide the corresponding arch/arc/{boot/dts,configs}/ updates 
so
we can switch over to this device-model/driver for OSCI platform for 4.6.

Thx,
-Vineet


Re: [RFC][PATCH v2 1/2] printk: Make printk() completely async

2016-03-11 Thread Sergey Senozhatsky
Hello Tejun,

On (03/11/16 12:22), Tejun Heo wrote:
> On Tue, Mar 08, 2016 at 07:21:52PM +0900, Sergey Senozhatsky wrote:
> > I'd personally prefer to go with the "less dependency" option -- a dedicated
> > kthread, I think. mostly for the sake of simplicity. I agree with the point
> > that console_unlock() has unpredictable execution time, and in general case
> > we would have a busy kworker (or sleeping in console_lock() or doing
> > cond_resched()) and an idle extra WQ_RESCUER kthread, with activation rules
> > that don't depend on printk. printk with dedicated printk-kthread seems
> > easier to control. how does it sound?
> 
> I don't think it makes sense to avoid workqueue for execution latency.
> The only case which can matter is the rescuer case and as I wrote
> before the system is already in an extremely high latency mode by the
> time rescuer is needed, so it's unlikely to make noticeable
> differences.
>
> However, I agree that using kthread is a good idea here just to reduce
> the amount of dependency as prink working even during complex failures
> is important.  workqueue itself is fairly complex and it also requires
> timer and task creation to work correctly for proper operation.
> That's a lot of extra dependency.

Thanks!

I agree that, in some cases (if not in most of them) the "value" of printk()
output is inversely proportional to the system health -- the worst the state,
the more attention people pay to printk() output; so a simpler solution here
gives more confidence.

-ss


Re: [RFC][PATCH v2 1/2] printk: Make printk() completely async

2016-03-11 Thread Sergey Senozhatsky
Hello Tejun,

On (03/11/16 12:22), Tejun Heo wrote:
> On Tue, Mar 08, 2016 at 07:21:52PM +0900, Sergey Senozhatsky wrote:
> > I'd personally prefer to go with the "less dependency" option -- a dedicated
> > kthread, I think. mostly for the sake of simplicity. I agree with the point
> > that console_unlock() has unpredictable execution time, and in general case
> > we would have a busy kworker (or sleeping in console_lock() or doing
> > cond_resched()) and an idle extra WQ_RESCUER kthread, with activation rules
> > that don't depend on printk. printk with dedicated printk-kthread seems
> > easier to control. how does it sound?
> 
> I don't think it makes sense to avoid workqueue for execution latency.
> The only case which can matter is the rescuer case and as I wrote
> before the system is already in an extremely high latency mode by the
> time rescuer is needed, so it's unlikely to make noticeable
> differences.
>
> However, I agree that using kthread is a good idea here just to reduce
> the amount of dependency as prink working even during complex failures
> is important.  workqueue itself is fairly complex and it also requires
> timer and task creation to work correctly for proper operation.
> That's a lot of extra dependency.

Thanks!

I agree that, in some cases (if not in most of them) the "value" of printk()
output is inversely proportional to the system health -- the worst the state,
the more attention people pay to printk() output; so a simpler solution here
gives more confidence.

-ss


Re: [PATCH 1/5] hwrng: exynos - Hide PM functions with __maybe_unused

2016-03-11 Thread Krzysztof Kozlowski
2016-03-11 16:49 GMT+09:00 Krzysztof Kozlowski :
> Replace ifdef with __maybe_unused to silence compiler warning on when
> SUSPEND=n and PM=y:
>
> drivers/char/hw_random/exynos-rng.c:166:12: warning: ‘exynos_rng_suspend’ 
> defined but not used [-Wunused-function]
>  static int exynos_rng_suspend(struct device *dev)
> ^
> drivers/char/hw_random/exynos-rng.c:171:12: warning: ‘exynos_rng_resume’ 
> defined but not used [-Wunused-function]
>  static int exynos_rng_resume(struct device *dev)
>
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  drivers/char/hw_random/exynos-rng.c | 10 --
>  1 file changed, 4 insertions(+), 6 deletions(-)
>

Patch can be dropped, Arnd sent the same some days before.

Best regards,
Krzysztof


Re: [PATCH 1/5] hwrng: exynos - Hide PM functions with __maybe_unused

2016-03-11 Thread Krzysztof Kozlowski
2016-03-11 16:49 GMT+09:00 Krzysztof Kozlowski :
> Replace ifdef with __maybe_unused to silence compiler warning on when
> SUSPEND=n and PM=y:
>
> drivers/char/hw_random/exynos-rng.c:166:12: warning: ‘exynos_rng_suspend’ 
> defined but not used [-Wunused-function]
>  static int exynos_rng_suspend(struct device *dev)
> ^
> drivers/char/hw_random/exynos-rng.c:171:12: warning: ‘exynos_rng_resume’ 
> defined but not used [-Wunused-function]
>  static int exynos_rng_resume(struct device *dev)
>
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  drivers/char/hw_random/exynos-rng.c | 10 --
>  1 file changed, 4 insertions(+), 6 deletions(-)
>

Patch can be dropped, Arnd sent the same some days before.

Best regards,
Krzysztof


Re: [PATCH V2] proc-vmcore: wrong data type casting fix

2016-03-11 Thread Dave Young
Hi, Andrew

On 03/11/16 at 12:27pm, Andrew Morton wrote:
> On Fri, 11 Mar 2016 16:42:48 +0800 Dave Young  wrote:
> 
> > On i686 PAE enabled machine the contiguous physical area could be large
> > and it can cause trimming down variables in below calculation in
> > read_vmcore() and mmap_vmcore():
> > 
> > tsz = min_t(size_t, m->offset + m->size - *fpos, buflen);
> > 
> > Then the real size passed down is not correct any more.
> > Suppose m->offset + m->size - *fpos being truncated to 0, buflen >0 then
> > we will get tsz = 0. It is of course not an expected result.
> 
> I don't really understand this.
> 
> vmcore.offset if loff_t which is 64-bit
> vmcore.size is long long
> *fpos is loff_t
> 
> so the expression should all be done with 64-bit arithmetic anyway.

#define min_t(type, x, y) ({\
type __min1 = (x);  \
type __min2 = (y);  \
__min1 < __min2 ? __min1: __min2; })

Here x = m->offset + m->size - *fpos; the expression is done with 64bit
arithmetic, it is true. But x will be cast to size_t then compare x with y
The casting will cause problem.

> 
> Maybe buflen (size_t) has the wrong type, but the result of the other
> expression should be in-range by the time we come to doing the
> comparison.
> 
> > During our tests there are two problems caused by it:
> > 1) read_vmcore will refuse to continue so makedumpfile fails.
> > 2) mmap_vmcore will trigger BUG_ON() in remap_pfn_range().
> > 
> > Use unsigned long long in min_t instead so that the variables are not
> > truncated.
> > 
> > Signed-off-by: Baoquan He 
> > Signed-off-by: Dave Young 
> 
> I think we'll need a cc:stable here.

Agreed. Do you think I need repost for this?

> 
> > --- linux-x86.orig/fs/proc/vmcore.c
> > +++ linux-x86/fs/proc/vmcore.c
> > @@ -231,7 +231,9 @@ static ssize_t __read_vmcore(char *buffe
> >  
> > list_for_each_entry(m, _list, list) {
> > if (*fpos < m->offset + m->size) {
> > -   tsz = min_t(size_t, m->offset + m->size - *fpos, 
> > buflen);
> > +   tsz = (size_t)min_t(unsigned long long,
> > +   m->offset + m->size - *fpos,
> > +   buflen);
> 
> This is rather a mess.  Can we please try to fix this bug by choosing
> appropriate types rather than all the typecasting?

file read/mmap buflen is size_t, so tsz is alwyas less then buflen unless
m->offset + m->size - *fpos < buflen. The only problem is we need avoid large
value of m->offset + m->size - *fpos being casted thus it will mistakenly be
less than buflen.

> 
> 
> > start = m->paddr + *fpos - m->offset;
> > tmp = read_from_oldmem(buffer, tsz, , userbuf);
> > if (tmp < 0)
> > @@ -461,7 +463,8 @@ static int mmap_vmcore(struct file *file
> > if (start < m->offset + m->size) {
> > u64 paddr = 0;
> >  
> > -   tsz = min_t(size_t, m->offset + m->size - start, size);
> > +   tsz = (size_t)min_t(unsigned long long,
> > +   m->offset + m->size - start, size);
> > paddr = m->paddr + start - m->offset;
> > if (vmcore_remap_oldmem_pfn(vma, vma->vm_start + len,
> > paddr >> PAGE_SHIFT, tsz,

Thanks
Dave


Re: [PATCH V2] proc-vmcore: wrong data type casting fix

2016-03-11 Thread Dave Young
Hi, Andrew

On 03/11/16 at 12:27pm, Andrew Morton wrote:
> On Fri, 11 Mar 2016 16:42:48 +0800 Dave Young  wrote:
> 
> > On i686 PAE enabled machine the contiguous physical area could be large
> > and it can cause trimming down variables in below calculation in
> > read_vmcore() and mmap_vmcore():
> > 
> > tsz = min_t(size_t, m->offset + m->size - *fpos, buflen);
> > 
> > Then the real size passed down is not correct any more.
> > Suppose m->offset + m->size - *fpos being truncated to 0, buflen >0 then
> > we will get tsz = 0. It is of course not an expected result.
> 
> I don't really understand this.
> 
> vmcore.offset if loff_t which is 64-bit
> vmcore.size is long long
> *fpos is loff_t
> 
> so the expression should all be done with 64-bit arithmetic anyway.

#define min_t(type, x, y) ({\
type __min1 = (x);  \
type __min2 = (y);  \
__min1 < __min2 ? __min1: __min2; })

Here x = m->offset + m->size - *fpos; the expression is done with 64bit
arithmetic, it is true. But x will be cast to size_t then compare x with y
The casting will cause problem.

> 
> Maybe buflen (size_t) has the wrong type, but the result of the other
> expression should be in-range by the time we come to doing the
> comparison.
> 
> > During our tests there are two problems caused by it:
> > 1) read_vmcore will refuse to continue so makedumpfile fails.
> > 2) mmap_vmcore will trigger BUG_ON() in remap_pfn_range().
> > 
> > Use unsigned long long in min_t instead so that the variables are not
> > truncated.
> > 
> > Signed-off-by: Baoquan He 
> > Signed-off-by: Dave Young 
> 
> I think we'll need a cc:stable here.

Agreed. Do you think I need repost for this?

> 
> > --- linux-x86.orig/fs/proc/vmcore.c
> > +++ linux-x86/fs/proc/vmcore.c
> > @@ -231,7 +231,9 @@ static ssize_t __read_vmcore(char *buffe
> >  
> > list_for_each_entry(m, _list, list) {
> > if (*fpos < m->offset + m->size) {
> > -   tsz = min_t(size_t, m->offset + m->size - *fpos, 
> > buflen);
> > +   tsz = (size_t)min_t(unsigned long long,
> > +   m->offset + m->size - *fpos,
> > +   buflen);
> 
> This is rather a mess.  Can we please try to fix this bug by choosing
> appropriate types rather than all the typecasting?

file read/mmap buflen is size_t, so tsz is alwyas less then buflen unless
m->offset + m->size - *fpos < buflen. The only problem is we need avoid large
value of m->offset + m->size - *fpos being casted thus it will mistakenly be
less than buflen.

> 
> 
> > start = m->paddr + *fpos - m->offset;
> > tmp = read_from_oldmem(buffer, tsz, , userbuf);
> > if (tmp < 0)
> > @@ -461,7 +463,8 @@ static int mmap_vmcore(struct file *file
> > if (start < m->offset + m->size) {
> > u64 paddr = 0;
> >  
> > -   tsz = min_t(size_t, m->offset + m->size - start, size);
> > +   tsz = (size_t)min_t(unsigned long long,
> > +   m->offset + m->size - start, size);
> > paddr = m->paddr + start - m->offset;
> > if (vmcore_remap_oldmem_pfn(vma, vma->vm_start + len,
> > paddr >> PAGE_SHIFT, tsz,

Thanks
Dave


Re: [PATCH v3 01/13] pinctrl: sunxi: Add A83T R_PIO controller support

2016-03-11 Thread Vishnu Patekar
Hello Linus,


On Wed, Mar 9, 2016 at 10:55 AM, Linus Walleij  wrote:
> On Sat, Mar 5, 2016 at 10:42 PM, Vishnu Patekar
>  wrote:
>
>> The A83T has R_PIO pin controller, it's same as A23, execpt A83T
>> interrupt bit is 6th and A83T has one extra pin PL12.
>>
>> Signed-off-by: Vishnu Patekar 
>> Acked-by: Chen-Yu Tsai 
>> Acked-by: Rob Herring 
>
> As partly noted by others:
>
>> +config PINCTRL_SUN8I_A83T_R
>> +   def_bool MACH_SUN8I
>
> bool
>
>> +   depends on RESET_CONTROLLER
>
> Should it rather select RESET_CONTROLLER?
I used depends on and def_bool as it is used for other sunxi pinctrl drivers.
Using bool and select will not harm anything.
Should I change it to bool and select ?  or keep it to be uniform with
earlier options?
>
>> +static const struct of_device_id sun8i_a83t_r_pinctrl_match[] = {
>> +   { .compatible = "allwinner,sun8i-a83t-r-pinctrl", },
>> +   {}
>> +};
>> +MODULE_DEVICE_TABLE(of, sun8i_a83t_r_pinctrl_match);
>
> Module talk in bool driver.
I'll remove it.
>
>> +static struct platform_driver sun8i_a83t_r_pinctrl_driver = {
>> +   .probe  = sun8i_a83t_r_pinctrl_probe,
>> +   .driver = {
>> +   .name   = "sun8i-a83t-r-pinctrl",
>> +   .of_match_table = sun8i_a83t_r_pinctrl_match,
>> +   },
>> +};
>> +module_platform_driver(sun8i_a83t_r_pinctrl_driver);
>
> Should be builtin?
Yes, It should be. I missed Maxime's earlier commets.
>
> Yours,
> Linus Walleij


Re: [PATCH v3 01/13] pinctrl: sunxi: Add A83T R_PIO controller support

2016-03-11 Thread Vishnu Patekar
Hello Linus,


On Wed, Mar 9, 2016 at 10:55 AM, Linus Walleij  wrote:
> On Sat, Mar 5, 2016 at 10:42 PM, Vishnu Patekar
>  wrote:
>
>> The A83T has R_PIO pin controller, it's same as A23, execpt A83T
>> interrupt bit is 6th and A83T has one extra pin PL12.
>>
>> Signed-off-by: Vishnu Patekar 
>> Acked-by: Chen-Yu Tsai 
>> Acked-by: Rob Herring 
>
> As partly noted by others:
>
>> +config PINCTRL_SUN8I_A83T_R
>> +   def_bool MACH_SUN8I
>
> bool
>
>> +   depends on RESET_CONTROLLER
>
> Should it rather select RESET_CONTROLLER?
I used depends on and def_bool as it is used for other sunxi pinctrl drivers.
Using bool and select will not harm anything.
Should I change it to bool and select ?  or keep it to be uniform with
earlier options?
>
>> +static const struct of_device_id sun8i_a83t_r_pinctrl_match[] = {
>> +   { .compatible = "allwinner,sun8i-a83t-r-pinctrl", },
>> +   {}
>> +};
>> +MODULE_DEVICE_TABLE(of, sun8i_a83t_r_pinctrl_match);
>
> Module talk in bool driver.
I'll remove it.
>
>> +static struct platform_driver sun8i_a83t_r_pinctrl_driver = {
>> +   .probe  = sun8i_a83t_r_pinctrl_probe,
>> +   .driver = {
>> +   .name   = "sun8i-a83t-r-pinctrl",
>> +   .of_match_table = sun8i_a83t_r_pinctrl_match,
>> +   },
>> +};
>> +module_platform_driver(sun8i_a83t_r_pinctrl_driver);
>
> Should be builtin?
Yes, It should be. I missed Maxime's earlier commets.
>
> Yours,
> Linus Walleij


Re: [PATCH] arc: use little endian accesses

2016-03-11 Thread Vineet Gupta
On Friday 11 March 2016 06:14 PM, Vineet Gupta wrote:
> @Lada I will fix up the changelog to add some of the background behind this
> change, and mark this for stable backport as well.

This is what I'm planning to add.
Noam please give this a spin - you might have to revert those native-endian DT
bindings from UART DT.

->
>From f778cc65717687a3d3f26dd21bef62cd059f1b8b Mon Sep 17 00:00:00 2001
From: Lada Trimasova 
Date: Wed, 9 Mar 2016 20:21:04 +0300
Subject: [PATCH] ARC: [BE] readl()/writel() to work in Big Endian CPU
 configuration

read{l,w}() write{l,w}() primitives should use le{16,32}_to_cpu() and
cpu_to_le{16,32}() respectively to ensure device registers are read
correctly in Big Endian CPU configuration.

Per Arnd Bergmann
| Most drivers using readl() or readl_relaxed() expect those to perform byte
| swaps on big-endian architectures, as the registers tend to be fixed endian

This was needed for getting UART to work correctly on a Big Endian ARC.

The ARC accessors originally were fine, and the bug got introduced
inadventently by commit b8a033023994 ("ARCv2: barriers")

Fixes: b8a033023994 ("ARCv2: barriers")
Link: http://lkml.kernel.org/r/201603100845.30602.a...@arndb.de
Cc: Alexey Brodkin 
Cc: sta...@vger.kernel.org  [4.2+]
Cc: Arnd Bergmann 
Signed-off-by: Lada Trimasova 
[vgupta: beefed up changelog, added Fixes/stable tags]
Signed-off-by: Vineet Gupta 
---
 arch/arc/include/asm/io.h | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/arc/include/asm/io.h b/arch/arc/include/asm/io.h
index 694ece8a0243..27b17adea50d 100644
--- a/arch/arc/include/asm/io.h
+++ b/arch/arc/include/asm/io.h
@@ -129,15 +129,23 @@ static inline void __raw_writel(u32 w, volatile void 
__iomem
*addr)
 #define writel(v,c)({ __iowmb(); writel_relaxed(v,c); })

 /*
- * Relaxed API for drivers which can handle any ordering themselves
+ * Relaxed API for drivers which can handle barrier ordering themselves
+ *
+ * Also these are defined to perform little endian accesses.
+ * To provide the typical device register semantics of fixed endian,
+ * swap the byte order for Big Endian
+ *
+ * http://lkml.kernel.org/r/201603100845.30602.a...@arndb.de
  */
 #define readb_relaxed(c)   __raw_readb(c)
-#define readw_relaxed(c)   __raw_readw(c)
-#define readl_relaxed(c)   __raw_readl(c)
+#define readw_relaxed(c) ({ u16 __r = le16_to_cpu((__force __le16) \
+   __raw_readw(c)); __r; })
+#define readl_relaxed(c) ({ u32 __r = le32_to_cpu((__force __le32) \
+   __raw_readl(c)); __r; })

 #define writeb_relaxed(v,c)__raw_writeb(v,c)
-#define writew_relaxed(v,c)__raw_writew(v,c)
-#define writel_relaxed(v,c)__raw_writel(v,c)
+#define writew_relaxed(v,c)__raw_writew((__force u16) cpu_to_le16(v),c)
+#define writel_relaxed(v,c)__raw_writel((__force u32) cpu_to_le32(v),c)

 #include 

-- 
2.5.0



Re: [PATCH] arc: use little endian accesses

2016-03-11 Thread Vineet Gupta
On Friday 11 March 2016 06:14 PM, Vineet Gupta wrote:
> @Lada I will fix up the changelog to add some of the background behind this
> change, and mark this for stable backport as well.

This is what I'm planning to add.
Noam please give this a spin - you might have to revert those native-endian DT
bindings from UART DT.

->
>From f778cc65717687a3d3f26dd21bef62cd059f1b8b Mon Sep 17 00:00:00 2001
From: Lada Trimasova 
Date: Wed, 9 Mar 2016 20:21:04 +0300
Subject: [PATCH] ARC: [BE] readl()/writel() to work in Big Endian CPU
 configuration

read{l,w}() write{l,w}() primitives should use le{16,32}_to_cpu() and
cpu_to_le{16,32}() respectively to ensure device registers are read
correctly in Big Endian CPU configuration.

Per Arnd Bergmann
| Most drivers using readl() or readl_relaxed() expect those to perform byte
| swaps on big-endian architectures, as the registers tend to be fixed endian

This was needed for getting UART to work correctly on a Big Endian ARC.

The ARC accessors originally were fine, and the bug got introduced
inadventently by commit b8a033023994 ("ARCv2: barriers")

Fixes: b8a033023994 ("ARCv2: barriers")
Link: http://lkml.kernel.org/r/201603100845.30602.a...@arndb.de
Cc: Alexey Brodkin 
Cc: sta...@vger.kernel.org  [4.2+]
Cc: Arnd Bergmann 
Signed-off-by: Lada Trimasova 
[vgupta: beefed up changelog, added Fixes/stable tags]
Signed-off-by: Vineet Gupta 
---
 arch/arc/include/asm/io.h | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/arc/include/asm/io.h b/arch/arc/include/asm/io.h
index 694ece8a0243..27b17adea50d 100644
--- a/arch/arc/include/asm/io.h
+++ b/arch/arc/include/asm/io.h
@@ -129,15 +129,23 @@ static inline void __raw_writel(u32 w, volatile void 
__iomem
*addr)
 #define writel(v,c)({ __iowmb(); writel_relaxed(v,c); })

 /*
- * Relaxed API for drivers which can handle any ordering themselves
+ * Relaxed API for drivers which can handle barrier ordering themselves
+ *
+ * Also these are defined to perform little endian accesses.
+ * To provide the typical device register semantics of fixed endian,
+ * swap the byte order for Big Endian
+ *
+ * http://lkml.kernel.org/r/201603100845.30602.a...@arndb.de
  */
 #define readb_relaxed(c)   __raw_readb(c)
-#define readw_relaxed(c)   __raw_readw(c)
-#define readl_relaxed(c)   __raw_readl(c)
+#define readw_relaxed(c) ({ u16 __r = le16_to_cpu((__force __le16) \
+   __raw_readw(c)); __r; })
+#define readl_relaxed(c) ({ u32 __r = le32_to_cpu((__force __le32) \
+   __raw_readl(c)); __r; })

 #define writeb_relaxed(v,c)__raw_writeb(v,c)
-#define writew_relaxed(v,c)__raw_writew(v,c)
-#define writel_relaxed(v,c)__raw_writel(v,c)
+#define writew_relaxed(v,c)__raw_writew((__force u16) cpu_to_le16(v),c)
+#define writel_relaxed(v,c)__raw_writel((__force u32) cpu_to_le32(v),c)

 #include 

-- 
2.5.0



Re: [PATCH 0/3] OOM detection rework v4

2016-03-11 Thread Tetsuo Handa
Michal Hocko wrote:
> OK, that would suggest that the oom rework patches are not really
> related. They just moved from the livelock to a sleep which is good in
> general IMHO. We even know that it is most probably the IO that is the
> problem because we know that more than half of the reclaimable memory is
> either dirty or under writeback. That is where you should be looking.
> Why the IO is not making progress or such a slow progress.
> 

A footnote. Regarding this reproducer, the problem was "anybody can declare
OOM and call out_of_memory(). But out_of_memory() does nothing because there
is a thread which has TIF_MEMDIE." before the OOM detection rework patches,
and the problem is "nobody can declare OOM and call out_of_memory(). Although
out_of_memory() will do nothing because there is a thread which has
TIF_MEMDIE." after the OOM detection rework patches.

Dave Chinner wrote at http://lkml.kernel.org/r/20160211225929.GU14668@dastard :
> > Although there are memory allocating tasks passing gfp flags with
> > __GFP_KSWAPD_RECLAIM, kswapd is unable to make forward progress because
> > it is blocked at down() called from memory reclaim path. And since it is
> > legal to block kswapd from memory reclaim path (am I correct?), I think
> > we must not assume that current_is_kswapd() check will break the infinite
> > loop condition.
> 
> Right, the threads that are blocked in writeback waiting on memory
> reclaim will be using GFP_NOFS to prevent recursion deadlocks, but
> that does not avoid the problem that kswapd can then get stuck
> on those locks, too. Hence there is no guarantee that kswapd can
> make reclaim progress if it does dirty page writeback...

Unless we address the issue Dave commented, the OOM detection rework patches
add a new location of livelock (which is demonstrated by this reproducer) in
the memory allocator. It is an unfortunate change that we add a new location
of livelock when we are trying to solve thrashing problem.


Re: [PATCH 0/3] OOM detection rework v4

2016-03-11 Thread Tetsuo Handa
Michal Hocko wrote:
> OK, that would suggest that the oom rework patches are not really
> related. They just moved from the livelock to a sleep which is good in
> general IMHO. We even know that it is most probably the IO that is the
> problem because we know that more than half of the reclaimable memory is
> either dirty or under writeback. That is where you should be looking.
> Why the IO is not making progress or such a slow progress.
> 

A footnote. Regarding this reproducer, the problem was "anybody can declare
OOM and call out_of_memory(). But out_of_memory() does nothing because there
is a thread which has TIF_MEMDIE." before the OOM detection rework patches,
and the problem is "nobody can declare OOM and call out_of_memory(). Although
out_of_memory() will do nothing because there is a thread which has
TIF_MEMDIE." after the OOM detection rework patches.

Dave Chinner wrote at http://lkml.kernel.org/r/20160211225929.GU14668@dastard :
> > Although there are memory allocating tasks passing gfp flags with
> > __GFP_KSWAPD_RECLAIM, kswapd is unable to make forward progress because
> > it is blocked at down() called from memory reclaim path. And since it is
> > legal to block kswapd from memory reclaim path (am I correct?), I think
> > we must not assume that current_is_kswapd() check will break the infinite
> > loop condition.
> 
> Right, the threads that are blocked in writeback waiting on memory
> reclaim will be using GFP_NOFS to prevent recursion deadlocks, but
> that does not avoid the problem that kswapd can then get stuck
> on those locks, too. Hence there is no guarantee that kswapd can
> make reclaim progress if it does dirty page writeback...

Unless we address the issue Dave commented, the OOM detection rework patches
add a new location of livelock (which is demonstrated by this reproducer) in
the memory allocator. It is an unfortunate change that we add a new location
of livelock when we are trying to solve thrashing problem.


Re: [PATCH v1 00/12] PCI: Rework shadow ROM handling

2016-03-11 Thread Alex Deucher
On Fri, Mar 11, 2016 at 8:09 PM, Linus Torvalds
 wrote:
> On Fri, Mar 11, 2016 at 4:49 PM, Andy Lutomirski  wrote:
>>
>> FWIW, if I disable all the checks in pci_get_rom_size, I learn that my
>> video ROM consists entirely of 0xff bytes.  Maybe there just isn't a
>> ROM shadow on my laptop.
>
> I think most laptops end up having the graphics ROM be part of the
> regular system flash, and there is no actual rom associated with the
> PCI device that is the GPU itself.
>
> The actual GPU ROM tends to be associated with plug-in cards, not
> soldered-down chips in a laptop where they don't want extra flash
> chips.

Right; on (at least AMD) mobile dGPUs and systems with APUs, the vbios
"rom" is part of the sbios image and is set up by the sbios when it
runs.  The driver either gets it from the legacy vga location or some
platform specific method such as ACPI.

Alex


Re: [PATCH v1 00/12] PCI: Rework shadow ROM handling

2016-03-11 Thread Alex Deucher
On Fri, Mar 11, 2016 at 8:09 PM, Linus Torvalds
 wrote:
> On Fri, Mar 11, 2016 at 4:49 PM, Andy Lutomirski  wrote:
>>
>> FWIW, if I disable all the checks in pci_get_rom_size, I learn that my
>> video ROM consists entirely of 0xff bytes.  Maybe there just isn't a
>> ROM shadow on my laptop.
>
> I think most laptops end up having the graphics ROM be part of the
> regular system flash, and there is no actual rom associated with the
> PCI device that is the GPU itself.
>
> The actual GPU ROM tends to be associated with plug-in cards, not
> soldered-down chips in a laptop where they don't want extra flash
> chips.

Right; on (at least AMD) mobile dGPUs and systems with APUs, the vbios
"rom" is part of the sbios image and is set up by the sbios when it
runs.  The driver either gets it from the legacy vga location or some
platform specific method such as ACPI.

Alex


Re: [PATCH v11 6/9] arm64: kprobes instruction simulation support

2016-03-11 Thread Marc Zyngier
On Wed,  9 Mar 2016 00:32:20 -0500
David Long  wrote:

David,

> From: Sandeepa Prabhu 
> 
> Kprobes needs simulation of instructions that cannot be stepped
> from a different memory location, e.g.: those instructions
> that uses PC-relative addressing. In simulation, the behaviour
> of the instruction is implemented using a copy of pt_regs.
> 
> The following instruction categories are simulated:
>  - All branching instructions(conditional, register, and immediate)
>  - Literal access instructions(load-literal, adr/adrp)
> 
> Conditional execution is limited to branching instructions in
> ARM v8. If conditions at PSTATE do not match the condition fields
> of opcode, the instruction is effectively NOP.
> 
> Thanks to Will Cohen for assorted suggested changes.
> 
> Signed-off-by: Sandeepa Prabhu 
> Signed-off-by: William Cohen 
> Signed-off-by: David A. Long 
> ---
>  arch/arm64/include/asm/insn.h|   1 +
>  arch/arm64/include/asm/probes.h  |   5 +-
>  arch/arm64/kernel/Makefile   |   3 +-
>  arch/arm64/kernel/insn.c |   1 +
>  arch/arm64/kernel/kprobes-arm64.c|  29 
>  arch/arm64/kernel/kprobes.c  |  32 -
>  arch/arm64/kernel/probes-simulate-insn.c | 218 
> +++
>  arch/arm64/kernel/probes-simulate-insn.h |  28 
>  8 files changed, 311 insertions(+), 6 deletions(-)
>  create mode 100644 arch/arm64/kernel/probes-simulate-insn.c
>  create mode 100644 arch/arm64/kernel/probes-simulate-insn.h
> 
> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
> index b9567a1..26cee10 100644
> --- a/arch/arm64/include/asm/insn.h
> +++ b/arch/arm64/include/asm/insn.h
> @@ -410,6 +410,7 @@ u32 aarch32_insn_mcr_extract_crm(u32 insn);
>  
>  typedef bool (pstate_check_t)(unsigned long);
>  extern pstate_check_t * const opcode_condition_checks[16];
> +
>  #endif /* __ASSEMBLY__ */
>  
>  #endif   /* __ASM_INSN_H */
> diff --git a/arch/arm64/include/asm/probes.h b/arch/arm64/include/asm/probes.h
> index c5fcbe6..d524f7d 100644
> --- a/arch/arm64/include/asm/probes.h
> +++ b/arch/arm64/include/asm/probes.h
> @@ -15,11 +15,12 @@
>  #ifndef _ARM_PROBES_H
>  #define _ARM_PROBES_H
>  
> +#include 
> +
>  struct kprobe;
>  struct arch_specific_insn;
>  
>  typedef u32 kprobe_opcode_t;
> -typedef unsigned long (kprobes_pstate_check_t)(unsigned long);
>  typedef void (kprobes_handler_t) (u32 opcode, long addr, struct pt_regs *);
>  
>  enum pc_restore_type {
> @@ -35,7 +36,7 @@ struct kprobe_pc_restore {
>  /* architecture specific copy of original instruction */
>  struct arch_specific_insn {
>   kprobe_opcode_t *insn;
> - kprobes_pstate_check_t *pstate_cc;
> + pstate_check_t *pstate_cc;
>   kprobes_handler_t *handler;
>   /* restore address after step xol */
>   struct kprobe_pc_restore restore;
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 4efb791..08325e5 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -36,7 +36,8 @@ arm64-obj-$(CONFIG_CPU_PM)  += sleep.o suspend.o
>  arm64-obj-$(CONFIG_CPU_IDLE) += cpuidle.o
>  arm64-obj-$(CONFIG_JUMP_LABEL)   += jump_label.o
>  arm64-obj-$(CONFIG_KGDB) += kgdb.o
> -arm64-obj-$(CONFIG_KPROBES)  += kprobes.o kprobes-arm64.o
> +arm64-obj-$(CONFIG_KPROBES)  += kprobes.o kprobes-arm64.o
> \
> +probes-simulate-insn.o
>  arm64-obj-$(CONFIG_EFI)  += efi.o efi-entry.stub.o
>  arm64-obj-$(CONFIG_PCI)  += pci.o
>  arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index 9f15ceb..f9a3432 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -30,6 +30,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #define AARCH64_INSN_SF_BIT  BIT(31)
> diff --git a/arch/arm64/kernel/kprobes-arm64.c 
> b/arch/arm64/kernel/kprobes-arm64.c
> index e07727a..487238a 100644
> --- a/arch/arm64/kernel/kprobes-arm64.c
> +++ b/arch/arm64/kernel/kprobes-arm64.c
> @@ -21,6 +21,7 @@
>  #include 
>  
>  #include "kprobes-arm64.h"
> +#include "probes-simulate-insn.h"
>  
>  static bool __kprobes aarch64_insn_is_steppable(u32 insn)
>  {
> @@ -62,8 +63,36 @@ arm_probe_decode_insn(kprobe_opcode_t insn, struct 
> arch_specific_insn *asi)
>*/
>   if (aarch64_insn_is_steppable(insn))
>   return INSN_GOOD;
> +
> + if (aarch64_insn_is_bcond(insn)) {
> + asi->handler = simulate_b_cond;
> + } else if (aarch64_insn_is_cbz(insn) ||
> + aarch64_insn_is_cbnz(insn)) {
> + asi->handler = simulate_cbz_cbnz;
> + } else if (aarch64_insn_is_tbz(insn) ||
> + 

Re: [PATCH v11 6/9] arm64: kprobes instruction simulation support

2016-03-11 Thread Marc Zyngier
On Wed,  9 Mar 2016 00:32:20 -0500
David Long  wrote:

David,

> From: Sandeepa Prabhu 
> 
> Kprobes needs simulation of instructions that cannot be stepped
> from a different memory location, e.g.: those instructions
> that uses PC-relative addressing. In simulation, the behaviour
> of the instruction is implemented using a copy of pt_regs.
> 
> The following instruction categories are simulated:
>  - All branching instructions(conditional, register, and immediate)
>  - Literal access instructions(load-literal, adr/adrp)
> 
> Conditional execution is limited to branching instructions in
> ARM v8. If conditions at PSTATE do not match the condition fields
> of opcode, the instruction is effectively NOP.
> 
> Thanks to Will Cohen for assorted suggested changes.
> 
> Signed-off-by: Sandeepa Prabhu 
> Signed-off-by: William Cohen 
> Signed-off-by: David A. Long 
> ---
>  arch/arm64/include/asm/insn.h|   1 +
>  arch/arm64/include/asm/probes.h  |   5 +-
>  arch/arm64/kernel/Makefile   |   3 +-
>  arch/arm64/kernel/insn.c |   1 +
>  arch/arm64/kernel/kprobes-arm64.c|  29 
>  arch/arm64/kernel/kprobes.c  |  32 -
>  arch/arm64/kernel/probes-simulate-insn.c | 218 
> +++
>  arch/arm64/kernel/probes-simulate-insn.h |  28 
>  8 files changed, 311 insertions(+), 6 deletions(-)
>  create mode 100644 arch/arm64/kernel/probes-simulate-insn.c
>  create mode 100644 arch/arm64/kernel/probes-simulate-insn.h
> 
> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
> index b9567a1..26cee10 100644
> --- a/arch/arm64/include/asm/insn.h
> +++ b/arch/arm64/include/asm/insn.h
> @@ -410,6 +410,7 @@ u32 aarch32_insn_mcr_extract_crm(u32 insn);
>  
>  typedef bool (pstate_check_t)(unsigned long);
>  extern pstate_check_t * const opcode_condition_checks[16];
> +
>  #endif /* __ASSEMBLY__ */
>  
>  #endif   /* __ASM_INSN_H */
> diff --git a/arch/arm64/include/asm/probes.h b/arch/arm64/include/asm/probes.h
> index c5fcbe6..d524f7d 100644
> --- a/arch/arm64/include/asm/probes.h
> +++ b/arch/arm64/include/asm/probes.h
> @@ -15,11 +15,12 @@
>  #ifndef _ARM_PROBES_H
>  #define _ARM_PROBES_H
>  
> +#include 
> +
>  struct kprobe;
>  struct arch_specific_insn;
>  
>  typedef u32 kprobe_opcode_t;
> -typedef unsigned long (kprobes_pstate_check_t)(unsigned long);
>  typedef void (kprobes_handler_t) (u32 opcode, long addr, struct pt_regs *);
>  
>  enum pc_restore_type {
> @@ -35,7 +36,7 @@ struct kprobe_pc_restore {
>  /* architecture specific copy of original instruction */
>  struct arch_specific_insn {
>   kprobe_opcode_t *insn;
> - kprobes_pstate_check_t *pstate_cc;
> + pstate_check_t *pstate_cc;
>   kprobes_handler_t *handler;
>   /* restore address after step xol */
>   struct kprobe_pc_restore restore;
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 4efb791..08325e5 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -36,7 +36,8 @@ arm64-obj-$(CONFIG_CPU_PM)  += sleep.o suspend.o
>  arm64-obj-$(CONFIG_CPU_IDLE) += cpuidle.o
>  arm64-obj-$(CONFIG_JUMP_LABEL)   += jump_label.o
>  arm64-obj-$(CONFIG_KGDB) += kgdb.o
> -arm64-obj-$(CONFIG_KPROBES)  += kprobes.o kprobes-arm64.o
> +arm64-obj-$(CONFIG_KPROBES)  += kprobes.o kprobes-arm64.o
> \
> +probes-simulate-insn.o
>  arm64-obj-$(CONFIG_EFI)  += efi.o efi-entry.stub.o
>  arm64-obj-$(CONFIG_PCI)  += pci.o
>  arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index 9f15ceb..f9a3432 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -30,6 +30,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #define AARCH64_INSN_SF_BIT  BIT(31)
> diff --git a/arch/arm64/kernel/kprobes-arm64.c 
> b/arch/arm64/kernel/kprobes-arm64.c
> index e07727a..487238a 100644
> --- a/arch/arm64/kernel/kprobes-arm64.c
> +++ b/arch/arm64/kernel/kprobes-arm64.c
> @@ -21,6 +21,7 @@
>  #include 
>  
>  #include "kprobes-arm64.h"
> +#include "probes-simulate-insn.h"
>  
>  static bool __kprobes aarch64_insn_is_steppable(u32 insn)
>  {
> @@ -62,8 +63,36 @@ arm_probe_decode_insn(kprobe_opcode_t insn, struct 
> arch_specific_insn *asi)
>*/
>   if (aarch64_insn_is_steppable(insn))
>   return INSN_GOOD;
> +
> + if (aarch64_insn_is_bcond(insn)) {
> + asi->handler = simulate_b_cond;
> + } else if (aarch64_insn_is_cbz(insn) ||
> + aarch64_insn_is_cbnz(insn)) {
> + asi->handler = simulate_cbz_cbnz;
> + } else if (aarch64_insn_is_tbz(insn) ||
> + aarch64_insn_is_tbnz(insn)) {
> + asi->handler = simulate_tbz_tbnz;
> + } else if (aarch64_insn_is_adr_adrp(insn))
> +   

[GIT PULL] target fixes for v4.5

2016-03-11 Thread Nicholas A. Bellinger
Hi Linus,

Here is the outstanding target-core bug-fix for v4.5 code.

Please go ahead and pull from:

  git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git master

This patch addresses a recent Task Management (TMR) regression related
to larger set of multi-port LUN_RESET bug-fixes in v4.5-rc5.

It drops a left-over target_put_sess_cmd() of se_cmd->cmd_kref within
ABORT_TASK failure path, once a se_cmd descriptor has already completed
posting response to fabric driver logic, and must be skipped during
normal ABORT_TASK se_cmd->tag lookup.

Thank you,

--nab

Nicholas Bellinger (1):
  target: Drop incorrect ABORT_TASK put for completed commands

 drivers/target/target_core_tmr.c | 1 -
 1 file changed, 1 deletion(-)



[GIT PULL] target fixes for v4.5

2016-03-11 Thread Nicholas A. Bellinger
Hi Linus,

Here is the outstanding target-core bug-fix for v4.5 code.

Please go ahead and pull from:

  git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git master

This patch addresses a recent Task Management (TMR) regression related
to larger set of multi-port LUN_RESET bug-fixes in v4.5-rc5.

It drops a left-over target_put_sess_cmd() of se_cmd->cmd_kref within
ABORT_TASK failure path, once a se_cmd descriptor has already completed
posting response to fabric driver logic, and must be skipped during
normal ABORT_TASK se_cmd->tag lookup.

Thank you,

--nab

Nicholas Bellinger (1):
  target: Drop incorrect ABORT_TASK put for completed commands

 drivers/target/target_core_tmr.c | 1 -
 1 file changed, 1 deletion(-)



Re: [PATCH v4 2/6] hwmon: (fam15h_power) Add compute unit accumulated power

2016-03-11 Thread Guenter Roeck

On 03/10/2016 06:17 PM, Huang Rui wrote:

This patch adds a member in fam15h_power_data which specifies the
compute unit accumulated power. It adds do_read_registers_on_cu to do
all the read to all MSRs and run it on one of the online cores on each
compute unit with smp_call_function_many(). This behavior can decrease
IPI numbers.

Suggested-by: Borislav Petkov 
Signed-off-by: Huang Rui 
---
  drivers/hwmon/fam15h_power.c | 61 +++-
  1 file changed, 60 insertions(+), 1 deletion(-)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 4f695d8..c5e2297 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -25,6 +25,8 @@
  #include 
  #include 
  #include 
+#include 
+#include 
  #include 
  #include 

@@ -44,7 +46,9 @@ MODULE_LICENSE("GPL");

  #define FAM15H_MIN_NUM_ATTRS  2
  #define FAM15H_NUM_GROUPS 2
+#define MAX_CUS8

+#define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a
  #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR   0xc001007b

  #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4
@@ -59,6 +63,8 @@ struct fam15h_power_data {
struct attribute_group group;
/* maximum accumulated power of a compute unit */
u64 max_cu_acc_power;
+   /* accumulated power of the compute units */
+   u64 cu_acc_power[MAX_CUS];
  };

  static ssize_t show_power(struct device *dev,
@@ -125,6 +131,59 @@ static ssize_t show_power_crit(struct device *dev,
  }
  static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL);

+static void do_read_registers_on_cu(void *_data)
+{
+   struct fam15h_power_data *data = _data;
+   int cpu, cu;
+
+   cpu = smp_processor_id();
+
+   cu = cpu / smp_num_siblings;
+


If smp is not configured:

drivers/hwmon/fam15h_power.c: In function ‘do_read_registers_on_cu’:
drivers/hwmon/fam15h_power.c:144:13: error: ‘smp_num_siblings’ undeclared 
(first use in this function)

Guenter


+   rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]);
+}
+
+/*
+ * This function is only able to be called when CPUID
+ * Fn8000_0007:EDX[12] is set.
+ */
+static int read_registers(struct fam15h_power_data *data)
+{
+   int this_cpu, ret, cpu;
+   int target;
+   cpumask_var_t mask;
+
+   ret = zalloc_cpumask_var(, GFP_KERNEL);
+   if (!ret)
+   return -ENOMEM;
+
+   get_online_cpus();
+   this_cpu = get_cpu();
+
+   /*
+* Choose the first online core of each compute unit, and then
+* read their MSR value of power and ptsc in a single IPI,
+* because the MSR value of CPU core represent the compute
+* unit's.
+*/
+   for_each_online_cpu(cpu) {
+   target = cpumask_first(topology_sibling_cpumask(cpu));
+   if (!cpumask_test_cpu(target, mask))
+   cpumask_set_cpu(target, mask);
+   }
+
+   if (cpumask_test_cpu(this_cpu, mask))
+   do_read_registers_on_cu(data);
+
+   smp_call_function_many(mask, do_read_registers_on_cu, data, true);
+   put_cpu();
+   put_online_cpus();
+
+   free_cpumask_var(mask);
+
+   return 0;
+}
+
  static int fam15h_power_init_attrs(struct pci_dev *pdev,
   struct fam15h_power_data *data)
  {
@@ -263,7 +322,7 @@ static int fam15h_power_init_data(struct pci_dev *f4,

data->max_cu_acc_power = tmp;

-   return 0;
+   return read_registers(data);
  }

  static int fam15h_power_probe(struct pci_dev *pdev,





Re: [PATCH v4 2/6] hwmon: (fam15h_power) Add compute unit accumulated power

2016-03-11 Thread Guenter Roeck

On 03/10/2016 06:17 PM, Huang Rui wrote:

This patch adds a member in fam15h_power_data which specifies the
compute unit accumulated power. It adds do_read_registers_on_cu to do
all the read to all MSRs and run it on one of the online cores on each
compute unit with smp_call_function_many(). This behavior can decrease
IPI numbers.

Suggested-by: Borislav Petkov 
Signed-off-by: Huang Rui 
---
  drivers/hwmon/fam15h_power.c | 61 +++-
  1 file changed, 60 insertions(+), 1 deletion(-)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 4f695d8..c5e2297 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -25,6 +25,8 @@
  #include 
  #include 
  #include 
+#include 
+#include 
  #include 
  #include 

@@ -44,7 +46,9 @@ MODULE_LICENSE("GPL");

  #define FAM15H_MIN_NUM_ATTRS  2
  #define FAM15H_NUM_GROUPS 2
+#define MAX_CUS8

+#define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a
  #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR   0xc001007b

  #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4
@@ -59,6 +63,8 @@ struct fam15h_power_data {
struct attribute_group group;
/* maximum accumulated power of a compute unit */
u64 max_cu_acc_power;
+   /* accumulated power of the compute units */
+   u64 cu_acc_power[MAX_CUS];
  };

  static ssize_t show_power(struct device *dev,
@@ -125,6 +131,59 @@ static ssize_t show_power_crit(struct device *dev,
  }
  static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL);

+static void do_read_registers_on_cu(void *_data)
+{
+   struct fam15h_power_data *data = _data;
+   int cpu, cu;
+
+   cpu = smp_processor_id();
+
+   cu = cpu / smp_num_siblings;
+


If smp is not configured:

drivers/hwmon/fam15h_power.c: In function ‘do_read_registers_on_cu’:
drivers/hwmon/fam15h_power.c:144:13: error: ‘smp_num_siblings’ undeclared 
(first use in this function)

Guenter


+   rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]);
+}
+
+/*
+ * This function is only able to be called when CPUID
+ * Fn8000_0007:EDX[12] is set.
+ */
+static int read_registers(struct fam15h_power_data *data)
+{
+   int this_cpu, ret, cpu;
+   int target;
+   cpumask_var_t mask;
+
+   ret = zalloc_cpumask_var(, GFP_KERNEL);
+   if (!ret)
+   return -ENOMEM;
+
+   get_online_cpus();
+   this_cpu = get_cpu();
+
+   /*
+* Choose the first online core of each compute unit, and then
+* read their MSR value of power and ptsc in a single IPI,
+* because the MSR value of CPU core represent the compute
+* unit's.
+*/
+   for_each_online_cpu(cpu) {
+   target = cpumask_first(topology_sibling_cpumask(cpu));
+   if (!cpumask_test_cpu(target, mask))
+   cpumask_set_cpu(target, mask);
+   }
+
+   if (cpumask_test_cpu(this_cpu, mask))
+   do_read_registers_on_cu(data);
+
+   smp_call_function_many(mask, do_read_registers_on_cu, data, true);
+   put_cpu();
+   put_online_cpus();
+
+   free_cpumask_var(mask);
+
+   return 0;
+}
+
  static int fam15h_power_init_attrs(struct pci_dev *pdev,
   struct fam15h_power_data *data)
  {
@@ -263,7 +322,7 @@ static int fam15h_power_init_data(struct pci_dev *f4,

data->max_cu_acc_power = tmp;

-   return 0;
+   return read_registers(data);
  }

  static int fam15h_power_probe(struct pci_dev *pdev,





Re: [PATCH v1 13/19] zsmalloc: factor page chain functionality out

2016-03-11 Thread xuyiping



On 2016/3/11 15:30, Minchan Kim wrote:

For migration, we need to create sub-page chain of zspage
dynamically so this patch factors it out from alloc_zspage.

As a minor refactoring, it makes OBJ_ALLOCATED_TAG assign
more clear in obj_malloc(it could be another patch but it's
trivial so I want to put together in this patch).

Signed-off-by: Minchan Kim 
---
  mm/zsmalloc.c | 78 ++-
  1 file changed, 45 insertions(+), 33 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index bfc6a048afac..f86f8aaeb902 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -977,7 +977,9 @@ static void init_zspage(struct size_class *class, struct 
page *first_page)
unsigned long off = 0;
struct page *page = first_page;

-   VM_BUG_ON_PAGE(!is_first_page(first_page), first_page);
+   first_page->freelist = NULL;
+   INIT_LIST_HEAD(_page->lru);
+   set_zspage_inuse(first_page, 0);

while (page) {
struct page *next_page;
@@ -1022,13 +1024,44 @@ static void init_zspage(struct size_class *class, 
struct page *first_page)
set_freeobj(first_page, 0);
  }

+static void create_page_chain(struct page *pages[], int nr_pages)
+{
+   int i;
+   struct page *page;
+   struct page *prev_page = NULL;
+   struct page *first_page = NULL;
+
+   for (i = 0; i < nr_pages; i++) {
+   page = pages[i];
+
+   INIT_LIST_HEAD(>lru);
+   if (i == 0) {
+   SetPagePrivate(page);
+   set_page_private(page, 0);
+   first_page = page;
+   }
+
+   if (i == 1)
+   set_page_private(first_page, (unsigned long)page);
+   if (i >= 1)
+   set_page_private(page, (unsigned long)first_page);
+   if (i >= 2)
+   list_add(>lru, _page->lru);
+   if (i == nr_pages - 1)
+   SetPagePrivate2(page);
+
+   prev_page = page;
+   }
+}
+
  /*
   * Allocate a zspage for the given size class
   */
  static struct page *alloc_zspage(struct size_class *class, gfp_t flags)
  {
-   int i, error;
+   int i;
struct page *first_page = NULL, *uninitialized_var(prev_page);
+   struct page *pages[ZS_MAX_PAGES_PER_ZSPAGE];

/*
 * Allocate individual pages and link them together as:
@@ -1041,43 +1074,23 @@ static struct page *alloc_zspage(struct size_class 
*class, gfp_t flags)


*uninitialized_var(prev_page) in alloc_zspage is not in use more.


 * (i.e. no other sub-page has this flag set) and PG_private_2 to
 * identify the last page.
 */
-   error = -ENOMEM;
for (i = 0; i < class->pages_per_zspage; i++) {
struct page *page;

page = alloc_page(flags);
-   if (!page)
-   goto cleanup;
-
-   INIT_LIST_HEAD(>lru);
-   if (i == 0) {   /* first page */
-   page->freelist = NULL;
-   SetPagePrivate(page);
-   set_page_private(page, 0);
-   first_page = page;
-   set_zspage_inuse(page, 0);
+   if (!page) {
+   while (--i >= 0)
+   __free_page(pages[i]);
+   return NULL;
}
-   if (i == 1)
-   set_page_private(first_page, (unsigned long)page);
-   if (i >= 1)
-   set_page_private(page, (unsigned long)first_page);
-   if (i >= 2)
-   list_add(>lru, _page->lru);
-   if (i == class->pages_per_zspage - 1)/* last page */
-   SetPagePrivate2(page);
-   prev_page = page;
+
+   pages[i] = page;
}

+   create_page_chain(pages, class->pages_per_zspage);
+   first_page = pages[0];
init_zspage(class, first_page);

-   error = 0; /* Success */
-
-cleanup:
-   if (unlikely(error) && first_page) {
-   free_zspage(first_page);
-   first_page = NULL;
-   }
-
return first_page;
  }

@@ -1419,7 +1432,6 @@ static unsigned long obj_malloc(struct size_class *class,
unsigned long m_offset;
void *vaddr;

-   handle |= OBJ_ALLOCATED_TAG;
obj = get_freeobj(first_page);
objidx_to_page_and_ofs(class, first_page, obj,
_page, _offset);
@@ -1429,10 +1441,10 @@ static unsigned long obj_malloc(struct size_class 
*class,
set_freeobj(first_page, link->next >> OBJ_ALLOCATED_TAG);
if (!class->huge)
/* record handle in the header of allocated chunk */
-   link->handle = handle;
+   link->handle = handle | OBJ_ALLOCATED_TAG;

Re: [PATCH v1 13/19] zsmalloc: factor page chain functionality out

2016-03-11 Thread xuyiping



On 2016/3/11 15:30, Minchan Kim wrote:

For migration, we need to create sub-page chain of zspage
dynamically so this patch factors it out from alloc_zspage.

As a minor refactoring, it makes OBJ_ALLOCATED_TAG assign
more clear in obj_malloc(it could be another patch but it's
trivial so I want to put together in this patch).

Signed-off-by: Minchan Kim 
---
  mm/zsmalloc.c | 78 ++-
  1 file changed, 45 insertions(+), 33 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index bfc6a048afac..f86f8aaeb902 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -977,7 +977,9 @@ static void init_zspage(struct size_class *class, struct 
page *first_page)
unsigned long off = 0;
struct page *page = first_page;

-   VM_BUG_ON_PAGE(!is_first_page(first_page), first_page);
+   first_page->freelist = NULL;
+   INIT_LIST_HEAD(_page->lru);
+   set_zspage_inuse(first_page, 0);

while (page) {
struct page *next_page;
@@ -1022,13 +1024,44 @@ static void init_zspage(struct size_class *class, 
struct page *first_page)
set_freeobj(first_page, 0);
  }

+static void create_page_chain(struct page *pages[], int nr_pages)
+{
+   int i;
+   struct page *page;
+   struct page *prev_page = NULL;
+   struct page *first_page = NULL;
+
+   for (i = 0; i < nr_pages; i++) {
+   page = pages[i];
+
+   INIT_LIST_HEAD(>lru);
+   if (i == 0) {
+   SetPagePrivate(page);
+   set_page_private(page, 0);
+   first_page = page;
+   }
+
+   if (i == 1)
+   set_page_private(first_page, (unsigned long)page);
+   if (i >= 1)
+   set_page_private(page, (unsigned long)first_page);
+   if (i >= 2)
+   list_add(>lru, _page->lru);
+   if (i == nr_pages - 1)
+   SetPagePrivate2(page);
+
+   prev_page = page;
+   }
+}
+
  /*
   * Allocate a zspage for the given size class
   */
  static struct page *alloc_zspage(struct size_class *class, gfp_t flags)
  {
-   int i, error;
+   int i;
struct page *first_page = NULL, *uninitialized_var(prev_page);
+   struct page *pages[ZS_MAX_PAGES_PER_ZSPAGE];

/*
 * Allocate individual pages and link them together as:
@@ -1041,43 +1074,23 @@ static struct page *alloc_zspage(struct size_class 
*class, gfp_t flags)


*uninitialized_var(prev_page) in alloc_zspage is not in use more.


 * (i.e. no other sub-page has this flag set) and PG_private_2 to
 * identify the last page.
 */
-   error = -ENOMEM;
for (i = 0; i < class->pages_per_zspage; i++) {
struct page *page;

page = alloc_page(flags);
-   if (!page)
-   goto cleanup;
-
-   INIT_LIST_HEAD(>lru);
-   if (i == 0) {   /* first page */
-   page->freelist = NULL;
-   SetPagePrivate(page);
-   set_page_private(page, 0);
-   first_page = page;
-   set_zspage_inuse(page, 0);
+   if (!page) {
+   while (--i >= 0)
+   __free_page(pages[i]);
+   return NULL;
}
-   if (i == 1)
-   set_page_private(first_page, (unsigned long)page);
-   if (i >= 1)
-   set_page_private(page, (unsigned long)first_page);
-   if (i >= 2)
-   list_add(>lru, _page->lru);
-   if (i == class->pages_per_zspage - 1)/* last page */
-   SetPagePrivate2(page);
-   prev_page = page;
+
+   pages[i] = page;
}

+   create_page_chain(pages, class->pages_per_zspage);
+   first_page = pages[0];
init_zspage(class, first_page);

-   error = 0; /* Success */
-
-cleanup:
-   if (unlikely(error) && first_page) {
-   free_zspage(first_page);
-   first_page = NULL;
-   }
-
return first_page;
  }

@@ -1419,7 +1432,6 @@ static unsigned long obj_malloc(struct size_class *class,
unsigned long m_offset;
void *vaddr;

-   handle |= OBJ_ALLOCATED_TAG;
obj = get_freeobj(first_page);
objidx_to_page_and_ofs(class, first_page, obj,
_page, _offset);
@@ -1429,10 +1441,10 @@ static unsigned long obj_malloc(struct size_class 
*class,
set_freeobj(first_page, link->next >> OBJ_ALLOCATED_TAG);
if (!class->huge)
/* record handle in the header of allocated chunk */
-   link->handle = handle;
+   link->handle = handle | OBJ_ALLOCATED_TAG;
else
   

[PATCH v7 01/10] tpm: Get rid of chip->pdev

2016-03-11 Thread Stefan Berger
From: Jason Gunthorpe 

This is a hold over from before the struct device conversion.

- All prints should be using >dev, which is the Linux
  standard. This changes prints to use tpm0 as the device name,
  not the PnP/etc ID.
- The few places involving sysfs/modules that really do need the
  parent just use chip->dev.parent instead
- We no longer need to get_device(pdev) in any places since it is no
  longer used by any of the code. The kref on the parent is held
  by the device core during device_add and dropped in device_del

Signed-off-by: Jason Gunthorpe 
Signed-off-by: Stefan Berger 
Tested-by: Stefan Berger 
Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 
Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-chip.c | 15 ++-
 drivers/char/tpm/tpm-dev.c  |  4 +---
 drivers/char/tpm/tpm-interface.c| 30 --
 drivers/char/tpm/tpm-sysfs.c|  6 +++---
 drivers/char/tpm/tpm.h  |  3 +--
 drivers/char/tpm/tpm2-cmd.c |  8 
 drivers/char/tpm/tpm_atmel.c| 14 +++---
 drivers/char/tpm/tpm_i2c_atmel.c| 16 
 drivers/char/tpm/tpm_i2c_infineon.c |  6 +++---
 drivers/char/tpm/tpm_i2c_nuvoton.c  | 22 +++---
 drivers/char/tpm/tpm_infineon.c | 22 +++---
 drivers/char/tpm/tpm_nsc.c  | 20 ++--
 drivers/char/tpm/tpm_tis.c  | 16 
 13 files changed, 89 insertions(+), 93 deletions(-)

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 274dd01..12829dd 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -49,7 +49,7 @@ struct tpm_chip *tpm_chip_find_get(int chip_num)
if (chip_num != TPM_ANY_NUM && chip_num != pos->dev_num)
continue;
 
-   if (try_module_get(pos->pdev->driver->owner)) {
+   if (try_module_get(pos->dev.parent->driver->owner)) {
chip = pos;
break;
}
@@ -113,13 +113,11 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
 
scnprintf(chip->devname, sizeof(chip->devname), "tpm%d", chip->dev_num);
 
-   chip->pdev = dev;
-
dev_set_drvdata(dev, chip);
 
chip->dev.class = tpm_class;
chip->dev.release = tpm_dev_release;
-   chip->dev.parent = chip->pdev;
+   chip->dev.parent = dev;
 #ifdef CONFIG_ACPI
chip->dev.groups = chip->groups;
 #endif
@@ -134,7 +132,7 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
device_initialize(>dev);
 
cdev_init(>cdev, _fops);
-   chip->cdev.owner = chip->pdev->driver->owner;
+   chip->cdev.owner = dev->driver->owner;
chip->cdev.kobj.parent = >dev.kobj;
 
rc = devm_add_action(dev, (void (*)(void *)) put_device, >dev);
@@ -241,9 +239,8 @@ int tpm_chip_register(struct tpm_chip *chip)
chip->flags |= TPM_CHIP_FLAG_REGISTERED;
 
if (!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
-   rc = __compat_only_sysfs_link_entry_to_kobj(>pdev->kobj,
-   >dev.kobj,
-   "ppi");
+   rc = __compat_only_sysfs_link_entry_to_kobj(
+   >dev.parent->kobj, >dev.kobj, "ppi");
if (rc && rc != -ENOENT) {
tpm_chip_unregister(chip);
return rc;
@@ -278,7 +275,7 @@ void tpm_chip_unregister(struct tpm_chip *chip)
synchronize_rcu();
 
if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
-   sysfs_remove_link(>pdev->kobj, "ppi");
+   sysfs_remove_link(>dev.parent->kobj, "ppi");
 
tpm1_chip_unregister(chip);
tpm_del_char_device(chip);
diff --git a/drivers/char/tpm/tpm-dev.c b/drivers/char/tpm/tpm-dev.c
index de0337e..4009765 100644
--- a/drivers/char/tpm/tpm-dev.c
+++ b/drivers/char/tpm/tpm-dev.c
@@ -61,7 +61,7 @@ static int tpm_open(struct inode *inode, struct file *file)
 * by the check of is_open variable, which is protected
 * by driver_lock. */
if (test_and_set_bit(0, >is_open)) {
-   dev_dbg(chip->pdev, "Another process owns this TPM\n");
+   dev_dbg(>dev, "Another process owns this TPM\n");
return -EBUSY;
}
 
@@ -79,7 +79,6 @@ static int tpm_open(struct inode *inode, struct file *file)
INIT_WORK(>work, timeout_work);
 
file->private_data = priv;
-   get_device(chip->pdev);
return 0;
 }
 
@@ -166,7 +165,6 @@ static int tpm_release(struct inode *inode, struct file 
*file)
file->private_data = NULL;
atomic_set(>data_pending, 0);
  

[PATCH v7 03/10] tpm: Provide strong locking for device removal

2016-03-11 Thread Stefan Berger
From: Jason Gunthorpe 

Add a read/write semaphore around the ops function pointers so
ops can be set to null when the driver un-registers.

Previously the tpm core expected module locking to be enough to
ensure that tpm_unregister could not be called during certain times,
however that hasn't been sufficient for a long time.

Introduce a read/write semaphore around 'ops' so the core can set
it to null when unregistering. This provides a strong fence around
the driver callbacks, guaranteeing to the driver that no callbacks
are running or will run again.

For now the ops_lock is placed very high in the call stack, it could
be pushed down and made more granular in future if necessary.

Signed-off-by: Jason Gunthorpe 
Reviewed-by: Stefan Berger 
Reviewed-by: Jarkko Sakkinen 
Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-chip.c  | 72 
 drivers/char/tpm/tpm-dev.c   | 11 +-
 drivers/char/tpm/tpm-interface.c | 19 ++-
 drivers/char/tpm/tpm-sysfs.c |  5 +++
 drivers/char/tpm/tpm.h   | 14 +---
 5 files changed, 100 insertions(+), 21 deletions(-)

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index c21d81c..5793ea1 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -36,10 +36,60 @@ static DEFINE_SPINLOCK(driver_lock);
 struct class *tpm_class;
 dev_t tpm_devt;
 
-/*
- * tpm_chip_find_get - return tpm_chip for a given chip number
- * @chip_num the device number for the chip
+/**
+ * tpm_try_get_ops() - Get a ref to the tpm_chip
+ * @chip: Chip to ref
+ *
+ * The caller must already have some kind of locking to ensure that chip is
+ * valid. This function will lock the chip so that the ops member can be
+ * accessed safely. The locking prevents tpm_chip_unregister from
+ * completing, so it should not be held for long periods.
+ *
+ * Returns -ERRNO if the chip could not be got.
  */
+int tpm_try_get_ops(struct tpm_chip *chip)
+{
+   int rc = -EIO;
+
+   get_device(>dev);
+
+   down_read(>ops_sem);
+   if (!chip->ops)
+   goto out_lock;
+
+   if (!try_module_get(chip->dev.parent->driver->owner))
+   goto out_lock;
+
+   return 0;
+out_lock:
+   up_read(>ops_sem);
+   put_device(>dev);
+   return rc;
+}
+EXPORT_SYMBOL_GPL(tpm_try_get_ops);
+
+/**
+ * tpm_put_ops() - Release a ref to the tpm_chip
+ * @chip: Chip to put
+ *
+ * This is the opposite pair to tpm_try_get_ops(). After this returns chip may
+ * be kfree'd.
+ */
+void tpm_put_ops(struct tpm_chip *chip)
+{
+   module_put(chip->dev.parent->driver->owner);
+   up_read(>ops_sem);
+   put_device(>dev);
+}
+EXPORT_SYMBOL_GPL(tpm_put_ops);
+
+/**
+ * tpm_chip_find_get() - return tpm_chip for a given chip number
+ * @chip_num: id to find
+ *
+ * The return'd chip has been tpm_try_get_ops'd and must be released via
+ * tpm_put_ops
+  */
 struct tpm_chip *tpm_chip_find_get(int chip_num)
 {
struct tpm_chip *pos, *chip = NULL;
@@ -49,10 +99,10 @@ struct tpm_chip *tpm_chip_find_get(int chip_num)
if (chip_num != TPM_ANY_NUM && chip_num != pos->dev_num)
continue;
 
-   if (try_module_get(pos->dev.parent->driver->owner)) {
+   /* rcu prevents chip from being free'd */
+   if (!tpm_try_get_ops(pos))
chip = pos;
-   break;
-   }
+   break;
}
rcu_read_unlock();
return chip;
@@ -95,6 +145,7 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
return ERR_PTR(-ENOMEM);
 
mutex_init(>tpm_mutex);
+   init_rwsem(>ops_sem);
INIT_LIST_HEAD(>list);
 
chip->ops = ops;
@@ -180,6 +231,12 @@ static int tpm_add_char_device(struct tpm_chip *chip)
 static void tpm_del_char_device(struct tpm_chip *chip)
 {
cdev_del(>cdev);
+
+   /* Make the driver uncallable. */
+   down_write(>ops_sem);
+   chip->ops = NULL;
+   up_write(>ops_sem);
+
device_del(>dev);
 }
 
@@ -265,6 +322,9 @@ EXPORT_SYMBOL_GPL(tpm_chip_register);
  * Takes the chip first away from the list of available TPM chips and then
  * cleans up all the resources reserved by tpm_chip_register().
  *
+ * Once this function returns the driver call backs in 'op's will not be
+ * running and will no longer start.
+ *
  * NOTE: This function should be only called before deinitializing chip
  * resources.
  */
diff --git a/drivers/char/tpm/tpm-dev.c b/drivers/char/tpm/tpm-dev.c
index 4009765..f5d4521 100644
--- a/drivers/char/tpm/tpm-dev.c
+++ b/drivers/char/tpm/tpm-dev.c
@@ -136,9 +136,18 @@ static ssize_t tpm_write(struct file *file, const char 
__user *buf,
return -EFAULT;
}
 
-   /* atomic tpm 

[PATCH v7 01/10] tpm: Get rid of chip->pdev

2016-03-11 Thread Stefan Berger
From: Jason Gunthorpe 

This is a hold over from before the struct device conversion.

- All prints should be using >dev, which is the Linux
  standard. This changes prints to use tpm0 as the device name,
  not the PnP/etc ID.
- The few places involving sysfs/modules that really do need the
  parent just use chip->dev.parent instead
- We no longer need to get_device(pdev) in any places since it is no
  longer used by any of the code. The kref on the parent is held
  by the device core during device_add and dropped in device_del

Signed-off-by: Jason Gunthorpe 
Signed-off-by: Stefan Berger 
Tested-by: Stefan Berger 
Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 
Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-chip.c | 15 ++-
 drivers/char/tpm/tpm-dev.c  |  4 +---
 drivers/char/tpm/tpm-interface.c| 30 --
 drivers/char/tpm/tpm-sysfs.c|  6 +++---
 drivers/char/tpm/tpm.h  |  3 +--
 drivers/char/tpm/tpm2-cmd.c |  8 
 drivers/char/tpm/tpm_atmel.c| 14 +++---
 drivers/char/tpm/tpm_i2c_atmel.c| 16 
 drivers/char/tpm/tpm_i2c_infineon.c |  6 +++---
 drivers/char/tpm/tpm_i2c_nuvoton.c  | 22 +++---
 drivers/char/tpm/tpm_infineon.c | 22 +++---
 drivers/char/tpm/tpm_nsc.c  | 20 ++--
 drivers/char/tpm/tpm_tis.c  | 16 
 13 files changed, 89 insertions(+), 93 deletions(-)

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 274dd01..12829dd 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -49,7 +49,7 @@ struct tpm_chip *tpm_chip_find_get(int chip_num)
if (chip_num != TPM_ANY_NUM && chip_num != pos->dev_num)
continue;
 
-   if (try_module_get(pos->pdev->driver->owner)) {
+   if (try_module_get(pos->dev.parent->driver->owner)) {
chip = pos;
break;
}
@@ -113,13 +113,11 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
 
scnprintf(chip->devname, sizeof(chip->devname), "tpm%d", chip->dev_num);
 
-   chip->pdev = dev;
-
dev_set_drvdata(dev, chip);
 
chip->dev.class = tpm_class;
chip->dev.release = tpm_dev_release;
-   chip->dev.parent = chip->pdev;
+   chip->dev.parent = dev;
 #ifdef CONFIG_ACPI
chip->dev.groups = chip->groups;
 #endif
@@ -134,7 +132,7 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
device_initialize(>dev);
 
cdev_init(>cdev, _fops);
-   chip->cdev.owner = chip->pdev->driver->owner;
+   chip->cdev.owner = dev->driver->owner;
chip->cdev.kobj.parent = >dev.kobj;
 
rc = devm_add_action(dev, (void (*)(void *)) put_device, >dev);
@@ -241,9 +239,8 @@ int tpm_chip_register(struct tpm_chip *chip)
chip->flags |= TPM_CHIP_FLAG_REGISTERED;
 
if (!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
-   rc = __compat_only_sysfs_link_entry_to_kobj(>pdev->kobj,
-   >dev.kobj,
-   "ppi");
+   rc = __compat_only_sysfs_link_entry_to_kobj(
+   >dev.parent->kobj, >dev.kobj, "ppi");
if (rc && rc != -ENOENT) {
tpm_chip_unregister(chip);
return rc;
@@ -278,7 +275,7 @@ void tpm_chip_unregister(struct tpm_chip *chip)
synchronize_rcu();
 
if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
-   sysfs_remove_link(>pdev->kobj, "ppi");
+   sysfs_remove_link(>dev.parent->kobj, "ppi");
 
tpm1_chip_unregister(chip);
tpm_del_char_device(chip);
diff --git a/drivers/char/tpm/tpm-dev.c b/drivers/char/tpm/tpm-dev.c
index de0337e..4009765 100644
--- a/drivers/char/tpm/tpm-dev.c
+++ b/drivers/char/tpm/tpm-dev.c
@@ -61,7 +61,7 @@ static int tpm_open(struct inode *inode, struct file *file)
 * by the check of is_open variable, which is protected
 * by driver_lock. */
if (test_and_set_bit(0, >is_open)) {
-   dev_dbg(chip->pdev, "Another process owns this TPM\n");
+   dev_dbg(>dev, "Another process owns this TPM\n");
return -EBUSY;
}
 
@@ -79,7 +79,6 @@ static int tpm_open(struct inode *inode, struct file *file)
INIT_WORK(>work, timeout_work);
 
file->private_data = priv;
-   get_device(chip->pdev);
return 0;
 }
 
@@ -166,7 +165,6 @@ static int tpm_release(struct inode *inode, struct file 
*file)
file->private_data = NULL;
atomic_set(>data_pending, 0);
clear_bit(0, >chip->is_open);
-   put_device(priv->chip->pdev);
kfree(priv);
return 0;
 }
diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index e2fa89c..483f86f 

[PATCH v7 03/10] tpm: Provide strong locking for device removal

2016-03-11 Thread Stefan Berger
From: Jason Gunthorpe 

Add a read/write semaphore around the ops function pointers so
ops can be set to null when the driver un-registers.

Previously the tpm core expected module locking to be enough to
ensure that tpm_unregister could not be called during certain times,
however that hasn't been sufficient for a long time.

Introduce a read/write semaphore around 'ops' so the core can set
it to null when unregistering. This provides a strong fence around
the driver callbacks, guaranteeing to the driver that no callbacks
are running or will run again.

For now the ops_lock is placed very high in the call stack, it could
be pushed down and made more granular in future if necessary.

Signed-off-by: Jason Gunthorpe 
Reviewed-by: Stefan Berger 
Reviewed-by: Jarkko Sakkinen 
Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-chip.c  | 72 
 drivers/char/tpm/tpm-dev.c   | 11 +-
 drivers/char/tpm/tpm-interface.c | 19 ++-
 drivers/char/tpm/tpm-sysfs.c |  5 +++
 drivers/char/tpm/tpm.h   | 14 +---
 5 files changed, 100 insertions(+), 21 deletions(-)

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index c21d81c..5793ea1 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -36,10 +36,60 @@ static DEFINE_SPINLOCK(driver_lock);
 struct class *tpm_class;
 dev_t tpm_devt;
 
-/*
- * tpm_chip_find_get - return tpm_chip for a given chip number
- * @chip_num the device number for the chip
+/**
+ * tpm_try_get_ops() - Get a ref to the tpm_chip
+ * @chip: Chip to ref
+ *
+ * The caller must already have some kind of locking to ensure that chip is
+ * valid. This function will lock the chip so that the ops member can be
+ * accessed safely. The locking prevents tpm_chip_unregister from
+ * completing, so it should not be held for long periods.
+ *
+ * Returns -ERRNO if the chip could not be got.
  */
+int tpm_try_get_ops(struct tpm_chip *chip)
+{
+   int rc = -EIO;
+
+   get_device(>dev);
+
+   down_read(>ops_sem);
+   if (!chip->ops)
+   goto out_lock;
+
+   if (!try_module_get(chip->dev.parent->driver->owner))
+   goto out_lock;
+
+   return 0;
+out_lock:
+   up_read(>ops_sem);
+   put_device(>dev);
+   return rc;
+}
+EXPORT_SYMBOL_GPL(tpm_try_get_ops);
+
+/**
+ * tpm_put_ops() - Release a ref to the tpm_chip
+ * @chip: Chip to put
+ *
+ * This is the opposite pair to tpm_try_get_ops(). After this returns chip may
+ * be kfree'd.
+ */
+void tpm_put_ops(struct tpm_chip *chip)
+{
+   module_put(chip->dev.parent->driver->owner);
+   up_read(>ops_sem);
+   put_device(>dev);
+}
+EXPORT_SYMBOL_GPL(tpm_put_ops);
+
+/**
+ * tpm_chip_find_get() - return tpm_chip for a given chip number
+ * @chip_num: id to find
+ *
+ * The return'd chip has been tpm_try_get_ops'd and must be released via
+ * tpm_put_ops
+  */
 struct tpm_chip *tpm_chip_find_get(int chip_num)
 {
struct tpm_chip *pos, *chip = NULL;
@@ -49,10 +99,10 @@ struct tpm_chip *tpm_chip_find_get(int chip_num)
if (chip_num != TPM_ANY_NUM && chip_num != pos->dev_num)
continue;
 
-   if (try_module_get(pos->dev.parent->driver->owner)) {
+   /* rcu prevents chip from being free'd */
+   if (!tpm_try_get_ops(pos))
chip = pos;
-   break;
-   }
+   break;
}
rcu_read_unlock();
return chip;
@@ -95,6 +145,7 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
return ERR_PTR(-ENOMEM);
 
mutex_init(>tpm_mutex);
+   init_rwsem(>ops_sem);
INIT_LIST_HEAD(>list);
 
chip->ops = ops;
@@ -180,6 +231,12 @@ static int tpm_add_char_device(struct tpm_chip *chip)
 static void tpm_del_char_device(struct tpm_chip *chip)
 {
cdev_del(>cdev);
+
+   /* Make the driver uncallable. */
+   down_write(>ops_sem);
+   chip->ops = NULL;
+   up_write(>ops_sem);
+
device_del(>dev);
 }
 
@@ -265,6 +322,9 @@ EXPORT_SYMBOL_GPL(tpm_chip_register);
  * Takes the chip first away from the list of available TPM chips and then
  * cleans up all the resources reserved by tpm_chip_register().
  *
+ * Once this function returns the driver call backs in 'op's will not be
+ * running and will no longer start.
+ *
  * NOTE: This function should be only called before deinitializing chip
  * resources.
  */
diff --git a/drivers/char/tpm/tpm-dev.c b/drivers/char/tpm/tpm-dev.c
index 4009765..f5d4521 100644
--- a/drivers/char/tpm/tpm-dev.c
+++ b/drivers/char/tpm/tpm-dev.c
@@ -136,9 +136,18 @@ static ssize_t tpm_write(struct file *file, const char 
__user *buf,
return -EFAULT;
}
 
-   /* atomic tpm command send and result receive */
+   /* atomic tpm command send and result receive. We only hold the ops
+* lock during this period so that the tpm can 

[PATCH v7 09/10] tpm: Initialize TPM and get durations and timeouts

2016-03-11 Thread Stefan Berger
Add the retrieval of TPM 1.2 durations and timeouts. Since this requires
the startup of the TPM, do this for TPM 1.2 and TPM 2.

Signed-off-by: Stefan Berger 
CC: linux-kernel@vger.kernel.org
CC: linux-...@vger.kernel.org
CC: linux-...@vger.kernel.org
---
 drivers/char/tpm/tpm_vtpm_proxy.c | 95 +++
 1 file changed, 86 insertions(+), 9 deletions(-)

diff --git a/drivers/char/tpm/tpm_vtpm_proxy.c 
b/drivers/char/tpm/tpm_vtpm_proxy.c
index d73944e..9dedf48 100644
--- a/drivers/char/tpm/tpm_vtpm_proxy.c
+++ b/drivers/char/tpm/tpm_vtpm_proxy.c
@@ -45,8 +45,11 @@ struct proxy_dev {
size_t req_len;  /* length of queued TPM request */
size_t resp_len; /* length of queued TPM response */
u8 buffer[TPM_BUFSIZE];  /* request/response buffer */
+
+   struct work_struct work; /* task that retrieves TPM timeouts */
 };
 
+static struct workqueue_struct *workqueue;
 
 static void vtpm_proxy_delete_device(struct proxy_dev *proxy_dev);
 
@@ -67,6 +70,15 @@ static ssize_t vtpm_proxy_fops_read(struct file *filp, char 
__user *buf,
size_t len;
int sig, rc;
 
+   mutex_lock(_dev->buf_lock);
+
+   if (!(proxy_dev->state & STATE_OPENED_FLAG)) {
+   mutex_unlock(_dev->buf_lock);
+   return -EPIPE;
+   }
+
+   mutex_unlock(_dev->buf_lock);
+
sig = wait_event_interruptible(proxy_dev->wq, proxy_dev->req_len != 0);
if (sig)
return -EINTR;
@@ -110,6 +122,11 @@ static ssize_t vtpm_proxy_fops_write(struct file *filp, 
const char __user *buf,
 
mutex_lock(_dev->buf_lock);
 
+   if (!(proxy_dev->state & STATE_OPENED_FLAG)) {
+   mutex_unlock(_dev->buf_lock);
+   return -EPIPE;
+   }
+
if (count > sizeof(proxy_dev->buffer) ||
!(proxy_dev->state & STATE_WAIT_RESPONSE_FLAG)) {
mutex_unlock(_dev->buf_lock);
@@ -154,6 +171,9 @@ static unsigned int vtpm_proxy_fops_poll(struct file *filp, 
poll_table *wait)
if (proxy_dev->req_len)
ret |= POLLIN | POLLRDNORM;
 
+   if (!(proxy_dev->state & STATE_OPENED_FLAG))
+   ret |= POLLHUP;
+
mutex_unlock(_dev->buf_lock);
 
return ret;
@@ -341,6 +361,55 @@ static const struct tpm_class_ops vtpm_proxy_tpm_ops = {
 };
 
 /*
+ * Code related to the startup of the TPM 2 and startup of TPM 1.2 +
+ * retrieval of timeouts and durations.
+ */
+
+static void vtpm_proxy_work(struct work_struct *work)
+{
+   struct proxy_dev *proxy_dev = container_of(work, struct proxy_dev,
+  work);
+   int rc;
+
+   if (proxy_dev->flags & VTPM_PROXY_FLAG_TPM2)
+   rc = tpm2_startup(proxy_dev->chip, TPM2_SU_CLEAR);
+   else
+   rc = tpm_get_timeouts(proxy_dev->chip);
+
+   if (rc)
+   goto err;
+
+   rc = tpm_chip_register(proxy_dev->chip);
+   if (rc)
+   goto err;
+
+   return;
+
+err:
+   vtpm_proxy_fops_undo_open(proxy_dev);
+}
+
+/*
+ * vtpm_proxy_work_stop: make sure the work has finished
+ *
+ * This function is useful when user space closed the fd
+ * while the driver still determines timeouts.
+ */
+static void vtpm_proxy_work_stop(struct proxy_dev *proxy_dev)
+{
+   vtpm_proxy_fops_undo_open(proxy_dev);
+   flush_work(_dev->work);
+}
+
+/*
+ * vtpm_proxy_work_start: Schedule the work for TPM 1.2 & 2 initialization
+ */
+static inline void vtpm_proxy_work_start(struct proxy_dev *proxy_dev)
+{
+   queue_work(workqueue, _dev->work);
+}
+
+/*
  * Code related to creation and deletion of device pairs
  */
 static struct proxy_dev *vtpm_proxy_create_proxy_dev(void)
@@ -355,6 +424,7 @@ static struct proxy_dev *vtpm_proxy_create_proxy_dev(void)
 
init_waitqueue_head(_dev->wq);
mutex_init(_dev->buf_lock);
+   INIT_WORK(_dev->work, vtpm_proxy_work);
 
chip = tpm_chip_alloc(NULL, _proxy_tpm_ops);
if (IS_ERR(chip)) {
@@ -425,9 +495,7 @@ static struct file *vtpm_proxy_create_device(
if (proxy_dev->flags & VTPM_PROXY_FLAG_TPM2)
proxy_dev->chip->flags |= TPM_CHIP_FLAG_TPM2;
 
-   rc = tpm_chip_register(proxy_dev->chip);
-   if (rc)
-   goto err_vtpm_fput;
+   vtpm_proxy_work_start(proxy_dev);
 
vtpm_new_dev->fd = fd;
vtpm_new_dev->major = MAJOR(proxy_dev->chip->dev.devt);
@@ -436,12 +504,6 @@ static struct file *vtpm_proxy_create_device(
 
return file;
 
-err_vtpm_fput:
-   put_unused_fd(fd);
-   fput(file);
-
-   return ERR_PTR(rc);
-
 err_put_unused_fd:
put_unused_fd(fd);
 
@@ -456,6 +518,8 @@ err_delete_proxy_dev:
  */
 static void vtpm_proxy_delete_device(struct proxy_dev *proxy_dev)
 {
+   vtpm_proxy_work_stop(proxy_dev);
+
tpm_chip_unregister(proxy_dev->chip);
 
vtpm_proxy_fops_undo_open(proxy_dev);
@@ 

[PATCH v7 10/10] tpm: Add documentation for the tpm_vtpm device driver

2016-03-11 Thread Stefan Berger
Add documentation for the tpm_vtpm device driver that implements
support for providing TPM functionality to Linux containers.

Parts of this documentation were recycled from the Xen vTPM
device driver documentation.

Update the documentation for the ioctl numbers.

Signed-off-by: Stefan Berger 
CC: linux-kernel@vger.kernel.org
CC: linux-...@vger.kernel.org
CC: linux-...@vger.kernel.org
---
 Documentation/ioctl/ioctl-number.txt |  1 +
 Documentation/tpm/tpm_vtpm_proxy.txt | 71 
 2 files changed, 72 insertions(+)
 create mode 100644 Documentation/tpm/tpm_vtpm_proxy.txt

diff --git a/Documentation/ioctl/ioctl-number.txt 
b/Documentation/ioctl/ioctl-number.txt
index 91261a3..7dbec90 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -303,6 +303,7 @@ Code  Seq#(hex) Include FileComments

 0xA0   all linux/sdp/sdp.h Industrial Device Project

+0xA1   0   linux/vtpm_proxy.h  TPM Emulator Proxy Driver
 0xA2   00-0F   arch/tile/include/asm/hardwall.h
 0xA3   80-8F   Port ACLin development:

diff --git a/Documentation/tpm/tpm_vtpm_proxy.txt 
b/Documentation/tpm/tpm_vtpm_proxy.txt
new file mode 100644
index 000..3a2e3946
--- /dev/null
+++ b/Documentation/tpm/tpm_vtpm_proxy.txt
@@ -0,0 +1,71 @@
+Virtual TPM Proxy Driver for Linux Containers
+
+Authors: Stefan Berger (IBM)
+
+This document describes the virtual Trusted Platform Module (vTPM)
+proxy device driver for Linux containers.
+
+INTRODUCTION
+
+
+The goal of this work is to provide TPM functionality to each Linux
+container. This allows programs to interact with a TPM in a container
+the same way they interact with a TPM on the physical system. Each
+container gets its own unique, emulated, software TPM.
+
+
+DESIGN
+--
+
+To make an emulated software TPM available to each container, the container
+management stack needs to create a device pair consisting of a client TPM
+character device /dev/tpmX (with X=0,1,2...) and a 'server side' file
+descriptor. The former is moved into the container by creating a character
+device with the appropriate major and minor numbers while the file descriptor
+is passed to the TPM emulator. Software inside the container can then send
+TPM commands using the character device and the emulator will receive the
+commands via the file descriptor and use it for sending back responses.
+
+To support this, the virtual TPM proxy driver provides a device /dev/vtpmx
+that is used to create device pairs using an ioctl. The ioctl takes as
+an input flags for configuring the device. The flags  for example indicate
+whether TPM 1.2 or TPM 2 functionality is supported by the TPM emulator.
+The result of the ioctl are the file descriptor for the 'server side'
+as well as the major and minor numbers of the character device that was 
created.
+Besides that the number of the TPM character device is return. If for
+example /dev/tpm10 was created, the number (dev_num) 10 is returned.
+
+The following is the data structure of the TPM_PROXY_IOC_NEW_DEV ioctl:
+
+struct vtpm_proxy_new_dev {
+   __u32 flags; /* input */
+   __u32 tpm_num;   /* output */
+   __u32 fd;/* output */
+   __u32 major; /* output */
+   __u32 minor; /* output */
+};
+
+Note that if unsupported flags are passed to the device driver, the ioctl will
+fail and errno will be set to EOPNOTSUPP. Similarly, if an unsupported ioctl is
+called on the device driver, the ioctl will fail and errno will be set to
+ENOTTY.
+
+See /usr/include/linux/vtpm_proxy.h for definitions related to the public 
interface
+of this vTPM device driver.
+
+Once the device has been created, the driver will immediately try to talk
+to the TPM. All commands from the driver can read from the file descriptor
+returned by the ioctl. The commands should be responded to immediately.
+
+Depending on the version of TPM the following commands will be sent by the
+driver:
+
+- TPM 1.2:
+  - the driver will send a TPM_Startup() command to the TPM emulator
+  - the driver will send commands to read the command durations and
+interface timeouts from the TPM emulator
+- TPM 2:
+  - the driver will send a TPM2_Startup() command to the TPM emulator
+
+The TPM device /dev/tpmX will only appear if all of the relevant commands
+were responded to properly.
-- 
2.4.3



[PATCH v7 08/10] tpm: Proxy driver for supporting multiple emulated TPMs

2016-03-11 Thread Stefan Berger
This patch implements a proxy driver for supporting multiple emulated TPMs
in a system.

The driver implements a device /dev/vtpmx that is used to created
a client device pair /dev/tpmX (e.g., /dev/tpm10) and a server side that
is accessed using a file descriptor returned by an ioctl.
The device /dev/tpmX is the usual TPM device created by the core TPM
driver. Applications or kernel subsystems can send TPM commands to it
and the corresponding server-side file descriptor receives these
commands and delivers them to an emulated TPM.

Signed-off-by: Stefan Berger 
CC: linux-kernel@vger.kernel.org
CC: linux-...@vger.kernel.org
CC: linux-...@vger.kernel.org
---
 drivers/char/tpm/Kconfig  |  10 +
 drivers/char/tpm/Makefile |   1 +
 drivers/char/tpm/tpm_vtpm_proxy.c | 567 ++
 include/uapi/linux/Kbuild |   1 +
 include/uapi/linux/vtpm_proxy.h   |  42 +++
 5 files changed, 621 insertions(+)
 create mode 100644 drivers/char/tpm/tpm_vtpm_proxy.c
 create mode 100644 include/uapi/linux/vtpm_proxy.h

diff --git a/drivers/char/tpm/Kconfig b/drivers/char/tpm/Kconfig
index 3b84a8b..0eac596 100644
--- a/drivers/char/tpm/Kconfig
+++ b/drivers/char/tpm/Kconfig
@@ -122,5 +122,15 @@ config TCG_CRB
  from within Linux.  To compile this driver as a module, choose
  M here; the module will be called tpm_crb.
 
+config TCG_VTPM_PROXY
+   tristate "VTPM Proxy Interface"
+   depends on TCG_TPM
+   ---help---
+ This driver proxies for an emulated TPM (vTPM) running in userspace.
+ A device /dev/vtpmx is provided that creates a device pair
+ /dev/vtpmX and a server-side file descriptor on which the vTPM
+ can receive commands.
+
+
 source "drivers/char/tpm/st33zp24/Kconfig"
 endif # TCG_TPM
diff --git a/drivers/char/tpm/Makefile b/drivers/char/tpm/Makefile
index 56e8f1f..98de5e6 100644
--- a/drivers/char/tpm/Makefile
+++ b/drivers/char/tpm/Makefile
@@ -23,3 +23,4 @@ obj-$(CONFIG_TCG_IBMVTPM) += tpm_ibmvtpm.o
 obj-$(CONFIG_TCG_TIS_ST33ZP24) += st33zp24/
 obj-$(CONFIG_TCG_XEN) += xen-tpmfront.o
 obj-$(CONFIG_TCG_CRB) += tpm_crb.o
+obj-$(CONFIG_TCG_VTPM_PROXY) += tpm_vtpm_proxy.o
diff --git a/drivers/char/tpm/tpm_vtpm_proxy.c 
b/drivers/char/tpm/tpm_vtpm_proxy.c
new file mode 100644
index 000..d73944e
--- /dev/null
+++ b/drivers/char/tpm/tpm_vtpm_proxy.c
@@ -0,0 +1,567 @@
+/*
+ * Copyright (C) 2015, 2016 IBM Corporation
+ *
+ * Author: Stefan Berger 
+ *
+ * Maintained by: 
+ *
+ * Device driver for vTPM (vTPM proxy driver)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation, version 2 of the
+ * License.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "tpm.h"
+
+#define VTPM_PROXY_REQ_COMPLETE_FLAG  BIT(0)
+
+struct proxy_dev {
+   struct tpm_chip *chip;
+
+   u32 flags;   /* public API flags */
+
+   wait_queue_head_t wq;
+
+   struct mutex buf_lock;   /* protect buffer and flags */
+
+   long state;  /* internal state */
+#define STATE_OPENED_FLAGBIT(0)
+#define STATE_WAIT_RESPONSE_FLAG BIT(1)  /* waiting for emulator response */
+
+   size_t req_len;  /* length of queued TPM request */
+   size_t resp_len; /* length of queued TPM response */
+   u8 buffer[TPM_BUFSIZE];  /* request/response buffer */
+};
+
+
+static void vtpm_proxy_delete_device(struct proxy_dev *proxy_dev);
+
+/*
+ * Functions related to 'server side'
+ */
+
+/**
+ * vtpm_proxy_fops_read - Read TPM commands on 'server side'
+ *
+ * Return value:
+ * Number of bytes read or negative error code
+ */
+static ssize_t vtpm_proxy_fops_read(struct file *filp, char __user *buf,
+   size_t count, loff_t *off)
+{
+   struct proxy_dev *proxy_dev = filp->private_data;
+   size_t len;
+   int sig, rc;
+
+   sig = wait_event_interruptible(proxy_dev->wq, proxy_dev->req_len != 0);
+   if (sig)
+   return -EINTR;
+
+   mutex_lock(_dev->buf_lock);
+
+   len = proxy_dev->req_len;
+
+   if (count < len) {
+   mutex_unlock(_dev->buf_lock);
+   pr_debug("Invalid size in recv: count=%zd, req_len=%zd\n",
+count, len);
+   return -EIO;
+   }
+
+   rc = copy_to_user(buf, proxy_dev->buffer, len);
+   memset(proxy_dev->buffer, 0, len);
+   proxy_dev->req_len = 0;
+
+   if (!rc)
+   proxy_dev->state |= STATE_WAIT_RESPONSE_FLAG;
+
+   mutex_unlock(_dev->buf_lock);
+
+   if (rc)
+   return -EFAULT;
+
+   return len;
+}
+
+/**
+ * vtpm_proxy_fops_write - Write TPM responses on 'server 

[PATCH v7 00/10] Multi-instance vTPM proxy driver

2016-03-11 Thread Stefan Berger
The following series of patches implements a multi-instance vTPM 
proxy driver that can dynamically create TPM 'server' and client device
pairs.

Using an ioctl on the provided /dev/vtpmx, a client-side vTPM device
and a server side file descriptor is created. The file descriptor must
be passed to a TPM emulator. The device driver will initialize the
emulated TPM using TPM 1.2 or TPM 2 startup commands and it will read
the command durations from the device in case of a TPM 1.2. The choice
of emulated TPM device (1.2 or 2) must be provided with a flag in
the ioctl.

The driver is based on a recent checkout of James Morris's 'next'
branch and uses several recently posted patches from Jason and Jarkko.

   Stefan

v6->v7:
 - Adjusted name of driver to tpm_vtpm_proxy from tpm_vtpm. Adjust function
   names, names of structures, and names of constants.
 - Adjusted IOCTL to use magic 0xa1 rather than the completely used 0xa0.
 - Extended driver documentation and added documentation of ioctl.
 - Moved test program to own project (dropped patch 11).

v5->v6:
 - Adapted errno's for unsupported flags and ioctls following Jason's comments

v4->v5:
 - Introduced different error codes for unsupported flags and ioctls
 - Added documentation patch

Jason Gunthorpe (4):
  tpm: Get rid of chip->pdev
  tpm: Get rid of devname
  tpm: Provide strong locking for device removal
  tpm: Split out the devm stuff from tpmm_chip_alloc

Stefan Berger (6):
  tpm: Get rid of module locking
  tpm: Replace device number bitmap with IDR
  tpm: Introduce TPM_CHIP_FLAG_VIRTUAL
  tpm: Proxy driver for supporting multiple emulated TPMs
  tpm: Initialize TPM and get durations and timeouts
  tpm: Add documentation for the tpm_vtpm device driver

 Documentation/ioctl/ioctl-number.txt |   1 +
 Documentation/tpm/tpm_vtpm_proxy.txt |  71 
 drivers/char/tpm/Kconfig |  10 +
 drivers/char/tpm/Makefile|   1 +
 drivers/char/tpm/tpm-chip.c  | 221 
 drivers/char/tpm/tpm-dev.c   |  15 +-
 drivers/char/tpm/tpm-interface.c |  50 +--
 drivers/char/tpm/tpm-sysfs.c |  22 +-
 drivers/char/tpm/tpm.h   |  30 +-
 drivers/char/tpm/tpm2-cmd.c  |   8 +-
 drivers/char/tpm/tpm_atmel.c |  14 +-
 drivers/char/tpm/tpm_eventlog.c  |   2 +-
 drivers/char/tpm/tpm_eventlog.h  |   2 +-
 drivers/char/tpm/tpm_i2c_atmel.c |  16 +-
 drivers/char/tpm/tpm_i2c_infineon.c  |   6 +-
 drivers/char/tpm/tpm_i2c_nuvoton.c   |  24 +-
 drivers/char/tpm/tpm_infineon.c  |  22 +-
 drivers/char/tpm/tpm_nsc.c   |  20 +-
 drivers/char/tpm/tpm_tis.c   |  18 +-
 drivers/char/tpm/tpm_vtpm_proxy.c| 644 +++
 include/uapi/linux/Kbuild|   1 +
 include/uapi/linux/vtpm_proxy.h  |  42 +++
 22 files changed, 1061 insertions(+), 179 deletions(-)
 create mode 100644 Documentation/tpm/tpm_vtpm_proxy.txt
 create mode 100644 drivers/char/tpm/tpm_vtpm_proxy.c
 create mode 100644 include/uapi/linux/vtpm_proxy.h

-- 
2.4.3



[PATCH v7 09/10] tpm: Initialize TPM and get durations and timeouts

2016-03-11 Thread Stefan Berger
Add the retrieval of TPM 1.2 durations and timeouts. Since this requires
the startup of the TPM, do this for TPM 1.2 and TPM 2.

Signed-off-by: Stefan Berger 
CC: linux-kernel@vger.kernel.org
CC: linux-...@vger.kernel.org
CC: linux-...@vger.kernel.org
---
 drivers/char/tpm/tpm_vtpm_proxy.c | 95 +++
 1 file changed, 86 insertions(+), 9 deletions(-)

diff --git a/drivers/char/tpm/tpm_vtpm_proxy.c 
b/drivers/char/tpm/tpm_vtpm_proxy.c
index d73944e..9dedf48 100644
--- a/drivers/char/tpm/tpm_vtpm_proxy.c
+++ b/drivers/char/tpm/tpm_vtpm_proxy.c
@@ -45,8 +45,11 @@ struct proxy_dev {
size_t req_len;  /* length of queued TPM request */
size_t resp_len; /* length of queued TPM response */
u8 buffer[TPM_BUFSIZE];  /* request/response buffer */
+
+   struct work_struct work; /* task that retrieves TPM timeouts */
 };
 
+static struct workqueue_struct *workqueue;
 
 static void vtpm_proxy_delete_device(struct proxy_dev *proxy_dev);
 
@@ -67,6 +70,15 @@ static ssize_t vtpm_proxy_fops_read(struct file *filp, char 
__user *buf,
size_t len;
int sig, rc;
 
+   mutex_lock(_dev->buf_lock);
+
+   if (!(proxy_dev->state & STATE_OPENED_FLAG)) {
+   mutex_unlock(_dev->buf_lock);
+   return -EPIPE;
+   }
+
+   mutex_unlock(_dev->buf_lock);
+
sig = wait_event_interruptible(proxy_dev->wq, proxy_dev->req_len != 0);
if (sig)
return -EINTR;
@@ -110,6 +122,11 @@ static ssize_t vtpm_proxy_fops_write(struct file *filp, 
const char __user *buf,
 
mutex_lock(_dev->buf_lock);
 
+   if (!(proxy_dev->state & STATE_OPENED_FLAG)) {
+   mutex_unlock(_dev->buf_lock);
+   return -EPIPE;
+   }
+
if (count > sizeof(proxy_dev->buffer) ||
!(proxy_dev->state & STATE_WAIT_RESPONSE_FLAG)) {
mutex_unlock(_dev->buf_lock);
@@ -154,6 +171,9 @@ static unsigned int vtpm_proxy_fops_poll(struct file *filp, 
poll_table *wait)
if (proxy_dev->req_len)
ret |= POLLIN | POLLRDNORM;
 
+   if (!(proxy_dev->state & STATE_OPENED_FLAG))
+   ret |= POLLHUP;
+
mutex_unlock(_dev->buf_lock);
 
return ret;
@@ -341,6 +361,55 @@ static const struct tpm_class_ops vtpm_proxy_tpm_ops = {
 };
 
 /*
+ * Code related to the startup of the TPM 2 and startup of TPM 1.2 +
+ * retrieval of timeouts and durations.
+ */
+
+static void vtpm_proxy_work(struct work_struct *work)
+{
+   struct proxy_dev *proxy_dev = container_of(work, struct proxy_dev,
+  work);
+   int rc;
+
+   if (proxy_dev->flags & VTPM_PROXY_FLAG_TPM2)
+   rc = tpm2_startup(proxy_dev->chip, TPM2_SU_CLEAR);
+   else
+   rc = tpm_get_timeouts(proxy_dev->chip);
+
+   if (rc)
+   goto err;
+
+   rc = tpm_chip_register(proxy_dev->chip);
+   if (rc)
+   goto err;
+
+   return;
+
+err:
+   vtpm_proxy_fops_undo_open(proxy_dev);
+}
+
+/*
+ * vtpm_proxy_work_stop: make sure the work has finished
+ *
+ * This function is useful when user space closed the fd
+ * while the driver still determines timeouts.
+ */
+static void vtpm_proxy_work_stop(struct proxy_dev *proxy_dev)
+{
+   vtpm_proxy_fops_undo_open(proxy_dev);
+   flush_work(_dev->work);
+}
+
+/*
+ * vtpm_proxy_work_start: Schedule the work for TPM 1.2 & 2 initialization
+ */
+static inline void vtpm_proxy_work_start(struct proxy_dev *proxy_dev)
+{
+   queue_work(workqueue, _dev->work);
+}
+
+/*
  * Code related to creation and deletion of device pairs
  */
 static struct proxy_dev *vtpm_proxy_create_proxy_dev(void)
@@ -355,6 +424,7 @@ static struct proxy_dev *vtpm_proxy_create_proxy_dev(void)
 
init_waitqueue_head(_dev->wq);
mutex_init(_dev->buf_lock);
+   INIT_WORK(_dev->work, vtpm_proxy_work);
 
chip = tpm_chip_alloc(NULL, _proxy_tpm_ops);
if (IS_ERR(chip)) {
@@ -425,9 +495,7 @@ static struct file *vtpm_proxy_create_device(
if (proxy_dev->flags & VTPM_PROXY_FLAG_TPM2)
proxy_dev->chip->flags |= TPM_CHIP_FLAG_TPM2;
 
-   rc = tpm_chip_register(proxy_dev->chip);
-   if (rc)
-   goto err_vtpm_fput;
+   vtpm_proxy_work_start(proxy_dev);
 
vtpm_new_dev->fd = fd;
vtpm_new_dev->major = MAJOR(proxy_dev->chip->dev.devt);
@@ -436,12 +504,6 @@ static struct file *vtpm_proxy_create_device(
 
return file;
 
-err_vtpm_fput:
-   put_unused_fd(fd);
-   fput(file);
-
-   return ERR_PTR(rc);
-
 err_put_unused_fd:
put_unused_fd(fd);
 
@@ -456,6 +518,8 @@ err_delete_proxy_dev:
  */
 static void vtpm_proxy_delete_device(struct proxy_dev *proxy_dev)
 {
+   vtpm_proxy_work_stop(proxy_dev);
+
tpm_chip_unregister(proxy_dev->chip);
 
vtpm_proxy_fops_undo_open(proxy_dev);
@@ -550,11 +614,24 @@ static int 

[PATCH v7 10/10] tpm: Add documentation for the tpm_vtpm device driver

2016-03-11 Thread Stefan Berger
Add documentation for the tpm_vtpm device driver that implements
support for providing TPM functionality to Linux containers.

Parts of this documentation were recycled from the Xen vTPM
device driver documentation.

Update the documentation for the ioctl numbers.

Signed-off-by: Stefan Berger 
CC: linux-kernel@vger.kernel.org
CC: linux-...@vger.kernel.org
CC: linux-...@vger.kernel.org
---
 Documentation/ioctl/ioctl-number.txt |  1 +
 Documentation/tpm/tpm_vtpm_proxy.txt | 71 
 2 files changed, 72 insertions(+)
 create mode 100644 Documentation/tpm/tpm_vtpm_proxy.txt

diff --git a/Documentation/ioctl/ioctl-number.txt 
b/Documentation/ioctl/ioctl-number.txt
index 91261a3..7dbec90 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -303,6 +303,7 @@ Code  Seq#(hex) Include FileComments

 0xA0   all linux/sdp/sdp.h Industrial Device Project

+0xA1   0   linux/vtpm_proxy.h  TPM Emulator Proxy Driver
 0xA2   00-0F   arch/tile/include/asm/hardwall.h
 0xA3   80-8F   Port ACLin development:

diff --git a/Documentation/tpm/tpm_vtpm_proxy.txt 
b/Documentation/tpm/tpm_vtpm_proxy.txt
new file mode 100644
index 000..3a2e3946
--- /dev/null
+++ b/Documentation/tpm/tpm_vtpm_proxy.txt
@@ -0,0 +1,71 @@
+Virtual TPM Proxy Driver for Linux Containers
+
+Authors: Stefan Berger (IBM)
+
+This document describes the virtual Trusted Platform Module (vTPM)
+proxy device driver for Linux containers.
+
+INTRODUCTION
+
+
+The goal of this work is to provide TPM functionality to each Linux
+container. This allows programs to interact with a TPM in a container
+the same way they interact with a TPM on the physical system. Each
+container gets its own unique, emulated, software TPM.
+
+
+DESIGN
+--
+
+To make an emulated software TPM available to each container, the container
+management stack needs to create a device pair consisting of a client TPM
+character device /dev/tpmX (with X=0,1,2...) and a 'server side' file
+descriptor. The former is moved into the container by creating a character
+device with the appropriate major and minor numbers while the file descriptor
+is passed to the TPM emulator. Software inside the container can then send
+TPM commands using the character device and the emulator will receive the
+commands via the file descriptor and use it for sending back responses.
+
+To support this, the virtual TPM proxy driver provides a device /dev/vtpmx
+that is used to create device pairs using an ioctl. The ioctl takes as
+an input flags for configuring the device. The flags  for example indicate
+whether TPM 1.2 or TPM 2 functionality is supported by the TPM emulator.
+The result of the ioctl are the file descriptor for the 'server side'
+as well as the major and minor numbers of the character device that was 
created.
+Besides that the number of the TPM character device is return. If for
+example /dev/tpm10 was created, the number (dev_num) 10 is returned.
+
+The following is the data structure of the TPM_PROXY_IOC_NEW_DEV ioctl:
+
+struct vtpm_proxy_new_dev {
+   __u32 flags; /* input */
+   __u32 tpm_num;   /* output */
+   __u32 fd;/* output */
+   __u32 major; /* output */
+   __u32 minor; /* output */
+};
+
+Note that if unsupported flags are passed to the device driver, the ioctl will
+fail and errno will be set to EOPNOTSUPP. Similarly, if an unsupported ioctl is
+called on the device driver, the ioctl will fail and errno will be set to
+ENOTTY.
+
+See /usr/include/linux/vtpm_proxy.h for definitions related to the public 
interface
+of this vTPM device driver.
+
+Once the device has been created, the driver will immediately try to talk
+to the TPM. All commands from the driver can read from the file descriptor
+returned by the ioctl. The commands should be responded to immediately.
+
+Depending on the version of TPM the following commands will be sent by the
+driver:
+
+- TPM 1.2:
+  - the driver will send a TPM_Startup() command to the TPM emulator
+  - the driver will send commands to read the command durations and
+interface timeouts from the TPM emulator
+- TPM 2:
+  - the driver will send a TPM2_Startup() command to the TPM emulator
+
+The TPM device /dev/tpmX will only appear if all of the relevant commands
+were responded to properly.
-- 
2.4.3



[PATCH v7 08/10] tpm: Proxy driver for supporting multiple emulated TPMs

2016-03-11 Thread Stefan Berger
This patch implements a proxy driver for supporting multiple emulated TPMs
in a system.

The driver implements a device /dev/vtpmx that is used to created
a client device pair /dev/tpmX (e.g., /dev/tpm10) and a server side that
is accessed using a file descriptor returned by an ioctl.
The device /dev/tpmX is the usual TPM device created by the core TPM
driver. Applications or kernel subsystems can send TPM commands to it
and the corresponding server-side file descriptor receives these
commands and delivers them to an emulated TPM.

Signed-off-by: Stefan Berger 
CC: linux-kernel@vger.kernel.org
CC: linux-...@vger.kernel.org
CC: linux-...@vger.kernel.org
---
 drivers/char/tpm/Kconfig  |  10 +
 drivers/char/tpm/Makefile |   1 +
 drivers/char/tpm/tpm_vtpm_proxy.c | 567 ++
 include/uapi/linux/Kbuild |   1 +
 include/uapi/linux/vtpm_proxy.h   |  42 +++
 5 files changed, 621 insertions(+)
 create mode 100644 drivers/char/tpm/tpm_vtpm_proxy.c
 create mode 100644 include/uapi/linux/vtpm_proxy.h

diff --git a/drivers/char/tpm/Kconfig b/drivers/char/tpm/Kconfig
index 3b84a8b..0eac596 100644
--- a/drivers/char/tpm/Kconfig
+++ b/drivers/char/tpm/Kconfig
@@ -122,5 +122,15 @@ config TCG_CRB
  from within Linux.  To compile this driver as a module, choose
  M here; the module will be called tpm_crb.
 
+config TCG_VTPM_PROXY
+   tristate "VTPM Proxy Interface"
+   depends on TCG_TPM
+   ---help---
+ This driver proxies for an emulated TPM (vTPM) running in userspace.
+ A device /dev/vtpmx is provided that creates a device pair
+ /dev/vtpmX and a server-side file descriptor on which the vTPM
+ can receive commands.
+
+
 source "drivers/char/tpm/st33zp24/Kconfig"
 endif # TCG_TPM
diff --git a/drivers/char/tpm/Makefile b/drivers/char/tpm/Makefile
index 56e8f1f..98de5e6 100644
--- a/drivers/char/tpm/Makefile
+++ b/drivers/char/tpm/Makefile
@@ -23,3 +23,4 @@ obj-$(CONFIG_TCG_IBMVTPM) += tpm_ibmvtpm.o
 obj-$(CONFIG_TCG_TIS_ST33ZP24) += st33zp24/
 obj-$(CONFIG_TCG_XEN) += xen-tpmfront.o
 obj-$(CONFIG_TCG_CRB) += tpm_crb.o
+obj-$(CONFIG_TCG_VTPM_PROXY) += tpm_vtpm_proxy.o
diff --git a/drivers/char/tpm/tpm_vtpm_proxy.c 
b/drivers/char/tpm/tpm_vtpm_proxy.c
new file mode 100644
index 000..d73944e
--- /dev/null
+++ b/drivers/char/tpm/tpm_vtpm_proxy.c
@@ -0,0 +1,567 @@
+/*
+ * Copyright (C) 2015, 2016 IBM Corporation
+ *
+ * Author: Stefan Berger 
+ *
+ * Maintained by: 
+ *
+ * Device driver for vTPM (vTPM proxy driver)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation, version 2 of the
+ * License.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "tpm.h"
+
+#define VTPM_PROXY_REQ_COMPLETE_FLAG  BIT(0)
+
+struct proxy_dev {
+   struct tpm_chip *chip;
+
+   u32 flags;   /* public API flags */
+
+   wait_queue_head_t wq;
+
+   struct mutex buf_lock;   /* protect buffer and flags */
+
+   long state;  /* internal state */
+#define STATE_OPENED_FLAGBIT(0)
+#define STATE_WAIT_RESPONSE_FLAG BIT(1)  /* waiting for emulator response */
+
+   size_t req_len;  /* length of queued TPM request */
+   size_t resp_len; /* length of queued TPM response */
+   u8 buffer[TPM_BUFSIZE];  /* request/response buffer */
+};
+
+
+static void vtpm_proxy_delete_device(struct proxy_dev *proxy_dev);
+
+/*
+ * Functions related to 'server side'
+ */
+
+/**
+ * vtpm_proxy_fops_read - Read TPM commands on 'server side'
+ *
+ * Return value:
+ * Number of bytes read or negative error code
+ */
+static ssize_t vtpm_proxy_fops_read(struct file *filp, char __user *buf,
+   size_t count, loff_t *off)
+{
+   struct proxy_dev *proxy_dev = filp->private_data;
+   size_t len;
+   int sig, rc;
+
+   sig = wait_event_interruptible(proxy_dev->wq, proxy_dev->req_len != 0);
+   if (sig)
+   return -EINTR;
+
+   mutex_lock(_dev->buf_lock);
+
+   len = proxy_dev->req_len;
+
+   if (count < len) {
+   mutex_unlock(_dev->buf_lock);
+   pr_debug("Invalid size in recv: count=%zd, req_len=%zd\n",
+count, len);
+   return -EIO;
+   }
+
+   rc = copy_to_user(buf, proxy_dev->buffer, len);
+   memset(proxy_dev->buffer, 0, len);
+   proxy_dev->req_len = 0;
+
+   if (!rc)
+   proxy_dev->state |= STATE_WAIT_RESPONSE_FLAG;
+
+   mutex_unlock(_dev->buf_lock);
+
+   if (rc)
+   return -EFAULT;
+
+   return len;
+}
+
+/**
+ * vtpm_proxy_fops_write - Write TPM responses on 'server side'
+ *
+ * Return value:
+ * Number of bytes read or negative error value
+ */

[PATCH v7 00/10] Multi-instance vTPM proxy driver

2016-03-11 Thread Stefan Berger
The following series of patches implements a multi-instance vTPM 
proxy driver that can dynamically create TPM 'server' and client device
pairs.

Using an ioctl on the provided /dev/vtpmx, a client-side vTPM device
and a server side file descriptor is created. The file descriptor must
be passed to a TPM emulator. The device driver will initialize the
emulated TPM using TPM 1.2 or TPM 2 startup commands and it will read
the command durations from the device in case of a TPM 1.2. The choice
of emulated TPM device (1.2 or 2) must be provided with a flag in
the ioctl.

The driver is based on a recent checkout of James Morris's 'next'
branch and uses several recently posted patches from Jason and Jarkko.

   Stefan

v6->v7:
 - Adjusted name of driver to tpm_vtpm_proxy from tpm_vtpm. Adjust function
   names, names of structures, and names of constants.
 - Adjusted IOCTL to use magic 0xa1 rather than the completely used 0xa0.
 - Extended driver documentation and added documentation of ioctl.
 - Moved test program to own project (dropped patch 11).

v5->v6:
 - Adapted errno's for unsupported flags and ioctls following Jason's comments

v4->v5:
 - Introduced different error codes for unsupported flags and ioctls
 - Added documentation patch

Jason Gunthorpe (4):
  tpm: Get rid of chip->pdev
  tpm: Get rid of devname
  tpm: Provide strong locking for device removal
  tpm: Split out the devm stuff from tpmm_chip_alloc

Stefan Berger (6):
  tpm: Get rid of module locking
  tpm: Replace device number bitmap with IDR
  tpm: Introduce TPM_CHIP_FLAG_VIRTUAL
  tpm: Proxy driver for supporting multiple emulated TPMs
  tpm: Initialize TPM and get durations and timeouts
  tpm: Add documentation for the tpm_vtpm device driver

 Documentation/ioctl/ioctl-number.txt |   1 +
 Documentation/tpm/tpm_vtpm_proxy.txt |  71 
 drivers/char/tpm/Kconfig |  10 +
 drivers/char/tpm/Makefile|   1 +
 drivers/char/tpm/tpm-chip.c  | 221 
 drivers/char/tpm/tpm-dev.c   |  15 +-
 drivers/char/tpm/tpm-interface.c |  50 +--
 drivers/char/tpm/tpm-sysfs.c |  22 +-
 drivers/char/tpm/tpm.h   |  30 +-
 drivers/char/tpm/tpm2-cmd.c  |   8 +-
 drivers/char/tpm/tpm_atmel.c |  14 +-
 drivers/char/tpm/tpm_eventlog.c  |   2 +-
 drivers/char/tpm/tpm_eventlog.h  |   2 +-
 drivers/char/tpm/tpm_i2c_atmel.c |  16 +-
 drivers/char/tpm/tpm_i2c_infineon.c  |   6 +-
 drivers/char/tpm/tpm_i2c_nuvoton.c   |  24 +-
 drivers/char/tpm/tpm_infineon.c  |  22 +-
 drivers/char/tpm/tpm_nsc.c   |  20 +-
 drivers/char/tpm/tpm_tis.c   |  18 +-
 drivers/char/tpm/tpm_vtpm_proxy.c| 644 +++
 include/uapi/linux/Kbuild|   1 +
 include/uapi/linux/vtpm_proxy.h  |  42 +++
 22 files changed, 1061 insertions(+), 179 deletions(-)
 create mode 100644 Documentation/tpm/tpm_vtpm_proxy.txt
 create mode 100644 drivers/char/tpm/tpm_vtpm_proxy.c
 create mode 100644 include/uapi/linux/vtpm_proxy.h

-- 
2.4.3



[PATCH v7 06/10] tpm: Replace device number bitmap with IDR

2016-03-11 Thread Stefan Berger
Replace the device number bitmap with IDR. Extend the number of devices we
can create to 64k.
Since an IDR allows us to associate a pointer with an ID, we use this now
to rewrite tpm_chip_find_get() to simply look up the chip pointer by the
given device ID.

Protect the IDR calls with a mutex.

Signed-off-by: Stefan Berger 
Reviewed-by: Jason Gunthorpe 
Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 
Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-chip.c  | 84 +---
 drivers/char/tpm/tpm-interface.c |  1 +
 drivers/char/tpm/tpm.h   |  5 +--
 3 files changed, 48 insertions(+), 42 deletions(-)

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 5880377..f62c851 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -29,9 +29,8 @@
 #include "tpm.h"
 #include "tpm_eventlog.h"
 
-static DECLARE_BITMAP(dev_mask, TPM_NUM_DEVICES);
-static LIST_HEAD(tpm_chip_list);
-static DEFINE_SPINLOCK(driver_lock);
+DEFINE_IDR(dev_nums_idr);
+static DEFINE_MUTEX(idr_lock);
 
 struct class *tpm_class;
 dev_t tpm_devt;
@@ -88,20 +87,30 @@ EXPORT_SYMBOL_GPL(tpm_put_ops);
   */
 struct tpm_chip *tpm_chip_find_get(int chip_num)
 {
-   struct tpm_chip *pos, *chip = NULL;
+   struct tpm_chip *chip, *res = NULL;
+   int chip_prev;
+
+   mutex_lock(_lock);
+
+   if (chip_num == TPM_ANY_NUM) {
+   chip_num = 0;
+   do {
+   chip_prev = chip_num;
+   chip = idr_get_next(_nums_idr, _num);
+   if (chip && !tpm_try_get_ops(chip)) {
+   res = chip;
+   break;
+   }
+   } while (chip_prev != chip_num);
+   } else {
+   chip = idr_find_slowpath(_nums_idr, chip_num);
+   if (chip && !tpm_try_get_ops(chip))
+   res = chip;
+   }
 
-   rcu_read_lock();
-   list_for_each_entry_rcu(pos, _chip_list, list) {
-   if (chip_num != TPM_ANY_NUM && chip_num != pos->dev_num)
-   continue;
+   mutex_unlock(_lock);
 
-   /* rcu prevents chip from being free'd */
-   if (!tpm_try_get_ops(pos))
-   chip = pos;
-   break;
-   }
-   rcu_read_unlock();
-   return chip;
+   return res;
 }
 
 /**
@@ -114,9 +123,10 @@ static void tpm_dev_release(struct device *dev)
 {
struct tpm_chip *chip = container_of(dev, struct tpm_chip, dev);
 
-   spin_lock(_lock);
-   clear_bit(chip->dev_num, dev_mask);
-   spin_unlock(_lock);
+   mutex_lock(_lock);
+   idr_remove(_nums_idr, chip->dev_num);
+   mutex_unlock(_lock);
+
kfree(chip);
 }
 
@@ -142,21 +152,18 @@ struct tpm_chip *tpm_chip_alloc(struct device *dev,
 
mutex_init(>tpm_mutex);
init_rwsem(>ops_sem);
-   INIT_LIST_HEAD(>list);
 
chip->ops = ops;
 
-   spin_lock(_lock);
-   chip->dev_num = find_first_zero_bit(dev_mask, TPM_NUM_DEVICES);
-   spin_unlock(_lock);
-
-   if (chip->dev_num >= TPM_NUM_DEVICES) {
+   mutex_lock(_lock);
+   rc = idr_alloc(_nums_idr, NULL, 0, TPM_NUM_DEVICES, GFP_KERNEL);
+   mutex_unlock(_lock);
+   if (rc < 0) {
dev_err(dev, "No available tpm device numbers\n");
kfree(chip);
-   return ERR_PTR(-ENOMEM);
+   return ERR_PTR(rc);
}
-
-   set_bit(chip->dev_num, dev_mask);
+   chip->dev_num = rc;
 
device_initialize(>dev);
 
@@ -242,19 +249,28 @@ static int tpm_add_char_device(struct tpm_chip *chip)
return rc;
}
 
+   /* Make the chip available. */
+   mutex_lock(_lock);
+   idr_replace(_nums_idr, chip, chip->dev_num);
+   mutex_unlock(_lock);
+
return rc;
 }
 
 static void tpm_del_char_device(struct tpm_chip *chip)
 {
cdev_del(>cdev);
+   device_del(>dev);
+
+   /* Make the chip unavailable. */
+   mutex_lock(_lock);
+   idr_replace(_nums_idr, NULL, chip->dev_num);
+   mutex_unlock(_lock);
 
/* Make the driver uncallable. */
down_write(>ops_sem);
chip->ops = NULL;
up_write(>ops_sem);
-
-   device_del(>dev);
 }
 
 static int tpm1_chip_register(struct tpm_chip *chip)
@@ -309,11 +325,6 @@ int tpm_chip_register(struct tpm_chip *chip)
if (rc)
goto out_err;
 
-   /* Make the chip available. */
-   spin_lock(_lock);
-   list_add_tail_rcu(>list, _chip_list);
-   spin_unlock(_lock);
-
chip->flags |= TPM_CHIP_FLAG_REGISTERED;
 
if (!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
@@ -350,11 +361,6 @@ void tpm_chip_unregister(struct tpm_chip *chip)

[PATCH v7 06/10] tpm: Replace device number bitmap with IDR

2016-03-11 Thread Stefan Berger
Replace the device number bitmap with IDR. Extend the number of devices we
can create to 64k.
Since an IDR allows us to associate a pointer with an ID, we use this now
to rewrite tpm_chip_find_get() to simply look up the chip pointer by the
given device ID.

Protect the IDR calls with a mutex.

Signed-off-by: Stefan Berger 
Reviewed-by: Jason Gunthorpe 
Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 
Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-chip.c  | 84 +---
 drivers/char/tpm/tpm-interface.c |  1 +
 drivers/char/tpm/tpm.h   |  5 +--
 3 files changed, 48 insertions(+), 42 deletions(-)

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 5880377..f62c851 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -29,9 +29,8 @@
 #include "tpm.h"
 #include "tpm_eventlog.h"
 
-static DECLARE_BITMAP(dev_mask, TPM_NUM_DEVICES);
-static LIST_HEAD(tpm_chip_list);
-static DEFINE_SPINLOCK(driver_lock);
+DEFINE_IDR(dev_nums_idr);
+static DEFINE_MUTEX(idr_lock);
 
 struct class *tpm_class;
 dev_t tpm_devt;
@@ -88,20 +87,30 @@ EXPORT_SYMBOL_GPL(tpm_put_ops);
   */
 struct tpm_chip *tpm_chip_find_get(int chip_num)
 {
-   struct tpm_chip *pos, *chip = NULL;
+   struct tpm_chip *chip, *res = NULL;
+   int chip_prev;
+
+   mutex_lock(_lock);
+
+   if (chip_num == TPM_ANY_NUM) {
+   chip_num = 0;
+   do {
+   chip_prev = chip_num;
+   chip = idr_get_next(_nums_idr, _num);
+   if (chip && !tpm_try_get_ops(chip)) {
+   res = chip;
+   break;
+   }
+   } while (chip_prev != chip_num);
+   } else {
+   chip = idr_find_slowpath(_nums_idr, chip_num);
+   if (chip && !tpm_try_get_ops(chip))
+   res = chip;
+   }
 
-   rcu_read_lock();
-   list_for_each_entry_rcu(pos, _chip_list, list) {
-   if (chip_num != TPM_ANY_NUM && chip_num != pos->dev_num)
-   continue;
+   mutex_unlock(_lock);
 
-   /* rcu prevents chip from being free'd */
-   if (!tpm_try_get_ops(pos))
-   chip = pos;
-   break;
-   }
-   rcu_read_unlock();
-   return chip;
+   return res;
 }
 
 /**
@@ -114,9 +123,10 @@ static void tpm_dev_release(struct device *dev)
 {
struct tpm_chip *chip = container_of(dev, struct tpm_chip, dev);
 
-   spin_lock(_lock);
-   clear_bit(chip->dev_num, dev_mask);
-   spin_unlock(_lock);
+   mutex_lock(_lock);
+   idr_remove(_nums_idr, chip->dev_num);
+   mutex_unlock(_lock);
+
kfree(chip);
 }
 
@@ -142,21 +152,18 @@ struct tpm_chip *tpm_chip_alloc(struct device *dev,
 
mutex_init(>tpm_mutex);
init_rwsem(>ops_sem);
-   INIT_LIST_HEAD(>list);
 
chip->ops = ops;
 
-   spin_lock(_lock);
-   chip->dev_num = find_first_zero_bit(dev_mask, TPM_NUM_DEVICES);
-   spin_unlock(_lock);
-
-   if (chip->dev_num >= TPM_NUM_DEVICES) {
+   mutex_lock(_lock);
+   rc = idr_alloc(_nums_idr, NULL, 0, TPM_NUM_DEVICES, GFP_KERNEL);
+   mutex_unlock(_lock);
+   if (rc < 0) {
dev_err(dev, "No available tpm device numbers\n");
kfree(chip);
-   return ERR_PTR(-ENOMEM);
+   return ERR_PTR(rc);
}
-
-   set_bit(chip->dev_num, dev_mask);
+   chip->dev_num = rc;
 
device_initialize(>dev);
 
@@ -242,19 +249,28 @@ static int tpm_add_char_device(struct tpm_chip *chip)
return rc;
}
 
+   /* Make the chip available. */
+   mutex_lock(_lock);
+   idr_replace(_nums_idr, chip, chip->dev_num);
+   mutex_unlock(_lock);
+
return rc;
 }
 
 static void tpm_del_char_device(struct tpm_chip *chip)
 {
cdev_del(>cdev);
+   device_del(>dev);
+
+   /* Make the chip unavailable. */
+   mutex_lock(_lock);
+   idr_replace(_nums_idr, NULL, chip->dev_num);
+   mutex_unlock(_lock);
 
/* Make the driver uncallable. */
down_write(>ops_sem);
chip->ops = NULL;
up_write(>ops_sem);
-
-   device_del(>dev);
 }
 
 static int tpm1_chip_register(struct tpm_chip *chip)
@@ -309,11 +325,6 @@ int tpm_chip_register(struct tpm_chip *chip)
if (rc)
goto out_err;
 
-   /* Make the chip available. */
-   spin_lock(_lock);
-   list_add_tail_rcu(>list, _chip_list);
-   spin_unlock(_lock);
-
chip->flags |= TPM_CHIP_FLAG_REGISTERED;
 
if (!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
@@ -350,11 +361,6 @@ void tpm_chip_unregister(struct tpm_chip *chip)
if (!(chip->flags & TPM_CHIP_FLAG_REGISTERED))
return;
 
-   spin_lock(_lock);
-   list_del_rcu(>list);
-   spin_unlock(_lock);
-   

[PATCH v7 04/10] tpm: Get rid of module locking

2016-03-11 Thread Stefan Berger
Now that the tpm core has strong locking around 'ops' it is possible
to remove a TPM driver, module and all, even while user space still
has things like /dev/tpmX open. For consistency and simplicity, drop
the module locking entirely.

The module lock can be dropped since /dev/tpmX holds the reader lock
on 'ops' while using 'ops' and this prevents the module from un-
registering, which needs the writer lock. Once the module unregistered
the 'ops' cannot be found anymore.

Signed-off-by: Stefan Berger 
Reviewed-by: Jason Gunthorpe 
Reviewed-by: Jarkko Sakkinen 
Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-chip.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 5793ea1..6636728 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -57,9 +57,6 @@ int tpm_try_get_ops(struct tpm_chip *chip)
if (!chip->ops)
goto out_lock;
 
-   if (!try_module_get(chip->dev.parent->driver->owner))
-   goto out_lock;
-
return 0;
 out_lock:
up_read(>ops_sem);
@@ -77,7 +74,6 @@ EXPORT_SYMBOL_GPL(tpm_try_get_ops);
  */
 void tpm_put_ops(struct tpm_chip *chip)
 {
-   module_put(chip->dev.parent->driver->owner);
up_read(>ops_sem);
put_device(>dev);
 }
@@ -183,7 +179,7 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
goto out;
 
cdev_init(>cdev, _fops);
-   chip->cdev.owner = dev->driver->owner;
+   chip->cdev.owner = THIS_MODULE;
chip->cdev.kobj.parent = >dev.kobj;
 
rc = devm_add_action(dev, (void (*)(void *)) put_device, >dev);
-- 
2.4.3



[PATCH v7 04/10] tpm: Get rid of module locking

2016-03-11 Thread Stefan Berger
Now that the tpm core has strong locking around 'ops' it is possible
to remove a TPM driver, module and all, even while user space still
has things like /dev/tpmX open. For consistency and simplicity, drop
the module locking entirely.

The module lock can be dropped since /dev/tpmX holds the reader lock
on 'ops' while using 'ops' and this prevents the module from un-
registering, which needs the writer lock. Once the module unregistered
the 'ops' cannot be found anymore.

Signed-off-by: Stefan Berger 
Reviewed-by: Jason Gunthorpe 
Reviewed-by: Jarkko Sakkinen 
Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-chip.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 5793ea1..6636728 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -57,9 +57,6 @@ int tpm_try_get_ops(struct tpm_chip *chip)
if (!chip->ops)
goto out_lock;
 
-   if (!try_module_get(chip->dev.parent->driver->owner))
-   goto out_lock;
-
return 0;
 out_lock:
up_read(>ops_sem);
@@ -77,7 +74,6 @@ EXPORT_SYMBOL_GPL(tpm_try_get_ops);
  */
 void tpm_put_ops(struct tpm_chip *chip)
 {
-   module_put(chip->dev.parent->driver->owner);
up_read(>ops_sem);
put_device(>dev);
 }
@@ -183,7 +179,7 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
goto out;
 
cdev_init(>cdev, _fops);
-   chip->cdev.owner = dev->driver->owner;
+   chip->cdev.owner = THIS_MODULE;
chip->cdev.kobj.parent = >dev.kobj;
 
rc = devm_add_action(dev, (void (*)(void *)) put_device, >dev);
-- 
2.4.3



[PATCH v7 05/10] tpm: Split out the devm stuff from tpmm_chip_alloc

2016-03-11 Thread Stefan Berger
From: Jason Gunthorpe 

tpm_chip_alloc becomes a typical subsystem allocate call.

Signed-off-by: Jason Gunthorpe 
Reviewed-by: Stefan Berger 
Tested-by: Stefan Berger 
Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 
Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-chip.c | 49 -
 drivers/char/tpm/tpm.h  |  4 +++-
 2 files changed, 38 insertions(+), 15 deletions(-)

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 6636728..5880377 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -121,17 +121,17 @@ static void tpm_dev_release(struct device *dev)
 }
 
 /**
- * tpmm_chip_alloc() - allocate a new struct tpm_chip instance
- * @dev: device to which the chip is associated
+ * tpm_chip_alloc() - allocate a new struct tpm_chip instance
+ * @pdev: device to which the chip is associated
+ *At this point pdev mst be initialized, but does not have to
+ *be registered
  * @ops: struct tpm_class_ops instance
  *
  * Allocates a new struct tpm_chip instance and assigns a free
- * device number for it. Caller does not have to worry about
- * freeing the allocated resources. When the devices is removed
- * devres calls tpmm_chip_remove() to do the job.
+ * device number for it. Must be paired with put_device(>dev).
  */
-struct tpm_chip *tpmm_chip_alloc(struct device *dev,
-const struct tpm_class_ops *ops)
+struct tpm_chip *tpm_chip_alloc(struct device *dev,
+   const struct tpm_class_ops *ops)
 {
struct tpm_chip *chip;
int rc;
@@ -160,8 +160,6 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
 
device_initialize(>dev);
 
-   dev_set_drvdata(dev, chip);
-
chip->dev.class = tpm_class;
chip->dev.release = tpm_dev_release;
chip->dev.parent = dev;
@@ -182,17 +180,40 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
chip->cdev.owner = THIS_MODULE;
chip->cdev.kobj.parent = >dev.kobj;
 
-   rc = devm_add_action(dev, (void (*)(void *)) put_device, >dev);
+   return chip;
+
+out:
+   put_device(>dev);
+   return ERR_PTR(rc);
+}
+EXPORT_SYMBOL_GPL(tpm_chip_alloc);
+
+/**
+ * tpmm_chip_alloc() - allocate a new struct tpm_chip instance
+ * @pdev: parent device to which the chip is associated
+ * @ops: struct tpm_class_ops instance
+ *
+ * Same as tpm_chip_alloc except devm is used to do the put_device
+ */
+struct tpm_chip *tpmm_chip_alloc(struct device *pdev,
+const struct tpm_class_ops *ops)
+{
+   struct tpm_chip *chip;
+   int rc;
+
+   chip = tpm_chip_alloc(pdev, ops);
+   if (IS_ERR(chip))
+   return chip;
+
+   rc = devm_add_action(pdev, (void (*)(void *)) put_device, >dev);
if (rc) {
put_device(>dev);
return ERR_PTR(rc);
}
 
-   return chip;
+   dev_set_drvdata(pdev, chip);
 
-out:
-   put_device(>dev);
-   return ERR_PTR(rc);
+   return chip;
 }
 EXPORT_SYMBOL_GPL(tpmm_chip_alloc);
 
diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index c6376b1..5fcf788 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -511,7 +511,9 @@ struct tpm_chip *tpm_chip_find_get(int chip_num);
 __must_check int tpm_try_get_ops(struct tpm_chip *chip);
 void tpm_put_ops(struct tpm_chip *chip);
 
-extern struct tpm_chip *tpmm_chip_alloc(struct device *dev,
+extern struct tpm_chip *tpm_chip_alloc(struct device *dev,
+  const struct tpm_class_ops *ops);
+extern struct tpm_chip *tpmm_chip_alloc(struct device *pdev,
   const struct tpm_class_ops *ops);
 extern int tpm_chip_register(struct tpm_chip *chip);
 extern void tpm_chip_unregister(struct tpm_chip *chip);
-- 
2.4.3



[PATCH v7 07/10] tpm: Introduce TPM_CHIP_FLAG_VIRTUAL

2016-03-11 Thread Stefan Berger
Introduce TPM_CHIP_FLAG_VIRTUAL to be used when the chip device has no
parent device. Also adapt tpm_chip_alloc so that it can be called with
parent device being NULL.

Signed-off-by: Stefan Berger 
Reviewed-by: Jason Gunthorpe 
Reviewed-by: Jarkko Sakkinen 
Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-chip.c  |  9 +
 drivers/char/tpm/tpm-sysfs.c | 15 +++
 drivers/char/tpm/tpm.h   |  5 +++--
 3 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index f62c851..e4e1fad 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -170,9 +170,7 @@ struct tpm_chip *tpm_chip_alloc(struct device *dev,
chip->dev.class = tpm_class;
chip->dev.release = tpm_dev_release;
chip->dev.parent = dev;
-#ifdef CONFIG_ACPI
chip->dev.groups = chip->groups;
-#endif
 
if (chip->dev_num == 0)
chip->dev.devt = MKDEV(MISC_MAJOR, TPM_MINOR);
@@ -183,6 +181,9 @@ struct tpm_chip *tpm_chip_alloc(struct device *dev,
if (rc)
goto out;
 
+   if (!dev)
+   chip->flags |= TPM_CHIP_FLAG_VIRTUAL;
+
cdev_init(>cdev, _fops);
chip->cdev.owner = THIS_MODULE;
chip->cdev.kobj.parent = >dev.kobj;
@@ -327,7 +328,7 @@ int tpm_chip_register(struct tpm_chip *chip)
 
chip->flags |= TPM_CHIP_FLAG_REGISTERED;
 
-   if (!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
+   if (!(chip->flags & (TPM_CHIP_FLAG_TPM2 | TPM_CHIP_FLAG_VIRTUAL))) {
rc = __compat_only_sysfs_link_entry_to_kobj(
>dev.parent->kobj, >dev.kobj, "ppi");
if (rc && rc != -ENOENT) {
@@ -361,7 +362,7 @@ void tpm_chip_unregister(struct tpm_chip *chip)
if (!(chip->flags & TPM_CHIP_FLAG_REGISTERED))
return;
 
-   if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
+   if (!(chip->flags & (TPM_CHIP_FLAG_TPM2 | TPM_CHIP_FLAG_VIRTUAL)))
sysfs_remove_link(>dev.parent->kobj, "ppi");
 
tpm1_chip_unregister(chip);
diff --git a/drivers/char/tpm/tpm-sysfs.c b/drivers/char/tpm/tpm-sysfs.c
index 34e7fc7..9ee0f4d 100644
--- a/drivers/char/tpm/tpm-sysfs.c
+++ b/drivers/char/tpm/tpm-sysfs.c
@@ -283,9 +283,15 @@ static const struct attribute_group tpm_dev_group = {
 
 int tpm_sysfs_add_device(struct tpm_chip *chip)
 {
-   int err;
-   err = sysfs_create_group(>dev.parent->kobj,
-_dev_group);
+   int err = 0;
+
+   if (!(chip->flags & TPM_CHIP_FLAG_VIRTUAL))
+   err = sysfs_create_group(>dev.parent->kobj,
+_dev_group);
+   else {
+   dev_set_drvdata(>dev, chip);
+   chip->groups[chip->groups_cnt++] = _dev_group;
+   }
 
if (err)
dev_err(>dev,
@@ -300,5 +306,6 @@ void tpm_sysfs_del_device(struct tpm_chip *chip)
 * synchronizes this removal so that no callbacks are running or can
 * run again
 */
-   sysfs_remove_group(>dev.parent->kobj, _dev_group);
+   if (!(chip->flags & TPM_CHIP_FLAG_VIRTUAL))
+   sysfs_remove_group(>dev.parent->kobj, _dev_group);
 }
diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 928b47f..f197eef 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -164,6 +164,7 @@ struct tpm_vendor_specific {
 enum tpm_chip_flags {
TPM_CHIP_FLAG_REGISTERED= BIT(0),
TPM_CHIP_FLAG_TPM2  = BIT(1),
+   TPM_CHIP_FLAG_VIRTUAL   = BIT(2),
 };
 
 struct tpm_chip {
@@ -189,9 +190,9 @@ struct tpm_chip {
 
struct dentry **bios_dir;
 
-#ifdef CONFIG_ACPI
-   const struct attribute_group *groups[2];
+   const struct attribute_group *groups[3];
unsigned int groups_cnt;
+#ifdef CONFIG_ACPI
acpi_handle acpi_dev_handle;
char ppi_version[TPM_PPI_VERSION_LEN + 1];
 #endif /* CONFIG_ACPI */
-- 
2.4.3



[PATCH v7 05/10] tpm: Split out the devm stuff from tpmm_chip_alloc

2016-03-11 Thread Stefan Berger
From: Jason Gunthorpe 

tpm_chip_alloc becomes a typical subsystem allocate call.

Signed-off-by: Jason Gunthorpe 
Reviewed-by: Stefan Berger 
Tested-by: Stefan Berger 
Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 
Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-chip.c | 49 -
 drivers/char/tpm/tpm.h  |  4 +++-
 2 files changed, 38 insertions(+), 15 deletions(-)

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 6636728..5880377 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -121,17 +121,17 @@ static void tpm_dev_release(struct device *dev)
 }
 
 /**
- * tpmm_chip_alloc() - allocate a new struct tpm_chip instance
- * @dev: device to which the chip is associated
+ * tpm_chip_alloc() - allocate a new struct tpm_chip instance
+ * @pdev: device to which the chip is associated
+ *At this point pdev mst be initialized, but does not have to
+ *be registered
  * @ops: struct tpm_class_ops instance
  *
  * Allocates a new struct tpm_chip instance and assigns a free
- * device number for it. Caller does not have to worry about
- * freeing the allocated resources. When the devices is removed
- * devres calls tpmm_chip_remove() to do the job.
+ * device number for it. Must be paired with put_device(>dev).
  */
-struct tpm_chip *tpmm_chip_alloc(struct device *dev,
-const struct tpm_class_ops *ops)
+struct tpm_chip *tpm_chip_alloc(struct device *dev,
+   const struct tpm_class_ops *ops)
 {
struct tpm_chip *chip;
int rc;
@@ -160,8 +160,6 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
 
device_initialize(>dev);
 
-   dev_set_drvdata(dev, chip);
-
chip->dev.class = tpm_class;
chip->dev.release = tpm_dev_release;
chip->dev.parent = dev;
@@ -182,17 +180,40 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
chip->cdev.owner = THIS_MODULE;
chip->cdev.kobj.parent = >dev.kobj;
 
-   rc = devm_add_action(dev, (void (*)(void *)) put_device, >dev);
+   return chip;
+
+out:
+   put_device(>dev);
+   return ERR_PTR(rc);
+}
+EXPORT_SYMBOL_GPL(tpm_chip_alloc);
+
+/**
+ * tpmm_chip_alloc() - allocate a new struct tpm_chip instance
+ * @pdev: parent device to which the chip is associated
+ * @ops: struct tpm_class_ops instance
+ *
+ * Same as tpm_chip_alloc except devm is used to do the put_device
+ */
+struct tpm_chip *tpmm_chip_alloc(struct device *pdev,
+const struct tpm_class_ops *ops)
+{
+   struct tpm_chip *chip;
+   int rc;
+
+   chip = tpm_chip_alloc(pdev, ops);
+   if (IS_ERR(chip))
+   return chip;
+
+   rc = devm_add_action(pdev, (void (*)(void *)) put_device, >dev);
if (rc) {
put_device(>dev);
return ERR_PTR(rc);
}
 
-   return chip;
+   dev_set_drvdata(pdev, chip);
 
-out:
-   put_device(>dev);
-   return ERR_PTR(rc);
+   return chip;
 }
 EXPORT_SYMBOL_GPL(tpmm_chip_alloc);
 
diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index c6376b1..5fcf788 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -511,7 +511,9 @@ struct tpm_chip *tpm_chip_find_get(int chip_num);
 __must_check int tpm_try_get_ops(struct tpm_chip *chip);
 void tpm_put_ops(struct tpm_chip *chip);
 
-extern struct tpm_chip *tpmm_chip_alloc(struct device *dev,
+extern struct tpm_chip *tpm_chip_alloc(struct device *dev,
+  const struct tpm_class_ops *ops);
+extern struct tpm_chip *tpmm_chip_alloc(struct device *pdev,
   const struct tpm_class_ops *ops);
 extern int tpm_chip_register(struct tpm_chip *chip);
 extern void tpm_chip_unregister(struct tpm_chip *chip);
-- 
2.4.3



[PATCH v7 07/10] tpm: Introduce TPM_CHIP_FLAG_VIRTUAL

2016-03-11 Thread Stefan Berger
Introduce TPM_CHIP_FLAG_VIRTUAL to be used when the chip device has no
parent device. Also adapt tpm_chip_alloc so that it can be called with
parent device being NULL.

Signed-off-by: Stefan Berger 
Reviewed-by: Jason Gunthorpe 
Reviewed-by: Jarkko Sakkinen 
Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-chip.c  |  9 +
 drivers/char/tpm/tpm-sysfs.c | 15 +++
 drivers/char/tpm/tpm.h   |  5 +++--
 3 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index f62c851..e4e1fad 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -170,9 +170,7 @@ struct tpm_chip *tpm_chip_alloc(struct device *dev,
chip->dev.class = tpm_class;
chip->dev.release = tpm_dev_release;
chip->dev.parent = dev;
-#ifdef CONFIG_ACPI
chip->dev.groups = chip->groups;
-#endif
 
if (chip->dev_num == 0)
chip->dev.devt = MKDEV(MISC_MAJOR, TPM_MINOR);
@@ -183,6 +181,9 @@ struct tpm_chip *tpm_chip_alloc(struct device *dev,
if (rc)
goto out;
 
+   if (!dev)
+   chip->flags |= TPM_CHIP_FLAG_VIRTUAL;
+
cdev_init(>cdev, _fops);
chip->cdev.owner = THIS_MODULE;
chip->cdev.kobj.parent = >dev.kobj;
@@ -327,7 +328,7 @@ int tpm_chip_register(struct tpm_chip *chip)
 
chip->flags |= TPM_CHIP_FLAG_REGISTERED;
 
-   if (!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
+   if (!(chip->flags & (TPM_CHIP_FLAG_TPM2 | TPM_CHIP_FLAG_VIRTUAL))) {
rc = __compat_only_sysfs_link_entry_to_kobj(
>dev.parent->kobj, >dev.kobj, "ppi");
if (rc && rc != -ENOENT) {
@@ -361,7 +362,7 @@ void tpm_chip_unregister(struct tpm_chip *chip)
if (!(chip->flags & TPM_CHIP_FLAG_REGISTERED))
return;
 
-   if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
+   if (!(chip->flags & (TPM_CHIP_FLAG_TPM2 | TPM_CHIP_FLAG_VIRTUAL)))
sysfs_remove_link(>dev.parent->kobj, "ppi");
 
tpm1_chip_unregister(chip);
diff --git a/drivers/char/tpm/tpm-sysfs.c b/drivers/char/tpm/tpm-sysfs.c
index 34e7fc7..9ee0f4d 100644
--- a/drivers/char/tpm/tpm-sysfs.c
+++ b/drivers/char/tpm/tpm-sysfs.c
@@ -283,9 +283,15 @@ static const struct attribute_group tpm_dev_group = {
 
 int tpm_sysfs_add_device(struct tpm_chip *chip)
 {
-   int err;
-   err = sysfs_create_group(>dev.parent->kobj,
-_dev_group);
+   int err = 0;
+
+   if (!(chip->flags & TPM_CHIP_FLAG_VIRTUAL))
+   err = sysfs_create_group(>dev.parent->kobj,
+_dev_group);
+   else {
+   dev_set_drvdata(>dev, chip);
+   chip->groups[chip->groups_cnt++] = _dev_group;
+   }
 
if (err)
dev_err(>dev,
@@ -300,5 +306,6 @@ void tpm_sysfs_del_device(struct tpm_chip *chip)
 * synchronizes this removal so that no callbacks are running or can
 * run again
 */
-   sysfs_remove_group(>dev.parent->kobj, _dev_group);
+   if (!(chip->flags & TPM_CHIP_FLAG_VIRTUAL))
+   sysfs_remove_group(>dev.parent->kobj, _dev_group);
 }
diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 928b47f..f197eef 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -164,6 +164,7 @@ struct tpm_vendor_specific {
 enum tpm_chip_flags {
TPM_CHIP_FLAG_REGISTERED= BIT(0),
TPM_CHIP_FLAG_TPM2  = BIT(1),
+   TPM_CHIP_FLAG_VIRTUAL   = BIT(2),
 };
 
 struct tpm_chip {
@@ -189,9 +190,9 @@ struct tpm_chip {
 
struct dentry **bios_dir;
 
-#ifdef CONFIG_ACPI
-   const struct attribute_group *groups[2];
+   const struct attribute_group *groups[3];
unsigned int groups_cnt;
+#ifdef CONFIG_ACPI
acpi_handle acpi_dev_handle;
char ppi_version[TPM_PPI_VERSION_LEN + 1];
 #endif /* CONFIG_ACPI */
-- 
2.4.3



[PATCH v7 02/10] tpm: Get rid of devname

2016-03-11 Thread Stefan Berger
From: Jason Gunthorpe 

Now that we have a proper struct device just use dev_name() to
access this value instead of keeping two copies.

Signed-off-by: Jason Gunthorpe 
Signed-off-by: Stefan Berger 
Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 
Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-chip.c| 18 +++---
 drivers/char/tpm/tpm.h |  1 -
 drivers/char/tpm/tpm_eventlog.c|  2 +-
 drivers/char/tpm/tpm_eventlog.h|  2 +-
 drivers/char/tpm/tpm_i2c_nuvoton.c |  2 +-
 drivers/char/tpm/tpm_tis.c |  2 +-
 6 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 12829dd..c21d81c 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -111,7 +111,7 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
 
set_bit(chip->dev_num, dev_mask);
 
-   scnprintf(chip->devname, sizeof(chip->devname), "tpm%d", chip->dev_num);
+   device_initialize(>dev);
 
dev_set_drvdata(dev, chip);
 
@@ -127,9 +127,9 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
else
chip->dev.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num);
 
-   dev_set_name(>dev, "%s", chip->devname);
-
-   device_initialize(>dev);
+   rc = dev_set_name(>dev, "tpm%d", chip->dev_num);
+   if (rc)
+   goto out;
 
cdev_init(>cdev, _fops);
chip->cdev.owner = dev->driver->owner;
@@ -142,6 +142,10 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
}
 
return chip;
+
+out:
+   put_device(>dev);
+   return ERR_PTR(rc);
 }
 EXPORT_SYMBOL_GPL(tpmm_chip_alloc);
 
@@ -153,7 +157,7 @@ static int tpm_add_char_device(struct tpm_chip *chip)
if (rc) {
dev_err(>dev,
"unable to cdev_add() %s, major %d, minor %d, err=%d\n",
-   chip->devname, MAJOR(chip->dev.devt),
+   dev_name(>dev), MAJOR(chip->dev.devt),
MINOR(chip->dev.devt), rc);
 
return rc;
@@ -163,7 +167,7 @@ static int tpm_add_char_device(struct tpm_chip *chip)
if (rc) {
dev_err(>dev,
"unable to device_register() %s, major %d, minor %d, 
err=%d\n",
-   chip->devname, MAJOR(chip->dev.devt),
+   dev_name(>dev), MAJOR(chip->dev.devt),
MINOR(chip->dev.devt), rc);
 
cdev_del(>cdev);
@@ -190,7 +194,7 @@ static int tpm1_chip_register(struct tpm_chip *chip)
if (rc)
return rc;
 
-   chip->bios_dir = tpm_bios_log_setup(chip->devname);
+   chip->bios_dir = tpm_bios_log_setup(dev_name(>dev));
 
return 0;
 }
diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 9c9be6c..5d33ba5 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -174,7 +174,6 @@ struct tpm_chip {
unsigned int flags;
 
int dev_num;/* /dev/tpm# */
-   char devname[7];
unsigned long is_open;  /* only one allowed */
int time_expired;
 
diff --git a/drivers/char/tpm/tpm_eventlog.c b/drivers/char/tpm/tpm_eventlog.c
index 4e6940a..e722886 100644
--- a/drivers/char/tpm/tpm_eventlog.c
+++ b/drivers/char/tpm/tpm_eventlog.c
@@ -403,7 +403,7 @@ static int is_bad(void *p)
return 0;
 }
 
-struct dentry **tpm_bios_log_setup(char *name)
+struct dentry **tpm_bios_log_setup(const char *name)
 {
struct dentry **ret = NULL, *tpm_dir, *bin_file, *ascii_file;
 
diff --git a/drivers/char/tpm/tpm_eventlog.h b/drivers/char/tpm/tpm_eventlog.h
index 267bfbd..cc9672f 100644
--- a/drivers/char/tpm/tpm_eventlog.h
+++ b/drivers/char/tpm/tpm_eventlog.h
@@ -77,7 +77,7 @@ int read_log(struct tpm_bios_log *log);
 
 #if defined(CONFIG_TCG_IBMVTPM) || defined(CONFIG_TCG_IBMVTPM_MODULE) || \
defined(CONFIG_ACPI)
-extern struct dentry **tpm_bios_log_setup(char *);
+extern struct dentry **tpm_bios_log_setup(const char *);
 extern void tpm_bios_log_teardown(struct dentry **);
 #else
 static inline struct dentry **tpm_bios_log_setup(char *name)
diff --git a/drivers/char/tpm/tpm_i2c_nuvoton.c 
b/drivers/char/tpm/tpm_i2c_nuvoton.c
index a1e1474..d61d43f 100644
--- a/drivers/char/tpm/tpm_i2c_nuvoton.c
+++ b/drivers/char/tpm/tpm_i2c_nuvoton.c
@@ -560,7 +560,7 @@ static int i2c_nuvoton_probe(struct i2c_client *client,
rc = devm_request_irq(dev, chip->vendor.irq,
  i2c_nuvoton_int_handler,
  IRQF_TRIGGER_LOW,
- chip->devname,
+ dev_name(>dev),
  chip);
   

[PATCH v7 02/10] tpm: Get rid of devname

2016-03-11 Thread Stefan Berger
From: Jason Gunthorpe 

Now that we have a proper struct device just use dev_name() to
access this value instead of keeping two copies.

Signed-off-by: Jason Gunthorpe 
Signed-off-by: Stefan Berger 
Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 
Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-chip.c| 18 +++---
 drivers/char/tpm/tpm.h |  1 -
 drivers/char/tpm/tpm_eventlog.c|  2 +-
 drivers/char/tpm/tpm_eventlog.h|  2 +-
 drivers/char/tpm/tpm_i2c_nuvoton.c |  2 +-
 drivers/char/tpm/tpm_tis.c |  2 +-
 6 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 12829dd..c21d81c 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -111,7 +111,7 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
 
set_bit(chip->dev_num, dev_mask);
 
-   scnprintf(chip->devname, sizeof(chip->devname), "tpm%d", chip->dev_num);
+   device_initialize(>dev);
 
dev_set_drvdata(dev, chip);
 
@@ -127,9 +127,9 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
else
chip->dev.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num);
 
-   dev_set_name(>dev, "%s", chip->devname);
-
-   device_initialize(>dev);
+   rc = dev_set_name(>dev, "tpm%d", chip->dev_num);
+   if (rc)
+   goto out;
 
cdev_init(>cdev, _fops);
chip->cdev.owner = dev->driver->owner;
@@ -142,6 +142,10 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev,
}
 
return chip;
+
+out:
+   put_device(>dev);
+   return ERR_PTR(rc);
 }
 EXPORT_SYMBOL_GPL(tpmm_chip_alloc);
 
@@ -153,7 +157,7 @@ static int tpm_add_char_device(struct tpm_chip *chip)
if (rc) {
dev_err(>dev,
"unable to cdev_add() %s, major %d, minor %d, err=%d\n",
-   chip->devname, MAJOR(chip->dev.devt),
+   dev_name(>dev), MAJOR(chip->dev.devt),
MINOR(chip->dev.devt), rc);
 
return rc;
@@ -163,7 +167,7 @@ static int tpm_add_char_device(struct tpm_chip *chip)
if (rc) {
dev_err(>dev,
"unable to device_register() %s, major %d, minor %d, 
err=%d\n",
-   chip->devname, MAJOR(chip->dev.devt),
+   dev_name(>dev), MAJOR(chip->dev.devt),
MINOR(chip->dev.devt), rc);
 
cdev_del(>cdev);
@@ -190,7 +194,7 @@ static int tpm1_chip_register(struct tpm_chip *chip)
if (rc)
return rc;
 
-   chip->bios_dir = tpm_bios_log_setup(chip->devname);
+   chip->bios_dir = tpm_bios_log_setup(dev_name(>dev));
 
return 0;
 }
diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 9c9be6c..5d33ba5 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -174,7 +174,6 @@ struct tpm_chip {
unsigned int flags;
 
int dev_num;/* /dev/tpm# */
-   char devname[7];
unsigned long is_open;  /* only one allowed */
int time_expired;
 
diff --git a/drivers/char/tpm/tpm_eventlog.c b/drivers/char/tpm/tpm_eventlog.c
index 4e6940a..e722886 100644
--- a/drivers/char/tpm/tpm_eventlog.c
+++ b/drivers/char/tpm/tpm_eventlog.c
@@ -403,7 +403,7 @@ static int is_bad(void *p)
return 0;
 }
 
-struct dentry **tpm_bios_log_setup(char *name)
+struct dentry **tpm_bios_log_setup(const char *name)
 {
struct dentry **ret = NULL, *tpm_dir, *bin_file, *ascii_file;
 
diff --git a/drivers/char/tpm/tpm_eventlog.h b/drivers/char/tpm/tpm_eventlog.h
index 267bfbd..cc9672f 100644
--- a/drivers/char/tpm/tpm_eventlog.h
+++ b/drivers/char/tpm/tpm_eventlog.h
@@ -77,7 +77,7 @@ int read_log(struct tpm_bios_log *log);
 
 #if defined(CONFIG_TCG_IBMVTPM) || defined(CONFIG_TCG_IBMVTPM_MODULE) || \
defined(CONFIG_ACPI)
-extern struct dentry **tpm_bios_log_setup(char *);
+extern struct dentry **tpm_bios_log_setup(const char *);
 extern void tpm_bios_log_teardown(struct dentry **);
 #else
 static inline struct dentry **tpm_bios_log_setup(char *name)
diff --git a/drivers/char/tpm/tpm_i2c_nuvoton.c 
b/drivers/char/tpm/tpm_i2c_nuvoton.c
index a1e1474..d61d43f 100644
--- a/drivers/char/tpm/tpm_i2c_nuvoton.c
+++ b/drivers/char/tpm/tpm_i2c_nuvoton.c
@@ -560,7 +560,7 @@ static int i2c_nuvoton_probe(struct i2c_client *client,
rc = devm_request_irq(dev, chip->vendor.irq,
  i2c_nuvoton_int_handler,
  IRQF_TRIGGER_LOW,
- chip->devname,
+ dev_name(>dev),
  chip);
if (rc) {
dev_err(dev, "%s() Unable to request irq: %d for use\n",
diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index 0a9aee9..eed3bf5 100644

Re: [ANNOUNCE] v4.4.4-rt11

2016-03-11 Thread Jean-Denis Girard
Hi,

Le 11/03/2016 07:14, Sebastian Andrzej Siewior a écrit :
> This looks promising. Can you check if IRQ 16 is used by another
> peripherals? While we need get rid fo the warning I am curious if it
> works on boards which chare the timer with UART or something.

Here are the interrupts, I hope this is what you want:
jdg@arietta:~$ cat /proc/interrupts
   CPU0
 16: 145825  atmel-aic   1 Level at91_rtc, ttyS0
 17:   45661872  atmel-aic  17 Level tc_clkevt
 18:  72014  atmel-aic  20 Level at_hdmac
 19:  0  atmel-aic  21 Level at_hdmac
 23:2541144  atmel-aic  12 Level f0008000.mmc
 26:  0  atmel-aic  10 Level f8014000.i2c
 27:  0  atmel-aic  19 Level at91_adc
 28:  0  atmel-aic  14 Level f0004000.spi
 31:  1  atmel-aic  22 Level ehci_hcd:usb1, ohci_hcd:usb2
Err:  0


The Arietta has been working fine for a couple of days with this kernel:
jdg@arietta:~$ uname -r
4.4.4-rt11
jdg@arietta:~$ uptime
 14:30:06 up 2 days,  4:04,  1 user,  load average: 0.00, 0.02, 0.05


Thanks,
-- 
Jean-Denis Girard

SysNuxSystèmes   Linux   en   Polynésie   française
http://www.sysnux.pf/ Tél: +689 40.50.10.40 / GSM: +689 87.79.75.27


Re: [ANNOUNCE] v4.4.4-rt11

2016-03-11 Thread Jean-Denis Girard
Hi,

Le 11/03/2016 07:14, Sebastian Andrzej Siewior a écrit :
> This looks promising. Can you check if IRQ 16 is used by another
> peripherals? While we need get rid fo the warning I am curious if it
> works on boards which chare the timer with UART or something.

Here are the interrupts, I hope this is what you want:
jdg@arietta:~$ cat /proc/interrupts
   CPU0
 16: 145825  atmel-aic   1 Level at91_rtc, ttyS0
 17:   45661872  atmel-aic  17 Level tc_clkevt
 18:  72014  atmel-aic  20 Level at_hdmac
 19:  0  atmel-aic  21 Level at_hdmac
 23:2541144  atmel-aic  12 Level f0008000.mmc
 26:  0  atmel-aic  10 Level f8014000.i2c
 27:  0  atmel-aic  19 Level at91_adc
 28:  0  atmel-aic  14 Level f0004000.spi
 31:  1  atmel-aic  22 Level ehci_hcd:usb1, ohci_hcd:usb2
Err:  0


The Arietta has been working fine for a couple of days with this kernel:
jdg@arietta:~$ uname -r
4.4.4-rt11
jdg@arietta:~$ uptime
 14:30:06 up 2 days,  4:04,  1 user,  load average: 0.00, 0.02, 0.05


Thanks,
-- 
Jean-Denis Girard

SysNuxSystèmes   Linux   en   Polynésie   française
http://www.sysnux.pf/ Tél: +689 40.50.10.40 / GSM: +689 87.79.75.27


[PATCH] cpufreq: Do not schedule policy update work in cpufreq_resume()

2016-03-11 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

cpufreq_resume() attempts to resync the current frequency with
policy->cur for the first online CPU, but first it does that after
restarting governors for all active policies (which means that this
is racy with respect to whatever the governors do) and second it
already is too late for that when cpufreq_resume() is called (that
happens after invoking ->resume callbacks for all devices in the
system).

Also it doesn't make sense to do that for one CPU only in any case,
because the other CPUs in the system need not share the policy with
it and their policy->cur may be out of sync as well in principle.

For the above reasons, drop the part in question from cpufreq_resume().

Signed-off-by: Rafael J. Wysocki 
---
 drivers/cpufreq/cpufreq.c |   11 ---
 1 file changed, 11 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -1593,17 +1593,6 @@ void cpufreq_resume(void)
   __func__, policy);
}
}
-
-   /*
-* schedule call cpufreq_update_policy() for first-online CPU, as that
-* wouldn't be hotplugged-out on suspend. It will verify that the
-* current freq is in sync with what we believe it to be.
-*/
-   policy = cpufreq_cpu_get_raw(cpumask_first(cpu_online_mask));
-   if (WARN_ON(!policy))
-   return;
-
-   schedule_work(>update);
 }
 
 /**



[PATCH] cpufreq: Do not schedule policy update work in cpufreq_resume()

2016-03-11 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

cpufreq_resume() attempts to resync the current frequency with
policy->cur for the first online CPU, but first it does that after
restarting governors for all active policies (which means that this
is racy with respect to whatever the governors do) and second it
already is too late for that when cpufreq_resume() is called (that
happens after invoking ->resume callbacks for all devices in the
system).

Also it doesn't make sense to do that for one CPU only in any case,
because the other CPUs in the system need not share the policy with
it and their policy->cur may be out of sync as well in principle.

For the above reasons, drop the part in question from cpufreq_resume().

Signed-off-by: Rafael J. Wysocki 
---
 drivers/cpufreq/cpufreq.c |   11 ---
 1 file changed, 11 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -1593,17 +1593,6 @@ void cpufreq_resume(void)
   __func__, policy);
}
}
-
-   /*
-* schedule call cpufreq_update_policy() for first-online CPU, as that
-* wouldn't be hotplugged-out on suspend. It will verify that the
-* current freq is in sync with what we believe it to be.
-*/
-   policy = cpufreq_cpu_get_raw(cpumask_first(cpu_online_mask));
-   if (WARN_ON(!policy))
-   return;
-
-   schedule_work(>update);
 }
 
 /**



Re: [v5] powerpc/mpc85xx: Add MDIO bus muxing support to the board device tree(s)

2016-03-11 Thread Scott Wood
On Mon, Aug 03, 2015 at 11:14:10AM +0300, Igal.Liberman wrote:
> From: Igal Liberman 
> 
> Describe the PHY topology for all configurations supported by each board
> 
> Based on prior work by Andy Fleming 
> 
> Signed-off-by: Shruti Kanetkar 
> Signed-off-by: Emil Medve 
> Signed-off-by: Igal Liberman 
> ---
> 
> Depends on the following patch set:
>   https://patchwork.ozlabs.org/patch/503107/
>   https://patchwork.ozlabs.org/patch/503108/
> 
> v4 ---> v5:
>   - Correct "Signed-off-by" order
> 
> v3 ---> v4:
>   - Added T1024 support
> 
> v2 ---> v3:
>   - Fixed incorrect E-Mail address (signed-off-by)
> 
> v1 ---> v2
>   - Remove 'Change-Id'
> 
>  arch/powerpc/boot/dts/fsl/b4860qds.dts|   60 -
>  arch/powerpc/boot/dts/fsl/b4qds.dtsi  |   51 -
>  arch/powerpc/boot/dts/fsl/p1023rdb.dts|   24 +-
>  arch/powerpc/boot/dts/fsl/p2041rdb.dts|   92 +++-
>  arch/powerpc/boot/dts/fsl/p3041ds.dts |  112 -
>  arch/powerpc/boot/dts/fsl/p4080ds.dts |  184 ++-
>  arch/powerpc/boot/dts/fsl/p5020ds.dts |  112 -
>  arch/powerpc/boot/dts/fsl/p5040ds.dts |  234 ++-
>  arch/powerpc/boot/dts/fsl/t1023rdb.dts|   41 
>  arch/powerpc/boot/dts/fsl/t1024rdb.dts|   45 
>  arch/powerpc/boot/dts/fsl/t1040rdb.dts|   32 ++-
>  arch/powerpc/boot/dts/fsl/t1042rdb.dts|   30 ++-
>  arch/powerpc/boot/dts/fsl/t1042rdb_pi.dts |   18 +-
>  arch/powerpc/boot/dts/fsl/t104xqds.dtsi   |  178 ++-
>  arch/powerpc/boot/dts/fsl/t104xrdb.dtsi   |   33 ++-
>  arch/powerpc/boot/dts/fsl/t2080qds.dts|  158 -
>  arch/powerpc/boot/dts/fsl/t2080rdb.dts|   67 +-
>  arch/powerpc/boot/dts/fsl/t2081qds.dts|  221 +-
>  arch/powerpc/boot/dts/fsl/t4240qds.dts|  400 
> -
>  arch/powerpc/boot/dts/fsl/t4240rdb.dts|  149 +++-
>  20 files changed, 2221 insertions(+), 20 deletions(-)

p1023rdb removed when applying, as it has dtc errors due to missing base
fman support.

-SCott


Re: [v5] powerpc/mpc85xx: Add MDIO bus muxing support to the board device tree(s)

2016-03-11 Thread Scott Wood
On Mon, Aug 03, 2015 at 11:14:10AM +0300, Igal.Liberman wrote:
> From: Igal Liberman 
> 
> Describe the PHY topology for all configurations supported by each board
> 
> Based on prior work by Andy Fleming 
> 
> Signed-off-by: Shruti Kanetkar 
> Signed-off-by: Emil Medve 
> Signed-off-by: Igal Liberman 
> ---
> 
> Depends on the following patch set:
>   https://patchwork.ozlabs.org/patch/503107/
>   https://patchwork.ozlabs.org/patch/503108/
> 
> v4 ---> v5:
>   - Correct "Signed-off-by" order
> 
> v3 ---> v4:
>   - Added T1024 support
> 
> v2 ---> v3:
>   - Fixed incorrect E-Mail address (signed-off-by)
> 
> v1 ---> v2
>   - Remove 'Change-Id'
> 
>  arch/powerpc/boot/dts/fsl/b4860qds.dts|   60 -
>  arch/powerpc/boot/dts/fsl/b4qds.dtsi  |   51 -
>  arch/powerpc/boot/dts/fsl/p1023rdb.dts|   24 +-
>  arch/powerpc/boot/dts/fsl/p2041rdb.dts|   92 +++-
>  arch/powerpc/boot/dts/fsl/p3041ds.dts |  112 -
>  arch/powerpc/boot/dts/fsl/p4080ds.dts |  184 ++-
>  arch/powerpc/boot/dts/fsl/p5020ds.dts |  112 -
>  arch/powerpc/boot/dts/fsl/p5040ds.dts |  234 ++-
>  arch/powerpc/boot/dts/fsl/t1023rdb.dts|   41 
>  arch/powerpc/boot/dts/fsl/t1024rdb.dts|   45 
>  arch/powerpc/boot/dts/fsl/t1040rdb.dts|   32 ++-
>  arch/powerpc/boot/dts/fsl/t1042rdb.dts|   30 ++-
>  arch/powerpc/boot/dts/fsl/t1042rdb_pi.dts |   18 +-
>  arch/powerpc/boot/dts/fsl/t104xqds.dtsi   |  178 ++-
>  arch/powerpc/boot/dts/fsl/t104xrdb.dtsi   |   33 ++-
>  arch/powerpc/boot/dts/fsl/t2080qds.dts|  158 -
>  arch/powerpc/boot/dts/fsl/t2080rdb.dts|   67 +-
>  arch/powerpc/boot/dts/fsl/t2081qds.dts|  221 +-
>  arch/powerpc/boot/dts/fsl/t4240qds.dts|  400 
> -
>  arch/powerpc/boot/dts/fsl/t4240rdb.dts|  149 +++-
>  20 files changed, 2221 insertions(+), 20 deletions(-)

p1023rdb removed when applying, as it has dtc errors due to missing base
fman support.

-SCott


RE: [PATCH V6 0/6] Intel memory b/w monitoring support

2016-03-11 Thread Luck, Tony
Some tracing printk() show that we are calling update_sample() with totally 
bogus arguments.

There are a few good calls, then I see rmid=-380863112 evt_type=-30689 first=0

That turns into a wild vrmid, and we fault accessing mbm_current->prev_msr

-Tony




RE: [PATCH V6 0/6] Intel memory b/w monitoring support

2016-03-11 Thread Luck, Tony
Some tracing printk() show that we are calling update_sample() with totally 
bogus arguments.

There are a few good calls, then I see rmid=-380863112 evt_type=-30689 first=0

That turns into a wild vrmid, and we fault accessing mbm_current->prev_msr

-Tony




Re: [PATCH 4.4 34/74] arm64: vmemmap: use virtual projection of linear region

2016-03-11 Thread Ard Biesheuvel
On 8 March 2016 at 20:45, Ard Biesheuvel  wrote:
> On 8 March 2016 at 20:44, Greg Kroah-Hartman  
> wrote:
>> On Tue, Mar 08, 2016 at 05:40:14PM +0700, Ard Biesheuvel wrote:
>>> On 8 March 2016 at 07:02, Greg Kroah-Hartman  
>>> wrote:
>>> > 4.4-stable review patch.  If anyone has any objections, please let me 
>>> > know.
>>> >
>>>
>>> Please hold off on this one. We are seeing some breakage on 64k pages 
>>> systems
>>
>> If this problem is also in Linus's tree, I'd like to keep it in to keep
>> things "bug compatible".  Please let me know what fix that I should
>> apply to resolve this.
>>
>
> I am about to send out the patch that should fix this, so I will put you on 
> cc.
>

Not sure what happened here, but this patch is in 4.4-stable now, but
the fix is not.


Re: [PATCH 4.4 34/74] arm64: vmemmap: use virtual projection of linear region

2016-03-11 Thread Ard Biesheuvel
On 8 March 2016 at 20:45, Ard Biesheuvel  wrote:
> On 8 March 2016 at 20:44, Greg Kroah-Hartman  
> wrote:
>> On Tue, Mar 08, 2016 at 05:40:14PM +0700, Ard Biesheuvel wrote:
>>> On 8 March 2016 at 07:02, Greg Kroah-Hartman  
>>> wrote:
>>> > 4.4-stable review patch.  If anyone has any objections, please let me 
>>> > know.
>>> >
>>>
>>> Please hold off on this one. We are seeing some breakage on 64k pages 
>>> systems
>>
>> If this problem is also in Linus's tree, I'd like to keep it in to keep
>> things "bug compatible".  Please let me know what fix that I should
>> apply to resolve this.
>>
>
> I am about to send out the patch that should fix this, so I will put you on 
> cc.
>

Not sure what happened here, but this patch is in 4.4-stable now, but
the fix is not.


Re: multipath: I/O hanging forever

2016-03-11 Thread Ming Lei
On Fri, 11 Mar 2016 15:24:33 -0700
Andrea Righi  wrote:

> On Sat, Mar 05, 2016 at 08:31:03PM -0900, Kent Overstreet wrote:
> > On Fri, Mar 04, 2016 at 10:30:44AM -0700, Andrea Righi wrote:
> > > On Sun, Feb 28, 2016 at 08:46:16PM -0700, Andrea Righi wrote:
> > > > On Sun, Feb 28, 2016 at 06:53:33PM -0700, Andrea Righi wrote:
> > > > ... 
> > > > > I'm using 4.5.0-rc5+, from Linus' git. I'll try to do a git bisect
> > > > > later, I'm pretty sure this problem has been introduced recently 
> > > > > (i.e.,
> > > > > I've never seen this issue with 4.1.x).
> > > > 
> > > > I confirm, just tested kernel 4.1 and this problem doesn't happen.
> > > 
> > > Alright, I had some spare time to bisect this problem and I found that
> > > the commit that introduced this issue is c66a14d.
> > > 
> > > So, I tried to revert the commit (with some changes to fix conflicts and
> > > ABI changes) and now multipath seems to work fine for me (no hung task).
> > 
> > Is it hanging on first IO, first large IO, or just randomly?
> 
> It's always the very first O_DIRECT I/O, in general the task gets stuck
> in do_blockdev_direct_IO().

I can reproduce the issue too, and looks it is a MD issue instead of block.
Andrea, could you try the following patch to see if it can fix your issue?

---
>From 43fc9c221e53c64f2df7c100c77cc25c4a98c607 Mon Sep 17 00:00:00 2001
From: Ming Lei 
Date: Sat, 12 Mar 2016 09:29:40 +0800
Subject: [PATCH] md: multipath: don't hardcopy bio in .make_request path

Inside multipath_make_request(), multipath maps the incoming
bio into low level device's bio, but it is totally wrong to
copy the bio into mapped bio via '*mapped_bio = *bio'. For
example, .__bi_remaining is kept in the copy, especially if
the incoming bio is chained to via bio splitting, so .bi_end_io
can't be called for the mapped bio at all in the completing path
in this kind of situation.

This patch fixes the issue by using clone style.

Signed-off-by: Ming Lei 
---
 drivers/md/multipath.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
index 0a72ab6..dd483bb 100644
--- a/drivers/md/multipath.c
+++ b/drivers/md/multipath.c
@@ -129,7 +129,9 @@ static void multipath_make_request(struct mddev *mddev, 
struct bio * bio)
}
multipath = conf->multipaths + mp_bh->path;
 
-   mp_bh->bio = *bio;
+   bio_init(_bh->bio);
+   __bio_clone_fast(_bh->bio, bio);
+
mp_bh->bio.bi_iter.bi_sector += multipath->rdev->data_offset;
mp_bh->bio.bi_bdev = multipath->rdev->bdev;
mp_bh->bio.bi_rw |= REQ_FAILFAST_TRANSPORT;
-- 
1.9.1


Thanks,


Re: multipath: I/O hanging forever

2016-03-11 Thread Ming Lei
On Fri, 11 Mar 2016 15:24:33 -0700
Andrea Righi  wrote:

> On Sat, Mar 05, 2016 at 08:31:03PM -0900, Kent Overstreet wrote:
> > On Fri, Mar 04, 2016 at 10:30:44AM -0700, Andrea Righi wrote:
> > > On Sun, Feb 28, 2016 at 08:46:16PM -0700, Andrea Righi wrote:
> > > > On Sun, Feb 28, 2016 at 06:53:33PM -0700, Andrea Righi wrote:
> > > > ... 
> > > > > I'm using 4.5.0-rc5+, from Linus' git. I'll try to do a git bisect
> > > > > later, I'm pretty sure this problem has been introduced recently 
> > > > > (i.e.,
> > > > > I've never seen this issue with 4.1.x).
> > > > 
> > > > I confirm, just tested kernel 4.1 and this problem doesn't happen.
> > > 
> > > Alright, I had some spare time to bisect this problem and I found that
> > > the commit that introduced this issue is c66a14d.
> > > 
> > > So, I tried to revert the commit (with some changes to fix conflicts and
> > > ABI changes) and now multipath seems to work fine for me (no hung task).
> > 
> > Is it hanging on first IO, first large IO, or just randomly?
> 
> It's always the very first O_DIRECT I/O, in general the task gets stuck
> in do_blockdev_direct_IO().

I can reproduce the issue too, and looks it is a MD issue instead of block.
Andrea, could you try the following patch to see if it can fix your issue?

---
>From 43fc9c221e53c64f2df7c100c77cc25c4a98c607 Mon Sep 17 00:00:00 2001
From: Ming Lei 
Date: Sat, 12 Mar 2016 09:29:40 +0800
Subject: [PATCH] md: multipath: don't hardcopy bio in .make_request path

Inside multipath_make_request(), multipath maps the incoming
bio into low level device's bio, but it is totally wrong to
copy the bio into mapped bio via '*mapped_bio = *bio'. For
example, .__bi_remaining is kept in the copy, especially if
the incoming bio is chained to via bio splitting, so .bi_end_io
can't be called for the mapped bio at all in the completing path
in this kind of situation.

This patch fixes the issue by using clone style.

Signed-off-by: Ming Lei 
---
 drivers/md/multipath.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
index 0a72ab6..dd483bb 100644
--- a/drivers/md/multipath.c
+++ b/drivers/md/multipath.c
@@ -129,7 +129,9 @@ static void multipath_make_request(struct mddev *mddev, 
struct bio * bio)
}
multipath = conf->multipaths + mp_bh->path;
 
-   mp_bh->bio = *bio;
+   bio_init(_bh->bio);
+   __bio_clone_fast(_bh->bio, bio);
+
mp_bh->bio.bi_iter.bi_sector += multipath->rdev->data_offset;
mp_bh->bio.bi_bdev = multipath->rdev->bdev;
mp_bh->bio.bi_rw |= REQ_FAILFAST_TRANSPORT;
-- 
1.9.1


Thanks,


Re: [PATCH v1 09/19] zsmalloc: keep max_object in size_class

2016-03-11 Thread xuyiping



On 2016/3/11 15:30, Minchan Kim wrote:

Every zspage in a size_class has same number of max objects so
we could move it to a size_class.

Signed-off-by: Minchan Kim 
---
  mm/zsmalloc.c | 29 ++---
  1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index b4fb11831acb..ca663c82c1fc 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -32,8 +32,6 @@
   *page->freelist: points to the first free object in zspage.
   *Free objects are linked together using in-place
   *metadata.
- * page->objects: maximum number of objects we can store in this
- * zspage (class->zspage_order * PAGE_SIZE / class->size)
   *page->lru: links together first pages of various zspages.
   *Basically forming list of zspages in a fullness group.
   *page->mapping: class index and fullness group of the zspage
@@ -211,6 +209,7 @@ struct size_class {
 * of ZS_ALIGN.
 */
int size;
+   int objs_per_zspage;
unsigned int index;

struct zs_size_stat stats;
@@ -622,21 +621,22 @@ static inline void zs_pool_stat_destroy(struct zs_pool 
*pool)
   * the pool (not yet implemented). This function returns fullness
   * status of the given page.
   */
-static enum fullness_group get_fullness_group(struct page *first_page)
+static enum fullness_group get_fullness_group(struct size_class *class,
+   struct page *first_page)
  {
-   int inuse, max_objects;
+   int inuse, objs_per_zspage;
enum fullness_group fg;

VM_BUG_ON_PAGE(!is_first_page(first_page), first_page);

inuse = first_page->inuse;
-   max_objects = first_page->objects;
+   objs_per_zspage = class->objs_per_zspage;

if (inuse == 0)
fg = ZS_EMPTY;
-   else if (inuse == max_objects)
+   else if (inuse == objs_per_zspage)
fg = ZS_FULL;
-   else if (inuse <= 3 * max_objects / fullness_threshold_frac)
+   else if (inuse <= 3 * objs_per_zspage / fullness_threshold_frac)
fg = ZS_ALMOST_EMPTY;
else
fg = ZS_ALMOST_FULL;
@@ -723,7 +723,7 @@ static enum fullness_group fix_fullness_group(struct 
size_class *class,
enum fullness_group currfg, newfg;

get_zspage_mapping(first_page, _idx, );
-   newfg = get_fullness_group(first_page);
+   newfg = get_fullness_group(class, first_page);
if (newfg == currfg)
goto out;

@@ -1003,9 +1003,6 @@ static struct page *alloc_zspage(struct size_class 
*class, gfp_t flags)
init_zspage(class, first_page);

first_page->freelist = location_to_obj(first_page, 0);
-   /* Maximum number of objects we can store in this zspage */
-   first_page->objects = class->pages_per_zspage * PAGE_SIZE / class->size;
-
error = 0; /* Success */

  cleanup:
@@ -1235,11 +1232,11 @@ static bool can_merge(struct size_class *prev, int 
size, int pages_per_zspage)
return true;
  }

-static bool zspage_full(struct page *first_page)
+static bool zspage_full(struct size_class *class, struct page *first_page)
  {
VM_BUG_ON_PAGE(!is_first_page(first_page), first_page);

-   return first_page->inuse == first_page->objects;
+   return first_page->inuse == class->objs_per_zspage;
  }

  unsigned long zs_get_total_pages(struct zs_pool *pool)
@@ -1625,7 +1622,7 @@ static int migrate_zspage(struct zs_pool *pool, struct 
size_class *class,
}

/* Stop if there is no more space */
-   if (zspage_full(d_page)) {
+   if (zspage_full(class, d_page)) {
unpin_tag(handle);
ret = -ENOMEM;
break;
@@ -1684,7 +1681,7 @@ static enum fullness_group putback_zspage(struct zs_pool 
*pool,
  {
enum fullness_group fullness;

-   fullness = get_fullness_group(first_page);
+   fullness = get_fullness_group(class, first_page);
insert_zspage(class, fullness, first_page);
set_zspage_mapping(first_page, class->index, fullness);

@@ -1933,6 +1930,8 @@ struct zs_pool *zs_create_pool(const char *name, gfp_t 
flags)
class->size = size;
class->index = i;
class->pages_per_zspage = pages_per_zspage;
+   class->objs_per_zspage = class->pages_per_zspage *
+   PAGE_SIZE / class->size;
if (pages_per_zspage == 1 &&
get_maxobj_per_zspage(size, pages_per_zspage) == 1)
class->huge = true;


computes the "objs_per_zspage" twice here.

class->objs_per_zspage = get_maxobj_per_zspage(size,
pages_per_zspage);
if (pages_per_zspage == 1 && class->objs_per_zspage ==1)
  

Re: [PATCH v1 09/19] zsmalloc: keep max_object in size_class

2016-03-11 Thread xuyiping



On 2016/3/11 15:30, Minchan Kim wrote:

Every zspage in a size_class has same number of max objects so
we could move it to a size_class.

Signed-off-by: Minchan Kim 
---
  mm/zsmalloc.c | 29 ++---
  1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index b4fb11831acb..ca663c82c1fc 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -32,8 +32,6 @@
   *page->freelist: points to the first free object in zspage.
   *Free objects are linked together using in-place
   *metadata.
- * page->objects: maximum number of objects we can store in this
- * zspage (class->zspage_order * PAGE_SIZE / class->size)
   *page->lru: links together first pages of various zspages.
   *Basically forming list of zspages in a fullness group.
   *page->mapping: class index and fullness group of the zspage
@@ -211,6 +209,7 @@ struct size_class {
 * of ZS_ALIGN.
 */
int size;
+   int objs_per_zspage;
unsigned int index;

struct zs_size_stat stats;
@@ -622,21 +621,22 @@ static inline void zs_pool_stat_destroy(struct zs_pool 
*pool)
   * the pool (not yet implemented). This function returns fullness
   * status of the given page.
   */
-static enum fullness_group get_fullness_group(struct page *first_page)
+static enum fullness_group get_fullness_group(struct size_class *class,
+   struct page *first_page)
  {
-   int inuse, max_objects;
+   int inuse, objs_per_zspage;
enum fullness_group fg;

VM_BUG_ON_PAGE(!is_first_page(first_page), first_page);

inuse = first_page->inuse;
-   max_objects = first_page->objects;
+   objs_per_zspage = class->objs_per_zspage;

if (inuse == 0)
fg = ZS_EMPTY;
-   else if (inuse == max_objects)
+   else if (inuse == objs_per_zspage)
fg = ZS_FULL;
-   else if (inuse <= 3 * max_objects / fullness_threshold_frac)
+   else if (inuse <= 3 * objs_per_zspage / fullness_threshold_frac)
fg = ZS_ALMOST_EMPTY;
else
fg = ZS_ALMOST_FULL;
@@ -723,7 +723,7 @@ static enum fullness_group fix_fullness_group(struct 
size_class *class,
enum fullness_group currfg, newfg;

get_zspage_mapping(first_page, _idx, );
-   newfg = get_fullness_group(first_page);
+   newfg = get_fullness_group(class, first_page);
if (newfg == currfg)
goto out;

@@ -1003,9 +1003,6 @@ static struct page *alloc_zspage(struct size_class 
*class, gfp_t flags)
init_zspage(class, first_page);

first_page->freelist = location_to_obj(first_page, 0);
-   /* Maximum number of objects we can store in this zspage */
-   first_page->objects = class->pages_per_zspage * PAGE_SIZE / class->size;
-
error = 0; /* Success */

  cleanup:
@@ -1235,11 +1232,11 @@ static bool can_merge(struct size_class *prev, int 
size, int pages_per_zspage)
return true;
  }

-static bool zspage_full(struct page *first_page)
+static bool zspage_full(struct size_class *class, struct page *first_page)
  {
VM_BUG_ON_PAGE(!is_first_page(first_page), first_page);

-   return first_page->inuse == first_page->objects;
+   return first_page->inuse == class->objs_per_zspage;
  }

  unsigned long zs_get_total_pages(struct zs_pool *pool)
@@ -1625,7 +1622,7 @@ static int migrate_zspage(struct zs_pool *pool, struct 
size_class *class,
}

/* Stop if there is no more space */
-   if (zspage_full(d_page)) {
+   if (zspage_full(class, d_page)) {
unpin_tag(handle);
ret = -ENOMEM;
break;
@@ -1684,7 +1681,7 @@ static enum fullness_group putback_zspage(struct zs_pool 
*pool,
  {
enum fullness_group fullness;

-   fullness = get_fullness_group(first_page);
+   fullness = get_fullness_group(class, first_page);
insert_zspage(class, fullness, first_page);
set_zspage_mapping(first_page, class->index, fullness);

@@ -1933,6 +1930,8 @@ struct zs_pool *zs_create_pool(const char *name, gfp_t 
flags)
class->size = size;
class->index = i;
class->pages_per_zspage = pages_per_zspage;
+   class->objs_per_zspage = class->pages_per_zspage *
+   PAGE_SIZE / class->size;
if (pages_per_zspage == 1 &&
get_maxobj_per_zspage(size, pages_per_zspage) == 1)
class->huge = true;


computes the "objs_per_zspage" twice here.

class->objs_per_zspage = get_maxobj_per_zspage(size,
pages_per_zspage);
if (pages_per_zspage == 1 && class->objs_per_zspage ==1)
  

Re: [lustre-devel] [PATCH 07/10] staging: lustre: cleanup comment style for lnet selftest

2016-03-11 Thread Dilger, Andreas
On 2016/03/11, 18:29, "lustre-devel on behalf of James Simmons"
 wrote:

>Apply a consistent style for comments in the lnet selftest
>code.
>
>Signed-off-by: James Simmons 
>---
> drivers/staging/lustre/lnet/selftest/brw_test.c  |8 ++--
> drivers/staging/lustre/lnet/selftest/conctl.c|   50
>+++---
> drivers/staging/lustre/lnet/selftest/conrpc.c|   23 +-
> drivers/staging/lustre/lnet/selftest/console.c   |   11 +++--
> drivers/staging/lustre/lnet/selftest/framework.c |   20 
> drivers/staging/lustre/lnet/selftest/ping_test.c |2 +-
> drivers/staging/lustre/lnet/selftest/rpc.c   |   46
>++--
> drivers/staging/lustre/lnet/selftest/rpc.h   |2 +-
> drivers/staging/lustre/lnet/selftest/selftest.h  |3 +-
> drivers/staging/lustre/lnet/selftest/timer.c |6 +-
> 10 files changed, 87 insertions(+), 84 deletions(-)
>
>diff --git a/drivers/staging/lustre/lnet/selftest/brw_test.c
>b/drivers/staging/lustre/lnet/selftest/brw_test.c
>index eebc924..6ac4d02 100644
>--- a/drivers/staging/lustre/lnet/selftest/brw_test.c
>+++ b/drivers/staging/lustre/lnet/selftest/brw_test.c
>@@ -86,7 +86,7 @@ brw_client_init(sfw_test_instance_t *tsi)
>   opc = breq->blk_opc;
>   flags = breq->blk_flags;
>   npg = breq->blk_npg;
>-  /*
>+  /**
>* NB: this is not going to work for variable page size,
>* but we have to keep it for compatibility
>*/

The "/**" comment opener is only for header comment blocks that
have markup in them.  I don't think that is kernel style for
normal multi-line comments in the code.

Cheers, Andreas

>@@ -95,7 +95,7 @@ brw_client_init(sfw_test_instance_t *tsi)
>   } else {
>   test_bulk_req_v1_t *breq = >tsi_u.bulk_v1;
> 
>-  /*
>+  /**
>* I should never get this step if it's unknown feature
>* because make_session will reject unknown feature
>*/
>@@ -283,7 +283,7 @@ brw_client_prep_rpc(sfw_test_unit_t *tsu,
>   } else {
>   test_bulk_req_v1_t *breq = >tsi_u.bulk_v1;
> 
>-  /*
>+  /**
>* I should never get this step if it's unknown feature
>* because make_session will reject unknown feature
>*/
>@@ -329,7 +329,7 @@ brw_client_done_rpc(sfw_test_unit_t *tsu,
>srpc_client_rpc_t *rpc)
>   if (rpc->crpc_status) {
>   CERROR("BRW RPC to %s failed with %d\n",
>  libcfs_id2str(rpc->crpc_dest), rpc->crpc_status);
>-  if (!tsi->tsi_stopping) /* rpc could have been aborted */
>+  if (!tsi->tsi_stopping) /* rpc could have been aborted */
>   atomic_inc(>sn_brw_errors);
>   return;
>   }
>diff --git a/drivers/staging/lustre/lnet/selftest/conctl.c
>b/drivers/staging/lustre/lnet/selftest/conctl.c
>index 872df72..d045ac5 100644
>--- a/drivers/staging/lustre/lnet/selftest/conctl.c
>+++ b/drivers/staging/lustre/lnet/selftest/conctl.c
>@@ -51,9 +51,9 @@ lst_session_new_ioctl(lstio_session_new_args_t *args)
>   char *name;
>   int rc;
> 
>-  if (!args->lstio_ses_idp || /* address for output sid */
>-  !args->lstio_ses_key ||/* no key is specified */
>-  !args->lstio_ses_namep || /* session name */
>+  if (!args->lstio_ses_idp || /* address for output sid */
>+  !args->lstio_ses_key || /* no key is specified */
>+  !args->lstio_ses_namep ||   /* session name */
>   args->lstio_ses_nmlen <= 0 ||
>   args->lstio_ses_nmlen > LST_NAME_SIZE)
>   return -EINVAL;
>@@ -95,11 +95,11 @@ lst_session_info_ioctl(lstio_session_info_args_t
>*args)
> {
>   /* no checking of key */
> 
>-  if (!args->lstio_ses_idp || /* address for output sid */
>-  !args->lstio_ses_keyp || /* address for output key */
>-  !args->lstio_ses_featp || /* address for output features */
>-  !args->lstio_ses_ndinfo || /* address for output ndinfo */
>-  !args->lstio_ses_namep || /* address for output name */
>+  if (!args->lstio_ses_idp || /* address for output sid */
>+  !args->lstio_ses_keyp ||/* address for output key */
>+  !args->lstio_ses_featp ||   /* address for output features */
>+  !args->lstio_ses_ndinfo ||  /* address for output ndinfo */
>+  !args->lstio_ses_namep ||   /* address for output name */
>   args->lstio_ses_nmlen <= 0 ||
>   args->lstio_ses_nmlen > LST_NAME_SIZE)
>   return -EINVAL;
>@@ -125,7 +125,7 @@ lst_debug_ioctl(lstio_debug_args_t *args)
>   if (!args->lstio_dbg_resultp)
>   return -EINVAL;
> 
>-  if (args->lstio_dbg_namep && /* name of batch/group */
>+  if 

Re: [lustre-devel] [PATCH 07/10] staging: lustre: cleanup comment style for lnet selftest

2016-03-11 Thread Dilger, Andreas
On 2016/03/11, 18:29, "lustre-devel on behalf of James Simmons"
 wrote:

>Apply a consistent style for comments in the lnet selftest
>code.
>
>Signed-off-by: James Simmons 
>---
> drivers/staging/lustre/lnet/selftest/brw_test.c  |8 ++--
> drivers/staging/lustre/lnet/selftest/conctl.c|   50
>+++---
> drivers/staging/lustre/lnet/selftest/conrpc.c|   23 +-
> drivers/staging/lustre/lnet/selftest/console.c   |   11 +++--
> drivers/staging/lustre/lnet/selftest/framework.c |   20 
> drivers/staging/lustre/lnet/selftest/ping_test.c |2 +-
> drivers/staging/lustre/lnet/selftest/rpc.c   |   46
>++--
> drivers/staging/lustre/lnet/selftest/rpc.h   |2 +-
> drivers/staging/lustre/lnet/selftest/selftest.h  |3 +-
> drivers/staging/lustre/lnet/selftest/timer.c |6 +-
> 10 files changed, 87 insertions(+), 84 deletions(-)
>
>diff --git a/drivers/staging/lustre/lnet/selftest/brw_test.c
>b/drivers/staging/lustre/lnet/selftest/brw_test.c
>index eebc924..6ac4d02 100644
>--- a/drivers/staging/lustre/lnet/selftest/brw_test.c
>+++ b/drivers/staging/lustre/lnet/selftest/brw_test.c
>@@ -86,7 +86,7 @@ brw_client_init(sfw_test_instance_t *tsi)
>   opc = breq->blk_opc;
>   flags = breq->blk_flags;
>   npg = breq->blk_npg;
>-  /*
>+  /**
>* NB: this is not going to work for variable page size,
>* but we have to keep it for compatibility
>*/

The "/**" comment opener is only for header comment blocks that
have markup in them.  I don't think that is kernel style for
normal multi-line comments in the code.

Cheers, Andreas

>@@ -95,7 +95,7 @@ brw_client_init(sfw_test_instance_t *tsi)
>   } else {
>   test_bulk_req_v1_t *breq = >tsi_u.bulk_v1;
> 
>-  /*
>+  /**
>* I should never get this step if it's unknown feature
>* because make_session will reject unknown feature
>*/
>@@ -283,7 +283,7 @@ brw_client_prep_rpc(sfw_test_unit_t *tsu,
>   } else {
>   test_bulk_req_v1_t *breq = >tsi_u.bulk_v1;
> 
>-  /*
>+  /**
>* I should never get this step if it's unknown feature
>* because make_session will reject unknown feature
>*/
>@@ -329,7 +329,7 @@ brw_client_done_rpc(sfw_test_unit_t *tsu,
>srpc_client_rpc_t *rpc)
>   if (rpc->crpc_status) {
>   CERROR("BRW RPC to %s failed with %d\n",
>  libcfs_id2str(rpc->crpc_dest), rpc->crpc_status);
>-  if (!tsi->tsi_stopping) /* rpc could have been aborted */
>+  if (!tsi->tsi_stopping) /* rpc could have been aborted */
>   atomic_inc(>sn_brw_errors);
>   return;
>   }
>diff --git a/drivers/staging/lustre/lnet/selftest/conctl.c
>b/drivers/staging/lustre/lnet/selftest/conctl.c
>index 872df72..d045ac5 100644
>--- a/drivers/staging/lustre/lnet/selftest/conctl.c
>+++ b/drivers/staging/lustre/lnet/selftest/conctl.c
>@@ -51,9 +51,9 @@ lst_session_new_ioctl(lstio_session_new_args_t *args)
>   char *name;
>   int rc;
> 
>-  if (!args->lstio_ses_idp || /* address for output sid */
>-  !args->lstio_ses_key ||/* no key is specified */
>-  !args->lstio_ses_namep || /* session name */
>+  if (!args->lstio_ses_idp || /* address for output sid */
>+  !args->lstio_ses_key || /* no key is specified */
>+  !args->lstio_ses_namep ||   /* session name */
>   args->lstio_ses_nmlen <= 0 ||
>   args->lstio_ses_nmlen > LST_NAME_SIZE)
>   return -EINVAL;
>@@ -95,11 +95,11 @@ lst_session_info_ioctl(lstio_session_info_args_t
>*args)
> {
>   /* no checking of key */
> 
>-  if (!args->lstio_ses_idp || /* address for output sid */
>-  !args->lstio_ses_keyp || /* address for output key */
>-  !args->lstio_ses_featp || /* address for output features */
>-  !args->lstio_ses_ndinfo || /* address for output ndinfo */
>-  !args->lstio_ses_namep || /* address for output name */
>+  if (!args->lstio_ses_idp || /* address for output sid */
>+  !args->lstio_ses_keyp ||/* address for output key */
>+  !args->lstio_ses_featp ||   /* address for output features */
>+  !args->lstio_ses_ndinfo ||  /* address for output ndinfo */
>+  !args->lstio_ses_namep ||   /* address for output name */
>   args->lstio_ses_nmlen <= 0 ||
>   args->lstio_ses_nmlen > LST_NAME_SIZE)
>   return -EINVAL;
>@@ -125,7 +125,7 @@ lst_debug_ioctl(lstio_debug_args_t *args)
>   if (!args->lstio_dbg_resultp)
>   return -EINVAL;
> 
>-  if (args->lstio_dbg_namep && /* name of batch/group */
>+  if (args->lstio_dbg_namep &&/* name of batch/group */
>   (args->lstio_dbg_nmlen <= 0 ||
>

[PATCH 03/10] staging: lustre: fix spacing issues checkpatch reported in lnet selftest

2016-03-11 Thread James Simmons
Remove any extra spacing as reported by checkpatch.

Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lnet/selftest/brw_test.c |6 +++---
 drivers/staging/lustre/lnet/selftest/console.c  |6 +++---
 drivers/staging/lustre/lnet/selftest/selftest.h |2 +-
 drivers/staging/lustre/lnet/selftest/timer.h|2 +-
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lnet/selftest/brw_test.c 
b/drivers/staging/lustre/lnet/selftest/brw_test.c
index 7bfc0db..18c7422 100644
--- a/drivers/staging/lustre/lnet/selftest/brw_test.c
+++ b/drivers/staging/lustre/lnet/selftest/brw_test.c
@@ -194,12 +194,12 @@ brw_check_page(struct page *pg, int pattern, __u64 magic)
return 0;
 
if (pattern == LST_BRW_CHECK_SIMPLE) {
-   data = *((__u64 *) addr);
+   data = *((__u64 *)addr);
if (data != magic)
goto bad_data;
 
addr += PAGE_CACHE_SIZE - BRW_MSIZE;
-   data = *((__u64 *) addr);
+   data = *((__u64 *)addr);
if (data != magic)
goto bad_data;
 
@@ -208,7 +208,7 @@ brw_check_page(struct page *pg, int pattern, __u64 magic)
 
if (pattern == LST_BRW_CHECK_FULL) {
for (i = 0; i < PAGE_CACHE_SIZE / BRW_MSIZE; i++) {
-   data = *(((__u64 *) addr) + i);
+   data = *(((__u64 *)addr) + i);
if (data != magic)
goto bad_data;
}
diff --git a/drivers/staging/lustre/lnet/selftest/console.c 
b/drivers/staging/lustre/lnet/selftest/console.c
index 9f1838f..b0c9acd 100644
--- a/drivers/staging/lustre/lnet/selftest/console.c
+++ b/drivers/staging/lustre/lnet/selftest/console.c
@@ -207,7 +207,7 @@ lstcon_group_alloc(char *name, lstcon_group_t **grpp)
 
grp->grp_ref = 1;
if (name) {
-   if (strlen(name) > sizeof(grp->grp_name)-1) {
+   if (strlen(name) > sizeof(grp->grp_name) - 1) {
LIBCFS_FREE(grp, offsetof(lstcon_group_t,
grp_ndl_hash[LST_NODE_HASHSIZE]));
return -E2BIG;
@@ -525,7 +525,7 @@ lstcon_group_add(char *name)
lstcon_group_t *grp;
int rc;
 
-   rc = lstcon_group_find(name, ) ? 0: -EEXIST;
+   rc = lstcon_group_find(name, ) ? 0 : -EEXIST;
if (rc) {
/* find a group with same name */
lstcon_group_decref(grp);
@@ -1746,7 +1746,7 @@ lstcon_session_new(char *name, int key, unsigned feats,
console_session.ses_timeout = (timeout <= 0) ?
  LST_CONSOLE_TIMEOUT : timeout;
 
-   if (strlen(name) > sizeof(console_session.ses_name)-1)
+   if (strlen(name) > sizeof(console_session.ses_name) - 1)
return -E2BIG;
strncpy(console_session.ses_name, name,
sizeof(console_session.ses_name));
diff --git a/drivers/staging/lustre/lnet/selftest/selftest.h 
b/drivers/staging/lustre/lnet/selftest/selftest.h
index 9669088..b605f7f 100644
--- a/drivers/staging/lustre/lnet/selftest/selftest.h
+++ b/drivers/staging/lustre/lnet/selftest/selftest.h
@@ -415,7 +415,7 @@ typedef struct sfw_test_case {
 srpc_client_rpc_t *
 sfw_create_rpc(lnet_process_id_t peer, int service,
   unsigned features, int nbulkiov, int bulklen,
-  void (*done) (srpc_client_rpc_t *), void *priv);
+  void (*done)(srpc_client_rpc_t *), void *priv);
 int sfw_create_test_rpc(sfw_test_unit_t *tsu,
lnet_process_id_t peer, unsigned features,
int nblk, int blklen, srpc_client_rpc_t **rpc);
diff --git a/drivers/staging/lustre/lnet/selftest/timer.h 
b/drivers/staging/lustre/lnet/selftest/timer.h
index 2affecf..39327bb 100644
--- a/drivers/staging/lustre/lnet/selftest/timer.h
+++ b/drivers/staging/lustre/lnet/selftest/timer.h
@@ -41,7 +41,7 @@
 struct stt_timer {
struct list_head stt_list;
time64_t stt_expires;
-   void (*stt_func) (void *);
+   void (*stt_func)(void *);
void *stt_data;
 };
 
-- 
1.7.1



[PATCH 03/10] staging: lustre: fix spacing issues checkpatch reported in lnet selftest

2016-03-11 Thread James Simmons
Remove any extra spacing as reported by checkpatch.

Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lnet/selftest/brw_test.c |6 +++---
 drivers/staging/lustre/lnet/selftest/console.c  |6 +++---
 drivers/staging/lustre/lnet/selftest/selftest.h |2 +-
 drivers/staging/lustre/lnet/selftest/timer.h|2 +-
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lnet/selftest/brw_test.c 
b/drivers/staging/lustre/lnet/selftest/brw_test.c
index 7bfc0db..18c7422 100644
--- a/drivers/staging/lustre/lnet/selftest/brw_test.c
+++ b/drivers/staging/lustre/lnet/selftest/brw_test.c
@@ -194,12 +194,12 @@ brw_check_page(struct page *pg, int pattern, __u64 magic)
return 0;
 
if (pattern == LST_BRW_CHECK_SIMPLE) {
-   data = *((__u64 *) addr);
+   data = *((__u64 *)addr);
if (data != magic)
goto bad_data;
 
addr += PAGE_CACHE_SIZE - BRW_MSIZE;
-   data = *((__u64 *) addr);
+   data = *((__u64 *)addr);
if (data != magic)
goto bad_data;
 
@@ -208,7 +208,7 @@ brw_check_page(struct page *pg, int pattern, __u64 magic)
 
if (pattern == LST_BRW_CHECK_FULL) {
for (i = 0; i < PAGE_CACHE_SIZE / BRW_MSIZE; i++) {
-   data = *(((__u64 *) addr) + i);
+   data = *(((__u64 *)addr) + i);
if (data != magic)
goto bad_data;
}
diff --git a/drivers/staging/lustre/lnet/selftest/console.c 
b/drivers/staging/lustre/lnet/selftest/console.c
index 9f1838f..b0c9acd 100644
--- a/drivers/staging/lustre/lnet/selftest/console.c
+++ b/drivers/staging/lustre/lnet/selftest/console.c
@@ -207,7 +207,7 @@ lstcon_group_alloc(char *name, lstcon_group_t **grpp)
 
grp->grp_ref = 1;
if (name) {
-   if (strlen(name) > sizeof(grp->grp_name)-1) {
+   if (strlen(name) > sizeof(grp->grp_name) - 1) {
LIBCFS_FREE(grp, offsetof(lstcon_group_t,
grp_ndl_hash[LST_NODE_HASHSIZE]));
return -E2BIG;
@@ -525,7 +525,7 @@ lstcon_group_add(char *name)
lstcon_group_t *grp;
int rc;
 
-   rc = lstcon_group_find(name, ) ? 0: -EEXIST;
+   rc = lstcon_group_find(name, ) ? 0 : -EEXIST;
if (rc) {
/* find a group with same name */
lstcon_group_decref(grp);
@@ -1746,7 +1746,7 @@ lstcon_session_new(char *name, int key, unsigned feats,
console_session.ses_timeout = (timeout <= 0) ?
  LST_CONSOLE_TIMEOUT : timeout;
 
-   if (strlen(name) > sizeof(console_session.ses_name)-1)
+   if (strlen(name) > sizeof(console_session.ses_name) - 1)
return -E2BIG;
strncpy(console_session.ses_name, name,
sizeof(console_session.ses_name));
diff --git a/drivers/staging/lustre/lnet/selftest/selftest.h 
b/drivers/staging/lustre/lnet/selftest/selftest.h
index 9669088..b605f7f 100644
--- a/drivers/staging/lustre/lnet/selftest/selftest.h
+++ b/drivers/staging/lustre/lnet/selftest/selftest.h
@@ -415,7 +415,7 @@ typedef struct sfw_test_case {
 srpc_client_rpc_t *
 sfw_create_rpc(lnet_process_id_t peer, int service,
   unsigned features, int nbulkiov, int bulklen,
-  void (*done) (srpc_client_rpc_t *), void *priv);
+  void (*done)(srpc_client_rpc_t *), void *priv);
 int sfw_create_test_rpc(sfw_test_unit_t *tsu,
lnet_process_id_t peer, unsigned features,
int nblk, int blklen, srpc_client_rpc_t **rpc);
diff --git a/drivers/staging/lustre/lnet/selftest/timer.h 
b/drivers/staging/lustre/lnet/selftest/timer.h
index 2affecf..39327bb 100644
--- a/drivers/staging/lustre/lnet/selftest/timer.h
+++ b/drivers/staging/lustre/lnet/selftest/timer.h
@@ -41,7 +41,7 @@
 struct stt_timer {
struct list_head stt_list;
time64_t stt_expires;
-   void (*stt_func) (void *);
+   void (*stt_func)(void *);
void *stt_data;
 };
 
-- 
1.7.1



  1   2   3   4   5   6   7   8   9   10   >