Re: [PATCH] phy: cpcap-usb: Fix missing return statement

2017-06-15 Thread Kishon Vijay Abraham I


On Monday 12 June 2017 01:42 PM, Tony Lindgren wrote:
> Commit 8ae904e3c236 ("phy: cpcap-usb: Add CPCAP PMIC USB support")
> is missing return statement as noted by Colin Ian King
> . If the optional pins are not configured,
> we just want to return early and not attempt to configure the pins.
> 
> Fixes: 8ae904e3c236 ("phy: cpcap-usb: Add CPCAP PMIC USB support")
> Reported-by: Colin Ian King 
> Signed-off-by: Tony Lindgren 

merged, thanks!

-Kishon
> ---
>  drivers/phy/motorola/phy-cpcap-usb.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/phy/motorola/phy-cpcap-usb.c 
> b/drivers/phy/motorola/phy-cpcap-usb.c
> --- a/drivers/phy/motorola/phy-cpcap-usb.c
> +++ b/drivers/phy/motorola/phy-cpcap-usb.c
> @@ -468,6 +468,8 @@ static int cpcap_usb_init_optional_pins(struct 
> cpcap_phy_ddata *ddata)
>   dev_info(ddata->dev, "default pins not configured: %ld\n",
>PTR_ERR(ddata->pins));
>   ddata->pins = NULL;
> +
> + return 0;
>   }
>  
>   ddata->pins_ulpi = pinctrl_lookup_state(ddata->pins, "ulpi");
> 


Re: [PATCH] phy: cpcap-usb: Fix missing return statement

2017-06-15 Thread Kishon Vijay Abraham I


On Monday 12 June 2017 01:42 PM, Tony Lindgren wrote:
> Commit 8ae904e3c236 ("phy: cpcap-usb: Add CPCAP PMIC USB support")
> is missing return statement as noted by Colin Ian King
> . If the optional pins are not configured,
> we just want to return early and not attempt to configure the pins.
> 
> Fixes: 8ae904e3c236 ("phy: cpcap-usb: Add CPCAP PMIC USB support")
> Reported-by: Colin Ian King 
> Signed-off-by: Tony Lindgren 

merged, thanks!

-Kishon
> ---
>  drivers/phy/motorola/phy-cpcap-usb.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/phy/motorola/phy-cpcap-usb.c 
> b/drivers/phy/motorola/phy-cpcap-usb.c
> --- a/drivers/phy/motorola/phy-cpcap-usb.c
> +++ b/drivers/phy/motorola/phy-cpcap-usb.c
> @@ -468,6 +468,8 @@ static int cpcap_usb_init_optional_pins(struct 
> cpcap_phy_ddata *ddata)
>   dev_info(ddata->dev, "default pins not configured: %ld\n",
>PTR_ERR(ddata->pins));
>   ddata->pins = NULL;
> +
> + return 0;
>   }
>  
>   ddata->pins_ulpi = pinctrl_lookup_state(ddata->pins, "ulpi");
> 


Re: [PATCH 00/13] block: assorted cleanup for bio splitting and cloning.

2017-06-15 Thread NeilBrown
On Thu, May 11 2017, NeilBrown wrote:

> On Tue, May 02 2017, NeilBrown wrote:
>
>> This is a revision of my series of patches working
>> towards removing the bioset work queues.
>
> Hi Jens,
>  could I get some feed-back about your thoughts on this series?
>  Will you apply it?  When?  Do I need to resend anything?
>  Would you like a git-pull request?  If so, what should I base it on?
>  There is a minor conflict with drivers/block/zram/zram_drv.c
>  as it dropped the call to blk_queue_split() recently, but otherwise it
>  still applies.

Hi Jens,
 I didn't hear back ... have you had a chance to look?
 In case it helps, you can pull the full set, based on a recent Linus tree,
 from
 git://neil.brown.name/linux bioset

 or I can resend the patches if you like.

Thanks,
NeilBrown


>
> Thanks,
> NeilBrown
>
>
>>
>> This set is based on Linus' tree as for today (2nd May) plus
>> the for-linus branch from Shaohua's md/raid tree.
>>
>> This series adds a fix for the new lightnvm/pblk-read code
>> and discards bioset_create_nobvec() in favor of a flag arg to
>> bioset_create().  There are also minor fixes and a little
>> code clean-up.
>>
>> I hope to eventually get rid of the new BIOSET_NEED_RESCUER flag,
>> but that needs work ing dm and probably bcache first.
>>
>> Thanks,
>> NeilBrown
>>
>>
>> ---
>>
>> NeilBrown (13):
>>   blk: remove bio_set arg from blk_queue_split()
>>   blk: replace bioset_create_nobvec() with a flags arg to bioset_create()
>>   blk: make the bioset rescue_workqueue optional.
>>   blk: use non-rescuing bioset for q->bio_split.
>>   block: Improvements to bounce-buffer handling
>>   rbd: use bio_clone_fast() instead of bio_clone()
>>   drbd: use bio_clone_fast() instead of bio_clone()
>>   pktcdvd: use bio_clone_fast() instead of bio_clone()
>>   lightnvm/pblk-read: use bio_clone_fast()
>>   xen-blkfront: remove bio splitting.
>>   bcache: use kmalloc to allocate bio in bch_data_verify()
>>   block: remove bio_clone() and all references.
>>   block: don't check for BIO_MAX_PAGES in blk_bio_segment_split()
>>
>>
>>  Documentation/block/biodoc.txt  |2 -
>>  block/bio.c |   72 
>> ---
>>  block/blk-core.c|4 +-
>>  block/blk-merge.c   |   31 ++-
>>  block/blk-mq.c  |2 -
>>  block/bounce.c  |   32 +---
>>  drivers/block/drbd/drbd_int.h   |3 +
>>  drivers/block/drbd/drbd_main.c  |   11 +
>>  drivers/block/drbd/drbd_req.c   |2 -
>>  drivers/block/drbd/drbd_req.h   |2 -
>>  drivers/block/pktcdvd.c |   14 +--
>>  drivers/block/ps3vram.c |2 -
>>  drivers/block/rbd.c |   16 +++-
>>  drivers/block/rsxx/dev.c|2 -
>>  drivers/block/umem.c|2 -
>>  drivers/block/xen-blkfront.c|   54 +-
>>  drivers/block/zram/zram_drv.c   |2 -
>>  drivers/lightnvm/pblk-init.c|   16 ++--
>>  drivers/lightnvm/pblk-read.c|2 -
>>  drivers/lightnvm/pblk.h |1 
>>  drivers/lightnvm/rrpc.c |2 -
>>  drivers/md/bcache/debug.c   |2 -
>>  drivers/md/bcache/super.c   |6 ++-
>>  drivers/md/dm-crypt.c   |2 -
>>  drivers/md/dm-io.c  |2 -
>>  drivers/md/dm.c |5 +-
>>  drivers/md/md.c |6 +--
>>  drivers/md/raid1.c  |2 -
>>  drivers/md/raid10.c |2 -
>>  drivers/md/raid5-cache.c|2 -
>>  drivers/md/raid5-ppl.c  |2 -
>>  drivers/md/raid5.c  |2 -
>>  drivers/s390/block/dcssblk.c|2 -
>>  drivers/s390/block/xpram.c  |2 -
>>  drivers/target/target_core_iblock.c |2 -
>>  fs/block_dev.c  |2 -
>>  fs/btrfs/extent_io.c|3 +
>>  fs/xfs/xfs_super.c  |3 +
>>  include/linux/bio.h |   12 ++
>>  include/linux/blkdev.h  |3 -
>>  40 files changed, 162 insertions(+), 174 deletions(-)
>>
>> --
>> Signature


signature.asc
Description: PGP signature


Re: [PATCH 00/13] block: assorted cleanup for bio splitting and cloning.

2017-06-15 Thread NeilBrown
On Thu, May 11 2017, NeilBrown wrote:

> On Tue, May 02 2017, NeilBrown wrote:
>
>> This is a revision of my series of patches working
>> towards removing the bioset work queues.
>
> Hi Jens,
>  could I get some feed-back about your thoughts on this series?
>  Will you apply it?  When?  Do I need to resend anything?
>  Would you like a git-pull request?  If so, what should I base it on?
>  There is a minor conflict with drivers/block/zram/zram_drv.c
>  as it dropped the call to blk_queue_split() recently, but otherwise it
>  still applies.

Hi Jens,
 I didn't hear back ... have you had a chance to look?
 In case it helps, you can pull the full set, based on a recent Linus tree,
 from
 git://neil.brown.name/linux bioset

 or I can resend the patches if you like.

Thanks,
NeilBrown


>
> Thanks,
> NeilBrown
>
>
>>
>> This set is based on Linus' tree as for today (2nd May) plus
>> the for-linus branch from Shaohua's md/raid tree.
>>
>> This series adds a fix for the new lightnvm/pblk-read code
>> and discards bioset_create_nobvec() in favor of a flag arg to
>> bioset_create().  There are also minor fixes and a little
>> code clean-up.
>>
>> I hope to eventually get rid of the new BIOSET_NEED_RESCUER flag,
>> but that needs work ing dm and probably bcache first.
>>
>> Thanks,
>> NeilBrown
>>
>>
>> ---
>>
>> NeilBrown (13):
>>   blk: remove bio_set arg from blk_queue_split()
>>   blk: replace bioset_create_nobvec() with a flags arg to bioset_create()
>>   blk: make the bioset rescue_workqueue optional.
>>   blk: use non-rescuing bioset for q->bio_split.
>>   block: Improvements to bounce-buffer handling
>>   rbd: use bio_clone_fast() instead of bio_clone()
>>   drbd: use bio_clone_fast() instead of bio_clone()
>>   pktcdvd: use bio_clone_fast() instead of bio_clone()
>>   lightnvm/pblk-read: use bio_clone_fast()
>>   xen-blkfront: remove bio splitting.
>>   bcache: use kmalloc to allocate bio in bch_data_verify()
>>   block: remove bio_clone() and all references.
>>   block: don't check for BIO_MAX_PAGES in blk_bio_segment_split()
>>
>>
>>  Documentation/block/biodoc.txt  |2 -
>>  block/bio.c |   72 
>> ---
>>  block/blk-core.c|4 +-
>>  block/blk-merge.c   |   31 ++-
>>  block/blk-mq.c  |2 -
>>  block/bounce.c  |   32 +---
>>  drivers/block/drbd/drbd_int.h   |3 +
>>  drivers/block/drbd/drbd_main.c  |   11 +
>>  drivers/block/drbd/drbd_req.c   |2 -
>>  drivers/block/drbd/drbd_req.h   |2 -
>>  drivers/block/pktcdvd.c |   14 +--
>>  drivers/block/ps3vram.c |2 -
>>  drivers/block/rbd.c |   16 +++-
>>  drivers/block/rsxx/dev.c|2 -
>>  drivers/block/umem.c|2 -
>>  drivers/block/xen-blkfront.c|   54 +-
>>  drivers/block/zram/zram_drv.c   |2 -
>>  drivers/lightnvm/pblk-init.c|   16 ++--
>>  drivers/lightnvm/pblk-read.c|2 -
>>  drivers/lightnvm/pblk.h |1 
>>  drivers/lightnvm/rrpc.c |2 -
>>  drivers/md/bcache/debug.c   |2 -
>>  drivers/md/bcache/super.c   |6 ++-
>>  drivers/md/dm-crypt.c   |2 -
>>  drivers/md/dm-io.c  |2 -
>>  drivers/md/dm.c |5 +-
>>  drivers/md/md.c |6 +--
>>  drivers/md/raid1.c  |2 -
>>  drivers/md/raid10.c |2 -
>>  drivers/md/raid5-cache.c|2 -
>>  drivers/md/raid5-ppl.c  |2 -
>>  drivers/md/raid5.c  |2 -
>>  drivers/s390/block/dcssblk.c|2 -
>>  drivers/s390/block/xpram.c  |2 -
>>  drivers/target/target_core_iblock.c |2 -
>>  fs/block_dev.c  |2 -
>>  fs/btrfs/extent_io.c|3 +
>>  fs/xfs/xfs_super.c  |3 +
>>  include/linux/bio.h |   12 ++
>>  include/linux/blkdev.h  |3 -
>>  40 files changed, 162 insertions(+), 174 deletions(-)
>>
>> --
>> Signature


signature.asc
Description: PGP signature


Re: [PATCH v2 0/2] Broadcom Stingray SATA PHY support

2017-06-15 Thread Kishon Vijay Abraham I


On Thursday 08 June 2017 05:01 PM, Srinath Mannam wrote:
> This patchset extends the Broadcom SATA PHY driver to add
> support for Stingray SATA PHY.
> 
> All patches are based on linux-phy/next
> 
> Changes since v1:
>  - Rebased patchset on linux-phy/next
> 

merged, thanks!

-Kishon
> Srinath Mannam (2):
>   dt-bindings: Update documentation for stingray SATA phy
>   phy: Add stingray SATA phy support
> 
>  .../devicetree/bindings/phy/brcm-sata-phy.txt  |  7 ++-
>  drivers/phy/broadcom/phy-brcm-sata.c   | 73 
> ++
>  2 files changed, 77 insertions(+), 3 deletions(-)
> 


Re: [PATCH v2 0/2] Broadcom Stingray SATA PHY support

2017-06-15 Thread Kishon Vijay Abraham I


On Thursday 08 June 2017 05:01 PM, Srinath Mannam wrote:
> This patchset extends the Broadcom SATA PHY driver to add
> support for Stingray SATA PHY.
> 
> All patches are based on linux-phy/next
> 
> Changes since v1:
>  - Rebased patchset on linux-phy/next
> 

merged, thanks!

-Kishon
> Srinath Mannam (2):
>   dt-bindings: Update documentation for stingray SATA phy
>   phy: Add stingray SATA phy support
> 
>  .../devicetree/bindings/phy/brcm-sata-phy.txt  |  7 ++-
>  drivers/phy/broadcom/phy-brcm-sata.c   | 73 
> ++
>  2 files changed, 77 insertions(+), 3 deletions(-)
> 


Re: [PATCH 1/7] drm/bridge: Support hotplugging panel-bridge.

2017-06-15 Thread Archit Taneja



On 06/16/2017 02:11 AM, Eric Anholt wrote:

If the panel-bridge is being set up after the drm_mode_config_reset(),
then the connector's state would never get initialized, and we'd
dereference the NULL in the hotplug path.  We also need to register
the connector, so that userspace can get at it.



Shouldn't the KMS driver make sure the panel-bridge is set up before
drm_mode_config_reset? Is it the case when we're inserting the
panel-bridge driver as a module?


All the connectors that have been added are registered automatically
when drm_dev_register() is called by the KMS driver. Registering a
connector in the middle of setting up our driver is prone to race
conditions if the userspace decides to use them immediately.

Thanks,
Archit


Signed-off-by: Eric Anholt 
---
  drivers/gpu/drm/bridge/panel.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/bridge/panel.c b/drivers/gpu/drm/bridge/panel.c
index 67fe19e5a9c6..8ed8a70799c7 100644
--- a/drivers/gpu/drm/bridge/panel.c
+++ b/drivers/gpu/drm/bridge/panel.c
@@ -82,11 +82,14 @@ static int panel_bridge_attach(struct drm_bridge *bridge)
  
  	drm_mode_connector_attach_encoder(_bridge->connector,

  bridge->encoder);
+   drm_atomic_helper_connector_reset(_bridge->connector);
  
  	ret = drm_panel_attach(panel_bridge->panel, _bridge->connector);

if (ret < 0)
return ret;
  
+	drm_connector_register(_bridge->connector);

+
return 0;
  }
  



--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 1/7] drm/bridge: Support hotplugging panel-bridge.

2017-06-15 Thread Archit Taneja



On 06/16/2017 02:11 AM, Eric Anholt wrote:

If the panel-bridge is being set up after the drm_mode_config_reset(),
then the connector's state would never get initialized, and we'd
dereference the NULL in the hotplug path.  We also need to register
the connector, so that userspace can get at it.



Shouldn't the KMS driver make sure the panel-bridge is set up before
drm_mode_config_reset? Is it the case when we're inserting the
panel-bridge driver as a module?


All the connectors that have been added are registered automatically
when drm_dev_register() is called by the KMS driver. Registering a
connector in the middle of setting up our driver is prone to race
conditions if the userspace decides to use them immediately.

Thanks,
Archit


Signed-off-by: Eric Anholt 
---
  drivers/gpu/drm/bridge/panel.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/bridge/panel.c b/drivers/gpu/drm/bridge/panel.c
index 67fe19e5a9c6..8ed8a70799c7 100644
--- a/drivers/gpu/drm/bridge/panel.c
+++ b/drivers/gpu/drm/bridge/panel.c
@@ -82,11 +82,14 @@ static int panel_bridge_attach(struct drm_bridge *bridge)
  
  	drm_mode_connector_attach_encoder(_bridge->connector,

  bridge->encoder);
+   drm_atomic_helper_connector_reset(_bridge->connector);
  
  	ret = drm_panel_attach(panel_bridge->panel, _bridge->connector);

if (ret < 0)
return ret;
  
+	drm_connector_register(_bridge->connector);

+
return 0;
  }
  



--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH v7 16/16] mtd: nand: denali: avoid magic numbers and rename for clarification

2017-06-15 Thread Masahiro Yamada
2017-06-13 22:45 GMT+09:00 Masahiro Yamada :
> Introduce some macros and helpers to avoid magic numbers and
> rename macros/functions for clarification.
>
> - We see '| 2' in several places.  This means Data Cycle in MAP11 mode.
>   The Denali User's Guide says bit[1:0] of MAP11 is like follows:
>
>   b'00 = Command Cycle
>   b'01 = Address Cycle
>   b'10 = Data Cycle
>
>   So, this commit added DENALI_MAP11_{CMD,ADDR,DATA} macros.
>
> - We see 'denali->flash_mem + 0x10' in several places, but 0x10 is a
>   magic number.  Actually, this accesses the data port of the Host
>   Data/Command Interface.  So, this commit added DENALI_HOST_DATA.
>   On the other hand, 'denali->flash_mem' gets access to the address
>   port, so DENALI_HOST_ADDR was also added.
>
> - We see 'index_addr(denali, cmd, 0x1)' in denali_erase(), but 0x1
>   is a magic number.  0x1 means the erase operation.  Replace 0x1
>   with DENALI_ERASE.
>
> - Rename index_addr() to denali_host_write() for clarification
>
> - Denali User's Guide says MAP{00,01,10,11} for access mode.  Match
>   the macros with terminology in the IP document.
>
> - Rename struct members as follows:
>   flash_bank   -> active_bank(currently selected bank)
>   flash_reg-> reg(base address of registers)
>   flash_mem-> host   (base address of host interface)
>   devnum   -> devs_per_cs(devices connected in parallel)
>   bbtskipbytes -> oob_skip_bytes (number of bytes to skip in OOB)
>
> Signed-off-by: Masahiro Yamada 
> ---
>
> Changes in v7: None
> Changes in v6:
>   - Newly added
>
> Changes in v5: None
> Changes in v4: None
> Changes in v3: None
> Changes in v2: None


Please let me update 18.  (only typo fix)

I sent v8 for this one.


-- 
Best Regards
Masahiro Yamada


Re: [PATCH v7 16/16] mtd: nand: denali: avoid magic numbers and rename for clarification

2017-06-15 Thread Masahiro Yamada
2017-06-13 22:45 GMT+09:00 Masahiro Yamada :
> Introduce some macros and helpers to avoid magic numbers and
> rename macros/functions for clarification.
>
> - We see '| 2' in several places.  This means Data Cycle in MAP11 mode.
>   The Denali User's Guide says bit[1:0] of MAP11 is like follows:
>
>   b'00 = Command Cycle
>   b'01 = Address Cycle
>   b'10 = Data Cycle
>
>   So, this commit added DENALI_MAP11_{CMD,ADDR,DATA} macros.
>
> - We see 'denali->flash_mem + 0x10' in several places, but 0x10 is a
>   magic number.  Actually, this accesses the data port of the Host
>   Data/Command Interface.  So, this commit added DENALI_HOST_DATA.
>   On the other hand, 'denali->flash_mem' gets access to the address
>   port, so DENALI_HOST_ADDR was also added.
>
> - We see 'index_addr(denali, cmd, 0x1)' in denali_erase(), but 0x1
>   is a magic number.  0x1 means the erase operation.  Replace 0x1
>   with DENALI_ERASE.
>
> - Rename index_addr() to denali_host_write() for clarification
>
> - Denali User's Guide says MAP{00,01,10,11} for access mode.  Match
>   the macros with terminology in the IP document.
>
> - Rename struct members as follows:
>   flash_bank   -> active_bank(currently selected bank)
>   flash_reg-> reg(base address of registers)
>   flash_mem-> host   (base address of host interface)
>   devnum   -> devs_per_cs(devices connected in parallel)
>   bbtskipbytes -> oob_skip_bytes (number of bytes to skip in OOB)
>
> Signed-off-by: Masahiro Yamada 
> ---
>
> Changes in v7: None
> Changes in v6:
>   - Newly added
>
> Changes in v5: None
> Changes in v4: None
> Changes in v3: None
> Changes in v2: None


Please let me update 18.  (only typo fix)

I sent v8 for this one.


-- 
Best Regards
Masahiro Yamada


RE: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-15 Thread Zheng, Lv
Hi,

> From: linux-acpi-ow...@vger.kernel.org 
> [mailto:linux-acpi-ow...@vger.kernel.org] On Behalf Of Peter
> Hutterer
> Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch 
> exported by ACPI
> 
> On Thu, Jun 15, 2017 at 07:33:58AM +, Zheng, Lv wrote:
> > Hi, Peter
> >
> > > From: Peter Hutterer [mailto:peter.hutte...@who-t.net]
> > > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID 
> > > switch exported by ACPI
> > >
> > > On Thu, Jun 15, 2017 at 02:52:57AM +, Zheng, Lv wrote:
> > > > Hi, Benjamin
> > > >
> > > > > From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]
> > > > > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable 
> > > > > LID switch exported by ACPI
> > > > >
> > > > > Hi,
> > > > >
> > > > > [Sorry for the delay, I have been sidetracked from this]
> > > > >
> > > > > On Jun 07 2017 or thereabouts, Lennart Poettering wrote:
> > > > > > On Thu, 01.06.17 20:46, Benjamin Tissoires 
> > > > > > (benjamin.tissoi...@redhat.com) wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Sending this as a WIP as it still need a few changes, but it 
> > > > > > > mostly works as
> > > > > > > expected (still not fully compliant yet).
> > > > > > >
> > > > > > > So this is based on Lennart's comment in [1]: if the LID state is 
> > > > > > > not reliable,
> > > > > > > the kernel should not export the LID switch device as long as we 
> > > > > > > are not sure
> > > > > > > about its state.
> > > > > >
> > > > > > Ah nice! I (obviously) like this approach.
> > > > >
> > > > > Heh. Now I just need to convince Lv that it's the right approach.
> > > >
> > > > I feel we don't have big conflicts.
> > > > And I already took part of your idea into this patchset:
> > > > https://patchwork.kernel.org/patch/9771121/
> > > > https://patchwork.kernel.org/patch/9771119/
> > > > I tested my surface pros with Ubuntu, they are working as expected.
> > > >
> > > > > > > Note that systemd currently doesn't sync the state when the input 
> > > > > > > node just
> > > > > > > appears. This is a systemd bug, and it should not be handled by 
> > > > > > > the kernel
> > > > > > > community.
> > > > > >
> > > > > > Uh if this is borked, we should indeed fix this in systemd. Is there
> > > > > > already a systemd github bug about this? If not, please create one,
> > > > > > and we'll look into it!
> > > > >
> > > > > I don't think there is. I haven't raised it yet because I am not so 
> > > > > sure
> > > > > this will not break again those worthless unreliable LID, and if we 
> > > > > play
> > > > > whack a mole between the kernel and user space, things are going to be
> > > > > nasty. So I'd rather have this fixed in systemd along with the
> > > > > unreliable LID switch knowledge, so we are sure that the kernel 
> > > > > behaves
> > > > > the way we expect it to be.
> > > >
> > > > This is my feeling:
> > > > We needn't go that far.
> > > > We can interpret "input node appears" into "default input node state".
> > >
> > > Sorry, can you clarify this bit please? I'm not sure what you mean here.
> > > Note that there's an unknown amount of time between "device node appearing
> > > in the system" and when a userspace process actually opens it and looks at
> > > its state. By then, the node may have changed state again.
> >
> > We can see:
> > "logind" has already implemented a timeout, and will not respond lid state
> > unless it can be stable within this timeout period.
> > I'm not an expert of logind, maybe this is because of "HoldOffTimeoutSec"?
> >
> > I feel "removing the input node for a period where its state is not 
> > trustful"
> > is technically identical to this mechanism.
> 
> but you'd be making kernel policy based on one userspace implementation.
> e.g. libinput doesn't have a timeout period, it assumes the state is
> correct when an input node is present.

Do you see practical issues?
If not, should we avoid over-engineering at this moment?

After resume, SW_LID state could remain unreliable "close" for a while.
But that's just a kind of delay happens in all computing programs.
I suppose all power managing programs have already handled that.
I confirmed no breakage for systemd 233.
For systemd 229, it cannot handle it well due to bugs.
But my latest patch series has worked the bug around.
So I don't see any breakage related to post-resume incorrect state period.
Do you see problems that my tests haven't covered?

So I wonder if you mean:
After boot, button driver should create input node right before sending first 
input report.
Is this exactly what you want me to improve?
If so, please also let me know if you have seen real issues related to this?

Cheers,
Lv


RE: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-15 Thread Zheng, Lv
Hi,

> From: linux-acpi-ow...@vger.kernel.org 
> [mailto:linux-acpi-ow...@vger.kernel.org] On Behalf Of Peter
> Hutterer
> Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch 
> exported by ACPI
> 
> On Thu, Jun 15, 2017 at 07:33:58AM +, Zheng, Lv wrote:
> > Hi, Peter
> >
> > > From: Peter Hutterer [mailto:peter.hutte...@who-t.net]
> > > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID 
> > > switch exported by ACPI
> > >
> > > On Thu, Jun 15, 2017 at 02:52:57AM +, Zheng, Lv wrote:
> > > > Hi, Benjamin
> > > >
> > > > > From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]
> > > > > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable 
> > > > > LID switch exported by ACPI
> > > > >
> > > > > Hi,
> > > > >
> > > > > [Sorry for the delay, I have been sidetracked from this]
> > > > >
> > > > > On Jun 07 2017 or thereabouts, Lennart Poettering wrote:
> > > > > > On Thu, 01.06.17 20:46, Benjamin Tissoires 
> > > > > > (benjamin.tissoi...@redhat.com) wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Sending this as a WIP as it still need a few changes, but it 
> > > > > > > mostly works as
> > > > > > > expected (still not fully compliant yet).
> > > > > > >
> > > > > > > So this is based on Lennart's comment in [1]: if the LID state is 
> > > > > > > not reliable,
> > > > > > > the kernel should not export the LID switch device as long as we 
> > > > > > > are not sure
> > > > > > > about its state.
> > > > > >
> > > > > > Ah nice! I (obviously) like this approach.
> > > > >
> > > > > Heh. Now I just need to convince Lv that it's the right approach.
> > > >
> > > > I feel we don't have big conflicts.
> > > > And I already took part of your idea into this patchset:
> > > > https://patchwork.kernel.org/patch/9771121/
> > > > https://patchwork.kernel.org/patch/9771119/
> > > > I tested my surface pros with Ubuntu, they are working as expected.
> > > >
> > > > > > > Note that systemd currently doesn't sync the state when the input 
> > > > > > > node just
> > > > > > > appears. This is a systemd bug, and it should not be handled by 
> > > > > > > the kernel
> > > > > > > community.
> > > > > >
> > > > > > Uh if this is borked, we should indeed fix this in systemd. Is there
> > > > > > already a systemd github bug about this? If not, please create one,
> > > > > > and we'll look into it!
> > > > >
> > > > > I don't think there is. I haven't raised it yet because I am not so 
> > > > > sure
> > > > > this will not break again those worthless unreliable LID, and if we 
> > > > > play
> > > > > whack a mole between the kernel and user space, things are going to be
> > > > > nasty. So I'd rather have this fixed in systemd along with the
> > > > > unreliable LID switch knowledge, so we are sure that the kernel 
> > > > > behaves
> > > > > the way we expect it to be.
> > > >
> > > > This is my feeling:
> > > > We needn't go that far.
> > > > We can interpret "input node appears" into "default input node state".
> > >
> > > Sorry, can you clarify this bit please? I'm not sure what you mean here.
> > > Note that there's an unknown amount of time between "device node appearing
> > > in the system" and when a userspace process actually opens it and looks at
> > > its state. By then, the node may have changed state again.
> >
> > We can see:
> > "logind" has already implemented a timeout, and will not respond lid state
> > unless it can be stable within this timeout period.
> > I'm not an expert of logind, maybe this is because of "HoldOffTimeoutSec"?
> >
> > I feel "removing the input node for a period where its state is not 
> > trustful"
> > is technically identical to this mechanism.
> 
> but you'd be making kernel policy based on one userspace implementation.
> e.g. libinput doesn't have a timeout period, it assumes the state is
> correct when an input node is present.

Do you see practical issues?
If not, should we avoid over-engineering at this moment?

After resume, SW_LID state could remain unreliable "close" for a while.
But that's just a kind of delay happens in all computing programs.
I suppose all power managing programs have already handled that.
I confirmed no breakage for systemd 233.
For systemd 229, it cannot handle it well due to bugs.
But my latest patch series has worked the bug around.
So I don't see any breakage related to post-resume incorrect state period.
Do you see problems that my tests haven't covered?

So I wonder if you mean:
After boot, button driver should create input node right before sending first 
input report.
Is this exactly what you want me to improve?
If so, please also let me know if you have seen real issues related to this?

Cheers,
Lv


[PATCH v3] staging: wlan-ng: Fix struct definition's and variable type

2017-06-15 Thread sunil . m
From: Suniel Mahesh 

le16_to_cpu() accepts argument of type __le16 and cpu_to_le16()
returns an argument of type __le16. This patch fixes:
(a) the type of the variable that end's up getting return from
cpu_to_le16().
(b) the member types of struct hfa384x_host_scan_request_data,
struct hfa384x_bytestr32 and struct hfa384x_hscan_result_sub.

The following type mismatch warnings reported by sparse
have been fixed:
warning: incorrect type in assignment (different base types)
warning: cast to restricted __le16

Signed-off-by: Suniel Mahesh 
---
Changes for v3:
- Edited subject and description of the patch to fit with the changes
  done in the code base, as suggested by Frans Klaver.

Changes for v2:
- Reworked on the patch and modified commit message as per the
  recommendations from Frans Klaver and Greg K-H.

- Patch was tested and built on next-20170609 and staging-testing.
---
 drivers/staging/wlan-ng/hfa384x.h| 18 +-
 drivers/staging/wlan-ng/prism2mgmt.c |  2 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/wlan-ng/hfa384x.h 
b/drivers/staging/wlan-ng/hfa384x.h
index 310e2c4..f99cc04 100644
--- a/drivers/staging/wlan-ng/hfa384x.h
+++ b/drivers/staging/wlan-ng/hfa384x.h
@@ -358,7 +358,7 @@ struct hfa384x_bytestr {
 } __packed;
 
 struct hfa384x_bytestr32 {
-   u16 len;
+   __le16 len;
u8 data[32];
 } __packed;
 
@@ -399,8 +399,8 @@ struct hfa384x_caplevel {
 
 /*-- Configuration Record: HostScanRequest (data portion only) --*/
 struct hfa384x_host_scan_request_data {
-   u16 channel_list;
-   u16 tx_rate;
+   __le16 channel_list;
+   __le16 tx_rate;
struct hfa384x_bytestr32 ssid;
 } __packed;
 
@@ -682,16 +682,16 @@ struct hfa384x_ch_info_result {
 
 /*--  Inquiry Frame, Diagnose: Host Scan Results & Subfields--*/
 struct hfa384x_hscan_result_sub {
-   u16 chid;
-   u16 anl;
-   u16 sl;
+   __le16 chid;
+   __le16 anl;
+   __le16 sl;
u8 bssid[WLAN_BSSID_LEN];
-   u16 bcnint;
-   u16 capinfo;
+   __le16 bcnint;
+   __le16 capinfo;
struct hfa384x_bytestr32 ssid;
u8 supprates[10];   /* 802.11 info element */
u16 proberesp_rate;
-   u16 atim;
+   __le16 atim;
 } __packed;
 
 struct hfa384x_hscan_result {
diff --git a/drivers/staging/wlan-ng/prism2mgmt.c 
b/drivers/staging/wlan-ng/prism2mgmt.c
index f4d6e48..c4aa9e7 100644
--- a/drivers/staging/wlan-ng/prism2mgmt.c
+++ b/drivers/staging/wlan-ng/prism2mgmt.c
@@ -213,7 +213,7 @@ int prism2mgmt_scan(struct wlandevice *wlandev, void *msgp)
goto exit;
}
if (word == HFA384x_PORTSTATUS_DISABLED) {
-   u16 wordbuf[17];
+   __le16 wordbuf[17];
 
result = hfa384x_drvr_setconfig16(hw,
HFA384x_RID_CNFROAMINGMODE,
-- 
1.9.1



[PATCH v3] staging: wlan-ng: Fix struct definition's and variable type

2017-06-15 Thread sunil . m
From: Suniel Mahesh 

le16_to_cpu() accepts argument of type __le16 and cpu_to_le16()
returns an argument of type __le16. This patch fixes:
(a) the type of the variable that end's up getting return from
cpu_to_le16().
(b) the member types of struct hfa384x_host_scan_request_data,
struct hfa384x_bytestr32 and struct hfa384x_hscan_result_sub.

The following type mismatch warnings reported by sparse
have been fixed:
warning: incorrect type in assignment (different base types)
warning: cast to restricted __le16

Signed-off-by: Suniel Mahesh 
---
Changes for v3:
- Edited subject and description of the patch to fit with the changes
  done in the code base, as suggested by Frans Klaver.

Changes for v2:
- Reworked on the patch and modified commit message as per the
  recommendations from Frans Klaver and Greg K-H.

- Patch was tested and built on next-20170609 and staging-testing.
---
 drivers/staging/wlan-ng/hfa384x.h| 18 +-
 drivers/staging/wlan-ng/prism2mgmt.c |  2 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/wlan-ng/hfa384x.h 
b/drivers/staging/wlan-ng/hfa384x.h
index 310e2c4..f99cc04 100644
--- a/drivers/staging/wlan-ng/hfa384x.h
+++ b/drivers/staging/wlan-ng/hfa384x.h
@@ -358,7 +358,7 @@ struct hfa384x_bytestr {
 } __packed;
 
 struct hfa384x_bytestr32 {
-   u16 len;
+   __le16 len;
u8 data[32];
 } __packed;
 
@@ -399,8 +399,8 @@ struct hfa384x_caplevel {
 
 /*-- Configuration Record: HostScanRequest (data portion only) --*/
 struct hfa384x_host_scan_request_data {
-   u16 channel_list;
-   u16 tx_rate;
+   __le16 channel_list;
+   __le16 tx_rate;
struct hfa384x_bytestr32 ssid;
 } __packed;
 
@@ -682,16 +682,16 @@ struct hfa384x_ch_info_result {
 
 /*--  Inquiry Frame, Diagnose: Host Scan Results & Subfields--*/
 struct hfa384x_hscan_result_sub {
-   u16 chid;
-   u16 anl;
-   u16 sl;
+   __le16 chid;
+   __le16 anl;
+   __le16 sl;
u8 bssid[WLAN_BSSID_LEN];
-   u16 bcnint;
-   u16 capinfo;
+   __le16 bcnint;
+   __le16 capinfo;
struct hfa384x_bytestr32 ssid;
u8 supprates[10];   /* 802.11 info element */
u16 proberesp_rate;
-   u16 atim;
+   __le16 atim;
 } __packed;
 
 struct hfa384x_hscan_result {
diff --git a/drivers/staging/wlan-ng/prism2mgmt.c 
b/drivers/staging/wlan-ng/prism2mgmt.c
index f4d6e48..c4aa9e7 100644
--- a/drivers/staging/wlan-ng/prism2mgmt.c
+++ b/drivers/staging/wlan-ng/prism2mgmt.c
@@ -213,7 +213,7 @@ int prism2mgmt_scan(struct wlandevice *wlandev, void *msgp)
goto exit;
}
if (word == HFA384x_PORTSTATUS_DISABLED) {
-   u16 wordbuf[17];
+   __le16 wordbuf[17];
 
result = hfa384x_drvr_setconfig16(hw,
HFA384x_RID_CNFROAMINGMODE,
-- 
1.9.1



[PATCH] selftests: lib: Skip tests on missing test modules

2017-06-15 Thread Sumit Semwal
With older kernels, printf.sh and bitmap.sh fail because they can't find
the respective test modules they are looking for.

Add the skip portion on missing the respective test_XXX module. Error out
the same way as prime_numbers.sh.

Signed-off-by: Sumit Semwal 
---
 tools/testing/selftests/lib/bitmap.sh | 5 +
 tools/testing/selftests/lib/printf.sh | 5 +
 2 files changed, 10 insertions(+)

diff --git a/tools/testing/selftests/lib/bitmap.sh 
b/tools/testing/selftests/lib/bitmap.sh
index 2da187b6ddad..85294b4a0861 100755
--- a/tools/testing/selftests/lib/bitmap.sh
+++ b/tools/testing/selftests/lib/bitmap.sh
@@ -1,6 +1,11 @@
 #!/bin/sh
 # Runs bitmap infrastructure tests using test_bitmap kernel module
 
+if ! /sbin/modprobe -q test_bitmap; then
+echo "bitmap: [SKIP]"
+   exit 77
+fi
+
 if /sbin/modprobe -q test_bitmap; then
/sbin/modprobe -q -r test_bitmap
echo "bitmap: ok"
diff --git a/tools/testing/selftests/lib/printf.sh 
b/tools/testing/selftests/lib/printf.sh
index 4fdc70fe6980..024e749a83d4 100755
--- a/tools/testing/selftests/lib/printf.sh
+++ b/tools/testing/selftests/lib/printf.sh
@@ -1,6 +1,11 @@
 #!/bin/sh
 # Runs printf infrastructure using test_printf kernel module
 
+if ! /sbin/modprobe -q test_printf; then
+echo "printf: [SKIP]"
+   exit 77
+fi
+
 if /sbin/modprobe -q test_printf; then
/sbin/modprobe -q -r test_printf
echo "printf: ok"
-- 
2.7.4



[PATCH] selftests: lib: Skip tests on missing test modules

2017-06-15 Thread Sumit Semwal
With older kernels, printf.sh and bitmap.sh fail because they can't find
the respective test modules they are looking for.

Add the skip portion on missing the respective test_XXX module. Error out
the same way as prime_numbers.sh.

Signed-off-by: Sumit Semwal 
---
 tools/testing/selftests/lib/bitmap.sh | 5 +
 tools/testing/selftests/lib/printf.sh | 5 +
 2 files changed, 10 insertions(+)

diff --git a/tools/testing/selftests/lib/bitmap.sh 
b/tools/testing/selftests/lib/bitmap.sh
index 2da187b6ddad..85294b4a0861 100755
--- a/tools/testing/selftests/lib/bitmap.sh
+++ b/tools/testing/selftests/lib/bitmap.sh
@@ -1,6 +1,11 @@
 #!/bin/sh
 # Runs bitmap infrastructure tests using test_bitmap kernel module
 
+if ! /sbin/modprobe -q test_bitmap; then
+echo "bitmap: [SKIP]"
+   exit 77
+fi
+
 if /sbin/modprobe -q test_bitmap; then
/sbin/modprobe -q -r test_bitmap
echo "bitmap: ok"
diff --git a/tools/testing/selftests/lib/printf.sh 
b/tools/testing/selftests/lib/printf.sh
index 4fdc70fe6980..024e749a83d4 100755
--- a/tools/testing/selftests/lib/printf.sh
+++ b/tools/testing/selftests/lib/printf.sh
@@ -1,6 +1,11 @@
 #!/bin/sh
 # Runs printf infrastructure using test_printf kernel module
 
+if ! /sbin/modprobe -q test_printf; then
+echo "printf: [SKIP]"
+   exit 77
+fi
+
 if /sbin/modprobe -q test_printf; then
/sbin/modprobe -q -r test_printf
echo "printf: ok"
-- 
2.7.4



linux-next: manual merge of the uuid tree with the arm64 tree

2017-06-15 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the uuid tree got conflicts in:

  drivers/acpi/apei/ghes.c
  include/uapi/linux/uuid.h

between commits:

  f4dccde3f9b9 ("ras: acpi/apei: cper: add support for generic data v3 
structure")
  476e4940c251 ("ras: acpi / apei: generate trace event for unrecognized CPER 
section")
  8a0456ea3dec ("trace, ras: add ARM processor error trace event")

from the arm64 tree and commit:

  60927bc31436 ("uuid: remove uuid_be defintions from the uapi header")
  f9727a17db9b ("uuid: rename uuid types")
  5b53696a30d5 ("ACPI / APEI: Switch to use new generic UUID API")

from the uuid tree.

I really can't fix this up, so can you guys get together and sort this
out, please.

For today, I have just dropped the uuid tree, sorry.



-- 
Cheers,
Stephen Rothwell


linux-next: manual merge of the uuid tree with the arm64 tree

2017-06-15 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the uuid tree got conflicts in:

  drivers/acpi/apei/ghes.c
  include/uapi/linux/uuid.h

between commits:

  f4dccde3f9b9 ("ras: acpi/apei: cper: add support for generic data v3 
structure")
  476e4940c251 ("ras: acpi / apei: generate trace event for unrecognized CPER 
section")
  8a0456ea3dec ("trace, ras: add ARM processor error trace event")

from the arm64 tree and commit:

  60927bc31436 ("uuid: remove uuid_be defintions from the uapi header")
  f9727a17db9b ("uuid: rename uuid types")
  5b53696a30d5 ("ACPI / APEI: Switch to use new generic UUID API")

from the uuid tree.

I really can't fix this up, so can you guys get together and sort this
out, please.

For today, I have just dropped the uuid tree, sorry.



-- 
Cheers,
Stephen Rothwell


Re: [PATCH] ppc64/perf: Fix oops when kthread execs user process

2017-06-15 Thread Michael Ellerman
Ravi Bangoria  writes:

> When a kthread makes a call_usermodehelper() call the steps are:
>  a. allocates current->mm
>  b. load_elf_binary()
>  c. populates current->thread.regs
>
> While doing this, interrupts are not disabled. If there is a perf
> interrupt in the middle of this process (i.e. step 'a' has completed
> but not yet reached to step 'c') and if perf tries to read userspace
> regs, kernel oops with following log:
>
>   [  131.217172] Unable to handle kernel paging request for data at address 
> 0x
>   [  131.217731] Faulting instruction address: 0xc00da0fc
>   ...
>   [  131.23] Call Trace:
>   [  131.235714] [c000bbaaad60] [c025dedc] 
> perf_output_sample_regs+0x6c/0xd0
>   [  131.236020] [c000bbaaadb0] [c0269b44] 
> perf_output_sample+0x4e4/0x830
>   [  131.236362] [c000bbaaae40] [c026a354] 
> perf_event_output_forward+0x64/0x90
>   [  131.236668] [c000bbaaaeb0] [c026298c] 
> __perf_event_overflow+0x8c/0x1e0
>   [  131.236979] [c000bbaaaf00] [c00dc330] 
> record_and_restart+0x220/0x5c0
>   [  131.237306] [c000bbaab230] [c00dd1d8] 
> perf_event_interrupt+0x2d8/0x4d0
>   [  131.237611] [c000bbaab320] [c00294a4] 
> performance_monitor_exception+0x54/0x70
>   [  131.237891] [c000bbaab350] [c000a0a8] 
> performance_monitor_common+0x158/0x160
>   [  131.238208] --- interrupt: f01 at avtab_search_node+0x150/0x1a0
>   [  131.238208] LR = avtab_search_node+0x100/0x1a0
>   [  131.238617] [c000bbaab640] [c0526770] 
> context_struct_compute_av+0x220/0x5b0 (unreliable)
>   [  131.238948] [c000bbaab730] [c05278b4] 
> security_compute_av+0x174/0x390
>   [  131.239231] [c000bbaab7e0] [c05050e4] 
> avc_compute_av+0x84/0x260
>   [  131.239471] [c000bbaab890] [c0506198] avc_has_perm+0xf8/0x1c0
>   [  131.239708] [c000bbaab980] [c050f32c] file_has_perm+0x6c/0xd0
>   [  131.239972] [c000bbaab9e0] [c04ff0fc] 
> security_mmap_file+0xac/0x140
>   [  131.240256] [c000bbaaba50] [c02b1fc0] 
> vm_mmap_pgoff+0x80/0x160
>   [  131.240532] [c000bbaabb30] [c03f7db4] elf_map+0xa4/0x180
>   [  131.240771] [c000bbaabb90] [c03f9a48] 
> load_elf_binary+0x6e8/0x15a0
>   [  131.241060] [c000bbaabc90] [c0374f58] 
> search_binary_handler+0xe8/0x290
>   [  131.241347] [c000bbaabd20] [c0375c14] 
> do_execveat_common.isra.14+0x5f4/0x840
>   [  131.241631] [c000bbaabdf0] [c010be70] 
> call_usermodehelper_exec_async+0x170/0x210
>   [  131.241955] [c000bbaabe30] [c000bae0] 
> ret_from_kernel_thread+0x5c/0x7c
>
> Fix it by setting abi to PERF_SAMPLE_REGS_ABI_NONE when userspace
> pt_regs are not set. 
>
> Signed-off-by: Ravi Bangoria 
> ---
> Note: this should go to stable as well. I've not checked below 4.4
> kernel but I'm able to reproduce it with 4.4 kernel.

That function (perf_get_regs_user()) didn't exist until 4.7, ie:

ed4a4ef85cf5 ("powerpc/perf: Add support for sampling interrupt register state")

So there must be something else going on.

I'll hold off on merging this until we've worked it out.

cheers


Re: [PATCH] ppc64/perf: Fix oops when kthread execs user process

2017-06-15 Thread Michael Ellerman
Ravi Bangoria  writes:

> When a kthread makes a call_usermodehelper() call the steps are:
>  a. allocates current->mm
>  b. load_elf_binary()
>  c. populates current->thread.regs
>
> While doing this, interrupts are not disabled. If there is a perf
> interrupt in the middle of this process (i.e. step 'a' has completed
> but not yet reached to step 'c') and if perf tries to read userspace
> regs, kernel oops with following log:
>
>   [  131.217172] Unable to handle kernel paging request for data at address 
> 0x
>   [  131.217731] Faulting instruction address: 0xc00da0fc
>   ...
>   [  131.23] Call Trace:
>   [  131.235714] [c000bbaaad60] [c025dedc] 
> perf_output_sample_regs+0x6c/0xd0
>   [  131.236020] [c000bbaaadb0] [c0269b44] 
> perf_output_sample+0x4e4/0x830
>   [  131.236362] [c000bbaaae40] [c026a354] 
> perf_event_output_forward+0x64/0x90
>   [  131.236668] [c000bbaaaeb0] [c026298c] 
> __perf_event_overflow+0x8c/0x1e0
>   [  131.236979] [c000bbaaaf00] [c00dc330] 
> record_and_restart+0x220/0x5c0
>   [  131.237306] [c000bbaab230] [c00dd1d8] 
> perf_event_interrupt+0x2d8/0x4d0
>   [  131.237611] [c000bbaab320] [c00294a4] 
> performance_monitor_exception+0x54/0x70
>   [  131.237891] [c000bbaab350] [c000a0a8] 
> performance_monitor_common+0x158/0x160
>   [  131.238208] --- interrupt: f01 at avtab_search_node+0x150/0x1a0
>   [  131.238208] LR = avtab_search_node+0x100/0x1a0
>   [  131.238617] [c000bbaab640] [c0526770] 
> context_struct_compute_av+0x220/0x5b0 (unreliable)
>   [  131.238948] [c000bbaab730] [c05278b4] 
> security_compute_av+0x174/0x390
>   [  131.239231] [c000bbaab7e0] [c05050e4] 
> avc_compute_av+0x84/0x260
>   [  131.239471] [c000bbaab890] [c0506198] avc_has_perm+0xf8/0x1c0
>   [  131.239708] [c000bbaab980] [c050f32c] file_has_perm+0x6c/0xd0
>   [  131.239972] [c000bbaab9e0] [c04ff0fc] 
> security_mmap_file+0xac/0x140
>   [  131.240256] [c000bbaaba50] [c02b1fc0] 
> vm_mmap_pgoff+0x80/0x160
>   [  131.240532] [c000bbaabb30] [c03f7db4] elf_map+0xa4/0x180
>   [  131.240771] [c000bbaabb90] [c03f9a48] 
> load_elf_binary+0x6e8/0x15a0
>   [  131.241060] [c000bbaabc90] [c0374f58] 
> search_binary_handler+0xe8/0x290
>   [  131.241347] [c000bbaabd20] [c0375c14] 
> do_execveat_common.isra.14+0x5f4/0x840
>   [  131.241631] [c000bbaabdf0] [c010be70] 
> call_usermodehelper_exec_async+0x170/0x210
>   [  131.241955] [c000bbaabe30] [c000bae0] 
> ret_from_kernel_thread+0x5c/0x7c
>
> Fix it by setting abi to PERF_SAMPLE_REGS_ABI_NONE when userspace
> pt_regs are not set. 
>
> Signed-off-by: Ravi Bangoria 
> ---
> Note: this should go to stable as well. I've not checked below 4.4
> kernel but I'm able to reproduce it with 4.4 kernel.

That function (perf_get_regs_user()) didn't exist until 4.7, ie:

ed4a4ef85cf5 ("powerpc/perf: Add support for sampling interrupt register state")

So there must be something else going on.

I'll hold off on merging this until we've worked it out.

cheers


Re: linux-next: build warning after merge of the wireless-drivers tree

2017-06-15 Thread Kalle Valo
Stephen Rothwell  writes:

> After merging the wireless-drivers tree, today's linux-next build
> (x86_64 allmodconfig) produced this warning:
>
> drivers/net/wireless/broadcom/brcm80211/brcmfmac/usb.c: In function
> 'brcmf_usb_probe_phase2':
> drivers/net/wireless/broadcom/brcm80211/brcmfmac/usb.c:1198:2:
> warning: 'devinfo' may be used uninitialized in this function
> [-Wmaybe-uninitialized]
>   mutex_unlock(>dev_init_lock);
>   ^
>
> Introduced by commit
>
>   6d0507a777fb ("brcmfmac: add parameter to pass error code in firmware 
> callback")

Thanks, for the report. I also noticed this, only after I had applied
the patch, but hopefully Arend sends a patch to fix this soon.

-- 
Kalle Valo


Re: linux-next: build warning after merge of the wireless-drivers tree

2017-06-15 Thread Kalle Valo
Stephen Rothwell  writes:

> After merging the wireless-drivers tree, today's linux-next build
> (x86_64 allmodconfig) produced this warning:
>
> drivers/net/wireless/broadcom/brcm80211/brcmfmac/usb.c: In function
> 'brcmf_usb_probe_phase2':
> drivers/net/wireless/broadcom/brcm80211/brcmfmac/usb.c:1198:2:
> warning: 'devinfo' may be used uninitialized in this function
> [-Wmaybe-uninitialized]
>   mutex_unlock(>dev_init_lock);
>   ^
>
> Introduced by commit
>
>   6d0507a777fb ("brcmfmac: add parameter to pass error code in firmware 
> callback")

Thanks, for the report. I also noticed this, only after I had applied
the patch, but hopefully Arend sends a patch to fix this soon.

-- 
Kalle Valo


[PATCH 2/2] loop: Add PF_LESS_THROTTLE to block/loop device thread.

2017-06-15 Thread NeilBrown
When a filesystem is mounted from a loop device, writes are
throttled by balance_dirty_pages() twice: once when writing
to the filesystem and once when the loop_handle_cmd() writes
to the backing file.  This double-throttling can trigger
positive feedback loops that create significant delays.  The
throttling at the lower level is seen by the upper level as
a slow device, so it throttles extra hard.

The PF_LESS_THROTTLE flag was created to handle exactly this
circumstance, though with an NFS filesystem mounted from a
local NFS server.  It reduces the throttling on the lower
layer so that it can proceed largely unthrottled.

To demonstrate this, create a filesystem on a loop device
and write (e.g. with dd) several large files which combine
to consume significantly more than the limit set by
/proc/sys/vm/dirty_ratio or dirty_bytes.  Measure the total
time taken.

When I do this directly on a device (no loop device) the
total time for several runs (mkfs, mount, write 200 files,
umount) is fairly stable: 28-35 seconds.
When I do this over a loop device the times are much worse
and less stable.  52-460 seconds.  Half below 100seconds,
half above.
When I apply this patch, the times become stable again,
though not as fast as the no-loop-back case: 53-72 seconds.

There may be room for further improvement as the total overhead still
seems too high, but this is a big improvement.

Reviewed-by: Christoph Hellwig 
Reviewed-by: Ming Lei 
Suggested-by: Michal Hocko 
Acked-by: Michal Hocko 
Signed-off-by: NeilBrown 
---
 drivers/block/loop.c |8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 9c457ca6c55e..6ed7c4506951 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -843,10 +843,16 @@ static void loop_unprepare_queue(struct loop_device *lo)
kthread_stop(lo->worker_task);
 }
 
+static int loop_kthread_worker_fn(void *worker_ptr)
+{
+   current->flags |= PF_LESS_THROTTLE;
+   return kthread_worker_fn(worker_ptr);
+}
+
 static int loop_prepare_queue(struct loop_device *lo)
 {
kthread_init_worker(>worker);
-   lo->worker_task = kthread_run(kthread_worker_fn,
+   lo->worker_task = kthread_run(loop_kthread_worker_fn,
>worker, "loop%d", lo->lo_number);
if (IS_ERR(lo->worker_task))
return -ENOMEM;




[PATCH 2/2] loop: Add PF_LESS_THROTTLE to block/loop device thread.

2017-06-15 Thread NeilBrown
When a filesystem is mounted from a loop device, writes are
throttled by balance_dirty_pages() twice: once when writing
to the filesystem and once when the loop_handle_cmd() writes
to the backing file.  This double-throttling can trigger
positive feedback loops that create significant delays.  The
throttling at the lower level is seen by the upper level as
a slow device, so it throttles extra hard.

The PF_LESS_THROTTLE flag was created to handle exactly this
circumstance, though with an NFS filesystem mounted from a
local NFS server.  It reduces the throttling on the lower
layer so that it can proceed largely unthrottled.

To demonstrate this, create a filesystem on a loop device
and write (e.g. with dd) several large files which combine
to consume significantly more than the limit set by
/proc/sys/vm/dirty_ratio or dirty_bytes.  Measure the total
time taken.

When I do this directly on a device (no loop device) the
total time for several runs (mkfs, mount, write 200 files,
umount) is fairly stable: 28-35 seconds.
When I do this over a loop device the times are much worse
and less stable.  52-460 seconds.  Half below 100seconds,
half above.
When I apply this patch, the times become stable again,
though not as fast as the no-loop-back case: 53-72 seconds.

There may be room for further improvement as the total overhead still
seems too high, but this is a big improvement.

Reviewed-by: Christoph Hellwig 
Reviewed-by: Ming Lei 
Suggested-by: Michal Hocko 
Acked-by: Michal Hocko 
Signed-off-by: NeilBrown 
---
 drivers/block/loop.c |8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 9c457ca6c55e..6ed7c4506951 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -843,10 +843,16 @@ static void loop_unprepare_queue(struct loop_device *lo)
kthread_stop(lo->worker_task);
 }
 
+static int loop_kthread_worker_fn(void *worker_ptr)
+{
+   current->flags |= PF_LESS_THROTTLE;
+   return kthread_worker_fn(worker_ptr);
+}
+
 static int loop_prepare_queue(struct loop_device *lo)
 {
kthread_init_worker(>worker);
-   lo->worker_task = kthread_run(kthread_worker_fn,
+   lo->worker_task = kthread_run(loop_kthread_worker_fn,
>worker, "loop%d", lo->lo_number);
if (IS_ERR(lo->worker_task))
return -ENOMEM;




[PATCH 0/2] Two fixes for loop devices

2017-06-15 Thread NeilBrown
Hi Jens,
 one of these is a resend of a patch I sent a while back.
 The other is new - loop closes files differently from close()
 and in a way that can confuse NFS.

Thanks,
NeilBrown

---

NeilBrown (2):
  loop: use filp_close() rather than fput()
  loop: Add PF_LESS_THROTTLE to block/loop device thread.


 drivers/block/loop.c |   14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

--
Signature



[PATCH 0/2] Two fixes for loop devices

2017-06-15 Thread NeilBrown
Hi Jens,
 one of these is a resend of a patch I sent a while back.
 The other is new - loop closes files differently from close()
 and in a way that can confuse NFS.

Thanks,
NeilBrown

---

NeilBrown (2):
  loop: use filp_close() rather than fput()
  loop: Add PF_LESS_THROTTLE to block/loop device thread.


 drivers/block/loop.c |   14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

--
Signature



[PATCH 1/2] loop: use filp_close() rather than fput()

2017-06-15 Thread NeilBrown
When a loop device is being shutdown the backing file is
closed with fput().  This is different from how close(2)
closes files - it uses filp_close().

The difference is important for filesystems which provide a ->flush
file operation such as NFS.  NFS assumes a flush will always
be called on last close, and gets confused otherwise.

Signed-off-by: NeilBrown 
---
 drivers/block/loop.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index ebbd0c3fe0ed..9c457ca6c55e 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -682,7 +682,7 @@ static int loop_change_fd(struct loop_device *lo, struct 
block_device *bdev,
if (error)
goto out_putf;
 
-   fput(old_file);
+   filp_close(old_file, NULL);
if (lo->lo_flags & LO_FLAGS_PARTSCAN)
loop_reread_partitions(lo, bdev);
return 0;
@@ -1071,12 +1071,12 @@ static int loop_clr_fd(struct loop_device *lo)
loop_unprepare_queue(lo);
mutex_unlock(>lo_ctl_mutex);
/*
-* Need not hold lo_ctl_mutex to fput backing file.
+* Need not hold lo_ctl_mutex to close backing file.
 * Calling fput holding lo_ctl_mutex triggers a circular
 * lock dependency possibility warning as fput can take
 * bd_mutex which is usually taken before lo_ctl_mutex.
 */
-   fput(filp);
+   filp_close(filp, NULL);
return 0;
 }
 




[PATCH 1/2] loop: use filp_close() rather than fput()

2017-06-15 Thread NeilBrown
When a loop device is being shutdown the backing file is
closed with fput().  This is different from how close(2)
closes files - it uses filp_close().

The difference is important for filesystems which provide a ->flush
file operation such as NFS.  NFS assumes a flush will always
be called on last close, and gets confused otherwise.

Signed-off-by: NeilBrown 
---
 drivers/block/loop.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index ebbd0c3fe0ed..9c457ca6c55e 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -682,7 +682,7 @@ static int loop_change_fd(struct loop_device *lo, struct 
block_device *bdev,
if (error)
goto out_putf;
 
-   fput(old_file);
+   filp_close(old_file, NULL);
if (lo->lo_flags & LO_FLAGS_PARTSCAN)
loop_reread_partitions(lo, bdev);
return 0;
@@ -1071,12 +1071,12 @@ static int loop_clr_fd(struct loop_device *lo)
loop_unprepare_queue(lo);
mutex_unlock(>lo_ctl_mutex);
/*
-* Need not hold lo_ctl_mutex to fput backing file.
+* Need not hold lo_ctl_mutex to close backing file.
 * Calling fput holding lo_ctl_mutex triggers a circular
 * lock dependency possibility warning as fput can take
 * bd_mutex which is usually taken before lo_ctl_mutex.
 */
-   fput(filp);
+   filp_close(filp, NULL);
return 0;
 }
 




Re: Applied "ASoC: rockchip: Fix an error handling in 'rockchip_i2s_probe'" to the asoc tree

2017-06-15 Thread Christophe JAILLET

Le 15/06/2017 à 19:20, Mark Brown a écrit :

The patch

ASoC: rockchip: Fix an error handling in 'rockchip_i2s_probe'

has been applied to the asoc tree at

git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

 From c3a3d3c41b74b05267bab6173f2a8224a1443ba6 Mon Sep 17 00:00:00 2001
From: Christophe Jaillet 
Date: Thu, 15 Jun 2017 07:53:11 +0200
Subject: [PATCH] ASoC: rockchip: Fix an error handling in 'rockchip_i2s_probe'

If this memory allocation fail, we must disable what has been enabled.
Do not return immediately but go thrue the error handling path instead.

Also use 'devm_kmemdup' instead of 'devm_kzalloc+memcpy' to simplify code.

Signed-off-by: Christophe JAILLET 
Signed-off-by: Mark Brown 
---
  sound/soc/rockchip/rockchip_i2s.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/sound/soc/rockchip/rockchip_i2s.c 
b/sound/soc/rockchip/rockchip_i2s.c
index 66a26c56c658..ce09dee2202e 100644
--- a/sound/soc/rockchip/rockchip_i2s.c
+++ b/sound/soc/rockchip/rockchip_i2s.c
@@ -641,12 +641,13 @@ static int rockchip_i2s_probe(struct platform_device 
*pdev)
goto err_pm_disable;
}
  
-	soc_dai = devm_kzalloc(>dev,

+   soc_dai = devm_kmemdup(>dev, _i2s_dai
   sizeof(*soc_dai), GFP_KERNEL);
-   if (!soc_dai)
-   return -ENOMEM;
+   if (!soc_dai) {
+   err = -ENOMEM;
+   goto err_pm_disable;
+   }
  
-	memcpy(soc_dai, _i2s_dai, sizeof(*soc_dai));

if (!of_property_read_u32(node, "rockchip,playback-channels", )) {
if (val >= 2 && val <= 8)
soc_dai->playback.channels_max = val;


... and 'err' should be 'ret'...

I'm really confused for such a ugly proposal !

CJ



Re: Applied "ASoC: rockchip: Fix an error handling in 'rockchip_i2s_probe'" to the asoc tree

2017-06-15 Thread Christophe JAILLET

Le 15/06/2017 à 19:20, Mark Brown a écrit :

The patch

ASoC: rockchip: Fix an error handling in 'rockchip_i2s_probe'

has been applied to the asoc tree at

git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

 From c3a3d3c41b74b05267bab6173f2a8224a1443ba6 Mon Sep 17 00:00:00 2001
From: Christophe Jaillet 
Date: Thu, 15 Jun 2017 07:53:11 +0200
Subject: [PATCH] ASoC: rockchip: Fix an error handling in 'rockchip_i2s_probe'

If this memory allocation fail, we must disable what has been enabled.
Do not return immediately but go thrue the error handling path instead.

Also use 'devm_kmemdup' instead of 'devm_kzalloc+memcpy' to simplify code.

Signed-off-by: Christophe JAILLET 
Signed-off-by: Mark Brown 
---
  sound/soc/rockchip/rockchip_i2s.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/sound/soc/rockchip/rockchip_i2s.c 
b/sound/soc/rockchip/rockchip_i2s.c
index 66a26c56c658..ce09dee2202e 100644
--- a/sound/soc/rockchip/rockchip_i2s.c
+++ b/sound/soc/rockchip/rockchip_i2s.c
@@ -641,12 +641,13 @@ static int rockchip_i2s_probe(struct platform_device 
*pdev)
goto err_pm_disable;
}
  
-	soc_dai = devm_kzalloc(>dev,

+   soc_dai = devm_kmemdup(>dev, _i2s_dai
   sizeof(*soc_dai), GFP_KERNEL);
-   if (!soc_dai)
-   return -ENOMEM;
+   if (!soc_dai) {
+   err = -ENOMEM;
+   goto err_pm_disable;
+   }
  
-	memcpy(soc_dai, _i2s_dai, sizeof(*soc_dai));

if (!of_property_read_u32(node, "rockchip,playback-channels", )) {
if (val >= 2 && val <= 8)
soc_dai->playback.channels_max = val;


... and 'err' should be 'ret'...

I'm really confused for such a ugly proposal !

CJ



[PATCH] MAINTAINERS: Add an entry for MediaTek PCIe driver

2017-06-15 Thread Ryder Lee
Add MediaTek PCIe driver maintainer entry.

Signed-off-by: Ryder Lee 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index f7d568b..736648e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9957,6 +9957,14 @@ S:   Supported
 F: Documentation/devicetree/bindings/pci/pci-thunder-*
 F: drivers/pci/host/pci-thunder-*
 
+PCIE DRIVER FOR MEDIATEK
+M:  Ryder Lee 
+L:  linux-...@vger.kernel.org
+L:  linux-media...@lists.infradead.org
+S:  Supported
+F:  Documentation/devicetree/bindings/pci/mediatek*
+F:  drivers/pci/host/*mediatek*
+
 PCMCIA SUBSYSTEM
 P: Linux PCMCIA Team
 L: linux-pcm...@lists.infradead.org
-- 
1.9.1



[PATCH] MAINTAINERS: Add an entry for MediaTek PCIe driver

2017-06-15 Thread Ryder Lee
Add MediaTek PCIe driver maintainer entry.

Signed-off-by: Ryder Lee 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index f7d568b..736648e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9957,6 +9957,14 @@ S:   Supported
 F: Documentation/devicetree/bindings/pci/pci-thunder-*
 F: drivers/pci/host/pci-thunder-*
 
+PCIE DRIVER FOR MEDIATEK
+M:  Ryder Lee 
+L:  linux-...@vger.kernel.org
+L:  linux-media...@lists.infradead.org
+S:  Supported
+F:  Documentation/devicetree/bindings/pci/mediatek*
+F:  drivers/pci/host/*mediatek*
+
 PCMCIA SUBSYSTEM
 P: Linux PCMCIA Team
 L: linux-pcm...@lists.infradead.org
-- 
1.9.1



[PATCH 0/7] fujitsu-laptop: ACPI-related cleanups

2017-06-15 Thread Michał Kępień
In preparation for splitting fujitsu-laptop into two separate modules, I
prepared two more cleanup series to minimize the amount of code being
moved around.  This first series contains all patches that touch ACPI in
some way, which is why I am CCing linux-acpi as per Rafael's previous
advice.

This patch series was based on platform-driver-x86/for-next and tested
on a Lifebook S7020.

 drivers/platform/x86/Kconfig  |  10 --
 drivers/platform/x86/fujitsu-laptop.c | 187 +-
 2 files changed, 48 insertions(+), 149 deletions(-)

-- 
2.13.1



[PATCH 0/7] fujitsu-laptop: ACPI-related cleanups

2017-06-15 Thread Michał Kępień
In preparation for splitting fujitsu-laptop into two separate modules, I
prepared two more cleanup series to minimize the amount of code being
moved around.  This first series contains all patches that touch ACPI in
some way, which is why I am CCing linux-acpi as per Rafael's previous
advice.

This patch series was based on platform-driver-x86/for-next and tested
on a Lifebook S7020.

 drivers/platform/x86/Kconfig  |  10 --
 drivers/platform/x86/fujitsu-laptop.c | 187 +-
 2 files changed, 48 insertions(+), 149 deletions(-)

-- 
2.13.1



[PATCH 4/7] platform/x86: fujitsu-laptop: sanitize hotkey input device identification

2017-06-15 Thread Michał Kępień
In the case of brightness-related FUJ02B1 ACPI device, initializing the
input device associated with it identically as acpi-video initializes
its input device makes sense.  However, using the same data for the
input device associated with the FUJ02E3 ACPI device makes little sense,
because the latter has nothing to do with video and assigning an
arbitrary product ID to it is redundant.

Signed-off-by: Michał Kępień 
---
 drivers/platform/x86/fujitsu-laptop.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/platform/x86/fujitsu-laptop.c 
b/drivers/platform/x86/fujitsu-laptop.c
index 06d04500dac0..5c0dc2126313 100644
--- a/drivers/platform/x86/fujitsu-laptop.c
+++ b/drivers/platform/x86/fujitsu-laptop.c
@@ -555,13 +555,12 @@ static int acpi_fujitsu_laptop_input_setup(struct 
acpi_device *device)
if (!priv->input)
return -ENOMEM;
 
-   snprintf(priv->phys, sizeof(priv->phys), "%s/video/input0",
+   snprintf(priv->phys, sizeof(priv->phys), "%s/input0",
 acpi_device_hid(device));
 
priv->input->name = acpi_device_name(device);
priv->input->phys = priv->phys;
priv->input->id.bustype = BUS_HOST;
-   priv->input->id.product = 0x06;
 
dmi_check_system(fujitsu_laptop_dmi_table);
ret = sparse_keymap_setup(priv->input, keymap, NULL);
-- 
2.13.1



[PATCH 5/7] platform/x86: fujitsu-laptop: do not update ACPI device power status

2017-06-15 Thread Michał Kępień
Calling acpi_bus_update_power() for ACPI devices FUJ02B1 and FUJ02E3 is
pointless as they are not power manageable (neither _PS0 nor _PR0 is
defined for any of them), which causes their power state to be inherited
from their parent devices.  Given the ACPI paths of these two devices
(\_SB.PCI0.LPCB.FJEX, \_SB.FEXT), their parent devices are also not
power manageable.  These parent devices will thus have their power state
initialized to ACPI_STATE_D0, which in turn causes the power state for
both FUJ02B1 and FUJ02E3 to always be ACPI_STATE_D0 ("on").

Remove relevant acpi_bus_update_power() calls along with parts of debug
messages that they were supposed to have an effect on.

Signed-off-by: Michał Kępień 
---
 drivers/platform/x86/fujitsu-laptop.c | 24 
 1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/drivers/platform/x86/fujitsu-laptop.c 
b/drivers/platform/x86/fujitsu-laptop.c
index 5c0dc2126313..b9f3ede4d567 100644
--- a/drivers/platform/x86/fujitsu-laptop.c
+++ b/drivers/platform/x86/fujitsu-laptop.c
@@ -400,7 +400,6 @@ static int fujitsu_backlight_register(struct acpi_device 
*device)
 static int acpi_fujitsu_bl_add(struct acpi_device *device)
 {
struct fujitsu_bl *priv;
-   int state = 0;
int error;
 
if (acpi_video_get_backlight_type() != acpi_backlight_vendor)
@@ -419,15 +418,8 @@ static int acpi_fujitsu_bl_add(struct acpi_device *device)
if (error)
return error;
 
-   error = acpi_bus_update_power(device->handle, );
-   if (error) {
-   pr_err("Error reading power state\n");
-   return error;
-   }
-
-   pr_info("ACPI: %s [%s] (%s)\n",
-  acpi_device_name(device), acpi_device_bid(device),
-  !device->power.state ? "on" : "off");
+   pr_info("ACPI: %s [%s]\n",
+   acpi_device_name(device), acpi_device_bid(device));
 
if (acpi_has_method(device->handle, METHOD_NAME__INI)) {
vdbg_printk(FUJLAPTOP_DBG_INFO, "Invoking _INI\n");
@@ -788,7 +780,6 @@ static int acpi_fujitsu_laptop_leds_register(struct 
acpi_device *device)
 static int acpi_fujitsu_laptop_add(struct acpi_device *device)
 {
struct fujitsu_laptop *priv;
-   int state = 0;
int error;
int i;
 
@@ -807,15 +798,8 @@ static int acpi_fujitsu_laptop_add(struct acpi_device 
*device)
if (error)
return error;
 
-   error = acpi_bus_update_power(device->handle, );
-   if (error) {
-   pr_err("Error reading power state\n");
-   return error;
-   }
-
-   pr_info("ACPI: %s [%s] (%s)\n",
-   acpi_device_name(device), acpi_device_bid(device),
-   !device->power.state ? "on" : "off");
+   pr_info("ACPI: %s [%s]\n",
+   acpi_device_name(device), acpi_device_bid(device));
 
if (acpi_has_method(device->handle, METHOD_NAME__INI)) {
vdbg_printk(FUJLAPTOP_DBG_INFO, "Invoking _INI\n");
-- 
2.13.1



[PATCH 5/7] platform/x86: fujitsu-laptop: do not update ACPI device power status

2017-06-15 Thread Michał Kępień
Calling acpi_bus_update_power() for ACPI devices FUJ02B1 and FUJ02E3 is
pointless as they are not power manageable (neither _PS0 nor _PR0 is
defined for any of them), which causes their power state to be inherited
from their parent devices.  Given the ACPI paths of these two devices
(\_SB.PCI0.LPCB.FJEX, \_SB.FEXT), their parent devices are also not
power manageable.  These parent devices will thus have their power state
initialized to ACPI_STATE_D0, which in turn causes the power state for
both FUJ02B1 and FUJ02E3 to always be ACPI_STATE_D0 ("on").

Remove relevant acpi_bus_update_power() calls along with parts of debug
messages that they were supposed to have an effect on.

Signed-off-by: Michał Kępień 
---
 drivers/platform/x86/fujitsu-laptop.c | 24 
 1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/drivers/platform/x86/fujitsu-laptop.c 
b/drivers/platform/x86/fujitsu-laptop.c
index 5c0dc2126313..b9f3ede4d567 100644
--- a/drivers/platform/x86/fujitsu-laptop.c
+++ b/drivers/platform/x86/fujitsu-laptop.c
@@ -400,7 +400,6 @@ static int fujitsu_backlight_register(struct acpi_device 
*device)
 static int acpi_fujitsu_bl_add(struct acpi_device *device)
 {
struct fujitsu_bl *priv;
-   int state = 0;
int error;
 
if (acpi_video_get_backlight_type() != acpi_backlight_vendor)
@@ -419,15 +418,8 @@ static int acpi_fujitsu_bl_add(struct acpi_device *device)
if (error)
return error;
 
-   error = acpi_bus_update_power(device->handle, );
-   if (error) {
-   pr_err("Error reading power state\n");
-   return error;
-   }
-
-   pr_info("ACPI: %s [%s] (%s)\n",
-  acpi_device_name(device), acpi_device_bid(device),
-  !device->power.state ? "on" : "off");
+   pr_info("ACPI: %s [%s]\n",
+   acpi_device_name(device), acpi_device_bid(device));
 
if (acpi_has_method(device->handle, METHOD_NAME__INI)) {
vdbg_printk(FUJLAPTOP_DBG_INFO, "Invoking _INI\n");
@@ -788,7 +780,6 @@ static int acpi_fujitsu_laptop_leds_register(struct 
acpi_device *device)
 static int acpi_fujitsu_laptop_add(struct acpi_device *device)
 {
struct fujitsu_laptop *priv;
-   int state = 0;
int error;
int i;
 
@@ -807,15 +798,8 @@ static int acpi_fujitsu_laptop_add(struct acpi_device 
*device)
if (error)
return error;
 
-   error = acpi_bus_update_power(device->handle, );
-   if (error) {
-   pr_err("Error reading power state\n");
-   return error;
-   }
-
-   pr_info("ACPI: %s [%s] (%s)\n",
-   acpi_device_name(device), acpi_device_bid(device),
-   !device->power.state ? "on" : "off");
+   pr_info("ACPI: %s [%s]\n",
+   acpi_device_name(device), acpi_device_bid(device));
 
if (acpi_has_method(device->handle, METHOD_NAME__INI)) {
vdbg_printk(FUJLAPTOP_DBG_INFO, "Invoking _INI\n");
-- 
2.13.1



[PATCH 4/7] platform/x86: fujitsu-laptop: sanitize hotkey input device identification

2017-06-15 Thread Michał Kępień
In the case of brightness-related FUJ02B1 ACPI device, initializing the
input device associated with it identically as acpi-video initializes
its input device makes sense.  However, using the same data for the
input device associated with the FUJ02E3 ACPI device makes little sense,
because the latter has nothing to do with video and assigning an
arbitrary product ID to it is redundant.

Signed-off-by: Michał Kępień 
---
 drivers/platform/x86/fujitsu-laptop.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/platform/x86/fujitsu-laptop.c 
b/drivers/platform/x86/fujitsu-laptop.c
index 06d04500dac0..5c0dc2126313 100644
--- a/drivers/platform/x86/fujitsu-laptop.c
+++ b/drivers/platform/x86/fujitsu-laptop.c
@@ -555,13 +555,12 @@ static int acpi_fujitsu_laptop_input_setup(struct 
acpi_device *device)
if (!priv->input)
return -ENOMEM;
 
-   snprintf(priv->phys, sizeof(priv->phys), "%s/video/input0",
+   snprintf(priv->phys, sizeof(priv->phys), "%s/input0",
 acpi_device_hid(device));
 
priv->input->name = acpi_device_name(device);
priv->input->phys = priv->phys;
priv->input->id.bustype = BUS_HOST;
-   priv->input->id.product = 0x06;
 
dmi_check_system(fujitsu_laptop_dmi_table);
ret = sparse_keymap_setup(priv->input, keymap, NULL);
-- 
2.13.1



[PATCH 1/7] platform/x86: fujitsu-laptop: do not use kfifo for storing hotkey scancodes

2017-06-15 Thread Michał Kępień
All ACPI device notify callbacks are invoked using acpi_os_execute(),
which causes the supplied callback to be queued to a static workqueue
which always executes on CPU 0.  This means that there is no possibility
for any ACPI device notify callback to be concurrently executed on
multiple CPUs, which in the case of fujitsu-laptop means that using a
locked kfifo for handling hotkeys is redundant: as hotkey scancodes are
only pushed and popped from within acpi_fujitsu_laptop_notify(), no risk
of concurrent pushing and popping exists.

Instead of a kfifo, use a fixed-size integer table for storing pushed
scancodes and an associated variable holding the number of scancodes
currently stored in that table.  Change debug messages so that they no
longer contain the term "ringbuffer".

In order to simplify implementation, hotkey input device behavior is
slightly changed (from FIFO to LIFO).  The order of release events
generated by the input device should not matter from end user
perspective as upon releasing any hotkey the firmware only produces a
single event which means "all hotkeys were released".

Signed-off-by: Michał Kępień 
---
 drivers/platform/x86/fujitsu-laptop.c | 54 +--
 1 file changed, 14 insertions(+), 40 deletions(-)

diff --git a/drivers/platform/x86/fujitsu-laptop.c 
b/drivers/platform/x86/fujitsu-laptop.c
index 1c6fdd952c75..54cb7ae541d4 100644
--- a/drivers/platform/x86/fujitsu-laptop.c
+++ b/drivers/platform/x86/fujitsu-laptop.c
@@ -58,7 +58,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -110,7 +109,6 @@
 #define KEY5_CODE  0x420
 
 #define MAX_HOTKEY_RINGBUFFER_SIZE 100
-#define RINGBUFFERSIZE 40
 
 /* Debugging */
 #define FUJLAPTOP_DBG_ERROR  0x0001
@@ -146,8 +144,8 @@ struct fujitsu_laptop {
struct input_dev *input;
char phys[32];
struct platform_device *pf_device;
-   struct kfifo fifo;
-   spinlock_t fifo_lock;
+   int scancode_buf[40];
+   int scancode_count;
int flags_supported;
int flags_state;
 };
@@ -813,23 +811,14 @@ static int acpi_fujitsu_laptop_add(struct acpi_device 
*device)
sprintf(acpi_device_class(device), "%s", ACPI_FUJITSU_CLASS);
device->driver_data = priv;
 
-   /* kfifo */
-   spin_lock_init(>fifo_lock);
-   error = kfifo_alloc(>fifo, RINGBUFFERSIZE * sizeof(int),
-   GFP_KERNEL);
-   if (error) {
-   pr_err("kfifo_alloc failed\n");
-   goto err_stop;
-   }
-
error = acpi_fujitsu_laptop_input_setup(device);
if (error)
-   goto err_free_fifo;
+   return error;
 
error = acpi_bus_update_power(device->handle, );
if (error) {
pr_err("Error reading power state\n");
-   goto err_free_fifo;
+   return error;
}
 
pr_info("ACPI: %s [%s] (%s)\n",
@@ -877,62 +866,47 @@ static int acpi_fujitsu_laptop_add(struct acpi_device 
*device)
 
error = acpi_fujitsu_laptop_leds_register(device);
if (error)
-   goto err_free_fifo;
+   return error;
 
error = fujitsu_laptop_platform_add(device);
if (error)
-   goto err_free_fifo;
+   return error;
 
return 0;
-
-err_free_fifo:
-   kfifo_free(>fifo);
-err_stop:
-   return error;
 }
 
 static int acpi_fujitsu_laptop_remove(struct acpi_device *device)
 {
-   struct fujitsu_laptop *priv = acpi_driver_data(device);
-
fujitsu_laptop_platform_remove(device);
 
-   kfifo_free(>fifo);
-
return 0;
 }
 
 static void acpi_fujitsu_laptop_press(struct acpi_device *device, int scancode)
 {
struct fujitsu_laptop *priv = acpi_driver_data(device);
-   int status;
 
-   status = kfifo_in_locked(>fifo, (unsigned char *),
-sizeof(scancode), >fifo_lock);
-   if (status != sizeof(scancode)) {
+   if (priv->scancode_count == ARRAY_SIZE(priv->scancode_buf)) {
vdbg_printk(FUJLAPTOP_DBG_WARN,
"Could not push scancode [0x%x]\n", scancode);
return;
}
+   priv->scancode_buf[priv->scancode_count++] = scancode;
sparse_keymap_report_event(priv->input, scancode, 1, false);
vdbg_printk(FUJLAPTOP_DBG_TRACE,
-   "Push scancode into ringbuffer [0x%x]\n", scancode);
+   "Push scancode into buffer [0x%x]\n", scancode);
 }
 
 static void acpi_fujitsu_laptop_release(struct acpi_device *device)
 {
struct fujitsu_laptop *priv = acpi_driver_data(device);
-   int scancode, status;
-
-   while (true) {
-   status = kfifo_out_locked(>fifo,
- (unsigned char *),
- sizeof(scancode), >fifo_lock);
-   if (status != sizeof(scancode))
- 

[PATCH 1/7] platform/x86: fujitsu-laptop: do not use kfifo for storing hotkey scancodes

2017-06-15 Thread Michał Kępień
All ACPI device notify callbacks are invoked using acpi_os_execute(),
which causes the supplied callback to be queued to a static workqueue
which always executes on CPU 0.  This means that there is no possibility
for any ACPI device notify callback to be concurrently executed on
multiple CPUs, which in the case of fujitsu-laptop means that using a
locked kfifo for handling hotkeys is redundant: as hotkey scancodes are
only pushed and popped from within acpi_fujitsu_laptop_notify(), no risk
of concurrent pushing and popping exists.

Instead of a kfifo, use a fixed-size integer table for storing pushed
scancodes and an associated variable holding the number of scancodes
currently stored in that table.  Change debug messages so that they no
longer contain the term "ringbuffer".

In order to simplify implementation, hotkey input device behavior is
slightly changed (from FIFO to LIFO).  The order of release events
generated by the input device should not matter from end user
perspective as upon releasing any hotkey the firmware only produces a
single event which means "all hotkeys were released".

Signed-off-by: Michał Kępień 
---
 drivers/platform/x86/fujitsu-laptop.c | 54 +--
 1 file changed, 14 insertions(+), 40 deletions(-)

diff --git a/drivers/platform/x86/fujitsu-laptop.c 
b/drivers/platform/x86/fujitsu-laptop.c
index 1c6fdd952c75..54cb7ae541d4 100644
--- a/drivers/platform/x86/fujitsu-laptop.c
+++ b/drivers/platform/x86/fujitsu-laptop.c
@@ -58,7 +58,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -110,7 +109,6 @@
 #define KEY5_CODE  0x420
 
 #define MAX_HOTKEY_RINGBUFFER_SIZE 100
-#define RINGBUFFERSIZE 40
 
 /* Debugging */
 #define FUJLAPTOP_DBG_ERROR  0x0001
@@ -146,8 +144,8 @@ struct fujitsu_laptop {
struct input_dev *input;
char phys[32];
struct platform_device *pf_device;
-   struct kfifo fifo;
-   spinlock_t fifo_lock;
+   int scancode_buf[40];
+   int scancode_count;
int flags_supported;
int flags_state;
 };
@@ -813,23 +811,14 @@ static int acpi_fujitsu_laptop_add(struct acpi_device 
*device)
sprintf(acpi_device_class(device), "%s", ACPI_FUJITSU_CLASS);
device->driver_data = priv;
 
-   /* kfifo */
-   spin_lock_init(>fifo_lock);
-   error = kfifo_alloc(>fifo, RINGBUFFERSIZE * sizeof(int),
-   GFP_KERNEL);
-   if (error) {
-   pr_err("kfifo_alloc failed\n");
-   goto err_stop;
-   }
-
error = acpi_fujitsu_laptop_input_setup(device);
if (error)
-   goto err_free_fifo;
+   return error;
 
error = acpi_bus_update_power(device->handle, );
if (error) {
pr_err("Error reading power state\n");
-   goto err_free_fifo;
+   return error;
}
 
pr_info("ACPI: %s [%s] (%s)\n",
@@ -877,62 +866,47 @@ static int acpi_fujitsu_laptop_add(struct acpi_device 
*device)
 
error = acpi_fujitsu_laptop_leds_register(device);
if (error)
-   goto err_free_fifo;
+   return error;
 
error = fujitsu_laptop_platform_add(device);
if (error)
-   goto err_free_fifo;
+   return error;
 
return 0;
-
-err_free_fifo:
-   kfifo_free(>fifo);
-err_stop:
-   return error;
 }
 
 static int acpi_fujitsu_laptop_remove(struct acpi_device *device)
 {
-   struct fujitsu_laptop *priv = acpi_driver_data(device);
-
fujitsu_laptop_platform_remove(device);
 
-   kfifo_free(>fifo);
-
return 0;
 }
 
 static void acpi_fujitsu_laptop_press(struct acpi_device *device, int scancode)
 {
struct fujitsu_laptop *priv = acpi_driver_data(device);
-   int status;
 
-   status = kfifo_in_locked(>fifo, (unsigned char *),
-sizeof(scancode), >fifo_lock);
-   if (status != sizeof(scancode)) {
+   if (priv->scancode_count == ARRAY_SIZE(priv->scancode_buf)) {
vdbg_printk(FUJLAPTOP_DBG_WARN,
"Could not push scancode [0x%x]\n", scancode);
return;
}
+   priv->scancode_buf[priv->scancode_count++] = scancode;
sparse_keymap_report_event(priv->input, scancode, 1, false);
vdbg_printk(FUJLAPTOP_DBG_TRACE,
-   "Push scancode into ringbuffer [0x%x]\n", scancode);
+   "Push scancode into buffer [0x%x]\n", scancode);
 }
 
 static void acpi_fujitsu_laptop_release(struct acpi_device *device)
 {
struct fujitsu_laptop *priv = acpi_driver_data(device);
-   int scancode, status;
-
-   while (true) {
-   status = kfifo_out_locked(>fifo,
- (unsigned char *),
- sizeof(scancode), >fifo_lock);
-   if (status != sizeof(scancode))
-   return;

[PATCH 2/7] platform/x86: fujitsu-laptop: remove redundant safety checks

2017-06-15 Thread Michał Kępień
Do not check whether the pointer passed to ACPI add callbacks is NULL as
it is earlier dereferenced anyway in the bus-level probe callback,
acpi_device_probe().

Do not check the value of acpi_disabled in fujitsu_init(), because it is
already done by acpi_bus_register_driver(), which is the first function
called by fujitsu_init().

Signed-off-by: Michał Kępień 
---
 drivers/platform/x86/fujitsu-laptop.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/drivers/platform/x86/fujitsu-laptop.c 
b/drivers/platform/x86/fujitsu-laptop.c
index 54cb7ae541d4..c64d5305ff37 100644
--- a/drivers/platform/x86/fujitsu-laptop.c
+++ b/drivers/platform/x86/fujitsu-laptop.c
@@ -406,9 +406,6 @@ static int acpi_fujitsu_bl_add(struct acpi_device *device)
if (acpi_video_get_backlight_type() != acpi_backlight_vendor)
return -ENODEV;
 
-   if (!device)
-   return -EINVAL;
-
priv = devm_kzalloc(>dev, sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
@@ -796,9 +793,6 @@ static int acpi_fujitsu_laptop_add(struct acpi_device 
*device)
int error;
int i;
 
-   if (!device)
-   return -EINVAL;
-
priv = devm_kzalloc(>dev, sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
@@ -993,9 +987,6 @@ static int __init fujitsu_init(void)
 {
int ret;
 
-   if (acpi_disabled)
-   return -ENODEV;
-
ret = acpi_bus_register_driver(_fujitsu_bl_driver);
if (ret)
return ret;
-- 
2.13.1



[PATCH 2/7] platform/x86: fujitsu-laptop: remove redundant safety checks

2017-06-15 Thread Michał Kępień
Do not check whether the pointer passed to ACPI add callbacks is NULL as
it is earlier dereferenced anyway in the bus-level probe callback,
acpi_device_probe().

Do not check the value of acpi_disabled in fujitsu_init(), because it is
already done by acpi_bus_register_driver(), which is the first function
called by fujitsu_init().

Signed-off-by: Michał Kępień 
---
 drivers/platform/x86/fujitsu-laptop.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/drivers/platform/x86/fujitsu-laptop.c 
b/drivers/platform/x86/fujitsu-laptop.c
index 54cb7ae541d4..c64d5305ff37 100644
--- a/drivers/platform/x86/fujitsu-laptop.c
+++ b/drivers/platform/x86/fujitsu-laptop.c
@@ -406,9 +406,6 @@ static int acpi_fujitsu_bl_add(struct acpi_device *device)
if (acpi_video_get_backlight_type() != acpi_backlight_vendor)
return -ENODEV;
 
-   if (!device)
-   return -EINVAL;
-
priv = devm_kzalloc(>dev, sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
@@ -796,9 +793,6 @@ static int acpi_fujitsu_laptop_add(struct acpi_device 
*device)
int error;
int i;
 
-   if (!device)
-   return -EINVAL;
-
priv = devm_kzalloc(>dev, sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
@@ -993,9 +987,6 @@ static int __init fujitsu_init(void)
 {
int ret;
 
-   if (acpi_disabled)
-   return -ENODEV;
-
ret = acpi_bus_register_driver(_fujitsu_bl_driver);
if (ret)
return ret;
-- 
2.13.1



[PATCH 6/7] platform/x86: fujitsu-laptop: do not evaluate ACPI _INI methods

2017-06-15 Thread Michał Kępień
acpi_ns_initialize_devices(), which is called during system-wide ACPI
initialization, already detects and calls all _INI methods belonging to
objects present in ACPI tables.  There is no need to call these methods
again every time the module is loaded because they only initialize
status flags and hotkey-related variables; status flags are effectively
constants, hotkey-related variables may be assigned non-zero values
before acpi_fujitsu_laptop_add() is called, but that does not really
matter as we drain the scancodes queued in the firmware's ring buffer
before doing anything else.

Remove sections of code which invoke and check evaluation status of the
_INI methods belonging to the ACPI devices handled by the driver.

Signed-off-by: Michał Kępień 
---
 drivers/platform/x86/fujitsu-laptop.c | 16 
 1 file changed, 16 deletions(-)

diff --git a/drivers/platform/x86/fujitsu-laptop.c 
b/drivers/platform/x86/fujitsu-laptop.c
index b9f3ede4d567..0861be36305d 100644
--- a/drivers/platform/x86/fujitsu-laptop.c
+++ b/drivers/platform/x86/fujitsu-laptop.c
@@ -421,14 +421,6 @@ static int acpi_fujitsu_bl_add(struct acpi_device *device)
pr_info("ACPI: %s [%s]\n",
acpi_device_name(device), acpi_device_bid(device));
 
-   if (acpi_has_method(device->handle, METHOD_NAME__INI)) {
-   vdbg_printk(FUJLAPTOP_DBG_INFO, "Invoking _INI\n");
-   if (ACPI_FAILURE
-   (acpi_evaluate_object
-(device->handle, METHOD_NAME__INI, NULL, NULL)))
-   pr_err("_INI Method failed\n");
-   }
-
if (get_max_brightness(device) <= 0)
priv->max_brightness = FUJITSU_LCD_N_LEVELS;
get_lcd_level(device);
@@ -801,14 +793,6 @@ static int acpi_fujitsu_laptop_add(struct acpi_device 
*device)
pr_info("ACPI: %s [%s]\n",
acpi_device_name(device), acpi_device_bid(device));
 
-   if (acpi_has_method(device->handle, METHOD_NAME__INI)) {
-   vdbg_printk(FUJLAPTOP_DBG_INFO, "Invoking _INI\n");
-   if (ACPI_FAILURE
-   (acpi_evaluate_object
-(device->handle, METHOD_NAME__INI, NULL, NULL)))
-   pr_err("_INI Method failed\n");
-   }
-
i = 0;
while (call_fext_func(device, FUNC_BUTTONS, 0x1, 0x0, 0x0) != 0
&& (i++) < MAX_HOTKEY_RINGBUFFER_SIZE)
-- 
2.13.1



[PATCH 7/7] platform/x86: fujitsu-laptop: rework debugging

2017-06-15 Thread Michał Kępień
Using a dedicated Kconfig option for enabling debugging means the user
may be forced to recompile their kernel in order to gather debugging
information, which is inconvenient.  Replace custom debugging
infrastructure with standard logging functions, taking advantage of
dynamic debug.  Replace a pr_info() call inside an ACPI callback with an
acpi_handle_info() call.

The following mapping was used:

  - FUJLAPTOP_DBG_ERROR -> acpi_handle_err()
  - FUJLAPTOP_DBG_WARN  -> acpi_handle_info() / dev_info()
  - FUJLAPTOP_DBG_INFO  -> acpi_handle_debug()
  - FUJLAPTOP_DBG_TRACE -> acpi_handle_debug() / dev_dbg()

This means that some events which used to only be logged when the user
explicitly requested it will now be logged by default:

  - ACPI method evaluation errors,
  - unknown ACPI notification codes,
  - unknown hotkey scancodes.

The first type of events should happen rarely, if ever at all.  The rest
is interesting from driver development perspective as their presence in
the logs will mean the driver is unaware of certain events, handling of
which should be implemented.

Signed-off-by: Michał Kępień 
---
 drivers/platform/x86/Kconfig  | 10 -
 drivers/platform/x86/fujitsu-laptop.c | 78 +--
 2 files changed, 28 insertions(+), 60 deletions(-)

diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index 49a1d012f025..980419081ed7 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -195,16 +195,6 @@ config FUJITSU_LAPTOP
 
  If you have a Fujitsu laptop, say Y or M here.
 
-config FUJITSU_LAPTOP_DEBUG
-   bool "Verbose debug mode for Fujitsu Laptop Extras"
-   depends on FUJITSU_LAPTOP
-   default n
-   ---help---
- Enables extra debug output from the fujitsu extras driver, at the
- expense of a slight increase in driver size.
-
- If you are not sure, say N here.
-
 config FUJITSU_TABLET
tristate "Fujitsu Tablet Extras"
depends on ACPI
diff --git a/drivers/platform/x86/fujitsu-laptop.c 
b/drivers/platform/x86/fujitsu-laptop.c
index 0861be36305d..d50872d96e63 100644
--- a/drivers/platform/x86/fujitsu-laptop.c
+++ b/drivers/platform/x86/fujitsu-laptop.c
@@ -110,22 +110,6 @@
 
 #define MAX_HOTKEY_RINGBUFFER_SIZE 100
 
-/* Debugging */
-#define FUJLAPTOP_DBG_ERROR  0x0001
-#define FUJLAPTOP_DBG_WARN   0x0002
-#define FUJLAPTOP_DBG_INFO   0x0004
-#define FUJLAPTOP_DBG_TRACE  0x0008
-
-#ifdef CONFIG_FUJITSU_LAPTOP_DEBUG
-#define vdbg_printk(a_dbg_level, format, arg...) \
-   do { if (dbg_level & a_dbg_level) \
-   printk(KERN_DEBUG pr_fmt("%s: " format), __func__, ## arg); \
-   } while (0)
-#else
-#define vdbg_printk(a_dbg_level, format, arg...) \
-   do { } while (0)
-#endif
-
 /* Device controlling the backlight and associated keys */
 struct fujitsu_bl {
struct input_dev *input;
@@ -152,10 +136,6 @@ struct fujitsu_laptop {
 
 static struct acpi_device *fext;
 
-#ifdef CONFIG_FUJITSU_LAPTOP_DEBUG
-static u32 dbg_level = 0x03;
-#endif
-
 /* Fujitsu ACPI interface function */
 
 static int call_fext_func(struct acpi_device *device,
@@ -174,12 +154,13 @@ static int call_fext_func(struct acpi_device *device,
status = acpi_evaluate_integer(device->handle, "FUNC", _list,
   );
if (ACPI_FAILURE(status)) {
-   vdbg_printk(FUJLAPTOP_DBG_ERROR, "Failed to evaluate FUNC\n");
+   acpi_handle_err(device->handle, "Failed to evaluate FUNC\n");
return -ENODEV;
}
 
-   vdbg_printk(FUJLAPTOP_DBG_TRACE, "FUNC 0x%x (args 0x%x, 0x%x, 0x%x) 
returned 0x%x\n",
-   func, op, feature, state, (int)value);
+   acpi_handle_debug(device->handle,
+ "FUNC 0x%x (args 0x%x, 0x%x, 0x%x) returned 0x%x\n",
+ func, op, feature, state, (int)value);
return value;
 }
 
@@ -206,16 +187,16 @@ static int set_lcd_level(struct acpi_device *device, int 
level)
break;
}
 
-   vdbg_printk(FUJLAPTOP_DBG_TRACE, "set lcd level via %s [%d]\n",
-   method, level);
+   acpi_handle_debug(device->handle, "set lcd level via %s [%d]\n", method,
+ level);
 
if (level < 0 || level >= priv->max_brightness)
return -EINVAL;
 
status = acpi_execute_simple_method(device->handle, method, level);
if (ACPI_FAILURE(status)) {
-   vdbg_printk(FUJLAPTOP_DBG_ERROR, "Failed to evaluate %s\n",
-   method);
+   acpi_handle_err(device->handle, "Failed to evaluate %s\n",
+   method);
return -ENODEV;
}
 
@@ -230,7 +211,7 @@ static int get_lcd_level(struct acpi_device *device)
unsigned long long state = 0;
acpi_status status = AE_OK;
 
-   

[PATCH 6/7] platform/x86: fujitsu-laptop: do not evaluate ACPI _INI methods

2017-06-15 Thread Michał Kępień
acpi_ns_initialize_devices(), which is called during system-wide ACPI
initialization, already detects and calls all _INI methods belonging to
objects present in ACPI tables.  There is no need to call these methods
again every time the module is loaded because they only initialize
status flags and hotkey-related variables; status flags are effectively
constants, hotkey-related variables may be assigned non-zero values
before acpi_fujitsu_laptop_add() is called, but that does not really
matter as we drain the scancodes queued in the firmware's ring buffer
before doing anything else.

Remove sections of code which invoke and check evaluation status of the
_INI methods belonging to the ACPI devices handled by the driver.

Signed-off-by: Michał Kępień 
---
 drivers/platform/x86/fujitsu-laptop.c | 16 
 1 file changed, 16 deletions(-)

diff --git a/drivers/platform/x86/fujitsu-laptop.c 
b/drivers/platform/x86/fujitsu-laptop.c
index b9f3ede4d567..0861be36305d 100644
--- a/drivers/platform/x86/fujitsu-laptop.c
+++ b/drivers/platform/x86/fujitsu-laptop.c
@@ -421,14 +421,6 @@ static int acpi_fujitsu_bl_add(struct acpi_device *device)
pr_info("ACPI: %s [%s]\n",
acpi_device_name(device), acpi_device_bid(device));
 
-   if (acpi_has_method(device->handle, METHOD_NAME__INI)) {
-   vdbg_printk(FUJLAPTOP_DBG_INFO, "Invoking _INI\n");
-   if (ACPI_FAILURE
-   (acpi_evaluate_object
-(device->handle, METHOD_NAME__INI, NULL, NULL)))
-   pr_err("_INI Method failed\n");
-   }
-
if (get_max_brightness(device) <= 0)
priv->max_brightness = FUJITSU_LCD_N_LEVELS;
get_lcd_level(device);
@@ -801,14 +793,6 @@ static int acpi_fujitsu_laptop_add(struct acpi_device 
*device)
pr_info("ACPI: %s [%s]\n",
acpi_device_name(device), acpi_device_bid(device));
 
-   if (acpi_has_method(device->handle, METHOD_NAME__INI)) {
-   vdbg_printk(FUJLAPTOP_DBG_INFO, "Invoking _INI\n");
-   if (ACPI_FAILURE
-   (acpi_evaluate_object
-(device->handle, METHOD_NAME__INI, NULL, NULL)))
-   pr_err("_INI Method failed\n");
-   }
-
i = 0;
while (call_fext_func(device, FUNC_BUTTONS, 0x1, 0x0, 0x0) != 0
&& (i++) < MAX_HOTKEY_RINGBUFFER_SIZE)
-- 
2.13.1



[PATCH 7/7] platform/x86: fujitsu-laptop: rework debugging

2017-06-15 Thread Michał Kępień
Using a dedicated Kconfig option for enabling debugging means the user
may be forced to recompile their kernel in order to gather debugging
information, which is inconvenient.  Replace custom debugging
infrastructure with standard logging functions, taking advantage of
dynamic debug.  Replace a pr_info() call inside an ACPI callback with an
acpi_handle_info() call.

The following mapping was used:

  - FUJLAPTOP_DBG_ERROR -> acpi_handle_err()
  - FUJLAPTOP_DBG_WARN  -> acpi_handle_info() / dev_info()
  - FUJLAPTOP_DBG_INFO  -> acpi_handle_debug()
  - FUJLAPTOP_DBG_TRACE -> acpi_handle_debug() / dev_dbg()

This means that some events which used to only be logged when the user
explicitly requested it will now be logged by default:

  - ACPI method evaluation errors,
  - unknown ACPI notification codes,
  - unknown hotkey scancodes.

The first type of events should happen rarely, if ever at all.  The rest
is interesting from driver development perspective as their presence in
the logs will mean the driver is unaware of certain events, handling of
which should be implemented.

Signed-off-by: Michał Kępień 
---
 drivers/platform/x86/Kconfig  | 10 -
 drivers/platform/x86/fujitsu-laptop.c | 78 +--
 2 files changed, 28 insertions(+), 60 deletions(-)

diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index 49a1d012f025..980419081ed7 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -195,16 +195,6 @@ config FUJITSU_LAPTOP
 
  If you have a Fujitsu laptop, say Y or M here.
 
-config FUJITSU_LAPTOP_DEBUG
-   bool "Verbose debug mode for Fujitsu Laptop Extras"
-   depends on FUJITSU_LAPTOP
-   default n
-   ---help---
- Enables extra debug output from the fujitsu extras driver, at the
- expense of a slight increase in driver size.
-
- If you are not sure, say N here.
-
 config FUJITSU_TABLET
tristate "Fujitsu Tablet Extras"
depends on ACPI
diff --git a/drivers/platform/x86/fujitsu-laptop.c 
b/drivers/platform/x86/fujitsu-laptop.c
index 0861be36305d..d50872d96e63 100644
--- a/drivers/platform/x86/fujitsu-laptop.c
+++ b/drivers/platform/x86/fujitsu-laptop.c
@@ -110,22 +110,6 @@
 
 #define MAX_HOTKEY_RINGBUFFER_SIZE 100
 
-/* Debugging */
-#define FUJLAPTOP_DBG_ERROR  0x0001
-#define FUJLAPTOP_DBG_WARN   0x0002
-#define FUJLAPTOP_DBG_INFO   0x0004
-#define FUJLAPTOP_DBG_TRACE  0x0008
-
-#ifdef CONFIG_FUJITSU_LAPTOP_DEBUG
-#define vdbg_printk(a_dbg_level, format, arg...) \
-   do { if (dbg_level & a_dbg_level) \
-   printk(KERN_DEBUG pr_fmt("%s: " format), __func__, ## arg); \
-   } while (0)
-#else
-#define vdbg_printk(a_dbg_level, format, arg...) \
-   do { } while (0)
-#endif
-
 /* Device controlling the backlight and associated keys */
 struct fujitsu_bl {
struct input_dev *input;
@@ -152,10 +136,6 @@ struct fujitsu_laptop {
 
 static struct acpi_device *fext;
 
-#ifdef CONFIG_FUJITSU_LAPTOP_DEBUG
-static u32 dbg_level = 0x03;
-#endif
-
 /* Fujitsu ACPI interface function */
 
 static int call_fext_func(struct acpi_device *device,
@@ -174,12 +154,13 @@ static int call_fext_func(struct acpi_device *device,
status = acpi_evaluate_integer(device->handle, "FUNC", _list,
   );
if (ACPI_FAILURE(status)) {
-   vdbg_printk(FUJLAPTOP_DBG_ERROR, "Failed to evaluate FUNC\n");
+   acpi_handle_err(device->handle, "Failed to evaluate FUNC\n");
return -ENODEV;
}
 
-   vdbg_printk(FUJLAPTOP_DBG_TRACE, "FUNC 0x%x (args 0x%x, 0x%x, 0x%x) 
returned 0x%x\n",
-   func, op, feature, state, (int)value);
+   acpi_handle_debug(device->handle,
+ "FUNC 0x%x (args 0x%x, 0x%x, 0x%x) returned 0x%x\n",
+ func, op, feature, state, (int)value);
return value;
 }
 
@@ -206,16 +187,16 @@ static int set_lcd_level(struct acpi_device *device, int 
level)
break;
}
 
-   vdbg_printk(FUJLAPTOP_DBG_TRACE, "set lcd level via %s [%d]\n",
-   method, level);
+   acpi_handle_debug(device->handle, "set lcd level via %s [%d]\n", method,
+ level);
 
if (level < 0 || level >= priv->max_brightness)
return -EINVAL;
 
status = acpi_execute_simple_method(device->handle, method, level);
if (ACPI_FAILURE(status)) {
-   vdbg_printk(FUJLAPTOP_DBG_ERROR, "Failed to evaluate %s\n",
-   method);
+   acpi_handle_err(device->handle, "Failed to evaluate %s\n",
+   method);
return -ENODEV;
}
 
@@ -230,7 +211,7 @@ static int get_lcd_level(struct acpi_device *device)
unsigned long long state = 0;
acpi_status status = AE_OK;
 
-   vdbg_printk(FUJLAPTOP_DBG_TRACE, "get lcd 

[PATCH 3/7] platform/x86: fujitsu-laptop: use strcpy to set ACPI device names and classes

2017-06-15 Thread Michał Kępień
No formatting is needed when setting ACPI device name and class, so
switch to using strcpy() for this purpose.

Signed-off-by: Michał Kępień 
---
 drivers/platform/x86/fujitsu-laptop.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/platform/x86/fujitsu-laptop.c 
b/drivers/platform/x86/fujitsu-laptop.c
index c64d5305ff37..06d04500dac0 100644
--- a/drivers/platform/x86/fujitsu-laptop.c
+++ b/drivers/platform/x86/fujitsu-laptop.c
@@ -411,8 +411,8 @@ static int acpi_fujitsu_bl_add(struct acpi_device *device)
return -ENOMEM;
 
fujitsu_bl = priv;
-   sprintf(acpi_device_name(device), "%s", ACPI_FUJITSU_BL_DEVICE_NAME);
-   sprintf(acpi_device_class(device), "%s", ACPI_FUJITSU_CLASS);
+   strcpy(acpi_device_name(device), ACPI_FUJITSU_BL_DEVICE_NAME);
+   strcpy(acpi_device_class(device), ACPI_FUJITSU_CLASS);
device->driver_data = priv;
 
error = acpi_fujitsu_bl_input_setup(device);
@@ -800,9 +800,8 @@ static int acpi_fujitsu_laptop_add(struct acpi_device 
*device)
WARN_ONCE(fext, "More than one FUJ02E3 ACPI device was found.  Driver 
may not work as intended.");
fext = device;
 
-   sprintf(acpi_device_name(device), "%s",
-   ACPI_FUJITSU_LAPTOP_DEVICE_NAME);
-   sprintf(acpi_device_class(device), "%s", ACPI_FUJITSU_CLASS);
+   strcpy(acpi_device_name(device), ACPI_FUJITSU_LAPTOP_DEVICE_NAME);
+   strcpy(acpi_device_class(device), ACPI_FUJITSU_CLASS);
device->driver_data = priv;
 
error = acpi_fujitsu_laptop_input_setup(device);
-- 
2.13.1



[PATCH 3/7] platform/x86: fujitsu-laptop: use strcpy to set ACPI device names and classes

2017-06-15 Thread Michał Kępień
No formatting is needed when setting ACPI device name and class, so
switch to using strcpy() for this purpose.

Signed-off-by: Michał Kępień 
---
 drivers/platform/x86/fujitsu-laptop.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/platform/x86/fujitsu-laptop.c 
b/drivers/platform/x86/fujitsu-laptop.c
index c64d5305ff37..06d04500dac0 100644
--- a/drivers/platform/x86/fujitsu-laptop.c
+++ b/drivers/platform/x86/fujitsu-laptop.c
@@ -411,8 +411,8 @@ static int acpi_fujitsu_bl_add(struct acpi_device *device)
return -ENOMEM;
 
fujitsu_bl = priv;
-   sprintf(acpi_device_name(device), "%s", ACPI_FUJITSU_BL_DEVICE_NAME);
-   sprintf(acpi_device_class(device), "%s", ACPI_FUJITSU_CLASS);
+   strcpy(acpi_device_name(device), ACPI_FUJITSU_BL_DEVICE_NAME);
+   strcpy(acpi_device_class(device), ACPI_FUJITSU_CLASS);
device->driver_data = priv;
 
error = acpi_fujitsu_bl_input_setup(device);
@@ -800,9 +800,8 @@ static int acpi_fujitsu_laptop_add(struct acpi_device 
*device)
WARN_ONCE(fext, "More than one FUJ02E3 ACPI device was found.  Driver 
may not work as intended.");
fext = device;
 
-   sprintf(acpi_device_name(device), "%s",
-   ACPI_FUJITSU_LAPTOP_DEVICE_NAME);
-   sprintf(acpi_device_class(device), "%s", ACPI_FUJITSU_CLASS);
+   strcpy(acpi_device_name(device), ACPI_FUJITSU_LAPTOP_DEVICE_NAME);
+   strcpy(acpi_device_class(device), ACPI_FUJITSU_CLASS);
device->driver_data = priv;
 
error = acpi_fujitsu_laptop_input_setup(device);
-- 
2.13.1



Re: xgetbv nondeterminism

2017-06-15 Thread H.J. Lu
On Thu, Jun 15, 2017 at 8:05 PM, Andy Lutomirski  wrote:
> On Thu, Jun 15, 2017 at 7:17 PM, H.J. Lu  wrote:
>> On Thu, Jun 15, 2017 at 4:28 PM, Andy Lutomirski  wrote:
>>> On Thu, Jun 15, 2017 at 4:11 PM, H.J. Lu  wrote:
 On Thu, Jun 15, 2017 at 3:45 PM, Andy Lutomirski  wrote:
> On Thu, Jun 15, 2017 at 3:40 PM, H.J. Lu  wrote:
>> On Thu, Jun 15, 2017 at 3:18 PM, Andy Lutomirski  wrote:
>>> On Thu, Jun 15, 2017 at 7:33 AM, Dave Hansen  
>>> wrote:
 On 06/14/2017 10:18 PM, Andy Lutomirski wrote:
> Dave, why is XINUSE exposed at all to userspace?

 You need it for XSAVEOPT when it is using the init optimization to be
 able to tell which state was written and which state in the XSAVE 
 buffer
 is potentially stale with respect to what's in the registers.  I guess
 you can just use XSAVE instead of XSAVEOPT, though.

 As you pointed out, if you are using XSAVEC's compaction features by
 leaving bits unset in the requested feature bitmap registers, you have
 no idea how much data XSAVEC will write, unless you read XINUSE with
 XGETBV.  But, you can get around *that* by just presizing the XSAVE
 buffer to be big.
>>>
>>> I imagine that, if you're going to save, do something quick, and
>>> restore, you'd be better off allocating a big buffer rather than
>>> trying to find the smallest buffer you can get away with by reading
>>> XINUSE.  Also, what happens if XINUSE nondeterministically changes out
>>> from under you before you do XSAVEC?  I assume you can avoid this
>>> becoming a problem by using RFBM carefully.
>>>

 So, I guess that leaves its use to just figuring out how much XSAVEOPT
 (and friends) are going to write.

> To be fair, glibc uses this new XGETBV feature, but I suspect its
> usage is rather dubious.  Shouldn't it just do XSAVEC directly rather
> than rolling its own code?

 A quick grep through my glibc source only shows XGETBV(0) used which
 reads XCR0.  I don't see any XGETBV(1) which reads XINUSE.  Did I miss 
 it.
>>>
>>> Take a look at sysdeps/x86_64/dl-trampoline.h in a new enough version.
>>
>> I wrote a test to compare latency against different approaches.   This
>> is on Skylake:
>>
>> [hjl@gnu-skl-1 glibc-test]$ make
>> ./test
>> move: 47212
>> fxsave  : 719440
>> xsave   : 925146
>> xsavec  : 811036
>> xsave_state_size: 1088
>> xsave_state_comp_size: 896
>>
>> load/store is about 17X faster than xsavec.
>>
>> I put my hjl/pr21265/xsavec branch at
>>
>> https://sourceware.org/git/?p=glibc.git;a=summary
>>
>> It uses xsave/xsave/xsavec in _dl_runtime_resolve.
>
> What is this used for?  Is it just to avoid clobbering argument regs
> when resolving a symbol that uses an ifunc, or is there more to it?

 It is used for lazy binding the first time when an external function is 
 called.

>>>
>>> Maybe I'm just being dense, but why?  What does ld.so need to do to
>>> resolve a symbol and update the GOT that requires using extended
>>> state?
>>
>> Since the first 8 vector registers are used to pass function parameters
>> and ld.so uses vector registers, _dl_runtime_resolve needs to preserve
>> the first 8 vector registers when transferring control to ld.so.
>>
>
> Wouldn't it be faster and more future-proof to recompile the relevant
> parts of ld.so to avoid using extended state?
>

Are you suggesting not to use vector in ld.so?  We used to do that
several years ago, which leads to some subtle bugs, like

https://sourceware.org/bugzilla/show_bug.cgi?id=15128

Also x86-64 was the only target which used FOREIGN_CALL macros
in ld.so,  FOREIGN_CALL macros were the cause of race condition
in ld.so:

https://sourceware.org/bugzilla/show_bug.cgi?id=11214



-- 
H.J.


Re: xgetbv nondeterminism

2017-06-15 Thread H.J. Lu
On Thu, Jun 15, 2017 at 8:05 PM, Andy Lutomirski  wrote:
> On Thu, Jun 15, 2017 at 7:17 PM, H.J. Lu  wrote:
>> On Thu, Jun 15, 2017 at 4:28 PM, Andy Lutomirski  wrote:
>>> On Thu, Jun 15, 2017 at 4:11 PM, H.J. Lu  wrote:
 On Thu, Jun 15, 2017 at 3:45 PM, Andy Lutomirski  wrote:
> On Thu, Jun 15, 2017 at 3:40 PM, H.J. Lu  wrote:
>> On Thu, Jun 15, 2017 at 3:18 PM, Andy Lutomirski  wrote:
>>> On Thu, Jun 15, 2017 at 7:33 AM, Dave Hansen  
>>> wrote:
 On 06/14/2017 10:18 PM, Andy Lutomirski wrote:
> Dave, why is XINUSE exposed at all to userspace?

 You need it for XSAVEOPT when it is using the init optimization to be
 able to tell which state was written and which state in the XSAVE 
 buffer
 is potentially stale with respect to what's in the registers.  I guess
 you can just use XSAVE instead of XSAVEOPT, though.

 As you pointed out, if you are using XSAVEC's compaction features by
 leaving bits unset in the requested feature bitmap registers, you have
 no idea how much data XSAVEC will write, unless you read XINUSE with
 XGETBV.  But, you can get around *that* by just presizing the XSAVE
 buffer to be big.
>>>
>>> I imagine that, if you're going to save, do something quick, and
>>> restore, you'd be better off allocating a big buffer rather than
>>> trying to find the smallest buffer you can get away with by reading
>>> XINUSE.  Also, what happens if XINUSE nondeterministically changes out
>>> from under you before you do XSAVEC?  I assume you can avoid this
>>> becoming a problem by using RFBM carefully.
>>>

 So, I guess that leaves its use to just figuring out how much XSAVEOPT
 (and friends) are going to write.

> To be fair, glibc uses this new XGETBV feature, but I suspect its
> usage is rather dubious.  Shouldn't it just do XSAVEC directly rather
> than rolling its own code?

 A quick grep through my glibc source only shows XGETBV(0) used which
 reads XCR0.  I don't see any XGETBV(1) which reads XINUSE.  Did I miss 
 it.
>>>
>>> Take a look at sysdeps/x86_64/dl-trampoline.h in a new enough version.
>>
>> I wrote a test to compare latency against different approaches.   This
>> is on Skylake:
>>
>> [hjl@gnu-skl-1 glibc-test]$ make
>> ./test
>> move: 47212
>> fxsave  : 719440
>> xsave   : 925146
>> xsavec  : 811036
>> xsave_state_size: 1088
>> xsave_state_comp_size: 896
>>
>> load/store is about 17X faster than xsavec.
>>
>> I put my hjl/pr21265/xsavec branch at
>>
>> https://sourceware.org/git/?p=glibc.git;a=summary
>>
>> It uses xsave/xsave/xsavec in _dl_runtime_resolve.
>
> What is this used for?  Is it just to avoid clobbering argument regs
> when resolving a symbol that uses an ifunc, or is there more to it?

 It is used for lazy binding the first time when an external function is 
 called.

>>>
>>> Maybe I'm just being dense, but why?  What does ld.so need to do to
>>> resolve a symbol and update the GOT that requires using extended
>>> state?
>>
>> Since the first 8 vector registers are used to pass function parameters
>> and ld.so uses vector registers, _dl_runtime_resolve needs to preserve
>> the first 8 vector registers when transferring control to ld.so.
>>
>
> Wouldn't it be faster and more future-proof to recompile the relevant
> parts of ld.so to avoid using extended state?
>

Are you suggesting not to use vector in ld.so?  We used to do that
several years ago, which leads to some subtle bugs, like

https://sourceware.org/bugzilla/show_bug.cgi?id=15128

Also x86-64 was the only target which used FOREIGN_CALL macros
in ld.so,  FOREIGN_CALL macros were the cause of race condition
in ld.so:

https://sourceware.org/bugzilla/show_bug.cgi?id=11214



-- 
H.J.


Re: LTS testing with latest kselftests - some failures

2017-06-15 Thread Sumit Semwal
Hi Alexander,

On 16 June 2017 at 04:35, Alexander Alemayhu  wrote:
> On Thu, Jun 15, 2017 at 11:26:53PM +0530, Sumit Semwal wrote:
>>
>> 4. bpf tests: These seem to have build failures in mainline as well -
>> I also tried to build kselftest-next, but a simple 'make -C
>> tools/testing/selftests/bpf' seems to error out. Are there any special
>> instructions to build these? [I tried x86_64, arm64 cross-compile on x86_64]
>>
> Do you have the full failure output? If you haven't already you might
> also want to run 'make headers_install' in the top level directory.

make headers_install was missing, but it still didn't improve the
build - here's the pastebin: https://paste.debian.net/971652/

>
> --
> Mit freundlichen Grüßen
>
> Alexander Alemayhu

Best,
Sumit.


Re: LTS testing with latest kselftests - some failures

2017-06-15 Thread Sumit Semwal
Hi Alexander,

On 16 June 2017 at 04:35, Alexander Alemayhu  wrote:
> On Thu, Jun 15, 2017 at 11:26:53PM +0530, Sumit Semwal wrote:
>>
>> 4. bpf tests: These seem to have build failures in mainline as well -
>> I also tried to build kselftest-next, but a simple 'make -C
>> tools/testing/selftests/bpf' seems to error out. Are there any special
>> instructions to build these? [I tried x86_64, arm64 cross-compile on x86_64]
>>
> Do you have the full failure output? If you haven't already you might
> also want to run 'make headers_install' in the top level directory.

make headers_install was missing, but it still didn't improve the
build - here's the pastebin: https://paste.debian.net/971652/

>
> --
> Mit freundlichen Grüßen
>
> Alexander Alemayhu

Best,
Sumit.


Re: [PATCH v2] perf: libdw support for powerpc [ping]

2017-06-15 Thread Ravi Bangoria
Works like a charm with Milian's patch.

Acked-by: Ravi Bangoria 

Note:
I still see very minor differences between libunwind and libdw. Also, second 
last
function gets repeated two times in every callchain but it can be fixed later 
on.
Otherwise all looks good!

Thanks,
-Ravi

On Thursday 15 June 2017 04:46 PM, Mark Wielaard wrote:
> On Thu, 2017-06-15 at 10:46 +0200, Milian Wolff wrote:
>> Just a quick question: Have you guys applied my recent patch:
>>
>> commit 5ea0416f51cc93436bbe497c62ab49fd9cb245b6
>> Author: Milian Wolff 
>> Date:   Thu Jun 1 23:00:21 2017 +0200
>>
>> perf report: Include partial stacks unwound with libdw
>> 
>> So far the whole stack was thrown away when any error occurred before
>> the maximum stack depth was unwound. This is actually a very common
>> scenario though. The stacks that got unwound so far are still
>> interesting. This removes a large chunk of differences when comparing
>> perf script output for libunwind and libdw perf unwinding.
>>
>> If not, then this could explain the issue you are seeing.
> Thanks! No, I didn't have that patch (*) yet. It makes a huge
> difference. With that, Paolo's patch and the elfutils libdw powerpc64
> fallback unwinder patch, it looks like I get user stack traces for
> everything now on ppc64le.
>
> Cheers,
>
> Mark
>
> (*) It just this one-liner, but what a difference that makes:
>
> --- a/tools/perf/util/unwind-libdw.c
> +++ b/tools/perf/util/unwind-libdw.c
> @@ -224,7 +224,7 @@ int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
>  
> err = dwfl_getthread_frames(ui->dwfl, thread->tid, frame_callback, 
> ui);
>  
> -   if (err && !ui->max_stack)
> +   if (err && ui->max_stack != max_stack)
> err = 0;
>  
> /*
>



Re: [PATCH v2] perf: libdw support for powerpc [ping]

2017-06-15 Thread Ravi Bangoria
Works like a charm with Milian's patch.

Acked-by: Ravi Bangoria 

Note:
I still see very minor differences between libunwind and libdw. Also, second 
last
function gets repeated two times in every callchain but it can be fixed later 
on.
Otherwise all looks good!

Thanks,
-Ravi

On Thursday 15 June 2017 04:46 PM, Mark Wielaard wrote:
> On Thu, 2017-06-15 at 10:46 +0200, Milian Wolff wrote:
>> Just a quick question: Have you guys applied my recent patch:
>>
>> commit 5ea0416f51cc93436bbe497c62ab49fd9cb245b6
>> Author: Milian Wolff 
>> Date:   Thu Jun 1 23:00:21 2017 +0200
>>
>> perf report: Include partial stacks unwound with libdw
>> 
>> So far the whole stack was thrown away when any error occurred before
>> the maximum stack depth was unwound. This is actually a very common
>> scenario though. The stacks that got unwound so far are still
>> interesting. This removes a large chunk of differences when comparing
>> perf script output for libunwind and libdw perf unwinding.
>>
>> If not, then this could explain the issue you are seeing.
> Thanks! No, I didn't have that patch (*) yet. It makes a huge
> difference. With that, Paolo's patch and the elfutils libdw powerpc64
> fallback unwinder patch, it looks like I get user stack traces for
> everything now on ppc64le.
>
> Cheers,
>
> Mark
>
> (*) It just this one-liner, but what a difference that makes:
>
> --- a/tools/perf/util/unwind-libdw.c
> +++ b/tools/perf/util/unwind-libdw.c
> @@ -224,7 +224,7 @@ int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
>  
> err = dwfl_getthread_frames(ui->dwfl, thread->tid, frame_callback, 
> ui);
>  
> -   if (err && !ui->max_stack)
> +   if (err && ui->max_stack != max_stack)
> err = 0;
>  
> /*
>



Re: [PATCH 11/14] mm, memory_hotplug: do not associate hotadded memory to zones until online

2017-06-15 Thread Wei Yang
On Mon, May 15, 2017 at 10:58:24AM +0200, Michal Hocko wrote:
>From: Michal Hocko 
>
>The current memory hotplug implementation relies on having all the
>struct pages associate with a zone/node during the physical hotplug phase
>(arch_add_memory->__add_pages->__add_section->__add_zone). In the vast
>majority of cases this means that they are added to ZONE_NORMAL. This
>has been so since 9d99aaa31f59 ("[PATCH] x86_64: Support memory hotadd
>without sparsemem") and it wasn't a big deal back then because movable
>onlining didn't exist yet.
>
>Much later memory hotplug wanted to (ab)use ZONE_MOVABLE for movable
>onlining 511c2aba8f07 ("mm, memory-hotplug: dynamic configure movable
>memory and portion memory") and then things got more complicated. Rather
>than reconsidering the zone association which was no longer needed
>(because the memory hotplug already depended on SPARSEMEM) a convoluted
>semantic of zone shifting has been developed. Only the currently last
>memblock or the one adjacent to the zone_movable can be onlined movable.
>This essentially means that the online type changes as the new memblocks
>are added.
>
>Let's simulate memory hot online manually
>$ echo 0x1 > /sys/devices/system/memory/probe
>$ grep . /sys/devices/system/memory/memory32/valid_zones
>Normal Movable
>
>$ echo $((0x1+(128<<20))) > /sys/devices/system/memory/probe
>$ grep . /sys/devices/system/memory/memory3?/valid_zones
>/sys/devices/system/memory/memory32/valid_zones:Normal
>/sys/devices/system/memory/memory33/valid_zones:Normal Movable
>
>$ echo $((0x1+2*(128<<20))) > /sys/devices/system/memory/probe
>$ grep . /sys/devices/system/memory/memory3?/valid_zones
>/sys/devices/system/memory/memory32/valid_zones:Normal
>/sys/devices/system/memory/memory33/valid_zones:Normal
>/sys/devices/system/memory/memory34/valid_zones:Normal Movable
>
>$ echo online_movable > /sys/devices/system/memory/memory34/state
>$ grep . /sys/devices/system/memory/memory3?/valid_zones
>/sys/devices/system/memory/memory32/valid_zones:Normal
>/sys/devices/system/memory/memory33/valid_zones:Normal Movable
>/sys/devices/system/memory/memory34/valid_zones:Movable Normal
>
>This is an awkward semantic because an udev event is sent as soon as the
>block is onlined and an udev handler might want to online it based on
>some policy (e.g. association with a node) but it will inherently race
>with new blocks showing up.
>
>This patch changes the physical online phase to not associate pages
>with any zone at all. All the pages are just marked reserved and wait
>for the onlining phase to be associated with the zone as per the online
>request. There are only two requirements
>   - existing ZONE_NORMAL and ZONE_MOVABLE cannot overlap
>   - ZONE_NORMAL precedes ZONE_MOVABLE in physical addresses
>the later on is not an inherent requirement and can be changed in the
>future. It preserves the current behavior and made the code slightly
>simpler. This is subject to change in future.
>
>This means that the same physical online steps as above will lead to the
>following state:
>Normal Movable
>
>/sys/devices/system/memory/memory32/valid_zones:Normal Movable
>/sys/devices/system/memory/memory33/valid_zones:Normal Movable
>
>/sys/devices/system/memory/memory32/valid_zones:Normal Movable
>/sys/devices/system/memory/memory33/valid_zones:Normal Movable
>/sys/devices/system/memory/memory34/valid_zones:Normal Movable
>
>/sys/devices/system/memory/memory32/valid_zones:Normal Movable
>/sys/devices/system/memory/memory33/valid_zones:Normal Movable
>/sys/devices/system/memory/memory34/valid_zones:Movable
>
>Implementation:
>The current move_pfn_range is reimplemented to check the above
>requirements (allow_online_pfn_range) and then updates the respective
>zone (move_pfn_range_to_zone), the pgdat and links all the pages in the
>pfn range with the zone/node. __add_pages is updated to not require the
>zone and only initializes sections in the range. This allowed to
>simplify the arch_add_memory code (s390 could get rid of quite some
>of code).
>
>devm_memremap_pages is the only user of arch_add_memory which relies
>on the zone association because it only hooks into the memory hotplug
>only half way. It uses it to associate the new memory with ZONE_DEVICE
>but doesn't allow it to be {on,off}lined via sysfs. This means that this
>particular code path has to call move_pfn_range_to_zone explicitly.
>
>The original zone shifting code is kept in place and will be removed in
>the follow up patch for an easier review.
>
>Please note that this patch also changes the original behavior when
>offlining a memory block adjacent to another zone (Normal vs. Movable)
>used to allow to change its movable type. This will be handled later.
>
>Changes since v1
>- we have to associate the page with the node early (in __add_section),
>  because pfn_to_node depends on struct page containing this
>  information - based on testing by Reza Arbab
>- 

Re: [PATCH 11/14] mm, memory_hotplug: do not associate hotadded memory to zones until online

2017-06-15 Thread Wei Yang
On Mon, May 15, 2017 at 10:58:24AM +0200, Michal Hocko wrote:
>From: Michal Hocko 
>
>The current memory hotplug implementation relies on having all the
>struct pages associate with a zone/node during the physical hotplug phase
>(arch_add_memory->__add_pages->__add_section->__add_zone). In the vast
>majority of cases this means that they are added to ZONE_NORMAL. This
>has been so since 9d99aaa31f59 ("[PATCH] x86_64: Support memory hotadd
>without sparsemem") and it wasn't a big deal back then because movable
>onlining didn't exist yet.
>
>Much later memory hotplug wanted to (ab)use ZONE_MOVABLE for movable
>onlining 511c2aba8f07 ("mm, memory-hotplug: dynamic configure movable
>memory and portion memory") and then things got more complicated. Rather
>than reconsidering the zone association which was no longer needed
>(because the memory hotplug already depended on SPARSEMEM) a convoluted
>semantic of zone shifting has been developed. Only the currently last
>memblock or the one adjacent to the zone_movable can be onlined movable.
>This essentially means that the online type changes as the new memblocks
>are added.
>
>Let's simulate memory hot online manually
>$ echo 0x1 > /sys/devices/system/memory/probe
>$ grep . /sys/devices/system/memory/memory32/valid_zones
>Normal Movable
>
>$ echo $((0x1+(128<<20))) > /sys/devices/system/memory/probe
>$ grep . /sys/devices/system/memory/memory3?/valid_zones
>/sys/devices/system/memory/memory32/valid_zones:Normal
>/sys/devices/system/memory/memory33/valid_zones:Normal Movable
>
>$ echo $((0x1+2*(128<<20))) > /sys/devices/system/memory/probe
>$ grep . /sys/devices/system/memory/memory3?/valid_zones
>/sys/devices/system/memory/memory32/valid_zones:Normal
>/sys/devices/system/memory/memory33/valid_zones:Normal
>/sys/devices/system/memory/memory34/valid_zones:Normal Movable
>
>$ echo online_movable > /sys/devices/system/memory/memory34/state
>$ grep . /sys/devices/system/memory/memory3?/valid_zones
>/sys/devices/system/memory/memory32/valid_zones:Normal
>/sys/devices/system/memory/memory33/valid_zones:Normal Movable
>/sys/devices/system/memory/memory34/valid_zones:Movable Normal
>
>This is an awkward semantic because an udev event is sent as soon as the
>block is onlined and an udev handler might want to online it based on
>some policy (e.g. association with a node) but it will inherently race
>with new blocks showing up.
>
>This patch changes the physical online phase to not associate pages
>with any zone at all. All the pages are just marked reserved and wait
>for the onlining phase to be associated with the zone as per the online
>request. There are only two requirements
>   - existing ZONE_NORMAL and ZONE_MOVABLE cannot overlap
>   - ZONE_NORMAL precedes ZONE_MOVABLE in physical addresses
>the later on is not an inherent requirement and can be changed in the
>future. It preserves the current behavior and made the code slightly
>simpler. This is subject to change in future.
>
>This means that the same physical online steps as above will lead to the
>following state:
>Normal Movable
>
>/sys/devices/system/memory/memory32/valid_zones:Normal Movable
>/sys/devices/system/memory/memory33/valid_zones:Normal Movable
>
>/sys/devices/system/memory/memory32/valid_zones:Normal Movable
>/sys/devices/system/memory/memory33/valid_zones:Normal Movable
>/sys/devices/system/memory/memory34/valid_zones:Normal Movable
>
>/sys/devices/system/memory/memory32/valid_zones:Normal Movable
>/sys/devices/system/memory/memory33/valid_zones:Normal Movable
>/sys/devices/system/memory/memory34/valid_zones:Movable
>
>Implementation:
>The current move_pfn_range is reimplemented to check the above
>requirements (allow_online_pfn_range) and then updates the respective
>zone (move_pfn_range_to_zone), the pgdat and links all the pages in the
>pfn range with the zone/node. __add_pages is updated to not require the
>zone and only initializes sections in the range. This allowed to
>simplify the arch_add_memory code (s390 could get rid of quite some
>of code).
>
>devm_memremap_pages is the only user of arch_add_memory which relies
>on the zone association because it only hooks into the memory hotplug
>only half way. It uses it to associate the new memory with ZONE_DEVICE
>but doesn't allow it to be {on,off}lined via sysfs. This means that this
>particular code path has to call move_pfn_range_to_zone explicitly.
>
>The original zone shifting code is kept in place and will be removed in
>the follow up patch for an easier review.
>
>Please note that this patch also changes the original behavior when
>offlining a memory block adjacent to another zone (Normal vs. Movable)
>used to allow to change its movable type. This will be handled later.
>
>Changes since v1
>- we have to associate the page with the node early (in __add_section),
>  because pfn_to_node depends on struct page containing this
>  information - based on testing by Reza Arbab
>- resize_{zone,pgdat}_range has to check 

[git pull] drm fixes for 4.12-rc6

2017-06-15 Thread Dave Airlie
Hi Linus,

This is the main fixes pull for 4.12-rc6, all pretty normal for this
stage, nothing really stands out. The mxsfb one is probably the
largest and it's for a black screen boot problem.

Dave.

The following changes since commit 32c1431eea4881a6b17bd7c639315010aeefa452:

  Linux 4.12-rc5 (2017-06-11 16:48:20 -0700)

are available in the git repository at:

  git://people.freedesktop.org/~airlied/linux tags/drm-fixes-for-v4.12-rc6

for you to fetch changes up to 7119dbdf7c52042acb1b02f116fa3257e97659ea:

  Merge tag 'drm-intel-fixes-2017-06-15' of
git://anongit.freedesktop.org/git/drm-intel into drm-fixes (2017-06-16
10:01:52 +1000)


amd, i915, mgag200, msxfb, tegra fixes


Christian König (1):
  drm/radeon: fix "force the UVD DPB into VRAM as well"

Christophe JAILLET (1):
  gpu: host1x: Fix error handling

Dave Airlie (3):
  Merge branch 'drm-fixes-4.12' of
git://people.freedesktop.org/~agd5f/linux into drm-fixes
  Merge tag 'drm-misc-fixes-2017-06-15' of
git://anongit.freedesktop.org/git/drm-misc into drm-fixes
  Merge tag 'drm-intel-fixes-2017-06-15' of
git://anongit.freedesktop.org/git/drm-intel into drm-fixes

Dmitry Osipenko (2):
  drm/tegra: Fix lockup on a use of staging API
  drm/tegra: Correct idr_alloc() minimum id

Fabio Estevam (1):
  drm: mxsfb_crtc: Reset the eLCDIF controller

Laurent Pinchart (1):
  drm: dw-hdmi: Fix compilation breakage by selecting REGMAP_MMIO

Mario Kleiner (2):
  drm/amdgpu: Fix overflow of watermark calcs at > 4k resolutions.
  drm/radeon: Fix overflow of watermark calcs at > 4k resolutions.

Mathieu Larouche (1):
  drm/mgag200: Fix to always set HiPri for G200e4 V2

Ville Syrjälä (2):
  drm/i915: Fix scaling check for 90/270 degree plane rotation
  drm/i915: Fix SKL+ watermarks for 90/270 rotation

Zhenyu Wang (1):
  drm/i915: Fix GVT-g PVINFO version compatibility check

 drivers/gpu/drm/amd/amdgpu/dce_v10_0.c  |  7 --
 drivers/gpu/drm/amd/amdgpu/dce_v11_0.c  |  7 --
 drivers/gpu/drm/amd/amdgpu/dce_v6_0.c   |  7 --
 drivers/gpu/drm/amd/amdgpu/dce_v8_0.c   |  7 --
 drivers/gpu/drm/bridge/synopsys/Kconfig |  1 +
 drivers/gpu/drm/i915/i915_pvinfo.h  |  8 ++-
 drivers/gpu/drm/i915/i915_vgpu.c| 10 
 drivers/gpu/drm/i915/intel_display.c| 14 ++-
 drivers/gpu/drm/i915/intel_pm.c | 36 ++--
 drivers/gpu/drm/mgag200/mgag200_mode.c  |  9 ++-
 drivers/gpu/drm/mxsfb/mxsfb_crtc.c  | 42 +
 drivers/gpu/drm/radeon/cik.c|  7 --
 drivers/gpu/drm/radeon/evergreen.c  |  7 --
 drivers/gpu/drm/radeon/radeon_uvd.c |  2 +-
 drivers/gpu/drm/radeon/si.c |  7 --
 drivers/gpu/drm/tegra/drm.c | 22 -
 drivers/gpu/host1x/dev.c|  2 +-


[git pull] drm fixes for 4.12-rc6

2017-06-15 Thread Dave Airlie
Hi Linus,

This is the main fixes pull for 4.12-rc6, all pretty normal for this
stage, nothing really stands out. The mxsfb one is probably the
largest and it's for a black screen boot problem.

Dave.

The following changes since commit 32c1431eea4881a6b17bd7c639315010aeefa452:

  Linux 4.12-rc5 (2017-06-11 16:48:20 -0700)

are available in the git repository at:

  git://people.freedesktop.org/~airlied/linux tags/drm-fixes-for-v4.12-rc6

for you to fetch changes up to 7119dbdf7c52042acb1b02f116fa3257e97659ea:

  Merge tag 'drm-intel-fixes-2017-06-15' of
git://anongit.freedesktop.org/git/drm-intel into drm-fixes (2017-06-16
10:01:52 +1000)


amd, i915, mgag200, msxfb, tegra fixes


Christian König (1):
  drm/radeon: fix "force the UVD DPB into VRAM as well"

Christophe JAILLET (1):
  gpu: host1x: Fix error handling

Dave Airlie (3):
  Merge branch 'drm-fixes-4.12' of
git://people.freedesktop.org/~agd5f/linux into drm-fixes
  Merge tag 'drm-misc-fixes-2017-06-15' of
git://anongit.freedesktop.org/git/drm-misc into drm-fixes
  Merge tag 'drm-intel-fixes-2017-06-15' of
git://anongit.freedesktop.org/git/drm-intel into drm-fixes

Dmitry Osipenko (2):
  drm/tegra: Fix lockup on a use of staging API
  drm/tegra: Correct idr_alloc() minimum id

Fabio Estevam (1):
  drm: mxsfb_crtc: Reset the eLCDIF controller

Laurent Pinchart (1):
  drm: dw-hdmi: Fix compilation breakage by selecting REGMAP_MMIO

Mario Kleiner (2):
  drm/amdgpu: Fix overflow of watermark calcs at > 4k resolutions.
  drm/radeon: Fix overflow of watermark calcs at > 4k resolutions.

Mathieu Larouche (1):
  drm/mgag200: Fix to always set HiPri for G200e4 V2

Ville Syrjälä (2):
  drm/i915: Fix scaling check for 90/270 degree plane rotation
  drm/i915: Fix SKL+ watermarks for 90/270 rotation

Zhenyu Wang (1):
  drm/i915: Fix GVT-g PVINFO version compatibility check

 drivers/gpu/drm/amd/amdgpu/dce_v10_0.c  |  7 --
 drivers/gpu/drm/amd/amdgpu/dce_v11_0.c  |  7 --
 drivers/gpu/drm/amd/amdgpu/dce_v6_0.c   |  7 --
 drivers/gpu/drm/amd/amdgpu/dce_v8_0.c   |  7 --
 drivers/gpu/drm/bridge/synopsys/Kconfig |  1 +
 drivers/gpu/drm/i915/i915_pvinfo.h  |  8 ++-
 drivers/gpu/drm/i915/i915_vgpu.c| 10 
 drivers/gpu/drm/i915/intel_display.c| 14 ++-
 drivers/gpu/drm/i915/intel_pm.c | 36 ++--
 drivers/gpu/drm/mgag200/mgag200_mode.c  |  9 ++-
 drivers/gpu/drm/mxsfb/mxsfb_crtc.c  | 42 +
 drivers/gpu/drm/radeon/cik.c|  7 --
 drivers/gpu/drm/radeon/evergreen.c  |  7 --
 drivers/gpu/drm/radeon/radeon_uvd.c |  2 +-
 drivers/gpu/drm/radeon/si.c |  7 --
 drivers/gpu/drm/tegra/drm.c | 22 -
 drivers/gpu/host1x/dev.c|  2 +-


Re: [patch] mm, oom: prevent additional oom kills before memory is freed

2017-06-15 Thread Tetsuo Handa
Tetsuo Handa wrote:
> and clarify in your patch that there is no possibility
> of waiting for direct/indirect memory allocation inside free_pgtables(),
> in addition to fixing the bug above.

Oops, this part was wrong, for __oom_reap_task_mm() will give up after
waiting for one second because down_read_trylock(>mmap_sem) continues
failing due to down_write(>mmap_sem) by exit_mmap().
# This is after all moving the location of "give up by timeout", isn't it? ;-)

Thus, clarify in your patch that there is no possibility of waiting for
direct/indirect memory allocation outside down_write()/up_write() (e.g.
i_mmap_lock_write() inside unmap_vmas(, vma, 0, -1) just before
down_write()).


Re: [patch] mm, oom: prevent additional oom kills before memory is freed

2017-06-15 Thread Tetsuo Handa
Tetsuo Handa wrote:
> and clarify in your patch that there is no possibility
> of waiting for direct/indirect memory allocation inside free_pgtables(),
> in addition to fixing the bug above.

Oops, this part was wrong, for __oom_reap_task_mm() will give up after
waiting for one second because down_read_trylock(>mmap_sem) continues
failing due to down_write(>mmap_sem) by exit_mmap().
# This is after all moving the location of "give up by timeout", isn't it? ;-)

Thus, clarify in your patch that there is no possibility of waiting for
direct/indirect memory allocation outside down_write()/up_write() (e.g.
i_mmap_lock_write() inside unmap_vmas(, vma, 0, -1) just before
down_write()).


Re: [PATCH v6 7/8] arm64: dts: hikey960: add device node for pmic and regulators

2017-06-15 Thread Guodong Xu
Thanks, Wei.

On Fri, Jun 16, 2017 at 3:55 AM, Wei Xu  wrote:
> Hi Guodong,
>
> On 2017/6/8 23:08, Guodong Xu wrote:
>> From: Wang Xiaoyin 
>>
>> add device node for hi6421 pmic core and hi6421v530
>> voltage regulator,include LDO(1,3,9,11,15,16)
>>
>> Signed-off-by: Wang Xiaoyin 
>> Signed-off-by: Guodong Xu 
>> Acked-by: Arnd Bergmann 
>> ---
>>  arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts | 46 
>> +++
>>  1 file changed, 46 insertions(+)
>>
>> diff --git a/arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts 
>> b/arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts
>> index ca448f0..e579333 100644
>> --- a/arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts
>> +++ b/arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts
>> @@ -97,6 +97,52 @@
>>   default-state = "off";
>>   };
>>   };
>> +
>> + pmic: pmic@fff34000 {
>> + compatible = "hisilicon,hi6421v530-pmic";
>> + reg = <0x0 0xfff34000 0x0 0x1000>;
>> + interrupt-controller;
>> + #interrupt-cells = <2>;
>> +
>> + regulators {
>> + ldo3: LDO3 { /* HDMI */
>> + regulator-name = "VOUT3_1V85";
>> + regulator-min-microvolt = <180>;
>> + regulator-max-microvolt = <220>;
>> + regulator-enable-ramp-delay = <120>;
>> + };
>> +
>> + ldo9: LDO9 { /* SDCARD I/O */
>> + regulator-name = "VOUT9_1V8_2V95";
>> + regulator-min-microvolt = <175>;
>> + regulator-max-microvolt = <330>;
>> + regulator-enable-ramp-delay = <240>;
>> + };
>> +
>> + ldo11: LDO11 { /* Low Speed Connector */
>> + regulator-name = "VOUT11_1V8_2V95";
>> + regulator-min-microvolt = <175>;
>> + regulator-max-microvolt = <330>;
>> + regulator-enable-ramp-delay = <240>;
>> + };
>> +
>> + ldo15: LDO15 { /* UFS VCC */
>> + regulator-name = "VOUT15_3V0";
>> + regulator-min-microvolt = <175>;
>> + regulator-max-microvolt = <300>;
>> + regulator-boot-on;
>> + regulator-always-on;
>> + regulator-enable-ramp-delay = <120>;
>> + };
>> +
>> + ldo16: LDO16 { /* SD VDD */
>> + regulator-name = "VOUT16_2V95";
>> + regulator-min-microvolt = <175>;
>> + regulator-max-microvolt = <300>;
>> + regulator-enable-ramp-delay = <360>;
>> + };
>> + };
>> + };
>>  };
>>
>>   {
>>
>
> Since you put the dts part into another patch set[1] and the driver part has 
> been applied.
> I will pick up [1].
> Thanks!
>
> [1]: https://lkml.org/lkml/2017/6/14/1052
>
> BR,
> Wei
>


Re: [PATCH v6 7/8] arm64: dts: hikey960: add device node for pmic and regulators

2017-06-15 Thread Guodong Xu
Thanks, Wei.

On Fri, Jun 16, 2017 at 3:55 AM, Wei Xu  wrote:
> Hi Guodong,
>
> On 2017/6/8 23:08, Guodong Xu wrote:
>> From: Wang Xiaoyin 
>>
>> add device node for hi6421 pmic core and hi6421v530
>> voltage regulator,include LDO(1,3,9,11,15,16)
>>
>> Signed-off-by: Wang Xiaoyin 
>> Signed-off-by: Guodong Xu 
>> Acked-by: Arnd Bergmann 
>> ---
>>  arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts | 46 
>> +++
>>  1 file changed, 46 insertions(+)
>>
>> diff --git a/arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts 
>> b/arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts
>> index ca448f0..e579333 100644
>> --- a/arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts
>> +++ b/arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts
>> @@ -97,6 +97,52 @@
>>   default-state = "off";
>>   };
>>   };
>> +
>> + pmic: pmic@fff34000 {
>> + compatible = "hisilicon,hi6421v530-pmic";
>> + reg = <0x0 0xfff34000 0x0 0x1000>;
>> + interrupt-controller;
>> + #interrupt-cells = <2>;
>> +
>> + regulators {
>> + ldo3: LDO3 { /* HDMI */
>> + regulator-name = "VOUT3_1V85";
>> + regulator-min-microvolt = <180>;
>> + regulator-max-microvolt = <220>;
>> + regulator-enable-ramp-delay = <120>;
>> + };
>> +
>> + ldo9: LDO9 { /* SDCARD I/O */
>> + regulator-name = "VOUT9_1V8_2V95";
>> + regulator-min-microvolt = <175>;
>> + regulator-max-microvolt = <330>;
>> + regulator-enable-ramp-delay = <240>;
>> + };
>> +
>> + ldo11: LDO11 { /* Low Speed Connector */
>> + regulator-name = "VOUT11_1V8_2V95";
>> + regulator-min-microvolt = <175>;
>> + regulator-max-microvolt = <330>;
>> + regulator-enable-ramp-delay = <240>;
>> + };
>> +
>> + ldo15: LDO15 { /* UFS VCC */
>> + regulator-name = "VOUT15_3V0";
>> + regulator-min-microvolt = <175>;
>> + regulator-max-microvolt = <300>;
>> + regulator-boot-on;
>> + regulator-always-on;
>> + regulator-enable-ramp-delay = <120>;
>> + };
>> +
>> + ldo16: LDO16 { /* SD VDD */
>> + regulator-name = "VOUT16_2V95";
>> + regulator-min-microvolt = <175>;
>> + regulator-max-microvolt = <300>;
>> + regulator-enable-ramp-delay = <360>;
>> + };
>> + };
>> + };
>>  };
>>
>>   {
>>
>
> Since you put the dts part into another patch set[1] and the driver part has 
> been applied.
> I will pick up [1].
> Thanks!
>
> [1]: https://lkml.org/lkml/2017/6/14/1052
>
> BR,
> Wei
>


Re: [Oops][next-20170614][] powerpc boot fails with WARNING: CPU: 12 PID: 0 at mm/memblock.c

2017-06-15 Thread Stephen Rothwell
Hi all,

On Fri, 16 Jun 2017 11:13:35 +1000 Stephen Rothwell  
wrote:
>
> On Fri, 16 Jun 2017 10:57:22 +1000 Michael Ellerman  
> wrote:
> >
> > "Rowand, Frank"  writes:
> >   
> > > On Thursday, June 15, 2017 2:25 AM, Abdul Haleem 
> > > [mailto:abdha...@linux.vnet.ibm.com]  wrote:
> > >>
> > >> On Thu, 2017-06-15 at 11:30 +0530, Abdul Haleem wrote:
> > >>>
> > >>> linux-next fails to boot on powerpc Bare-metal with these warnings.
> > >>>
> > >>> machine booted fine on next-20170613
> > >>
> > >> Thanks Michael, Yes it is (75fe04e59 of: remove *phandle properties from
> > >> expanded device tree)
> > >>
> > >> Frank, would you please take a look at the trace.
> > >>
> > >> Thanks
> > >
> > > < snip >
> > >
> > > My patch series 'of: remove *phandle properties from expanded device tree'
> > > in -next seems to have broken boot for a significant number of powerpc
> > > systems.  I am actively working on understanding and fixing the problem.  
> > >   
> > 
> > Thanks.
> > 
> > At least for me reverting that patch on top of linux-next gets things
> > booting again.
> > 
> > Stephen can you revert that patch in linux-next today?  
> 
> OK.

Actually, Rob removed it from his tree before I merged it.

-- 
Cheers,
Stephen Rothwell


Re: [Oops][next-20170614][] powerpc boot fails with WARNING: CPU: 12 PID: 0 at mm/memblock.c

2017-06-15 Thread Stephen Rothwell
Hi all,

On Fri, 16 Jun 2017 11:13:35 +1000 Stephen Rothwell  
wrote:
>
> On Fri, 16 Jun 2017 10:57:22 +1000 Michael Ellerman  
> wrote:
> >
> > "Rowand, Frank"  writes:
> >   
> > > On Thursday, June 15, 2017 2:25 AM, Abdul Haleem 
> > > [mailto:abdha...@linux.vnet.ibm.com]  wrote:
> > >>
> > >> On Thu, 2017-06-15 at 11:30 +0530, Abdul Haleem wrote:
> > >>>
> > >>> linux-next fails to boot on powerpc Bare-metal with these warnings.
> > >>>
> > >>> machine booted fine on next-20170613
> > >>
> > >> Thanks Michael, Yes it is (75fe04e59 of: remove *phandle properties from
> > >> expanded device tree)
> > >>
> > >> Frank, would you please take a look at the trace.
> > >>
> > >> Thanks
> > >
> > > < snip >
> > >
> > > My patch series 'of: remove *phandle properties from expanded device tree'
> > > in -next seems to have broken boot for a significant number of powerpc
> > > systems.  I am actively working on understanding and fixing the problem.  
> > >   
> > 
> > Thanks.
> > 
> > At least for me reverting that patch on top of linux-next gets things
> > booting again.
> > 
> > Stephen can you revert that patch in linux-next today?  
> 
> OK.

Actually, Rob removed it from his tree before I merged it.

-- 
Cheers,
Stephen Rothwell


Re: [PATCH] autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAIL

2017-06-15 Thread Ian Kent
On Fri, 2017-06-16 at 12:13 +1000, NeilBrown wrote:
> On Thu, Jun 15 2017, Andrew Morton wrote:
> 
> > On Wed, 07 Jun 2017 12:08:38 +1000 NeilBrown  wrote:
> > 
> > > 
> > > If a positive status is passed with the AUTOFS_DEV_IOCTL_FAIL
> > > ioctl, autofs4_d_automount() will return
> > >    ERR_PTR(status)
> > > with that status to follow_automount(), which will then
> > > dereference an invalid pointer.
> > > 
> > > So treat a positive status the same as zero, and map
> > > to ENOENT.
> > > 
> > > See comment in systemd src/core/automount.c::automount_send_ready().
> > > 
> > > ...
> > > 
> > > --- a/fs/autofs4/dev-ioctl.c
> > > +++ b/fs/autofs4/dev-ioctl.c
> > > @@ -344,7 +344,7 @@ static int autofs_dev_ioctl_fail(struct file *fp,
> > >   int status;
> > >  
> > >   token = (autofs_wqt_t) param->fail.token;
> > > - status = param->fail.status ? param->fail.status : -ENOENT;
> > > + status = param->fail.status < 0 ? param->fail.status : -ENOENT;
> > >   return autofs4_wait_release(sbi, token, status);
> > >  }
> > 
> > Sounds serious.  Was the absence of a cc:stable deliberate?
> 
> You need CAP_SYS_ADMIN to  get the ioctl even looked at.  Doesn't that
> mean the bug can only be triggered by a process that could easily do
> worse?

Think so, yes.

> 
> Or do containers allow admins to give out CAP_SYS_ADMIN to untrusted
> people??  I haven't been keeping up.

Maybe, with docker root can start a container with --privileged to give the
container admin capabilities. It may be the case that capabilities can be used
now I don't know.

> 
> Given how simple the patch is, it probably makes sense to add a
> cc:stable, just in case.

IMHO it needs to be applied to stable as well.

> 
> Thanks,
> NeilBrown


Re: linux-next: build failure after merge of the kspp tree

2017-06-15 Thread Stephen Rothwell
Hi Kees,

On Thu, 15 Jun 2017 20:20:47 -0700 Kees Cook  wrote:
>
> I'm so confused -- isn't this in next? All the build tests I did were
> against yesterday's -next which includes this from what I can see...

It is in next, but gets merged after the kspp tree ... so, this is when
inter-tree dependencies are a pain - I can merge the kspp tree later,
but then you have to remember which trees Linus must merge before you
send your pull request.  That's why we like to have all trees be
effectively stand alone (as much as possible).

-- 
Cheers,
Stephen Rothwell


Re: [PATCH] autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAIL

2017-06-15 Thread Ian Kent
On Fri, 2017-06-16 at 12:13 +1000, NeilBrown wrote:
> On Thu, Jun 15 2017, Andrew Morton wrote:
> 
> > On Wed, 07 Jun 2017 12:08:38 +1000 NeilBrown  wrote:
> > 
> > > 
> > > If a positive status is passed with the AUTOFS_DEV_IOCTL_FAIL
> > > ioctl, autofs4_d_automount() will return
> > >    ERR_PTR(status)
> > > with that status to follow_automount(), which will then
> > > dereference an invalid pointer.
> > > 
> > > So treat a positive status the same as zero, and map
> > > to ENOENT.
> > > 
> > > See comment in systemd src/core/automount.c::automount_send_ready().
> > > 
> > > ...
> > > 
> > > --- a/fs/autofs4/dev-ioctl.c
> > > +++ b/fs/autofs4/dev-ioctl.c
> > > @@ -344,7 +344,7 @@ static int autofs_dev_ioctl_fail(struct file *fp,
> > >   int status;
> > >  
> > >   token = (autofs_wqt_t) param->fail.token;
> > > - status = param->fail.status ? param->fail.status : -ENOENT;
> > > + status = param->fail.status < 0 ? param->fail.status : -ENOENT;
> > >   return autofs4_wait_release(sbi, token, status);
> > >  }
> > 
> > Sounds serious.  Was the absence of a cc:stable deliberate?
> 
> You need CAP_SYS_ADMIN to  get the ioctl even looked at.  Doesn't that
> mean the bug can only be triggered by a process that could easily do
> worse?

Think so, yes.

> 
> Or do containers allow admins to give out CAP_SYS_ADMIN to untrusted
> people??  I haven't been keeping up.

Maybe, with docker root can start a container with --privileged to give the
container admin capabilities. It may be the case that capabilities can be used
now I don't know.

> 
> Given how simple the patch is, it probably makes sense to add a
> cc:stable, just in case.

IMHO it needs to be applied to stable as well.

> 
> Thanks,
> NeilBrown


Re: linux-next: build failure after merge of the kspp tree

2017-06-15 Thread Stephen Rothwell
Hi Kees,

On Thu, 15 Jun 2017 20:20:47 -0700 Kees Cook  wrote:
>
> I'm so confused -- isn't this in next? All the build tests I did were
> against yesterday's -next which includes this from what I can see...

It is in next, but gets merged after the kspp tree ... so, this is when
inter-tree dependencies are a pain - I can merge the kspp tree later,
but then you have to remember which trees Linus must merge before you
send your pull request.  That's why we like to have all trees be
effectively stand alone (as much as possible).

-- 
Cheers,
Stephen Rothwell


RE: [PATCH net-next v2 0/2] r8152: adjust runtime suspend/resume

2017-06-15 Thread Hayes Wang
David Miller [mailto:da...@davemloft.net]
> Sent: Wednesday, June 14, 2017 1:02 AM
> > v2:
> > For #1, replace GFP_KERNEL with GFP_NOIO for usb_submit_urb().
> >
> > v1:
> > Improve the flow about runtime suspend/resume and make the code
> > easy to read.
> 
> Series applied.

Excuse me. I don't see these patches in net-next repository. Where could I find 
them?

Best Regards,
Hayes



RE: [PATCH net-next v2 0/2] r8152: adjust runtime suspend/resume

2017-06-15 Thread Hayes Wang
David Miller [mailto:da...@davemloft.net]
> Sent: Wednesday, June 14, 2017 1:02 AM
> > v2:
> > For #1, replace GFP_KERNEL with GFP_NOIO for usb_submit_urb().
> >
> > v1:
> > Improve the flow about runtime suspend/resume and make the code
> > easy to read.
> 
> Series applied.

Excuse me. I don't see these patches in net-next repository. Where could I find 
them?

Best Regards,
Hayes



linux-next: manual merge of the tip tree with the arm64 tree

2017-06-15 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the tip tree got a conflict in:

  drivers/acpi/apei/ghes.c

between commit:

  d0189b2eef2e ("acpi: apei: handle SEA notification type for ARMv8")

from the arm64 tree and commit:

  7bf130e4a065 ("ACPI/APEI: Handle GSIV and GPIO notification types")

from the tip tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/acpi/apei/ghes.c
index dfdb33f09f0a,d2c8a9286fa8..
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@@ -810,59 -718,10 +810,59 @@@ static int ghes_notify_hed(struct notif
return ret;
  }
  
- static struct notifier_block ghes_notifier_sci = {
-   .notifier_call = ghes_notify_sci,
+ static struct notifier_block ghes_notifier_hed = {
+   .notifier_call = ghes_notify_hed,
  };
  
 +#ifdef CONFIG_ACPI_APEI_SEA
 +static LIST_HEAD(ghes_sea);
 +
 +/*
 + * Return 0 only if one of the SEA error sources successfully reported an 
error
 + * record sent from the firmware.
 + */
 +int ghes_notify_sea(void)
 +{
 +  struct ghes *ghes;
 +  int ret = -ENOENT;
 +
 +  rcu_read_lock();
 +  list_for_each_entry_rcu(ghes, _sea, list) {
 +  if (!ghes_proc(ghes))
 +  ret = 0;
 +  }
 +  rcu_read_unlock();
 +  return ret;
 +}
 +
 +static void ghes_sea_add(struct ghes *ghes)
 +{
 +  mutex_lock(_list_mutex);
 +  list_add_rcu(>list, _sea);
 +  mutex_unlock(_list_mutex);
 +}
 +
 +static void ghes_sea_remove(struct ghes *ghes)
 +{
 +  mutex_lock(_list_mutex);
 +  list_del_rcu(>list);
 +  mutex_unlock(_list_mutex);
 +  synchronize_rcu();
 +}
 +#else /* CONFIG_ACPI_APEI_SEA */
 +static inline void ghes_sea_add(struct ghes *ghes)
 +{
 +  pr_err(GHES_PFX "ID: %d, trying to add SEA notification which is not 
supported\n",
 + ghes->generic->header.source_id);
 +}
 +
 +static inline void ghes_sea_remove(struct ghes *ghes)
 +{
 +  pr_err(GHES_PFX "ID: %d, trying to remove SEA notification which is not 
supported\n",
 + ghes->generic->header.source_id);
 +}
 +#endif /* CONFIG_ACPI_APEI_SEA */
 +
  #ifdef CONFIG_HAVE_ACPI_APEI_NMI
  /*
   * printk is not safe in NMI context.  So in NMI handler, we allocate
@@@ -1096,15 -966,10 +1096,18 @@@ static int ghes_probe(struct platform_d
case ACPI_HEST_NOTIFY_POLLED:
case ACPI_HEST_NOTIFY_EXTERNAL:
case ACPI_HEST_NOTIFY_SCI:
+   case ACPI_HEST_NOTIFY_GSIV:
+   case ACPI_HEST_NOTIFY_GPIO:
break;
 +  case ACPI_HEST_NOTIFY_SEA:
 +  if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA)) {
 +  pr_warn(GHES_PFX "Generic hardware error source: %d 
notified via SEA is not supported\n",
 +  generic->header.source_id);
 +  rc = -ENOTSUPP;
 +  goto err;
 +  }
 +  break;
+ 
case ACPI_HEST_NOTIFY_NMI:
if (!IS_ENABLED(CONFIG_HAVE_ACPI_APEI_NMI)) {
pr_warn(GHES_PFX "Generic hardware error source: %d 
notified via NMI interrupt is not supported!\n",
@@@ -1162,16 -1027,17 +1165,20 @@@
goto err_edac_unreg;
}
break;
+ 
case ACPI_HEST_NOTIFY_SCI:
+   case ACPI_HEST_NOTIFY_GSIV:
+   case ACPI_HEST_NOTIFY_GPIO:
mutex_lock(_list_mutex);
-   if (list_empty(_sci))
-   register_acpi_hed_notifier(_notifier_sci);
-   list_add_rcu(>list, _sci);
+   if (list_empty(_hed))
+   register_acpi_hed_notifier(_notifier_hed);
+   list_add_rcu(>list, _hed);
mutex_unlock(_list_mutex);
break;
 +  case ACPI_HEST_NOTIFY_SEA:
 +  ghes_sea_add(ghes);
 +  break;
+ 
case ACPI_HEST_NOTIFY_NMI:
ghes_nmi_add(ghes);
break;
@@@ -1218,9 -1084,7 +1228,10 @@@ static int ghes_remove(struct platform_
mutex_unlock(_list_mutex);
synchronize_rcu();
break;
 +  case ACPI_HEST_NOTIFY_SEA:
 +  ghes_sea_remove(ghes);
 +  break;
+ 
case ACPI_HEST_NOTIFY_NMI:
ghes_nmi_remove(ghes);
break;


linux-next: manual merge of the tip tree with the arm64 tree

2017-06-15 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the tip tree got a conflict in:

  drivers/acpi/apei/ghes.c

between commit:

  d0189b2eef2e ("acpi: apei: handle SEA notification type for ARMv8")

from the arm64 tree and commit:

  7bf130e4a065 ("ACPI/APEI: Handle GSIV and GPIO notification types")

from the tip tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/acpi/apei/ghes.c
index dfdb33f09f0a,d2c8a9286fa8..
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@@ -810,59 -718,10 +810,59 @@@ static int ghes_notify_hed(struct notif
return ret;
  }
  
- static struct notifier_block ghes_notifier_sci = {
-   .notifier_call = ghes_notify_sci,
+ static struct notifier_block ghes_notifier_hed = {
+   .notifier_call = ghes_notify_hed,
  };
  
 +#ifdef CONFIG_ACPI_APEI_SEA
 +static LIST_HEAD(ghes_sea);
 +
 +/*
 + * Return 0 only if one of the SEA error sources successfully reported an 
error
 + * record sent from the firmware.
 + */
 +int ghes_notify_sea(void)
 +{
 +  struct ghes *ghes;
 +  int ret = -ENOENT;
 +
 +  rcu_read_lock();
 +  list_for_each_entry_rcu(ghes, _sea, list) {
 +  if (!ghes_proc(ghes))
 +  ret = 0;
 +  }
 +  rcu_read_unlock();
 +  return ret;
 +}
 +
 +static void ghes_sea_add(struct ghes *ghes)
 +{
 +  mutex_lock(_list_mutex);
 +  list_add_rcu(>list, _sea);
 +  mutex_unlock(_list_mutex);
 +}
 +
 +static void ghes_sea_remove(struct ghes *ghes)
 +{
 +  mutex_lock(_list_mutex);
 +  list_del_rcu(>list);
 +  mutex_unlock(_list_mutex);
 +  synchronize_rcu();
 +}
 +#else /* CONFIG_ACPI_APEI_SEA */
 +static inline void ghes_sea_add(struct ghes *ghes)
 +{
 +  pr_err(GHES_PFX "ID: %d, trying to add SEA notification which is not 
supported\n",
 + ghes->generic->header.source_id);
 +}
 +
 +static inline void ghes_sea_remove(struct ghes *ghes)
 +{
 +  pr_err(GHES_PFX "ID: %d, trying to remove SEA notification which is not 
supported\n",
 + ghes->generic->header.source_id);
 +}
 +#endif /* CONFIG_ACPI_APEI_SEA */
 +
  #ifdef CONFIG_HAVE_ACPI_APEI_NMI
  /*
   * printk is not safe in NMI context.  So in NMI handler, we allocate
@@@ -1096,15 -966,10 +1096,18 @@@ static int ghes_probe(struct platform_d
case ACPI_HEST_NOTIFY_POLLED:
case ACPI_HEST_NOTIFY_EXTERNAL:
case ACPI_HEST_NOTIFY_SCI:
+   case ACPI_HEST_NOTIFY_GSIV:
+   case ACPI_HEST_NOTIFY_GPIO:
break;
 +  case ACPI_HEST_NOTIFY_SEA:
 +  if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA)) {
 +  pr_warn(GHES_PFX "Generic hardware error source: %d 
notified via SEA is not supported\n",
 +  generic->header.source_id);
 +  rc = -ENOTSUPP;
 +  goto err;
 +  }
 +  break;
+ 
case ACPI_HEST_NOTIFY_NMI:
if (!IS_ENABLED(CONFIG_HAVE_ACPI_APEI_NMI)) {
pr_warn(GHES_PFX "Generic hardware error source: %d 
notified via NMI interrupt is not supported!\n",
@@@ -1162,16 -1027,17 +1165,20 @@@
goto err_edac_unreg;
}
break;
+ 
case ACPI_HEST_NOTIFY_SCI:
+   case ACPI_HEST_NOTIFY_GSIV:
+   case ACPI_HEST_NOTIFY_GPIO:
mutex_lock(_list_mutex);
-   if (list_empty(_sci))
-   register_acpi_hed_notifier(_notifier_sci);
-   list_add_rcu(>list, _sci);
+   if (list_empty(_hed))
+   register_acpi_hed_notifier(_notifier_hed);
+   list_add_rcu(>list, _hed);
mutex_unlock(_list_mutex);
break;
 +  case ACPI_HEST_NOTIFY_SEA:
 +  ghes_sea_add(ghes);
 +  break;
+ 
case ACPI_HEST_NOTIFY_NMI:
ghes_nmi_add(ghes);
break;
@@@ -1218,9 -1084,7 +1228,10 @@@ static int ghes_remove(struct platform_
mutex_unlock(_list_mutex);
synchronize_rcu();
break;
 +  case ACPI_HEST_NOTIFY_SEA:
 +  ghes_sea_remove(ghes);
 +  break;
+ 
case ACPI_HEST_NOTIFY_NMI:
ghes_nmi_remove(ghes);
break;


Re: linux-next: build failure after merge of the kspp tree

2017-06-15 Thread Kees Cook
On Thu, Jun 15, 2017 at 7:51 PM, Daniel Micay  wrote:
> On Fri, 2017-06-16 at 11:30 +1000, Stephen Rothwell wrote:
>> Hi Kees,
>>
>> After merging the kspp tree, today's linux-next build (x86_64
>> allmodconfig) failed like this:
>>
>> In file included from include/linux/bitmap.h:8:0,
>>  from include/linux/cpumask.h:11,
>>  from arch/x86/include/asm/cpumask.h:4,
>>  from arch/x86/include/asm/msr.h:10,
>>  from arch/x86/include/asm/processor.h:20,
>>  from arch/x86/include/asm/cpufeature.h:4,
>>  from arch/x86/include/asm/thread_info.h:52,
>>  from include/linux/thread_info.h:37,
>>  from arch/x86/include/asm/preempt.h:6,
>>  from include/linux/preempt.h:80,
>>  from include/linux/spinlock.h:50,
>>  from include/linux/mmzone.h:7,
>>  from include/linux/gfp.h:5,
>>  from include/linux/slab.h:14,
>>  from drivers/scsi/csiostor/csio_lnode.c:37:
>> In function 'memcpy',
>> inlined from 'csio_append_attrib' at
>> drivers/scsi/csiostor/csio_lnode.c:248:2,
>> inlined from 'csio_ln_fdmi_dprt_cbfn' at
>> drivers/scsi/csiostor/csio_lnode.c:471:2:
>> include/linux/string.h:309:4: error: call to '__read_overflow2'
>> declared with attribute error: detected read beyond size of object
>> passed as 2nd parameter
>> __read_overflow2();
>> ^
>> In function 'memcpy',
>> inlined from 'csio_append_attrib' at
>> drivers/scsi/csiostor/csio_lnode.c:248:2,
>> inlined from 'csio_ln_fdmi_rhba_cbfn' at
>> drivers/scsi/csiostor/csio_lnode.c:337:2:
>> include/linux/string.h:309:4: error: call to '__read_overflow2'
>> declared with attribute error: detected read beyond size of object
>> passed as 2nd parameter
>> __read_overflow2();
>> ^
>>
>> Caused by commit
>>
>>   b90d6eba50d7 ("include/linux/string.h: add the option of fortified
>> string.h functions")
>>
>> I have reverted that commit for today.
>
> That's this one: https://lkml.org/lkml/2017/5/9/613, which is in
> https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/ in the
> 4.13/scsi-queue and for-next branches. I think that's why Kees didn't
> include it but I get he needs to add that.

I'm so confused -- isn't this in next? All the build tests I did were
against yesterday's -next which includes this from what I can see...

-Kees

-- 
Kees Cook
Pixel Security


Re: linux-next: build failure after merge of the kspp tree

2017-06-15 Thread Kees Cook
On Thu, Jun 15, 2017 at 7:51 PM, Daniel Micay  wrote:
> On Fri, 2017-06-16 at 11:30 +1000, Stephen Rothwell wrote:
>> Hi Kees,
>>
>> After merging the kspp tree, today's linux-next build (x86_64
>> allmodconfig) failed like this:
>>
>> In file included from include/linux/bitmap.h:8:0,
>>  from include/linux/cpumask.h:11,
>>  from arch/x86/include/asm/cpumask.h:4,
>>  from arch/x86/include/asm/msr.h:10,
>>  from arch/x86/include/asm/processor.h:20,
>>  from arch/x86/include/asm/cpufeature.h:4,
>>  from arch/x86/include/asm/thread_info.h:52,
>>  from include/linux/thread_info.h:37,
>>  from arch/x86/include/asm/preempt.h:6,
>>  from include/linux/preempt.h:80,
>>  from include/linux/spinlock.h:50,
>>  from include/linux/mmzone.h:7,
>>  from include/linux/gfp.h:5,
>>  from include/linux/slab.h:14,
>>  from drivers/scsi/csiostor/csio_lnode.c:37:
>> In function 'memcpy',
>> inlined from 'csio_append_attrib' at
>> drivers/scsi/csiostor/csio_lnode.c:248:2,
>> inlined from 'csio_ln_fdmi_dprt_cbfn' at
>> drivers/scsi/csiostor/csio_lnode.c:471:2:
>> include/linux/string.h:309:4: error: call to '__read_overflow2'
>> declared with attribute error: detected read beyond size of object
>> passed as 2nd parameter
>> __read_overflow2();
>> ^
>> In function 'memcpy',
>> inlined from 'csio_append_attrib' at
>> drivers/scsi/csiostor/csio_lnode.c:248:2,
>> inlined from 'csio_ln_fdmi_rhba_cbfn' at
>> drivers/scsi/csiostor/csio_lnode.c:337:2:
>> include/linux/string.h:309:4: error: call to '__read_overflow2'
>> declared with attribute error: detected read beyond size of object
>> passed as 2nd parameter
>> __read_overflow2();
>> ^
>>
>> Caused by commit
>>
>>   b90d6eba50d7 ("include/linux/string.h: add the option of fortified
>> string.h functions")
>>
>> I have reverted that commit for today.
>
> That's this one: https://lkml.org/lkml/2017/5/9/613, which is in
> https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/ in the
> 4.13/scsi-queue and for-next branches. I think that's why Kees didn't
> include it but I get he needs to add that.

I'm so confused -- isn't this in next? All the build tests I did were
against yesterday's -next which includes this from what I can see...

-Kees

-- 
Kees Cook
Pixel Security


Re: [virtio-dev] Re: [PATCH v11 3/6] virtio-balloon: VIRTIO_BALLOON_F_PAGE_CHUNKS

2017-06-15 Thread Michael S. Tsirkin
On Thu, Jun 15, 2017 at 04:10:17PM +0800, Wei Wang wrote:
> On 06/14/2017 01:56 AM, Michael S. Tsirkin wrote:
> > On Fri, Jun 09, 2017 at 06:41:38PM +0800, Wei Wang wrote:
> > > Add a new feature, VIRTIO_BALLOON_F_PAGE_CHUNKS, which enables
> > > the transfer of the ballooned (i.e. inflated/deflated) pages in
> > > chunks to the host.
> > so now these chunks are just s/g list entry.
> > So let's rename this VIRTIO_BALLOON_F_SG with a comment:
> > * Use standard virtio s/g instead of PFN lists *
> 
> Actually, it's not using the standard s/g list in the implementation,
> because:
> using the standard s/g will need kmalloc() the indirect table on
> demand (i.e. when virtqueue_add() converts s/g to indirect table);
> 
> The implementation directly pre-allocates an indirect desc table,
> and uses a entry (i.e. vring_desc) to describe a chunk. This
> avoids the overhead of kmalloc() the indirect table.

It's a separate API but the host/guest interface is standard.

> 
> > > +/*
> > > + * Callulates how many pfns can a page_bmap record. A bit corresponds to 
> > > a
> > > + * page of PAGE_SIZE.
> > > + */
> > > +#define VIRTIO_BALLOON_PFNS_PER_PAGE_BMAP \
> > > + (VIRTIO_BALLOON_PAGE_BMAP_SIZE * BITS_PER_BYTE)
> > > +
> > > +/* The number of page_bmap to allocate by default. */
> > > +#define VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM 1
> > It's not by default, it's at probe time, right?
> It is the number of page bitmap being kept throughout the whole
> lifecycle of the driver. The page bmap will be temporarily extended
> due to insufficiency during a ballooning process, but when that
> ballooning finishes, the extended part will be freed.
> > > +/* The maximum number of page_bmap that can be allocated. */
> > Not really, this is the size of the array we use to keep them.
> 
> This is the max number of the page bmap that can be
> extended temporarily.

That's just a confusing way to say the same.

> > > +#define VIRTIO_BALLOON_PAGE_BMAP_MAX_NUM 32
> > > +
> > So you still have a home-grown bitmap. I'd like to know why
> > isn't xbitmap suggested for this purpose by Matthew Wilcox
> > appropriate. Please add a comment explaining the requirements
> > from the data structure.
> 
> I didn't find his xbitmap being upstreamed, did you?

It's from dax tree - Matthew?

> > > +/*
> > > + * QEMU virtio implementation requires the desc table size less than
> > > + * VIRTQUEUE_MAX_SIZE, so minus 1 here.
> > I think it doesn't, the issue is probably that you add a header
> > as a separate s/g. In any case see below.
> > 
> > > + */
> > > +#define VIRTIO_BALLOON_MAX_PAGE_CHUNKS (VIRTQUEUE_MAX_SIZE - 1)
> > This is wrong, virtio spec says s/g size should not exceed VQ size.
> > If you want to support huge VQ sizes, you can add a fallback to
> > smaller sizes until it fits in 1 page.
> 
> Probably no need for huge VQ size, 1024 queue size should be
> enough. And we can have 1024 descriptors in the indirect
> table, so the above size doesn't exceed the vq size, right?

You need to look at vq size, you shouldn't assume it's > 1024.


> 
> > +static unsigned int extend_page_bmap_size(struct virtio_balloon *vb,
> > + unsigned long pfn_num)
> > what's this API doing?  Pls add comments. this seems to assume
> > it will only be called once.
> OK, I will add some comments here. This is the function to extend
> the number of page bitmap when the original 1 page bmap is
> not sufficient during a ballooning process. As mentioned above,
> at the end of this ballooning process, the extended part will be freed.
> 
> > it would be better to avoid making
> > this assumption, just look at what has been allocated
> > and extend it.
> Actually it's not an assumption. The rule here is that we always keep
> "1" page bmap. "1" is defined by the
> VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM. So when freeing, it also
> references VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM (not assuming
> any number)

When allocating, why don't you check what's allocated already?
why assume VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM was allocated?
Then calling extend_page_bmap_size many times would be idempotent.

> > +}
> > +
> > +/* Add a chunk to the buffer. */
> > +static void add_one_chunk(struct virtio_balloon *vb, struct virtqueue *vq,
> > + u64 base_addr, u32 size)
> > +{
> > +   unsigned int *num = >balloon_page_chunk.chunk_num;
> > +   struct vring_desc *desc = >balloon_page_chunk.desc_table[*num];
> > +
> > +   desc->addr = cpu_to_virtio64(vb->vdev, base_addr);
> > +   desc->len = cpu_to_virtio32(vb->vdev, size);
> > +   *num += 1;
> > +   if (*num == VIRTIO_BALLOON_MAX_PAGE_CHUNKS)
> > +   send_page_chunks(vb, vq);
> > +}
> > +
> > Poking at virtio internals like this is not nice. Pls move to virtio
> > code.  Also, pages must be read descriptors as host might modify them.
> > 
> > This also lacks viommu support but this is not mandatory as
> > that is borken atm anyway. I'll send a patch to at least fail 

Re: [virtio-dev] Re: [PATCH v11 3/6] virtio-balloon: VIRTIO_BALLOON_F_PAGE_CHUNKS

2017-06-15 Thread Michael S. Tsirkin
On Thu, Jun 15, 2017 at 04:10:17PM +0800, Wei Wang wrote:
> On 06/14/2017 01:56 AM, Michael S. Tsirkin wrote:
> > On Fri, Jun 09, 2017 at 06:41:38PM +0800, Wei Wang wrote:
> > > Add a new feature, VIRTIO_BALLOON_F_PAGE_CHUNKS, which enables
> > > the transfer of the ballooned (i.e. inflated/deflated) pages in
> > > chunks to the host.
> > so now these chunks are just s/g list entry.
> > So let's rename this VIRTIO_BALLOON_F_SG with a comment:
> > * Use standard virtio s/g instead of PFN lists *
> 
> Actually, it's not using the standard s/g list in the implementation,
> because:
> using the standard s/g will need kmalloc() the indirect table on
> demand (i.e. when virtqueue_add() converts s/g to indirect table);
> 
> The implementation directly pre-allocates an indirect desc table,
> and uses a entry (i.e. vring_desc) to describe a chunk. This
> avoids the overhead of kmalloc() the indirect table.

It's a separate API but the host/guest interface is standard.

> 
> > > +/*
> > > + * Callulates how many pfns can a page_bmap record. A bit corresponds to 
> > > a
> > > + * page of PAGE_SIZE.
> > > + */
> > > +#define VIRTIO_BALLOON_PFNS_PER_PAGE_BMAP \
> > > + (VIRTIO_BALLOON_PAGE_BMAP_SIZE * BITS_PER_BYTE)
> > > +
> > > +/* The number of page_bmap to allocate by default. */
> > > +#define VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM 1
> > It's not by default, it's at probe time, right?
> It is the number of page bitmap being kept throughout the whole
> lifecycle of the driver. The page bmap will be temporarily extended
> due to insufficiency during a ballooning process, but when that
> ballooning finishes, the extended part will be freed.
> > > +/* The maximum number of page_bmap that can be allocated. */
> > Not really, this is the size of the array we use to keep them.
> 
> This is the max number of the page bmap that can be
> extended temporarily.

That's just a confusing way to say the same.

> > > +#define VIRTIO_BALLOON_PAGE_BMAP_MAX_NUM 32
> > > +
> > So you still have a home-grown bitmap. I'd like to know why
> > isn't xbitmap suggested for this purpose by Matthew Wilcox
> > appropriate. Please add a comment explaining the requirements
> > from the data structure.
> 
> I didn't find his xbitmap being upstreamed, did you?

It's from dax tree - Matthew?

> > > +/*
> > > + * QEMU virtio implementation requires the desc table size less than
> > > + * VIRTQUEUE_MAX_SIZE, so minus 1 here.
> > I think it doesn't, the issue is probably that you add a header
> > as a separate s/g. In any case see below.
> > 
> > > + */
> > > +#define VIRTIO_BALLOON_MAX_PAGE_CHUNKS (VIRTQUEUE_MAX_SIZE - 1)
> > This is wrong, virtio spec says s/g size should not exceed VQ size.
> > If you want to support huge VQ sizes, you can add a fallback to
> > smaller sizes until it fits in 1 page.
> 
> Probably no need for huge VQ size, 1024 queue size should be
> enough. And we can have 1024 descriptors in the indirect
> table, so the above size doesn't exceed the vq size, right?

You need to look at vq size, you shouldn't assume it's > 1024.


> 
> > +static unsigned int extend_page_bmap_size(struct virtio_balloon *vb,
> > + unsigned long pfn_num)
> > what's this API doing?  Pls add comments. this seems to assume
> > it will only be called once.
> OK, I will add some comments here. This is the function to extend
> the number of page bitmap when the original 1 page bmap is
> not sufficient during a ballooning process. As mentioned above,
> at the end of this ballooning process, the extended part will be freed.
> 
> > it would be better to avoid making
> > this assumption, just look at what has been allocated
> > and extend it.
> Actually it's not an assumption. The rule here is that we always keep
> "1" page bmap. "1" is defined by the
> VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM. So when freeing, it also
> references VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM (not assuming
> any number)

When allocating, why don't you check what's allocated already?
why assume VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM was allocated?
Then calling extend_page_bmap_size many times would be idempotent.

> > +}
> > +
> > +/* Add a chunk to the buffer. */
> > +static void add_one_chunk(struct virtio_balloon *vb, struct virtqueue *vq,
> > + u64 base_addr, u32 size)
> > +{
> > +   unsigned int *num = >balloon_page_chunk.chunk_num;
> > +   struct vring_desc *desc = >balloon_page_chunk.desc_table[*num];
> > +
> > +   desc->addr = cpu_to_virtio64(vb->vdev, base_addr);
> > +   desc->len = cpu_to_virtio32(vb->vdev, size);
> > +   *num += 1;
> > +   if (*num == VIRTIO_BALLOON_MAX_PAGE_CHUNKS)
> > +   send_page_chunks(vb, vq);
> > +}
> > +
> > Poking at virtio internals like this is not nice. Pls move to virtio
> > code.  Also, pages must be read descriptors as host might modify them.
> > 
> > This also lacks viommu support but this is not mandatory as
> > that is borken atm anyway. I'll send a patch to at least fail 

Re: [PATCH v3 1/3] rtmutex: update rt-mutex-design

2017-06-15 Thread Alex Shi
Hi Steven & Sebastian,

If there are no more comments, could you like to give reviewed-by? :)

Regards
Alex

On 05/25/2017 01:26 PM, Alex Shi wrote:
> The rt-mutex-design documents didn't gotten meaningful update from its
> first version. Even after owner's pending bit was removed in commit 
> 8161239a8bcc
> ("rtmutex: Simplify PI algorithm and make highest prio task get lock")
> and priority list 'plist' changed to rbtree. And Peter Zijlstra did some
> clean up and fix for deadline task changes on tip tree.
> 
> So update it to latest code and make it meaningful.
> 
> Signed-off-by: Alex Shi 
> Cc: Steven Rostedt 
> Cc: Sebastian Siewior 
> Cc: Mathieu Poirier 
> Cc: Juri Lelli 
> Cc: Thomas Gleixner 
> To: linux-...@vger.kernel.org
> To: linux-kernel@vger.kernel.org
> To: Jonathan Corbet 
> To: Ingo Molnar 
> To: Peter Zijlstra 
> ---
>  Documentation/locking/rt-mutex-design.txt | 418 
> +++---
>  1 file changed, 97 insertions(+), 321 deletions(-)
> 
> diff --git a/Documentation/locking/rt-mutex-design.txt 
> b/Documentation/locking/rt-mutex-design.txt
> index 8666070..1a0da32 100644
> --- a/Documentation/locking/rt-mutex-design.txt
> +++ b/Documentation/locking/rt-mutex-design.txt
> @@ -97,9 +97,9 @@ waiter   - A waiter is a struct that is stored on the stack 
> of a blocked
> a process being blocked on the mutex, it is fine to allocate
> the waiter on the process's stack (local variable).  This
> structure holds a pointer to the task, as well as the mutex that
> -   the task is blocked on.  It also has the plist node structures to
> -   place the task in the waiter_list of a mutex as well as the
> -   pi_list of a mutex owner task (described below).
> +   the task is blocked on.  It also has a rbtree node structures to
> +   place the task in waiters rbtree of a mutex as well as the
> +   pi_waiters rbtree of a mutex owner task (described below).
>  
> waiter is sometimes used in reference to the task that is waiting
> on a mutex. This is the same as waiter->task.
> @@ -179,53 +179,35 @@ again.
>   |
> F->L5-+
>  
> -
> -Plist
> --
> -
> -Before I go further and talk about how the PI chain is stored through lists
> -on both mutexes and processes, I'll explain the plist.  This is similar to
> -the struct list_head functionality that is already in the kernel.
> -The implementation of plist is out of scope for this document, but it is
> -very important to understand what it does.
> -
> -There are a few differences between plist and list, the most important one
> -being that plist is a priority sorted linked list.  This means that the
> -priorities of the plist are sorted, such that it takes O(1) to retrieve the
> -highest priority item in the list.  Obviously this is useful to store 
> processes
> -based on their priorities.
> -
> -Another difference, which is important for implementation, is that, unlike
> -list, the head of the list is a different element than the nodes of a list.
> -So the head of the list is declared as struct plist_head and nodes that will
> -be added to the list are declared as struct plist_node.
> -
> +If process G has the highest priority in the chain, then all the tasks up
> +the chain (A and B in this example), must have their priorities increased
> +to that of G.
>  
>  Mutex Waiter List
>  -
>  
>  Every mutex keeps track of all the waiters that are blocked on itself. The 
> mutex
> -has a plist to store these waiters by priority.  This list is protected by
> +has a rbtree to store these waiters by priority.  This tree is protected by
>  a spin lock that is located in the struct of the mutex. This lock is called
> -wait_lock.  Since the modification of the waiter list is never done in
> +wait_lock.  Since the modification of the waiter tree is never done in
>  interrupt context, the wait_lock can be taken without disabling interrupts.
>  
>  
> -Task PI List
> +Task PI Tree
>  
>  
> -To keep track of the PI chains, each process has its own PI list.  This is
> -a list of all top waiters of the mutexes that are owned by the process.
> -Note that this list only holds the top waiters and not all waiters that are
> +To keep track of the PI chains, each process has its own PI rbtree.  This is
> +a tree of all top waiters of the mutexes that are owned by the process.
> +Note that this tree only holds the top waiters and not all waiters that are
>  blocked on mutexes owned by the process.
>  
> -The top of the task's PI list is always the highest priority task that
> +The top of the task's PI tree is always the highest priority task that
>  is waiting on a mutex that is owned by the task.  So 

Re: [PATCH v3 1/3] rtmutex: update rt-mutex-design

2017-06-15 Thread Alex Shi
Hi Steven & Sebastian,

If there are no more comments, could you like to give reviewed-by? :)

Regards
Alex

On 05/25/2017 01:26 PM, Alex Shi wrote:
> The rt-mutex-design documents didn't gotten meaningful update from its
> first version. Even after owner's pending bit was removed in commit 
> 8161239a8bcc
> ("rtmutex: Simplify PI algorithm and make highest prio task get lock")
> and priority list 'plist' changed to rbtree. And Peter Zijlstra did some
> clean up and fix for deadline task changes on tip tree.
> 
> So update it to latest code and make it meaningful.
> 
> Signed-off-by: Alex Shi 
> Cc: Steven Rostedt 
> Cc: Sebastian Siewior 
> Cc: Mathieu Poirier 
> Cc: Juri Lelli 
> Cc: Thomas Gleixner 
> To: linux-...@vger.kernel.org
> To: linux-kernel@vger.kernel.org
> To: Jonathan Corbet 
> To: Ingo Molnar 
> To: Peter Zijlstra 
> ---
>  Documentation/locking/rt-mutex-design.txt | 418 
> +++---
>  1 file changed, 97 insertions(+), 321 deletions(-)
> 
> diff --git a/Documentation/locking/rt-mutex-design.txt 
> b/Documentation/locking/rt-mutex-design.txt
> index 8666070..1a0da32 100644
> --- a/Documentation/locking/rt-mutex-design.txt
> +++ b/Documentation/locking/rt-mutex-design.txt
> @@ -97,9 +97,9 @@ waiter   - A waiter is a struct that is stored on the stack 
> of a blocked
> a process being blocked on the mutex, it is fine to allocate
> the waiter on the process's stack (local variable).  This
> structure holds a pointer to the task, as well as the mutex that
> -   the task is blocked on.  It also has the plist node structures to
> -   place the task in the waiter_list of a mutex as well as the
> -   pi_list of a mutex owner task (described below).
> +   the task is blocked on.  It also has a rbtree node structures to
> +   place the task in waiters rbtree of a mutex as well as the
> +   pi_waiters rbtree of a mutex owner task (described below).
>  
> waiter is sometimes used in reference to the task that is waiting
> on a mutex. This is the same as waiter->task.
> @@ -179,53 +179,35 @@ again.
>   |
> F->L5-+
>  
> -
> -Plist
> --
> -
> -Before I go further and talk about how the PI chain is stored through lists
> -on both mutexes and processes, I'll explain the plist.  This is similar to
> -the struct list_head functionality that is already in the kernel.
> -The implementation of plist is out of scope for this document, but it is
> -very important to understand what it does.
> -
> -There are a few differences between plist and list, the most important one
> -being that plist is a priority sorted linked list.  This means that the
> -priorities of the plist are sorted, such that it takes O(1) to retrieve the
> -highest priority item in the list.  Obviously this is useful to store 
> processes
> -based on their priorities.
> -
> -Another difference, which is important for implementation, is that, unlike
> -list, the head of the list is a different element than the nodes of a list.
> -So the head of the list is declared as struct plist_head and nodes that will
> -be added to the list are declared as struct plist_node.
> -
> +If process G has the highest priority in the chain, then all the tasks up
> +the chain (A and B in this example), must have their priorities increased
> +to that of G.
>  
>  Mutex Waiter List
>  -
>  
>  Every mutex keeps track of all the waiters that are blocked on itself. The 
> mutex
> -has a plist to store these waiters by priority.  This list is protected by
> +has a rbtree to store these waiters by priority.  This tree is protected by
>  a spin lock that is located in the struct of the mutex. This lock is called
> -wait_lock.  Since the modification of the waiter list is never done in
> +wait_lock.  Since the modification of the waiter tree is never done in
>  interrupt context, the wait_lock can be taken without disabling interrupts.
>  
>  
> -Task PI List
> +Task PI Tree
>  
>  
> -To keep track of the PI chains, each process has its own PI list.  This is
> -a list of all top waiters of the mutexes that are owned by the process.
> -Note that this list only holds the top waiters and not all waiters that are
> +To keep track of the PI chains, each process has its own PI rbtree.  This is
> +a tree of all top waiters of the mutexes that are owned by the process.
> +Note that this tree only holds the top waiters and not all waiters that are
>  blocked on mutexes owned by the process.
>  
> -The top of the task's PI list is always the highest priority task that
> +The top of the task's PI tree is always the highest priority task that
>  is waiting on a mutex that is owned by the task.  So if the task has
>  inherited a priority, it will always be the priority of the task that is
> -at the top of this list.
> +at the top of this tree.
>  
> -This list is stored in the task 

Re: [RFC PATCH] cpu_pm/rt: replace rt rwlock with raw spinlock

2017-06-15 Thread Alex Shi
It's a serious bug which cause arm/arm64 rt boot failed.
Anyone like to give a glance?

Thanks
Alex

On 06/14/2017 09:22 PM, Alex Shi wrote:
> This is a quick fix for a bug as 'scheduling while atomic' or
> 'scheduling from the idle thread' on arm/arm64.
> 
> On arm/arm64, rwlock cpu_pm_notifier_lock in cpu_pm cause a potential
> schedule after irq disable in idle call chain:
> 
> cpu_startup_entry
>   cpu_idle_loop
> local_irq_disable()
> cpuidle_idle_call
>   call_cpuidle
> cpuidle_enter
>   cpuidle_enter_state
> ->enter :arm_enter_idle_state
> cpu_pm_enter/exit
>   CPU_PM_CPU_IDLE_ENTER
>   read_lock(_pm_notifier_lock); <-- sleep in idle
>__rt_spin_lock();
>   schedule();
> 
> The kernel panic is here:
> [4.609601] BUG: scheduling while atomic: swapper/1/0/0x0002
> [4.609608] [] arm_enter_idle_state+0x18/0x70
> [4.609614] Modules linked in:
> [4.609615] [] cpuidle_enter_state+0xf0/0x218
> [4.609620] [] cpuidle_enter+0x18/0x20
> [4.609626] Preemption disabled at:
> [4.609627] [] call_cpuidle+0x24/0x40
> [4.609635] [] schedule_preempt_disabled+0x1c/0x28
> [4.609639] [] cpu_startup_entry+0x154/0x1f8
> [4.609645] [] secondary_start_kernel+0x15c/0x1a0
> 
> Daniel Lezcano said this notification is needed on arm/arm64 platforms.
> I also tried use local_lock_irq to replace local_irq_disable, but my 2
> boards just die without any output. So maybe it's only quick way to
> make rt kernel work on arm/arm64.
> 
> Since this is quick fix, instead of split out the raw rwlock, to use
> raw_spin_lock is simple and don't cost much.
> 
> Signed-off-by: Alex Shi 
> Cc: Sebastian Andrzej Siewior 
> Cc: Thomas Gleixner 
> Cc: Anders Roxell 
> Cc: Daniel Lezcano 
> Cc: linux-rt-users 
> ---
>  kernel/cpu_pm.c | 26 +-
>  1 file changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/kernel/cpu_pm.c b/kernel/cpu_pm.c
> index 009cc9a..8ffa13e3 100644
> --- a/kernel/cpu_pm.c
> +++ b/kernel/cpu_pm.c
> @@ -22,7 +22,7 @@
>  #include 
>  #include 
>  
> -static DEFINE_RWLOCK(cpu_pm_notifier_lock);
> +static DEFINE_RAW_SPINLOCK(cpu_pm_notifier_lock);
>  static RAW_NOTIFIER_HEAD(cpu_pm_notifier_chain);
>  
>  static int cpu_pm_notify(enum cpu_pm_event event, int nr_to_call, int 
> *nr_calls)
> @@ -50,9 +50,9 @@ int cpu_pm_register_notifier(struct notifier_block *nb)
>   unsigned long flags;
>   int ret;
>  
> - write_lock_irqsave(_pm_notifier_lock, flags);
> + raw_spin_lock_irqsave(_pm_notifier_lock, flags);
>   ret = raw_notifier_chain_register(_pm_notifier_chain, nb);
> - write_unlock_irqrestore(_pm_notifier_lock, flags);
> + raw_spin_unlock_irqrestore(_pm_notifier_lock, flags);
>  
>   return ret;
>  }
> @@ -72,9 +72,9 @@ int cpu_pm_unregister_notifier(struct notifier_block *nb)
>   unsigned long flags;
>   int ret;
>  
> - write_lock_irqsave(_pm_notifier_lock, flags);
> + raw_spin_lock_irqsave(_pm_notifier_lock, flags);
>   ret = raw_notifier_chain_unregister(_pm_notifier_chain, nb);
> - write_unlock_irqrestore(_pm_notifier_lock, flags);
> + raw_spin_unlock_irqrestore(_pm_notifier_lock, flags);
>  
>   return ret;
>  }
> @@ -100,7 +100,7 @@ int cpu_pm_enter(void)
>   int nr_calls;
>   int ret = 0;
>  
> - read_lock(_pm_notifier_lock);
> + raw_spin_lock(_pm_notifier_lock);
>   ret = cpu_pm_notify(CPU_PM_ENTER, -1, _calls);
>   if (ret)
>   /*
> @@ -108,7 +108,7 @@ int cpu_pm_enter(void)
>* PM entry who are notified earlier to prepare for it.
>*/
>   cpu_pm_notify(CPU_PM_ENTER_FAILED, nr_calls - 1, NULL);
> - read_unlock(_pm_notifier_lock);
> + raw_spin_unlock(_pm_notifier_lock);
>  
>   return ret;
>  }
> @@ -130,9 +130,9 @@ int cpu_pm_exit(void)
>  {
>   int ret;
>  
> - read_lock(_pm_notifier_lock);
> + raw_spin_lock(_pm_notifier_lock);
>   ret = cpu_pm_notify(CPU_PM_EXIT, -1, NULL);
> - read_unlock(_pm_notifier_lock);
> + raw_spin_unlock(_pm_notifier_lock);
>  
>   return ret;
>  }
> @@ -159,7 +159,7 @@ int cpu_cluster_pm_enter(void)
>   int nr_calls;
>   int ret = 0;
>  
> - read_lock(_pm_notifier_lock);
> + raw_spin_lock(_pm_notifier_lock);
>   ret = cpu_pm_notify(CPU_CLUSTER_PM_ENTER, -1, _calls);
>   if (ret)
>   /*
> @@ -167,7 +167,7 @@ int cpu_cluster_pm_enter(void)
>* PM entry who are notified earlier to prepare for it.
>*/
>   cpu_pm_notify(CPU_CLUSTER_PM_ENTER_FAILED, nr_calls - 1, NULL);
> - read_unlock(_pm_notifier_lock);
> + raw_spin_unlock(_pm_notifier_lock);
>  
>   return ret;
>  }
> @@ -192,9 +192,9 @@ 

Re: [RFC PATCH] cpu_pm/rt: replace rt rwlock with raw spinlock

2017-06-15 Thread Alex Shi
It's a serious bug which cause arm/arm64 rt boot failed.
Anyone like to give a glance?

Thanks
Alex

On 06/14/2017 09:22 PM, Alex Shi wrote:
> This is a quick fix for a bug as 'scheduling while atomic' or
> 'scheduling from the idle thread' on arm/arm64.
> 
> On arm/arm64, rwlock cpu_pm_notifier_lock in cpu_pm cause a potential
> schedule after irq disable in idle call chain:
> 
> cpu_startup_entry
>   cpu_idle_loop
> local_irq_disable()
> cpuidle_idle_call
>   call_cpuidle
> cpuidle_enter
>   cpuidle_enter_state
> ->enter :arm_enter_idle_state
> cpu_pm_enter/exit
>   CPU_PM_CPU_IDLE_ENTER
>   read_lock(_pm_notifier_lock); <-- sleep in idle
>__rt_spin_lock();
>   schedule();
> 
> The kernel panic is here:
> [4.609601] BUG: scheduling while atomic: swapper/1/0/0x0002
> [4.609608] [] arm_enter_idle_state+0x18/0x70
> [4.609614] Modules linked in:
> [4.609615] [] cpuidle_enter_state+0xf0/0x218
> [4.609620] [] cpuidle_enter+0x18/0x20
> [4.609626] Preemption disabled at:
> [4.609627] [] call_cpuidle+0x24/0x40
> [4.609635] [] schedule_preempt_disabled+0x1c/0x28
> [4.609639] [] cpu_startup_entry+0x154/0x1f8
> [4.609645] [] secondary_start_kernel+0x15c/0x1a0
> 
> Daniel Lezcano said this notification is needed on arm/arm64 platforms.
> I also tried use local_lock_irq to replace local_irq_disable, but my 2
> boards just die without any output. So maybe it's only quick way to
> make rt kernel work on arm/arm64.
> 
> Since this is quick fix, instead of split out the raw rwlock, to use
> raw_spin_lock is simple and don't cost much.
> 
> Signed-off-by: Alex Shi 
> Cc: Sebastian Andrzej Siewior 
> Cc: Thomas Gleixner 
> Cc: Anders Roxell 
> Cc: Daniel Lezcano 
> Cc: linux-rt-users 
> ---
>  kernel/cpu_pm.c | 26 +-
>  1 file changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/kernel/cpu_pm.c b/kernel/cpu_pm.c
> index 009cc9a..8ffa13e3 100644
> --- a/kernel/cpu_pm.c
> +++ b/kernel/cpu_pm.c
> @@ -22,7 +22,7 @@
>  #include 
>  #include 
>  
> -static DEFINE_RWLOCK(cpu_pm_notifier_lock);
> +static DEFINE_RAW_SPINLOCK(cpu_pm_notifier_lock);
>  static RAW_NOTIFIER_HEAD(cpu_pm_notifier_chain);
>  
>  static int cpu_pm_notify(enum cpu_pm_event event, int nr_to_call, int 
> *nr_calls)
> @@ -50,9 +50,9 @@ int cpu_pm_register_notifier(struct notifier_block *nb)
>   unsigned long flags;
>   int ret;
>  
> - write_lock_irqsave(_pm_notifier_lock, flags);
> + raw_spin_lock_irqsave(_pm_notifier_lock, flags);
>   ret = raw_notifier_chain_register(_pm_notifier_chain, nb);
> - write_unlock_irqrestore(_pm_notifier_lock, flags);
> + raw_spin_unlock_irqrestore(_pm_notifier_lock, flags);
>  
>   return ret;
>  }
> @@ -72,9 +72,9 @@ int cpu_pm_unregister_notifier(struct notifier_block *nb)
>   unsigned long flags;
>   int ret;
>  
> - write_lock_irqsave(_pm_notifier_lock, flags);
> + raw_spin_lock_irqsave(_pm_notifier_lock, flags);
>   ret = raw_notifier_chain_unregister(_pm_notifier_chain, nb);
> - write_unlock_irqrestore(_pm_notifier_lock, flags);
> + raw_spin_unlock_irqrestore(_pm_notifier_lock, flags);
>  
>   return ret;
>  }
> @@ -100,7 +100,7 @@ int cpu_pm_enter(void)
>   int nr_calls;
>   int ret = 0;
>  
> - read_lock(_pm_notifier_lock);
> + raw_spin_lock(_pm_notifier_lock);
>   ret = cpu_pm_notify(CPU_PM_ENTER, -1, _calls);
>   if (ret)
>   /*
> @@ -108,7 +108,7 @@ int cpu_pm_enter(void)
>* PM entry who are notified earlier to prepare for it.
>*/
>   cpu_pm_notify(CPU_PM_ENTER_FAILED, nr_calls - 1, NULL);
> - read_unlock(_pm_notifier_lock);
> + raw_spin_unlock(_pm_notifier_lock);
>  
>   return ret;
>  }
> @@ -130,9 +130,9 @@ int cpu_pm_exit(void)
>  {
>   int ret;
>  
> - read_lock(_pm_notifier_lock);
> + raw_spin_lock(_pm_notifier_lock);
>   ret = cpu_pm_notify(CPU_PM_EXIT, -1, NULL);
> - read_unlock(_pm_notifier_lock);
> + raw_spin_unlock(_pm_notifier_lock);
>  
>   return ret;
>  }
> @@ -159,7 +159,7 @@ int cpu_cluster_pm_enter(void)
>   int nr_calls;
>   int ret = 0;
>  
> - read_lock(_pm_notifier_lock);
> + raw_spin_lock(_pm_notifier_lock);
>   ret = cpu_pm_notify(CPU_CLUSTER_PM_ENTER, -1, _calls);
>   if (ret)
>   /*
> @@ -167,7 +167,7 @@ int cpu_cluster_pm_enter(void)
>* PM entry who are notified earlier to prepare for it.
>*/
>   cpu_pm_notify(CPU_CLUSTER_PM_ENTER_FAILED, nr_calls - 1, NULL);
> - read_unlock(_pm_notifier_lock);
> + raw_spin_unlock(_pm_notifier_lock);
>  
>   return ret;
>  }
> @@ -192,9 +192,9 @@ int cpu_cluster_pm_exit(void)
>  {
>   int ret;
>  
> - read_lock(_pm_notifier_lock);
> + raw_spin_lock(_pm_notifier_lock);
>   ret = 

Re: [PATCH tip/sched/core] Add comments to aid in safer usage of swake_up.

2017-06-15 Thread Paul E. McKenney
On Fri, Jun 16, 2017 at 09:07:57AM +0800, Boqun Feng wrote:
> On Thu, Jun 15, 2017 at 10:56:29AM -0700, Paul E. McKenney wrote:
> [...]
> > > > 
> > > > FWLIW, I agree.  There was a smb_mb() in RT-linux's equivalent of
> > > > swait_activate().
> > > > 
> > > > https://www.spinics.net/lists/linux-rt-users/msg10340.html
> > > > 
> > > > If the barrier goes in swait_active() then we don't have to require all
> > > > of the callers of swait_active and swake_up to issue the barrier
> > > > instead.  Handling this in swait_active is likely to be less error
> > > > prone.  Though, we could also do something like wq_has_sleeper() and use
> > > > that preferentially in swake_up and its variants.
> > > > 
> > > 
> > > I think it makes more sense that we delete the swait_active() in
> > > swake_up()? Because we seems to encourage users to do the quick check on
> > > wait queue on their own, so why do the check again in swake_up()?
> > > Besides, wake_up() doesn't call waitqueue_activie() outside the lock
> > > critical section either.
> > > 
> > > So how about the patch below(Testing is in progress)? Peter?
> > 
> > It is quite possible that a problem I am seeing is caused by this, but
> > there are reasons to believe otherwise.  And in any case, the problem is
> > quite rare, taking tens or perhaps even hundreds of hours of rcutorture
> > to reproduce.
> > 
> > So, would you be willing to create a dedicated swait torture test to check
> > this out?  The usual approach would be to create a circle of kthreads,
> > with each waiting on the previous kthread and waking up the next one.
> > Each kthread, after being awakened, checks a variable that its waker
> > sets just before the wakeup.  Have another kthread check for hangs.
> > 
> > Possibly introduce timeouts and random delays to stir things up a bit.
> > 
> > But maybe such a test already exists.  Does anyone know of one?  I don't
> > see anything obvious.
> > 
> 
> Your waketorture patchset[1] seems to be something similar, at least a
> good start ;-)

Glad I could help!  ;-)

> As we don't know which kind of scenario will trigger the problem easily,
> I will play around with different ones, and hopefully we can find a way.

Makes sense, please let me know how it goes!

Thanx, Paul

> Regards,
> Boqun
> 
> [1]: https://marc.info/?l=linux-kernel=146602969518960
> 
> > Interested?
> > 
> > Thanx, Paul
> > 
> [...]




Re: [PATCH tip/sched/core] Add comments to aid in safer usage of swake_up.

2017-06-15 Thread Paul E. McKenney
On Fri, Jun 16, 2017 at 09:07:57AM +0800, Boqun Feng wrote:
> On Thu, Jun 15, 2017 at 10:56:29AM -0700, Paul E. McKenney wrote:
> [...]
> > > > 
> > > > FWLIW, I agree.  There was a smb_mb() in RT-linux's equivalent of
> > > > swait_activate().
> > > > 
> > > > https://www.spinics.net/lists/linux-rt-users/msg10340.html
> > > > 
> > > > If the barrier goes in swait_active() then we don't have to require all
> > > > of the callers of swait_active and swake_up to issue the barrier
> > > > instead.  Handling this in swait_active is likely to be less error
> > > > prone.  Though, we could also do something like wq_has_sleeper() and use
> > > > that preferentially in swake_up and its variants.
> > > > 
> > > 
> > > I think it makes more sense that we delete the swait_active() in
> > > swake_up()? Because we seems to encourage users to do the quick check on
> > > wait queue on their own, so why do the check again in swake_up()?
> > > Besides, wake_up() doesn't call waitqueue_activie() outside the lock
> > > critical section either.
> > > 
> > > So how about the patch below(Testing is in progress)? Peter?
> > 
> > It is quite possible that a problem I am seeing is caused by this, but
> > there are reasons to believe otherwise.  And in any case, the problem is
> > quite rare, taking tens or perhaps even hundreds of hours of rcutorture
> > to reproduce.
> > 
> > So, would you be willing to create a dedicated swait torture test to check
> > this out?  The usual approach would be to create a circle of kthreads,
> > with each waiting on the previous kthread and waking up the next one.
> > Each kthread, after being awakened, checks a variable that its waker
> > sets just before the wakeup.  Have another kthread check for hangs.
> > 
> > Possibly introduce timeouts and random delays to stir things up a bit.
> > 
> > But maybe such a test already exists.  Does anyone know of one?  I don't
> > see anything obvious.
> > 
> 
> Your waketorture patchset[1] seems to be something similar, at least a
> good start ;-)

Glad I could help!  ;-)

> As we don't know which kind of scenario will trigger the problem easily,
> I will play around with different ones, and hopefully we can find a way.

Makes sense, please let me know how it goes!

Thanx, Paul

> Regards,
> Boqun
> 
> [1]: https://marc.info/?l=linux-kernel=146602969518960
> 
> > Interested?
> > 
> > Thanx, Paul
> > 
> [...]




Re: [PATCH] usb: host: ehci: workaround PME bug on AMD EHCI controller

2017-06-15 Thread Kai-Heng Feng
On Thu, Jun 15, 2017 at 10:12 PM, Alan Stern  wrote:
> On Thu, 15 Jun 2017, Kai-Heng Feng wrote:
>
>> On Wed, Jun 14, 2017 at 1:28 AM, Bjorn Helgaas  wrote:
>> >
>> > The lspci output [1] shows:
>> >
>> >   00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB EHCI 
>> > Controller (rev 39) (prog-if 20 [EHCI])
>> > Capabilities: [c0] Power Management version 2
>> >   Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
>> > PME(D0+,D1+,D2+,D3hot+,D3cold+)
>> >   Status: D3 NoSoftRst- PME-Enable+ DSel=0 DScale=0 PME-
>> >   Bridge: PM- B3+
>> >
>> > The device claims it can assert PME# from D3hot.  If it can't, that
>> > sounds like a hardware defect that should be addressed with a quirk.
>> > Ideally we would also have a pointer to the AMD hardware erratum.
>>
>> Looks like it's pretty similar to "23 USB Wake on Connect/Disconnect
>> with Low Speed Devices" in [1].
>> It points to a workaround in appendix A2 from [2].
>> However it says this bug only effects Low Speed devices, but it
>> actually also happens on High Speed devices.
>>
>> [1] https://support.amd.com/TechDocs/46837.pdf
>> [2] https://support.amd.com/TechDocs/42413.pdf
>
> Those documents refer to a hardware bug with a workaround in the BIOS.
> Have you checked to see if your BIOS is up to date?

Yes, it's up to date.

>
> Alan Stern
>


Re: [PATCH] usb: host: ehci: workaround PME bug on AMD EHCI controller

2017-06-15 Thread Kai-Heng Feng
On Thu, Jun 15, 2017 at 10:12 PM, Alan Stern  wrote:
> On Thu, 15 Jun 2017, Kai-Heng Feng wrote:
>
>> On Wed, Jun 14, 2017 at 1:28 AM, Bjorn Helgaas  wrote:
>> >
>> > The lspci output [1] shows:
>> >
>> >   00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB EHCI 
>> > Controller (rev 39) (prog-if 20 [EHCI])
>> > Capabilities: [c0] Power Management version 2
>> >   Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
>> > PME(D0+,D1+,D2+,D3hot+,D3cold+)
>> >   Status: D3 NoSoftRst- PME-Enable+ DSel=0 DScale=0 PME-
>> >   Bridge: PM- B3+
>> >
>> > The device claims it can assert PME# from D3hot.  If it can't, that
>> > sounds like a hardware defect that should be addressed with a quirk.
>> > Ideally we would also have a pointer to the AMD hardware erratum.
>>
>> Looks like it's pretty similar to "23 USB Wake on Connect/Disconnect
>> with Low Speed Devices" in [1].
>> It points to a workaround in appendix A2 from [2].
>> However it says this bug only effects Low Speed devices, but it
>> actually also happens on High Speed devices.
>>
>> [1] https://support.amd.com/TechDocs/46837.pdf
>> [2] https://support.amd.com/TechDocs/42413.pdf
>
> Those documents refer to a hardware bug with a workaround in the BIOS.
> Have you checked to see if your BIOS is up to date?

Yes, it's up to date.

>
> Alan Stern
>


Re: xgetbv nondeterminism

2017-06-15 Thread Andy Lutomirski
On Thu, Jun 15, 2017 at 7:17 PM, H.J. Lu  wrote:
> On Thu, Jun 15, 2017 at 4:28 PM, Andy Lutomirski  wrote:
>> On Thu, Jun 15, 2017 at 4:11 PM, H.J. Lu  wrote:
>>> On Thu, Jun 15, 2017 at 3:45 PM, Andy Lutomirski  wrote:
 On Thu, Jun 15, 2017 at 3:40 PM, H.J. Lu  wrote:
> On Thu, Jun 15, 2017 at 3:18 PM, Andy Lutomirski  wrote:
>> On Thu, Jun 15, 2017 at 7:33 AM, Dave Hansen  
>> wrote:
>>> On 06/14/2017 10:18 PM, Andy Lutomirski wrote:
 Dave, why is XINUSE exposed at all to userspace?
>>>
>>> You need it for XSAVEOPT when it is using the init optimization to be
>>> able to tell which state was written and which state in the XSAVE buffer
>>> is potentially stale with respect to what's in the registers.  I guess
>>> you can just use XSAVE instead of XSAVEOPT, though.
>>>
>>> As you pointed out, if you are using XSAVEC's compaction features by
>>> leaving bits unset in the requested feature bitmap registers, you have
>>> no idea how much data XSAVEC will write, unless you read XINUSE with
>>> XGETBV.  But, you can get around *that* by just presizing the XSAVE
>>> buffer to be big.
>>
>> I imagine that, if you're going to save, do something quick, and
>> restore, you'd be better off allocating a big buffer rather than
>> trying to find the smallest buffer you can get away with by reading
>> XINUSE.  Also, what happens if XINUSE nondeterministically changes out
>> from under you before you do XSAVEC?  I assume you can avoid this
>> becoming a problem by using RFBM carefully.
>>
>>>
>>> So, I guess that leaves its use to just figuring out how much XSAVEOPT
>>> (and friends) are going to write.
>>>
 To be fair, glibc uses this new XGETBV feature, but I suspect its
 usage is rather dubious.  Shouldn't it just do XSAVEC directly rather
 than rolling its own code?
>>>
>>> A quick grep through my glibc source only shows XGETBV(0) used which
>>> reads XCR0.  I don't see any XGETBV(1) which reads XINUSE.  Did I miss 
>>> it.
>>
>> Take a look at sysdeps/x86_64/dl-trampoline.h in a new enough version.
>
> I wrote a test to compare latency against different approaches.   This
> is on Skylake:
>
> [hjl@gnu-skl-1 glibc-test]$ make
> ./test
> move: 47212
> fxsave  : 719440
> xsave   : 925146
> xsavec  : 811036
> xsave_state_size: 1088
> xsave_state_comp_size: 896
>
> load/store is about 17X faster than xsavec.
>
> I put my hjl/pr21265/xsavec branch at
>
> https://sourceware.org/git/?p=glibc.git;a=summary
>
> It uses xsave/xsave/xsavec in _dl_runtime_resolve.

 What is this used for?  Is it just to avoid clobbering argument regs
 when resolving a symbol that uses an ifunc, or is there more to it?
>>>
>>> It is used for lazy binding the first time when an external function is 
>>> called.
>>>
>>
>> Maybe I'm just being dense, but why?  What does ld.so need to do to
>> resolve a symbol and update the GOT that requires using extended
>> state?
>
> Since the first 8 vector registers are used to pass function parameters
> and ld.so uses vector registers, _dl_runtime_resolve needs to preserve
> the first 8 vector registers when transferring control to ld.so.
>

Wouldn't it be faster and more future-proof to recompile the relevant
parts of ld.so to avoid using extended state?

--Andy


Re: xgetbv nondeterminism

2017-06-15 Thread Andy Lutomirski
On Thu, Jun 15, 2017 at 7:17 PM, H.J. Lu  wrote:
> On Thu, Jun 15, 2017 at 4:28 PM, Andy Lutomirski  wrote:
>> On Thu, Jun 15, 2017 at 4:11 PM, H.J. Lu  wrote:
>>> On Thu, Jun 15, 2017 at 3:45 PM, Andy Lutomirski  wrote:
 On Thu, Jun 15, 2017 at 3:40 PM, H.J. Lu  wrote:
> On Thu, Jun 15, 2017 at 3:18 PM, Andy Lutomirski  wrote:
>> On Thu, Jun 15, 2017 at 7:33 AM, Dave Hansen  
>> wrote:
>>> On 06/14/2017 10:18 PM, Andy Lutomirski wrote:
 Dave, why is XINUSE exposed at all to userspace?
>>>
>>> You need it for XSAVEOPT when it is using the init optimization to be
>>> able to tell which state was written and which state in the XSAVE buffer
>>> is potentially stale with respect to what's in the registers.  I guess
>>> you can just use XSAVE instead of XSAVEOPT, though.
>>>
>>> As you pointed out, if you are using XSAVEC's compaction features by
>>> leaving bits unset in the requested feature bitmap registers, you have
>>> no idea how much data XSAVEC will write, unless you read XINUSE with
>>> XGETBV.  But, you can get around *that* by just presizing the XSAVE
>>> buffer to be big.
>>
>> I imagine that, if you're going to save, do something quick, and
>> restore, you'd be better off allocating a big buffer rather than
>> trying to find the smallest buffer you can get away with by reading
>> XINUSE.  Also, what happens if XINUSE nondeterministically changes out
>> from under you before you do XSAVEC?  I assume you can avoid this
>> becoming a problem by using RFBM carefully.
>>
>>>
>>> So, I guess that leaves its use to just figuring out how much XSAVEOPT
>>> (and friends) are going to write.
>>>
 To be fair, glibc uses this new XGETBV feature, but I suspect its
 usage is rather dubious.  Shouldn't it just do XSAVEC directly rather
 than rolling its own code?
>>>
>>> A quick grep through my glibc source only shows XGETBV(0) used which
>>> reads XCR0.  I don't see any XGETBV(1) which reads XINUSE.  Did I miss 
>>> it.
>>
>> Take a look at sysdeps/x86_64/dl-trampoline.h in a new enough version.
>
> I wrote a test to compare latency against different approaches.   This
> is on Skylake:
>
> [hjl@gnu-skl-1 glibc-test]$ make
> ./test
> move: 47212
> fxsave  : 719440
> xsave   : 925146
> xsavec  : 811036
> xsave_state_size: 1088
> xsave_state_comp_size: 896
>
> load/store is about 17X faster than xsavec.
>
> I put my hjl/pr21265/xsavec branch at
>
> https://sourceware.org/git/?p=glibc.git;a=summary
>
> It uses xsave/xsave/xsavec in _dl_runtime_resolve.

 What is this used for?  Is it just to avoid clobbering argument regs
 when resolving a symbol that uses an ifunc, or is there more to it?
>>>
>>> It is used for lazy binding the first time when an external function is 
>>> called.
>>>
>>
>> Maybe I'm just being dense, but why?  What does ld.so need to do to
>> resolve a symbol and update the GOT that requires using extended
>> state?
>
> Since the first 8 vector registers are used to pass function parameters
> and ld.so uses vector registers, _dl_runtime_resolve needs to preserve
> the first 8 vector registers when transferring control to ld.so.
>

Wouldn't it be faster and more future-proof to recompile the relevant
parts of ld.so to avoid using extended state?

--Andy


Re: [PATCHv2 3/3] mm: Use updated pmdp_invalidate() inteface to track dirty/accessed bits

2017-06-15 Thread Minchan Kim
Hello,

On Thu, Jun 15, 2017 at 05:52:24PM +0300, Kirill A. Shutemov wrote:
> This patch uses modifed pmdp_invalidate(), that return previous value of pmd,
> to transfer dirty and accessed bits.
> 
> Signed-off-by: Kirill A. Shutemov 
> ---
>  fs/proc/task_mmu.c |  8 
>  mm/huge_memory.c   | 29 -
>  2 files changed, 16 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index f0c8b33d99b1..f2fc1ef5bba2 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -906,13 +906,13 @@ static inline void clear_soft_dirty(struct 
> vm_area_struct *vma,
>  static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
>   unsigned long addr, pmd_t *pmdp)
>  {
> - pmd_t pmd = *pmdp;
> + pmd_t old, pmd = *pmdp;
>  
>   /* See comment in change_huge_pmd() */
> - pmdp_invalidate(vma, addr, pmdp);
> - if (pmd_dirty(*pmdp))
> + old = pmdp_invalidate(vma, addr, pmdp);
> + if (pmd_dirty(old))
>   pmd = pmd_mkdirty(pmd);
> - if (pmd_young(*pmdp))
> + if (pmd_young(old))
>   pmd = pmd_mkyoung(pmd);
>  
>   pmd = pmd_wrprotect(pmd);
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index a84909cf20d3..0433e73531bf 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1777,17 +1777,7 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t 
> *pmd,
>* pmdp_invalidate() is required to make sure we don't miss
>* dirty/young flags set by hardware.
>*/
> - entry = *pmd;
> - pmdp_invalidate(vma, addr, pmd);
> -
> - /*
> -  * Recover dirty/young flags.  It relies on pmdp_invalidate to not
> -  * corrupt them.
> -  */
> - if (pmd_dirty(*pmd))
> - entry = pmd_mkdirty(entry);
> - if (pmd_young(*pmd))
> - entry = pmd_mkyoung(entry);
> + entry = pmdp_invalidate(vma, addr, pmd);
>  
>   entry = pmd_modify(entry, newprot);
>   if (preserve_write)
> @@ -1927,8 +1917,8 @@ static void __split_huge_pmd_locked(struct 
> vm_area_struct *vma, pmd_t *pmd,
>   struct mm_struct *mm = vma->vm_mm;
>   struct page *page;
>   pgtable_t pgtable;
> - pmd_t _pmd;
> - bool young, write, dirty, soft_dirty;
> + pmd_t old, _pmd;
> + bool young, write, soft_dirty;
>   unsigned long addr;
>   int i;
>  
> @@ -1965,7 +1955,6 @@ static void __split_huge_pmd_locked(struct 
> vm_area_struct *vma, pmd_t *pmd,
>   page_ref_add(page, HPAGE_PMD_NR - 1);
>   write = pmd_write(*pmd);
>   young = pmd_young(*pmd);
> - dirty = pmd_dirty(*pmd);
>   soft_dirty = pmd_soft_dirty(*pmd);
>  
>   pmdp_huge_split_prepare(vma, haddr, pmd);
> @@ -1995,8 +1984,6 @@ static void __split_huge_pmd_locked(struct 
> vm_area_struct *vma, pmd_t *pmd,
>   if (soft_dirty)
>   entry = pte_mksoft_dirty(entry);
>   }
> - if (dirty)
> - SetPageDirty(page + i);
>   pte = pte_offset_map(&_pmd, addr);
>   BUG_ON(!pte_none(*pte));
>   set_pte_at(mm, addr, pte, entry);
> @@ -2045,7 +2032,15 @@ static void __split_huge_pmd_locked(struct 
> vm_area_struct *vma, pmd_t *pmd,
>* and finally we write the non-huge version of the pmd entry with
>* pmd_populate.
>*/
> - pmdp_invalidate(vma, haddr, pmd);
> + old = pmdp_invalidate(vma, haddr, pmd);
> +
> + /*
> +  * Transfer dirty bit using value returned by pmd_invalidate() to be
> +  * sure we don't race with CPU that can set the bit under us.
> +  */
> + if (pmd_dirty(old))
> + SetPageDirty(page);
> +

When I see this, without this patch, MADV_FREE has been broken because
it can lose dirty bit by early checking. Right?
If so, isn't it a candidate for -stable?


Re: [PATCHv2 3/3] mm: Use updated pmdp_invalidate() inteface to track dirty/accessed bits

2017-06-15 Thread Minchan Kim
Hello,

On Thu, Jun 15, 2017 at 05:52:24PM +0300, Kirill A. Shutemov wrote:
> This patch uses modifed pmdp_invalidate(), that return previous value of pmd,
> to transfer dirty and accessed bits.
> 
> Signed-off-by: Kirill A. Shutemov 
> ---
>  fs/proc/task_mmu.c |  8 
>  mm/huge_memory.c   | 29 -
>  2 files changed, 16 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index f0c8b33d99b1..f2fc1ef5bba2 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -906,13 +906,13 @@ static inline void clear_soft_dirty(struct 
> vm_area_struct *vma,
>  static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
>   unsigned long addr, pmd_t *pmdp)
>  {
> - pmd_t pmd = *pmdp;
> + pmd_t old, pmd = *pmdp;
>  
>   /* See comment in change_huge_pmd() */
> - pmdp_invalidate(vma, addr, pmdp);
> - if (pmd_dirty(*pmdp))
> + old = pmdp_invalidate(vma, addr, pmdp);
> + if (pmd_dirty(old))
>   pmd = pmd_mkdirty(pmd);
> - if (pmd_young(*pmdp))
> + if (pmd_young(old))
>   pmd = pmd_mkyoung(pmd);
>  
>   pmd = pmd_wrprotect(pmd);
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index a84909cf20d3..0433e73531bf 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1777,17 +1777,7 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t 
> *pmd,
>* pmdp_invalidate() is required to make sure we don't miss
>* dirty/young flags set by hardware.
>*/
> - entry = *pmd;
> - pmdp_invalidate(vma, addr, pmd);
> -
> - /*
> -  * Recover dirty/young flags.  It relies on pmdp_invalidate to not
> -  * corrupt them.
> -  */
> - if (pmd_dirty(*pmd))
> - entry = pmd_mkdirty(entry);
> - if (pmd_young(*pmd))
> - entry = pmd_mkyoung(entry);
> + entry = pmdp_invalidate(vma, addr, pmd);
>  
>   entry = pmd_modify(entry, newprot);
>   if (preserve_write)
> @@ -1927,8 +1917,8 @@ static void __split_huge_pmd_locked(struct 
> vm_area_struct *vma, pmd_t *pmd,
>   struct mm_struct *mm = vma->vm_mm;
>   struct page *page;
>   pgtable_t pgtable;
> - pmd_t _pmd;
> - bool young, write, dirty, soft_dirty;
> + pmd_t old, _pmd;
> + bool young, write, soft_dirty;
>   unsigned long addr;
>   int i;
>  
> @@ -1965,7 +1955,6 @@ static void __split_huge_pmd_locked(struct 
> vm_area_struct *vma, pmd_t *pmd,
>   page_ref_add(page, HPAGE_PMD_NR - 1);
>   write = pmd_write(*pmd);
>   young = pmd_young(*pmd);
> - dirty = pmd_dirty(*pmd);
>   soft_dirty = pmd_soft_dirty(*pmd);
>  
>   pmdp_huge_split_prepare(vma, haddr, pmd);
> @@ -1995,8 +1984,6 @@ static void __split_huge_pmd_locked(struct 
> vm_area_struct *vma, pmd_t *pmd,
>   if (soft_dirty)
>   entry = pte_mksoft_dirty(entry);
>   }
> - if (dirty)
> - SetPageDirty(page + i);
>   pte = pte_offset_map(&_pmd, addr);
>   BUG_ON(!pte_none(*pte));
>   set_pte_at(mm, addr, pte, entry);
> @@ -2045,7 +2032,15 @@ static void __split_huge_pmd_locked(struct 
> vm_area_struct *vma, pmd_t *pmd,
>* and finally we write the non-huge version of the pmd entry with
>* pmd_populate.
>*/
> - pmdp_invalidate(vma, haddr, pmd);
> + old = pmdp_invalidate(vma, haddr, pmd);
> +
> + /*
> +  * Transfer dirty bit using value returned by pmd_invalidate() to be
> +  * sure we don't race with CPU that can set the bit under us.
> +  */
> + if (pmd_dirty(old))
> + SetPageDirty(page);
> +

When I see this, without this patch, MADV_FREE has been broken because
it can lose dirty bit by early checking. Right?
If so, isn't it a candidate for -stable?


Re: linux-next: build failure after merge of the kspp tree

2017-06-15 Thread Daniel Micay
> https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/ in the
> 4.13/scsi-queue and for-next branches. I think that's why Kees didn't
> include it but I get he needs to add that.

s/get/guess/

Or is that repo supposed to get pulled into next?


Re: linux-next: build failure after merge of the kspp tree

2017-06-15 Thread Daniel Micay
> https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/ in the
> 4.13/scsi-queue and for-next branches. I think that's why Kees didn't
> include it but I get he needs to add that.

s/get/guess/

Or is that repo supposed to get pulled into next?


Re: linux-next: build failure after merge of the kspp tree

2017-06-15 Thread Daniel Micay
On Fri, 2017-06-16 at 11:30 +1000, Stephen Rothwell wrote:
> Hi Kees,
> 
> After merging the kspp tree, today's linux-next build (x86_64
> allmodconfig) failed like this:
> 
> In file included from include/linux/bitmap.h:8:0,
>  from include/linux/cpumask.h:11,
>  from arch/x86/include/asm/cpumask.h:4,
>  from arch/x86/include/asm/msr.h:10,
>  from arch/x86/include/asm/processor.h:20,
>  from arch/x86/include/asm/cpufeature.h:4,
>  from arch/x86/include/asm/thread_info.h:52,
>  from include/linux/thread_info.h:37,
>  from arch/x86/include/asm/preempt.h:6,
>  from include/linux/preempt.h:80,
>  from include/linux/spinlock.h:50,
>  from include/linux/mmzone.h:7,
>  from include/linux/gfp.h:5,
>  from include/linux/slab.h:14,
>  from drivers/scsi/csiostor/csio_lnode.c:37:
> In function 'memcpy',
> inlined from 'csio_append_attrib' at
> drivers/scsi/csiostor/csio_lnode.c:248:2,
> inlined from 'csio_ln_fdmi_dprt_cbfn' at
> drivers/scsi/csiostor/csio_lnode.c:471:2:
> include/linux/string.h:309:4: error: call to '__read_overflow2'
> declared with attribute error: detected read beyond size of object
> passed as 2nd parameter
> __read_overflow2();
> ^
> In function 'memcpy',
> inlined from 'csio_append_attrib' at
> drivers/scsi/csiostor/csio_lnode.c:248:2,
> inlined from 'csio_ln_fdmi_rhba_cbfn' at
> drivers/scsi/csiostor/csio_lnode.c:337:2:
> include/linux/string.h:309:4: error: call to '__read_overflow2'
> declared with attribute error: detected read beyond size of object
> passed as 2nd parameter
> __read_overflow2();
> ^
> 
> Caused by commit
> 
>   b90d6eba50d7 ("include/linux/string.h: add the option of fortified
> string.h functions")
> 
> I have reverted that commit for today.

That's this one: https://lkml.org/lkml/2017/5/9/613, which is in
https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/ in the
4.13/scsi-queue and for-next branches. I think that's why Kees didn't
include it but I get he needs to add that.


Re: linux-next: build failure after merge of the kspp tree

2017-06-15 Thread Daniel Micay
On Fri, 2017-06-16 at 11:30 +1000, Stephen Rothwell wrote:
> Hi Kees,
> 
> After merging the kspp tree, today's linux-next build (x86_64
> allmodconfig) failed like this:
> 
> In file included from include/linux/bitmap.h:8:0,
>  from include/linux/cpumask.h:11,
>  from arch/x86/include/asm/cpumask.h:4,
>  from arch/x86/include/asm/msr.h:10,
>  from arch/x86/include/asm/processor.h:20,
>  from arch/x86/include/asm/cpufeature.h:4,
>  from arch/x86/include/asm/thread_info.h:52,
>  from include/linux/thread_info.h:37,
>  from arch/x86/include/asm/preempt.h:6,
>  from include/linux/preempt.h:80,
>  from include/linux/spinlock.h:50,
>  from include/linux/mmzone.h:7,
>  from include/linux/gfp.h:5,
>  from include/linux/slab.h:14,
>  from drivers/scsi/csiostor/csio_lnode.c:37:
> In function 'memcpy',
> inlined from 'csio_append_attrib' at
> drivers/scsi/csiostor/csio_lnode.c:248:2,
> inlined from 'csio_ln_fdmi_dprt_cbfn' at
> drivers/scsi/csiostor/csio_lnode.c:471:2:
> include/linux/string.h:309:4: error: call to '__read_overflow2'
> declared with attribute error: detected read beyond size of object
> passed as 2nd parameter
> __read_overflow2();
> ^
> In function 'memcpy',
> inlined from 'csio_append_attrib' at
> drivers/scsi/csiostor/csio_lnode.c:248:2,
> inlined from 'csio_ln_fdmi_rhba_cbfn' at
> drivers/scsi/csiostor/csio_lnode.c:337:2:
> include/linux/string.h:309:4: error: call to '__read_overflow2'
> declared with attribute error: detected read beyond size of object
> passed as 2nd parameter
> __read_overflow2();
> ^
> 
> Caused by commit
> 
>   b90d6eba50d7 ("include/linux/string.h: add the option of fortified
> string.h functions")
> 
> I have reverted that commit for today.

That's this one: https://lkml.org/lkml/2017/5/9/613, which is in
https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/ in the
4.13/scsi-queue and for-next branches. I think that's why Kees didn't
include it but I get he needs to add that.


Re: [PATCH] ppc64/perf: Fix oops when kthread execs user process

2017-06-15 Thread Ravi Bangoria
Thanks Naveen,

On Thursday 15 June 2017 08:57 PM, Naveen N. Rao wrote:
> Hmm... are you sure it's the same issue? The above commit only went into 
> v4.7, which means we weren't able to use --call-graph=dwarf till v4.7.

Yes sorry. It's from v4.7 onwards.

-Ravi



Re: [PATCH] ppc64/perf: Fix oops when kthread execs user process

2017-06-15 Thread Ravi Bangoria
Thanks Naveen,

On Thursday 15 June 2017 08:57 PM, Naveen N. Rao wrote:
> Hmm... are you sure it's the same issue? The above commit only went into 
> v4.7, which means we weren't able to use --call-graph=dwarf till v4.7.

Yes sorry. It's from v4.7 onwards.

-Ravi



Re: xgetbv nondeterminism

2017-06-15 Thread H.J. Lu
On Thu, Jun 15, 2017 at 4:37 PM, Dave Hansen  wrote:
> On 06/15/2017 03:18 PM, Andy Lutomirski wrote:
>>> As you pointed out, if you are using XSAVEC's compaction features by
>>> leaving bits unset in the requested feature bitmap registers, you have
>>> no idea how much data XSAVEC will write, unless you read XINUSE with
>>> XGETBV.  But, you can get around *that* by just presizing the XSAVE
>>> buffer to be big.
>> I imagine that, if you're going to save, do something quick, and
>> restore, you'd be better off allocating a big buffer rather than
>> trying to find the smallest buffer you can get away with by reading
>> XINUSE.  Also, what happens if XINUSE nondeterministically changes out
>> from under you before you do XSAVEC?  I assume you can avoid this
>> becoming a problem by using RFBM carefully.
>
> That's a good point.  HJ, does your XGETBV1 code disable signals, btw?
> A signal could theoretically change XINUSE if the handler modified the
> on-stack XSAVE state before sigreturn.

We are preserving the first 8 vector registers used by caller to pass
function parameters.  As long as signal doesn't change XINUSE from
yes to no, we are OK.

> Also, your glibc patch talks a lot about the upper parts of the register
> being zeroed, but that isn't precisely what XGETBV1 does, right?  It
> tells you whether the upper portion of the registers are in the init
> state.  But, the high parts of the registers could be zero, and not in
> the init state, rights?

We are preserving the first 8 vector registers used by caller to pass
function parameters.   We only care the vector bits used to pass
function parameters.   If caller doesn't use the upper bits, we will
always zero the upper bits upon exit.

> I'm missing something, though...  Is the stuff in question here called
> *every* time one of these AVX-using functions is called, or called only
> the first time when the binding is done?

Only the first time.  If you pass "-z now" to linker and set LD_BIND_NOW=1,
_dl_runtime_resolve won't be used.

> It's also bonkers that software has to go to this trouble.  This is
> precisely what XSAVEOPT is supposed to do for us.


-- 
H.J.


Re: xgetbv nondeterminism

2017-06-15 Thread H.J. Lu
On Thu, Jun 15, 2017 at 4:37 PM, Dave Hansen  wrote:
> On 06/15/2017 03:18 PM, Andy Lutomirski wrote:
>>> As you pointed out, if you are using XSAVEC's compaction features by
>>> leaving bits unset in the requested feature bitmap registers, you have
>>> no idea how much data XSAVEC will write, unless you read XINUSE with
>>> XGETBV.  But, you can get around *that* by just presizing the XSAVE
>>> buffer to be big.
>> I imagine that, if you're going to save, do something quick, and
>> restore, you'd be better off allocating a big buffer rather than
>> trying to find the smallest buffer you can get away with by reading
>> XINUSE.  Also, what happens if XINUSE nondeterministically changes out
>> from under you before you do XSAVEC?  I assume you can avoid this
>> becoming a problem by using RFBM carefully.
>
> That's a good point.  HJ, does your XGETBV1 code disable signals, btw?
> A signal could theoretically change XINUSE if the handler modified the
> on-stack XSAVE state before sigreturn.

We are preserving the first 8 vector registers used by caller to pass
function parameters.  As long as signal doesn't change XINUSE from
yes to no, we are OK.

> Also, your glibc patch talks a lot about the upper parts of the register
> being zeroed, but that isn't precisely what XGETBV1 does, right?  It
> tells you whether the upper portion of the registers are in the init
> state.  But, the high parts of the registers could be zero, and not in
> the init state, rights?

We are preserving the first 8 vector registers used by caller to pass
function parameters.   We only care the vector bits used to pass
function parameters.   If caller doesn't use the upper bits, we will
always zero the upper bits upon exit.

> I'm missing something, though...  Is the stuff in question here called
> *every* time one of these AVX-using functions is called, or called only
> the first time when the binding is done?

Only the first time.  If you pass "-z now" to linker and set LD_BIND_NOW=1,
_dl_runtime_resolve won't be used.

> It's also bonkers that software has to go to this trouble.  This is
> precisely what XSAVEOPT is supposed to do for us.


-- 
H.J.


Re: [PATCH v2 1/1] Add Trusted Path Execution as a stackable LSM

2017-06-15 Thread kbuild test robot
Hi Matt,

[auto build test WARNING on security/next]
[also build test WARNING on v4.12-rc5 next-20170615]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Matt-Brown/Add-Trusted-Path-Execution-as-a-stackable-LSM/20170609-115004
base:   
https://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git next
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

>> security/tpe/tpe_lsm.c:45:5: sparse: symbol 'print_tpe_error' was not 
>> declared. Should it be static?
>> security/tpe/tpe_lsm.c:128:5: sparse: symbol 'tpe_mmap_file' was not 
>> declared. Should it be static?
>> security/tpe/tpe_lsm.c:137:5: sparse: symbol 'tpe_file_mprotect' was not 
>> declared. Should it be static?
>> security/tpe/tpe_lsm.c:160:17: sparse: symbol 'tpe_sysctl_path' was not 
>> declared. Should it be static?

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


[RFC PATCH] print_tpe_error() can be static

2017-06-15 Thread kbuild test robot

Signed-off-by: Fengguang Wu 
---
 tpe_lsm.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/security/tpe/tpe_lsm.c b/security/tpe/tpe_lsm.c
index fda811a..77d2498 100644
--- a/security/tpe/tpe_lsm.c
+++ b/security/tpe/tpe_lsm.c
@@ -42,7 +42,7 @@ static int tpe_strict __read_mostly = 
IS_ENABLED(CONFIG_SECURITY_TPE_STRICT);
 static int tpe_restrict_root __read_mostly =
IS_ENABLED(CONFIG_SECURITY_TPE_RESTRICT_ROOT);
 
-int print_tpe_error(struct file *file, char *reason1, char *reason2,
+static int print_tpe_error(struct file *file, char *reason1, char *reason2,
char *method)
 {
char *filepath;
@@ -125,7 +125,7 @@ static int tpe_check(struct file *file, char *method)
return 0;
 }
 
-int tpe_mmap_file(struct file *file, unsigned long reqprot,
+static int tpe_mmap_file(struct file *file, unsigned long reqprot,
unsigned long prot, unsigned long flags)
 {
if (!file || !(prot & PROT_EXEC))
@@ -134,7 +134,7 @@ int tpe_mmap_file(struct file *file, unsigned long reqprot,
return tpe_check(file, "mmap");
 }
 
-int tpe_file_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
+static int tpe_file_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
unsigned long prot)
 {
if (!vma->vm_file)
@@ -157,7 +157,7 @@ static struct security_hook_list tpe_hooks[] = {
 };
 
 #ifdef CONFIG_SYSCTL
-struct ctl_path tpe_sysctl_path[] = {
+static struct ctl_path tpe_sysctl_path[] = {
{ .procname = "kernel", },
{ .procname = "tpe", },
{ }


Re: [PATCH v2 1/1] Add Trusted Path Execution as a stackable LSM

2017-06-15 Thread kbuild test robot
Hi Matt,

[auto build test WARNING on security/next]
[also build test WARNING on v4.12-rc5 next-20170615]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Matt-Brown/Add-Trusted-Path-Execution-as-a-stackable-LSM/20170609-115004
base:   
https://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git next
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

>> security/tpe/tpe_lsm.c:45:5: sparse: symbol 'print_tpe_error' was not 
>> declared. Should it be static?
>> security/tpe/tpe_lsm.c:128:5: sparse: symbol 'tpe_mmap_file' was not 
>> declared. Should it be static?
>> security/tpe/tpe_lsm.c:137:5: sparse: symbol 'tpe_file_mprotect' was not 
>> declared. Should it be static?
>> security/tpe/tpe_lsm.c:160:17: sparse: symbol 'tpe_sysctl_path' was not 
>> declared. Should it be static?

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


[RFC PATCH] print_tpe_error() can be static

2017-06-15 Thread kbuild test robot

Signed-off-by: Fengguang Wu 
---
 tpe_lsm.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/security/tpe/tpe_lsm.c b/security/tpe/tpe_lsm.c
index fda811a..77d2498 100644
--- a/security/tpe/tpe_lsm.c
+++ b/security/tpe/tpe_lsm.c
@@ -42,7 +42,7 @@ static int tpe_strict __read_mostly = 
IS_ENABLED(CONFIG_SECURITY_TPE_STRICT);
 static int tpe_restrict_root __read_mostly =
IS_ENABLED(CONFIG_SECURITY_TPE_RESTRICT_ROOT);
 
-int print_tpe_error(struct file *file, char *reason1, char *reason2,
+static int print_tpe_error(struct file *file, char *reason1, char *reason2,
char *method)
 {
char *filepath;
@@ -125,7 +125,7 @@ static int tpe_check(struct file *file, char *method)
return 0;
 }
 
-int tpe_mmap_file(struct file *file, unsigned long reqprot,
+static int tpe_mmap_file(struct file *file, unsigned long reqprot,
unsigned long prot, unsigned long flags)
 {
if (!file || !(prot & PROT_EXEC))
@@ -134,7 +134,7 @@ int tpe_mmap_file(struct file *file, unsigned long reqprot,
return tpe_check(file, "mmap");
 }
 
-int tpe_file_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
+static int tpe_file_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
unsigned long prot)
 {
if (!vma->vm_file)
@@ -157,7 +157,7 @@ static struct security_hook_list tpe_hooks[] = {
 };
 
 #ifdef CONFIG_SYSCTL
-struct ctl_path tpe_sysctl_path[] = {
+static struct ctl_path tpe_sysctl_path[] = {
{ .procname = "kernel", },
{ .procname = "tpe", },
{ }


  1   2   3   4   5   6   7   8   9   10   >