date:20160809

Re: [PATCH v5 7/8] PM / devfreq: rockchip: add devfreq driver for rk3399 dmc

2016-08-09 Thread Chanwoo Choi

Hi Lin,

I add the some comment on below.

If you modify them, feel free to add the my reviewed tag on next version:
Reviewed-by: Chanwoo Choi 

On 2016년 08월 10일 12:26, Lin Huang wrote:
> base on dfi result, we do ddr frequency scaling, register
> dmc driver to devfreq framework, and use simple-ondemand
> policy.
> 
> Signed-off-by: Lin Huang 
> ---
> Changes in v5:
> - improve dmc driver suggest by Chanwoo Choi
> 
> Changes in v4:
> - use arm_smccc_smc() function talk to bl31
> - delete rockchip_dmc.c file and config
> - delete dmc_notify
> - adjust probe order
> 
> Changes in v3:
> - operate dram setting through sip call
> - imporve set rate flow
> 
> Changes in v2:
> - None
> 
> Changes in v1:
> - move dfi controller to event
> - fix set voltage sequence when set rate fail
> - change Kconfig type from tristate to bool
> - move unuse EXPORT_SYMBOL_GPL()
> 
>  drivers/devfreq/Kconfig  |   9 +
>  drivers/devfreq/Makefile |   1 +
>  drivers/devfreq/rk3399_dmc.c | 512 
> +++
>  3 files changed, 522 insertions(+)
>  create mode 100644 drivers/devfreq/rk3399_dmc.c
> 
> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> index a5be56e..749499d 100644
> --- a/drivers/devfreq/Kconfig
> +++ b/drivers/devfreq/Kconfig
> @@ -100,6 +100,15 @@ config ARM_TEGRA_DEVFREQ
>   It reads ACTMON counters of memory controllers and adjusts the
>   operating frequencies and voltages with OPP support.
>  
> +config ARM_RK3399_DMC_DEVFREQ
> + tristate "ARM RK3399 DMC DEVFREQ Driver"

depend on ARCH_ROCKCHIP ?

> + select PM_OPP
> + select DEVFREQ_GOV_SIMPLE_ONDEMAND

This entry needs the following command in Kconfig:
select DEVFREQ_EVENT_ROCKCHIP_DFI

> + help
> +  This adds the DEVFREQ driver for the RK3399 dmc(Dynamic Memory 
> Controller).

Use a capital letter for and abbreviation.

s/dmc -> DMC

> +  It sets the frequency for the memory controller and reads the 
> usage counts
> +  from hardware.
> +
>  source "drivers/devfreq/event/Kconfig"
>  
>  endif # PM_DEVFREQ
> diff --git a/drivers/devfreq/Makefile b/drivers/devfreq/Makefile
> index 09f11d9..70d9549 100644
> --- a/drivers/devfreq/Makefile
> +++ b/drivers/devfreq/Makefile
> @@ -9,6 +9,7 @@ obj-$(CONFIG_DEVFREQ_GOV_PASSIVE) += governor_passive.o
>  # DEVFREQ Drivers
>  obj-$(CONFIG_ARM_EXYNOS_BUS_DEVFREQ) += exynos-bus.o
>  obj-$(CONFIG_ARM_TEGRA_DEVFREQ)  += tegra-devfreq.o
> +obj-$(CONFIG_ARM_RK3399_DMC_DEVFREQ) += rk3399_dmc.o

The  entry would be positioned
The position of CONFIG_ARM_RK3399_DMC_DEVFREQ would be good on
between EXYNOS_BUS_DEVFREQ and TEGRA_DEVFREQ because of the alphabetical order.

>  
>  # DEVFREQ Event Drivers
>  obj-$(CONFIG_PM_DEVFREQ_EVENT)   += event/
> diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
> new file mode 100644
> index 000..c1157ba
> --- /dev/null
> +++ b/drivers/devfreq/rk3399_dmc.c
> @@ -0,0 +1,512 @@
> +/*
> + * Copyright (c) 2016, Fuzhou Rockchip Electronics Co., Ltd.
> + * Author: Lin Huang 
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +struct dram_timing {
> + unsigned int ddr3_speed_bin;
> + unsigned int pd_idle;
> + unsigned int sr_idle;
> + unsigned int sr_mc_gate_idle;
> + unsigned int srpd_lite_idle;
> + unsigned int standby_idle;
> + unsigned int dram_dll_dis_freq;
> + unsigned int phy_dll_dis_freq;
> + unsigned int ddr3_odt_dis_freq;
> + unsigned int ddr3_drv;
> + unsigned int ddr3_odt;
> + unsigned int phy_ddr3_ca_drv;
> + unsigned int phy_ddr3_dq_drv;
> + unsigned int phy_ddr3_odt;
> + unsigned int lpddr3_odt_dis_freq;
> + unsigned int lpddr3_drv;
> + unsigned int lpddr3_odt;
> + unsigned int phy_lpddr3_ca_drv;
> + unsigned int phy_lpddr3_dq_drv;
> + unsigned int phy_lpddr3_odt;
> + unsigned int lpddr4_odt_dis_freq;
> + unsigned int lpddr4_drv;
> + unsigned int lpddr4_dq_odt;
> + unsigned int lpddr4_ca_odt;
> + unsigned int phy_lpddr4_ca_drv;
> + unsigned int phy_lpddr4_ck_cs_drv;
> + unsigned int phy_lpddr4_dq_drv;
> + unsigned int phy_lpddr4_odt;
> +};
> +
> +struct rk3399_dmcfreq {
> + struct device *dev;
> + struct devfreq *devfreq;
> + struct devfreq_simple_ondemand_data ondemand_data;
> +

Re: [RFC][PATCH 3/4] arm64: dts: hikey: Add hikey support for syscon-reboot-mode

2016-08-09 Thread John Stultz

On Tue, Aug 9, 2016 at 9:34 PM, Bjorn Andersson
 wrote:
> On Mon 08 Aug 16:03 PDT 2016, John Stultz wrote:
>
> [..]
>> diff --git a/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts 
>> b/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
> [..]
>>   memory@0 {
>>   device_type = "memory";
>>   reg = <0x 0x 0x 0x05e0>,
>> -   <0x 0x05f0 0x 0x00eff000>,
>> +   <0x 0x05f0 0x 0x1000>,
>> +   <0x 0x05f02000 0x 0x00efd000>,
>> <0x 0x06e0 0x 0x0060f000>,
>> <0x 0x0741 0x 0x36bf>;
>>   };
>
> As I said when talked about this, I don't think you should punch holes
> in the /memory node, but rather add these regions as no-map in a
> /reserved-memory node. But that's a unrelated matter.

Yea. I need to sync w/ Wei and Guodong to see about reworking all of
those to use /reserved-memory, but for now I'd like to stay in sync w/
how they have it setup.

>>
>> + reboot-mode-syscon@5f01000 {
>> + compatible = "syscon", "simple-mfd";
>> + reg = <0x0 0x05f01000 0x0 0x1000>;
>> +
>> + reboot-mode@0 {
>
> Drop the @0

Will do.


>
> Other than that:
> Acked-by: Bjorn Andersson 

Thanks!
-john

Re: [RFC PATCH 2/3] net: macb: Add support for 1588 for Zynq Ultrascale+ MPSoC

2016-08-09 Thread Harini Katakam

Hi Nicolas,

Thanks for your reply

On Tue, Aug 9, 2016 at 10:26 PM, Punnaiah Choudary Kalluri
 wrote:
> Hi Nicolas,
>
>  1588 implementation in cadence GEM IP we have in Zynq Ultascale+ MPSoC is
> Different to the one in Zynq SOC.
>
> In earlier version, all timestamp values will be stored in registers and 
> there is no specific
> Mechanism to distinguish the received ethernet frame that contains time stamp 
> information
> Other than parsing the frame for PTP packet type.
>
> We have basic implementation for earlier version in our out of tree driver, 
> which is going to be deprecated
> Soon. You could also check the below driver for 1588 support.
> https://gitenterprise.xilinx.com/Linux/linux-xlnx/blob/master/drivers/net/ethernet/xilinx/xilinx_emacps.c
>
>
> Regards,
> Punnaiah
>
>> -Original Message-
>> From: Nicolas Ferre [mailto:nicolas.fe...@atmel.com]
>> Sent: Tuesday, August 09, 2016 10:10 PM
>> To: Harini Katakam ; Harini Katakam
>> ; Andrei Pistirica 
>> Cc: da...@davemloft.net; Boris Brezillon > electrons.com>; alexandre.bell...@free-electrons.com;
>> net...@vger.kernel.org; linux-kernel@vger.kernel.org;
>> devicet...@vger.kernel.org; Punnaiah Choudary Kalluri
>> ; Michal Simek ; Anirudha
>> Sarangi 
>> Subject: Re: [RFC PATCH 2/3] net: macb: Add support for 1588 for Zynq
>> Ultrascale+ MPSoC
>>
>> Le 21/09/2015 à 19:49, Harini Katakam a écrit :
>> > On Fri, Sep 11, 2015 at 1:27 PM, Harini Katakam
>> >  wrote:
>> >> Cadence GEM in Zynq Ultrascale+ MPSoC supports 1588 and provides a
>> >> 102 bit time counter with 48 bits for seconds, 30 bits for nsecs and
>> >> 24 bits for sub-nsecs. The timestamp is made available to the SW through
>> >> registers as well as (more precisely) through upper two words in
>> >> an extended BD.
>> >>
>> >> This patch does the following:
>> >> - Adds MACB_CAPS_TSU in zynqmp_config.
>> >> - Registers to ptp clock framework (after checking for timestamp support
>> in
>> >>   IP and capability in config).
>> >> - TX BD and RX BD control registers are written to populate timestamp in
>> >>   extended BD words.
>> >> - Timer initialization is done by writing time of day to the timer 
>> >> counter.
>> >> - ns increment register is programmed as NS_PER_SEC/TSU_CLK.
>> >>   For a 24 bit subns precision, the subns increment equals
>> >>   remainder of (NS_PER_SEC/TSU_CLK) * (2^24).
>> >>   TSU (Time stamp unit) clock is obtained by the  driver from devicetree.
>> >> - HW time stamp capabilities are advertised via ethtool and macb ioctl is
>> >>   updated accordingly.
>> >> - For all PTP event frames, nanoseconds and the lower 5 bits of seconds
>> are
>> >>   obtained from the BD. This offers a precise timestamp. The upper bits
>> >>   (which dont vary between consecutive packets) are obtained from the
>> >>   TX/RX PTP event/PEER registers. The timestamp obtained thus is
>> updated
>> >>   in skb for upper layers to access.
>> >> - The drivers register functions with ptp to perform time and frequency
>> >>   adjustment.
>> >> - Time adjustment is done by writing to the 1558_ADJUST register.
>> >>   The controller will read the delta in this register and update the timer
>> >>   counter register. Alternatively, for large time offset adjustments,
>> >>   the driver reads the secs and nsecs counter values, adds/subtracts the
>> >>   delta and updates the timer counter. In order to be as precise as
>> possible,
>> >>   nsecs counter is read again if secs has incremented during the counter
>> read.
>> >> - Frequency adjustment is not directly supported by this IP.
>> >>   addend is the initial value ns increment and similarly addendesub.
>> >>   The ppb (parts per billion) provided is used as
>> >>   ns_incr = addend +/- (ppb/rate).
>> >>   Similarly the remainder of the above is used to populate subns
>> increment.
>> >>   In case the ppb requested is negative AND subns adjustment greater
>> than
>> >>   the addendsub, ns_incr is reduced by 1 and subns_incr is adjusted in
>> >>   positive accordingly.
>> >>
>> >> Signed-off-by: Harini Katakam :
>> >> ---
>> >>  drivers/net/ethernet/cadence/macb.c |  372
>> ++-
>> >>  drivers/net/ethernet/cadence/macb.h |   64 ++
>> >>  2 files changed, 428 insertions(+), 8 deletions(-)
>> >>
>> >> diff --git a/drivers/net/ethernet/cadence/macb.c
>> b/drivers/net/ethernet/cadence/macb.c
>> >> index bb2932c..b531008 100644
>> >> --- a/drivers/net/ethernet/cadence/macb.c
>> >> +++ b/drivers/net/ethernet/cadence/macb.c
>> >> @@ -30,6 +30,8 @@
>> >>  #include 
>> >>  #include 
>>
>> [..]
>>
>> >> +   unsigned intns_incr;
>> >> +   unsigned intsubns_incr;
>> >>  };
>> >>
>> >>  static inline bool macb_is_gem(struct macb *bp)
>> >> --
>> >> 1.7.9.5
>> >
>> > Ping
>> >
>> > Thanks.
>>
>> Harini,
>>
>> I come back to this patch of last year and I'm sorry about being so late
>> answering you.
>>
>> Andrei who is added to the discussion will have some time to deal with
>> this fe

Re: [PATCH v5 7/8] PM / devfreq: rockchip: add devfreq driver for rk3399 dmc

2016-08-09 Thread kbuild test robot

Hi Lin,

[auto build test WARNING on v4.8-rc1]
[also build test WARNING on next-20160809]
[cannot apply to rockchip/for-next devfreq/for-rafael linux/master]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Lin-Huang/rk3399-support-ddr-frequency-scaling/20160810-114433


coccinelle warnings: (new ones prefixed by >>)

>> drivers/devfreq/rk3399_dmc.c:393:2-3: Unneeded semicolon

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation

Re: [PATCH v5 4/8] Documentation: bindings: add dt documentation for dfi controller

2016-08-09 Thread Chanwoo Choi

Hi Lin,

On 2016년 08월 10일 12:26, Lin Huang wrote:
> This patch adds the documentation for rockchip dfi devfreq-event driver.
> 
> Signed-off-by: Lin Huang 
> ---
> Changes in v5:
> -None
> 
> Changes in v4:
> -None
> 
> Changes in v3:
> -None
> 
> Changes in v2:
> -None 
> 
> Changes in v1:
> -None
> 
>  .../bindings/devfreq/event/rockchip-dfi.txt  | 20 
> 
>  1 file changed, 20 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/devfreq/event/rockchip-dfi.txt
> 
> diff --git a/Documentation/devicetree/bindings/devfreq/event/rockchip-dfi.txt 
> b/Documentation/devicetree/bindings/devfreq/event/rockchip-dfi.txt
> new file mode 100644
> index 000..bf42255
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/devfreq/event/rockchip-dfi.txt
> @@ -0,0 +1,20 @@
> +
> +* Rockchip rk3399 DFI device
> +
> +Required properties:
> +- compatible: Must be "rockchip,rk3399-dfi".
> +- reg: physical base address of each DFI and length of memory mapped region
> +- rockchip,pmu: phandle to the syscon managing the "pmu general register 
> files"
> +- clocks: phandles for clock specified in "clock-names" property
> +- clock-names : the name of clock used by the DFI, must be "pclk_ddr_mon";
> +
> +Example:
> + dfi: dfi@0xff63 {
> + reg = <0x00 0xff63 0x00 0x4000>;
> + compatible = "rockchip,rk3399-dfi";

Usually, the compatible is first entry within Device-tree node.

> + rockchip,pmu = <&pmugrf>;
> + clocks = <&cru PCLK_DDR_MON>;
> + clock-names = "pclk_ddr_mon";
> + status = "disabled";
> + };
> +
> 

Looks good to me. Just I want to change the sequence
between 'compatible' and 'reg' property.

Acked-by: Chanwoo Choi 

Regards,
Chawnoo Choi

[PATCH] PM / devfreq: rockchip: fix semicolon.cocci warnings

2016-08-09 Thread kbuild test robot

drivers/devfreq/rk3399_dmc.c:393:2-3: Unneeded semicolon


 Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

CC: Lin Huang 
Signed-off-by: Fengguang Wu 
---

 rk3399_dmc.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/devfreq/rk3399_dmc.c
+++ b/drivers/devfreq/rk3399_dmc.c
@@ -390,7 +390,7 @@ static int rk3399_dmcfreq_probe(struct p
if (IS_ERR(data->dmc_clk)) {
dev_err(dev, "Cannot get the clk dmc_clk\n");
return PTR_ERR(data->dmc_clk);
-   };
+   }
 
data->irq = irq;
ret = devm_request_irq(dev, irq, rk3399_dmc_irq, 0,

Re: [PATCH v6 0/2] Block layer support ZAC/ZBC commands

2016-08-09 Thread Damien Le Moal


Shaun,

On 8/10/16 12:58, Shaun Tancheff wrote:

On Tue, Aug 9, 2016 at 3:09 AM, Damien Le Moal  wrote:

On Aug 9, 2016, at 15:47, Hannes Reinecke  wrote:


[trim]


Since disk type == 0 for everything that isn't HM so I would prefer the
sysfs 'zoned' file just report if the drive is HA or HM.


Okay. So let's put in the 'zoned' attribute the device type:
'host-managed', 'host-aware', or 'device managed'.


I hacked your patches and simply put a "0" or "1" in the sysfs zoned file.
Any drive that has ZBC/ZAC command support gets a "1", "0" for everything
else. This means that drive managed models are not exposed as zoned block
devices. For HM vs HA differentiation, an application can look at the
device type file since it is already present.

We could indeed set the "zoned" file to the device type, but HM drives and
regular drives will both have "0" in it, so no differentiation possible.
The other choice could be the "zoned" bits defined by ZBC, but these
do not define a value for host managed drives, and the drive managed value
being not "0" could be confusing too. So I settled for a simple 0/1 boolean.


This seems good to me.


Another option I forgot is for the "zoned" file to indicate the total 
number of zones of the device, and 0 for a non zoned regular block 
device. That would work as well.


[...]

Done: I hacked Shaun ioctl code and added finish zone too. The
difference with Shaun initial code is that the ioctl are propagated down to
the driver (__blkdev_driver_ioctl -> sd_ioctl) so that there is no need for
BIO request definition for the zone operations. So a lot less code added.


The purpose of the BIO flags is not to enable the ioctls so much as
the other way round. Creating BIO op's is to enable issuing ZBC
commands from device mapper targets and file systems without some
heinous ioctl hacks.
Making the resulting block layer interfaces available via ioctls is just a
reasonable way to exercise the code ... or that was my intent.


Yes, I understood your code. However, since (or if) we keep the zone 
information in the RB-tree cache, there is no need for the report zone 
operation BIO interface. Same for reset write pointer by keeping the 
mapping to discard. blk_lookup_zone can be used in kernel as a report 
zone BIO replacement and works as well for the report zone ioctl 
implementation. For reset, there is blkdev_issue_discrad in kernel, and 
the reset zone ioctl becomes equivalent to BLKDISCARD ioctl. These are 
simple. Open, close and finish zone remains. For these, adding the BIO 
interface seemed an overkill. Hence my choice of propagating the ioctl 
to the driver.
This is debatable of course, and adding an in-kernel interface is not 
hard: we can implement blk_open_zone, blk_close_zone and blk_finish_zone 
using __blkdev_driver_ioctl. That looks clean to me.


Overall, my concern with the BIO based interface for the ZBC commands is 
that it adds one flag for each command, which is not really the 
philosophy of the interface and potentially opens the door for more such 
implementations in the future with new standards and new commands coming 
up. Clearly that is not a sustainable path. So I think that a more 
specific interface for these zone operations is a better choice. That is 
consistent with what happens with the tons of ATA and SCSI commands not 
actually doing data I/Os (mode sense, log pages, SMART, etc). All these 
do not use BIOs and are processed as request REQ_TYPE_BLOCK_PC.



The ioctls do not mimic exactly the ZBC standard. For instance, there is no
reporting options for report zones, nor is the "all" bit supported for open,
close or finish zone commands. But the information provided on zones is complete
and maps to the standard definitions.


For the reporting options I have planned to reuse the stream_id in
struct bio when that is formalized. There are certainly other places in
struct bio to stuff a few extra bits ...


We could add reporting options to blk_lookup_zones to filter the result 
and use that in the ioctl implementation as well. This can be added 
without any problem.



As far as the all bit ... this is being handled by all the zone action
commands. Just pass a sector of ~0ul and it's handled in sd.c by
sd_setup_zone_action_cmnd().

Apologies as apparently my documentation here is lacking :-(


Yes, I got it (libzbc does the same actually). I did not add it for 
simplicity. But indeed may be it should be. The same trick can be used 
with the ioctl to driver interface. No problems.



I also added a reset_wp ioctl for completeness, but this one simply calls
blkdev_issue_discard internally, so it is in fact equivalent to the BLKDISCARD
ioctl call with a range exactly aligned on a zone.


I'm confused as my patch set includes a Reset WP (BLKRESETZONE) that
creates a REQ_OP_ZONE_RESET .. same as open and close. My
expectation being that BLKDISCARD doesn't really need yet another alias.


Yes, we could remove the BLKRESETZONE ioctl and have applications use 
th

Re: [PATCH v5 6/8] Documentation: bindings: add dt documentation for rk3399 dmc

2016-08-09 Thread Chanwoo Choi

Hi Lin,

On 2016년 08월 10일 12:26, Lin Huang wrote:
> This patch adds the documentation for rockchip rk3399 dmc driver.
> 
> Signed-off-by: Lin Huang 
> ---
> Changes in v5:
> -None
> 
> Changes in v4:
> -None
> 
> Changes in v3:
> -None
> 
> Changes in v2:
> -None 
> 
> Changes in v1:
> -None
> 
>  .../devicetree/bindings/devfreq/rk3399_dmc.txt | 35 
> ++
>  1 file changed, 35 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
> 
> diff --git a/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt 
> b/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
> new file mode 100644
> index 000..90e9581
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
> @@ -0,0 +1,35 @@
> +* Rockchip rk3399 dmc device

dmc -> DMC(Dynamic Memory Controller ?)

> +
> +Required properties:
> +- compatible: Must be "rockchip,rk3399-dmc".
> +- devfreq-events: Node to get ddr loading, Refer to
> +   Documentation/devicetree/bindings/devfreq/rockchip-dif.txt
> +- interrupts: The interrupt number to the cpu. The interrupt specifier format
> +   depends on the interrupt controller. 

If you add the specific role of this interrupt, it would be better
to understand the operation of interrupt.

> +- clocks: Phandles for clock specified in "clock-names" property
> +- clock-names : The name of clock used by the DFI, must be "pclk_ddr_mon";

"pclk_ddr_mon" -> "dmc_clk"

> +- operating-points-v2: Refer to 
> Documentation/devicetree/bindings/power/opp.txt
> +for details.
> +- center-supply: Dmc supply node.
> +- status: Marks the node enabled/disabled.
> +
> +Optional properties:
> +- ddr_timing: ddr timing need to pass to arm trust firmware
> +- upthreshold: the upthreshold to simpleondeamnd policy
> +- downdifferential: The downdifferential to simpleondeamnd policy
> +
> +Example:
> + dmc: dmc {
> + compatible = "rockchip,rk3399-dmc";
> + devfreq-events = <&dfi>;
> + interrupts = ;
> + clocks = <&cru SCLK_DDRCLK>;
> + clock-names = "dmc_clk";
> + ddr_timing = <&ddr_timing>;

I think that you should add the detailed document for 
'ddr_timing' because we don't understand the 'ddr_timing' easily
, it depends on the trust firmware.

> + operating-points-v2 = <&dmc_opp_table>;

I think that you better to add the example of 'dmc_opp_table' 
in the documentation.

> + center-supply = <&ppvar_centerlogic>;
> + upthreshold = <15>;
> + downdifferential = <10>;
> + status = "disabled";
> + };
> +
> 

Regards,
Chanwoo Choi

Re: [PATCH v3] powerpc: Do not make the entire heap executable

2016-08-09 Thread Michael Ellerman

Denys Vlasenko  writes:

> On 32-bit powerps the ELF PLT sections of binaries (built with --bss-plt,
> or with a toolchain which defaults to it) look like this:
...
>
>  arch/powerpc/include/asm/page.h| 10 +-
>  arch/powerpc/include/asm/page_32.h |  2 --
>  arch/powerpc/include/asm/page_64.h |  4 
>  fs/binfmt_elf.c| 34 ++
>  include/linux/mm.h |  1 +
>  mm/mmap.c  | 20 +++-
>  6 files changed, 43 insertions(+), 28 deletions(-)

What tree is this against?

I can't get it to apply to either Linus' tree or linux-next.

cheers

$ patch --dry-run -p1 < diff.diff
checking file arch/powerpc/include/asm/page.h
checking file arch/powerpc/include/asm/page_32.h
checking file arch/powerpc/include/asm/page_64.h
checking file fs/binfmt_elf.c
Hunk #3 FAILED at 613.
Hunk #4 FAILED at 633.
Hunk #5 succeeded at 681 (offset 2 lines).
Hunk #6 succeeded at 889 (offset 2 lines).
Hunk #7 succeeded at 984 (offset 2 lines).
Hunk #8 succeeded at 1003 (offset 2 lines).
2 out of 8 hunks FAILED
checking file include/linux/mm.h
checking file mm/mmap.c
Hunk #1 FAILED at 2653.
Hunk #2 succeeded at 2668 (offset 2 lines).
Hunk #3 succeeded at 2736 (offset 2 lines).
Hunk #4 succeeded at 2750 (offset 2 lines).
1 out of 4 hunks FAILED

Re: [RFC][PATCH 3/4] arm64: dts: hikey: Add hikey support for syscon-reboot-mode

2016-08-09 Thread Bjorn Andersson

On Mon 08 Aug 16:03 PDT 2016, John Stultz wrote:

[..]
> diff --git a/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts 
> b/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
[..]
>   memory@0 {
>   device_type = "memory";
>   reg = <0x 0x 0x 0x05e0>,
> -   <0x 0x05f0 0x 0x00eff000>,
> +   <0x 0x05f0 0x 0x1000>,
> +   <0x 0x05f02000 0x 0x00efd000>,
> <0x 0x06e0 0x 0x0060f000>,
> <0x 0x0741 0x 0x36bf>;
>   };

As I said when talked about this, I don't think you should punch holes
in the /memory node, but rather add these regions as no-map in a
/reserved-memory node. But that's a unrelated matter.

>  
> + reboot-mode-syscon@5f01000 {
> + compatible = "syscon", "simple-mfd";
> + reg = <0x0 0x05f01000 0x0 0x1000>;
> +
> + reboot-mode@0 {

Drop the @0

Other than that:
Acked-by: Bjorn Andersson 

> + compatible = "syscon-reboot-mode";
> + offset = <0x0>;
> +
> + mode-normal = <0x77665501>;
> + mode-bootloader = <0x77665500>;
> + mode-recovery   = <0x77665502>;
> + };
> + };
> +
>   soc {
>   spi0: spi@f7106000 {
>   status = "ok";

Regards,
Bjorn

Re: Is it ok if ModemManager process is killed AFTER network-interface is brought up and IP-Address assigned?

2016-08-09 Thread Ajay Garg

Ok Greg :)

On Tue, Aug 9, 2016 at 1:44 PM, Greg KH  wrote:
> On Tue, Aug 09, 2016 at 12:48:12PM +0530, Ajay Garg wrote:
>> Hi All.
>>
>> We are using Sierra's USB-to-WWAN driver on Ubuntu-14 for Sierra's
>> MC8090 modem, and we have a requirement wherein we need to have access
>> to the modem-serial-port (from our user-application that is).
>>
>> Right now, we see that /usr/sbin/ModemManager is always connected to
>> /dev/ttyUSB3 (which means we cannot connect to the port from our
>> application at the same time, or even if we can, received-data will be
>> at best inconsistent).
>>
>>
>> We are thinking of the following ::
>>
>> * Initially, let nmcli and ModemManager do their work, and let them
>> bring the WWAN interface up.
>>
>> * Once this happens, we permanently-down the ModemManager from our
>> application-binary, thereby freeing up /dev/ttyUSB3.
>>
>> * Thereafter, we are free to connect to /dev/ttyUSB3 from our
>> application, thereby using features like SMS-notification (+CMTI),
>> signal-strength (+CSQ), etc.
>>
>>
>>
>> Does our approach make sense?
>> We will be grateful to any help.
>
> Why not ask the modem manager team about this?  The kernel doesn't care
> what you do with the device links :)
>
> thanks,
>
> greg k-h



-- 
Regards,
Ajay

Re: [PATCH] cpufreq: powernv: Fix crash in gpstate_timer_handler

2016-08-09 Thread Michael Ellerman

Viresh Kumar  writes:

> On 04-08-16, 20:59, Akshay Adiga wrote:
>> 'commit 09ca4c9b5958 ("cpufreq: powernv: Replacing pstate_id with
>> frequency table index")' changes calc_global_pstate() to use
>> cpufreq_table index instead of pstate_id.
>> 
>> But in gpstate_timer_handler() pstate_id was being passed instead
>> of cpufreq_table index, which caused the index_to_pstate() to access
>> out of bound indices, leading to this crash.
>> 
>> Adding sanity check for index and pstate, to ensure only valid pstate
>> and index values are returned.
>> 
>> Call Trace:
>> [c0078d66b130] [c011d224] __free_irq+0x234/0x360
>> (unreliable)
>> [c0078d66b1c0] [c011d44c] free_irq+0x6c/0xa0
>> [c0078d66b1f0] [c006c4f8] opal_event_shutdown+0x88/0xd0
>> [c0078d66b230] [c0067a4c] opal_shutdown+0x1c/0x90
>> [c0078d66b260] [c0063a00] pnv_shutdown+0x20/0x40
>> [c0078d66b280] [c0021538] machine_restart+0x38/0x90
>> [c78d66b310] [c0965ea0] panic+0x284/0x300
>> [c0078d66b3a0] [c001f508] die+0x388/0x450
>> [c0078d66b430] [c0045a50] bad_page_fault+0xd0/0x140
>> [c0078d66b4a0] [c0008964] handle_page_fault+0x2c/0x30
>>interrupt: 300 at gpstate_timer_handler+0x150/0x260
>> LR = gpstate_timer_handler+0x130/0x260
>> [c0078d66b7f0] [c0132b58] call_timer_fn+0x58/0x1c0
>> [c0078d66b880] [c0132e20] expire_timers+0x130/0x1d0
>> [c0078d66b8f0] [c0133068] run_timer_softirq+0x1a8/0x230
>> [c0078d66b980] [c00b535c] __do_softirq+0x18c/0x400
>> [c0078d66ba70] [c00b5828] irq_exit+0xc8/0x100
>> [c0078d66ba90] [c001e214] timer_interrupt+0xa4/0xe0
>> [c0078d66bac0] [c00027d0] decrementer_common+0x150/0x180
>>interrupt: 901 at arch_local_irq_restore+0x74/0x90
>>   0] [c0106b34] call_cpuidle+0x44/0x90
>> [c0078d66be50] [c010708c] cpu_startup_entry+0x38c/0x460
>> [c0078d66bf20] [c003d930] start_secondary+0x330/0x380
>> [c0078d66bf90] [c0008e6c] start_secondary_prolog+0x10/0x14
>> 
>> Fixes: 08d27eb ("cpufreq: powernv: Replacing pstate_id with
>> frequency table index")
>> Reported-by: Madhavan Srinivasan 
>> Signed-off-by: Akshay Adiga 
>> ---
>>  drivers/cpufreq/powernv-cpufreq.c | 21 -
>>  1 file changed, 20 insertions(+), 1 deletion(-)
>
> Acked-by: Viresh Kumar 

Who's merging this?

cheers

Re: [PATCH 2/2] device-tree: nexus7: Add IMEM syscon and reboot reason support

2016-08-09 Thread Bjorn Andersson

On Mon 08 Aug 15:34 PDT 2016, John Stultz wrote:

> This patch add the IMEM syscon memory region to the DT,
> as well as addds support for the magic reboot reason
> values that are written to the address for each mode.
>

This looks good, double checked the addresses and magics. But I think
you should move the entire thing to qcom-apq8064.dtsi, as this is common
to the base platform.

And I would prefer if you updated the subject prefix...

With the move and subject update:
Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> Cc: Rob Herring 
> Cc: Andy Gross 
> Cc: Bjorn Andersson 
> Cc: Stephen Boyd 
> Cc: linux-arm-...@vger.kernel.org
> Cc: devicet...@vger.kernel.org
> Signed-off-by: John Stultz 
> ---
>  arch/arm/boot/dts/qcom-apq8064-asus-nexus7-flo.dts | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/qcom-apq8064-asus-nexus7-flo.dts 
> b/arch/arm/boot/dts/qcom-apq8064-asus-nexus7-flo.dts
> index 7b05f07..ff856c3 100644
> --- a/arch/arm/boot/dts/qcom-apq8064-asus-nexus7-flo.dts
> +++ b/arch/arm/boot/dts/qcom-apq8064-asus-nexus7-flo.dts
> @@ -272,5 +272,19 @@
>   vqmmc-supply = <&pm8921_s4>;
>   };
>   };
> +
> + imem@2a03f000 {
> + compatible = "syscon", "simple-mfd";
> + reg = <0x2a03f000 0x1000>;
> +
> + reboot-mode {
> + compatible = "syscon-reboot-mode";
> + offset = <0x65c>;
> +
> + mode-normal = <0x77665501>;
> + mode-bootloader = <0x77665500>;
> + mode-recovery   = <0x77665502>;
> + };
> + };
>   };
>  };
> -- 
> 1.9.1
>

[PATCH v3 4/5] x86/ioapic: Fix lost IOAPIC resource after hot-removal and hotadd

2016-08-09 Thread Rui Wang

IOAPIC resource at 0xfecx gets lost from /proc/iomem after
hot-removing and then hot-adding the IOAPIC device.

After system boot, in /proc/iomem:
fec0-fecf : PNP0003:00
  fec0-fec003ff : IOAPIC 0
  fec01000-fec013ff : IOAPIC 1
  fec4-fec403ff : IOAPIC 2
  fec8-fec803ff : IOAPIC 3
  fecc-fecc03ff : IOAPIC 4

Then hot-remove IOAPIC 2 and hot-add it again:
fec0-fecf : PNP0003:00
  fec0-fec003ff : IOAPIC 0
  fec01000-fec013ff : IOAPIC 1
  fec8-fec803ff : IOAPIC 3
  fecc-fecc03ff : IOAPIC 4

The range at 0xfec4 is lost from /proc/iomem. It is because
handle_ioapic_add() requests resource from either PCI config BAR or
ACPI "_CRS", not both. But Intel platforms map the IOxAPIC registers
both at the PCI config BAR (called MBAR, dynamic), and at the ACPI
"_CRS" (called ABAR, static). The 0xfecX_YZ00 to 0xfecX_YZFF range
appears in "_CRS" of each IOAPIC device. Both ranges should be claimed
from /proc/iomem for exclusive use.

Signed-off-by: Rui Wang 
---
 drivers/acpi/ioapic.c | 36 
 1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/drivers/acpi/ioapic.c b/drivers/acpi/ioapic.c
index 8ab6d42..ee20111 100644
--- a/drivers/acpi/ioapic.c
+++ b/drivers/acpi/ioapic.c
@@ -97,7 +97,7 @@ static acpi_status handle_ioapic_add(acpi_handle handle, u32 
lvl,
unsigned long long gsi_base;
struct acpi_pci_ioapic *ioapic;
struct pci_dev *dev = NULL;
-   struct resource *res = NULL;
+   struct resource *res = NULL, *pci_res = NULL, *crs_res;
char *type = NULL;
 
if (!acpi_is_ioapic(handle, &type))
@@ -137,23 +137,28 @@ static acpi_status handle_ioapic_add(acpi_handle handle, 
u32 lvl,
pci_set_master(dev);
if (pci_request_region(dev, 0, type))
goto exit_disable;
-   res = &dev->resource[0];
+   pci_res = &dev->resource[0];
ioapic->pdev = dev;
} else {
pci_dev_put(dev);
dev = NULL;
+   }
 
-   res = &ioapic->res;
-   acpi_walk_resources(handle, METHOD_NAME__CRS, setup_res, res);
-   if (res->flags == 0) {
-   acpi_handle_warn(handle, "failed to get resource\n");
-   goto exit_free;
-   } else if (request_resource(&iomem_resource, res)) {
-   acpi_handle_warn(handle, "failed to insert resource\n");
-   goto exit_free;
-   }
+   crs_res = &ioapic->res;
+   acpi_walk_resources(handle, METHOD_NAME__CRS, setup_res, crs_res);
+   if (crs_res->flags == 0) {
+   acpi_handle_warn(handle, "failed to get resource\n");
+   goto exit_release;
+   } else if (request_resource(&iomem_resource, crs_res)) {
+   acpi_handle_warn(handle, "failed to insert resource\n");
+   goto exit_release;
}
 
+   /* try pci resource first, then "_CRS" resource */
+   res = pci_res;
+   if (!res || !res->flags)
+   res = crs_res;
+
if (acpi_register_ioapic(handle, res->start, (u32)gsi_base)) {
acpi_handle_warn(handle, "failed to register IOAPIC\n");
goto exit_release;
@@ -174,14 +179,13 @@ done:
 exit_release:
if (dev)
pci_release_region(dev, 0);
-   else
-   release_resource(res);
+   if (ioapic->res.flags && ioapic->res.parent)
+   release_resource(&ioapic->res);
 exit_disable:
if (dev)
pci_disable_device(dev);
 exit_put:
pci_dev_put(dev);
-exit_free:
kfree(ioapic);
 exit:
mutex_unlock(&ioapic_list_lock);
@@ -217,9 +221,9 @@ int acpi_ioapic_remove(struct acpi_pci_root *root)
pci_release_region(ioapic->pdev, 0);
pci_disable_device(ioapic->pdev);
pci_dev_put(ioapic->pdev);
-   } else if (ioapic->res.flags && ioapic->res.parent) {
-   release_resource(&ioapic->res);
}
+   if (ioapic->res.flags && ioapic->res.parent)
+   release_resource(&ioapic->res);
list_del(&ioapic->list);
kfree(ioapic);
}
-- 
1.8.3.1

[PATCH v3 2/5] x86/ioapic: Support hot-removal of IOAPICs present during boot

2016-08-09 Thread Rui Wang

IOAPICs present during system boot aren't added to ioapic_list,
thus are unable to be hot-removed. Fix it by calling
acpi_ioapic_add() during root bus enumeration.

Signed-off-by: Rui Wang 
---
 drivers/acpi/pci_root.c | 10 ++
 drivers/pci/setup-bus.c |  5 -
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index b07eda1..bf601d4 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -614,6 +614,16 @@ static int acpi_pci_root_add(struct acpi_device *device,
if (hotadd) {
pcibios_resource_survey_bus(root->bus);
pci_assign_unassigned_root_bus_resources(root->bus);
+   /*
+* This is only called for the hotadd case. For the boot-time
+* case, we need to wait until after PCI initialization in
+* order to deal with IOAPICs mapped in on a PCI BAR.
+*
+* This is currently x86-specific, because acpi_ioapic_add()
+* is an empty function without CONFIG_ACPI_HOTPLUG_IOAPIC.
+* And CONFIG_ACPI_HOTPLUG_IOAPIC depends on CONFIG_X86_IO_APIC
+* (see drivers/acpi/Kconfig).
+*/
acpi_ioapic_add(root->device->handle);
}
 
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index c74059e..ec538d3 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "pci.h"
 
 unsigned int pci_flags;
@@ -1852,8 +1853,10 @@ void __init pci_assign_unassigned_resources(void)
 {
struct pci_bus *root_bus;
 
-   list_for_each_entry(root_bus, &pci_root_buses, node)
+   list_for_each_entry(root_bus, &pci_root_buses, node) {
pci_assign_unassigned_root_bus_resources(root_bus);
+   acpi_ioapic_add(ACPI_HANDLE(root_bus->bridge));
+   }
 }
 
 void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
-- 
1.8.3.1

[PATCH v3 1/5] x86/ioapic: Change prototype of acpi_ioapic_add()

2016-08-09 Thread Rui Wang

Change the argument of acpi_ioapic_add() to a generic ACPI handle, and
move its prototype from drivers/acpi/internal.h to include/linux/acpi.h
so that it can be called from outside the pci_root driver.

Signed-off-by: Rui Wang 
---
 drivers/acpi/internal.h | 2 --
 drivers/acpi/ioapic.c   | 6 +++---
 drivers/acpi/pci_root.c | 2 +-
 include/linux/acpi.h| 6 ++
 4 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
index 940218f..f26fc1d 100644
--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -40,10 +40,8 @@ int acpi_sysfs_init(void);
 void acpi_container_init(void);
 void acpi_memory_hotplug_init(void);
 #ifdef CONFIG_ACPI_HOTPLUG_IOAPIC
-int acpi_ioapic_add(struct acpi_pci_root *root);
 int acpi_ioapic_remove(struct acpi_pci_root *root);
 #else
-static inline int acpi_ioapic_add(struct acpi_pci_root *root) { return 0; }
 static inline int acpi_ioapic_remove(struct acpi_pci_root *root) { return 0; }
 #endif
 #ifdef CONFIG_ACPI_DOCK
diff --git a/drivers/acpi/ioapic.c b/drivers/acpi/ioapic.c
index ccdc8db..2449377 100644
--- a/drivers/acpi/ioapic.c
+++ b/drivers/acpi/ioapic.c
@@ -189,13 +189,13 @@ exit:
return AE_OK;
 }
 
-int acpi_ioapic_add(struct acpi_pci_root *root)
+int acpi_ioapic_add(acpi_handle root_handle)
 {
acpi_status status, retval = AE_OK;
 
-   status = acpi_walk_namespace(ACPI_TYPE_DEVICE, root->device->handle,
+   status = acpi_walk_namespace(ACPI_TYPE_DEVICE, root_handle,
 UINT_MAX, handle_ioapic_add, NULL,
-root->device->handle, (void **)&retval);
+root_handle, (void **)&retval);
 
return ACPI_SUCCESS(status) && ACPI_SUCCESS(retval) ? 0 : -ENODEV;
 }
diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index d144168..b07eda1 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -614,7 +614,7 @@ static int acpi_pci_root_add(struct acpi_device *device,
if (hotadd) {
pcibios_resource_survey_bus(root->bus);
pci_assign_unassigned_root_bus_resources(root->bus);
-   acpi_ioapic_add(root);
+   acpi_ioapic_add(root->device->handle);
}
 
pci_lock_rescan_remove();
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 4d8452c..c9a596b 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -751,6 +751,12 @@ static inline int acpi_reconfig_notifier_unregister(struct 
notifier_block *nb)
 
 #endif /* !CONFIG_ACPI */
 
+#ifdef CONFIG_ACPI_HOTPLUG_IOAPIC
+int acpi_ioapic_add(acpi_handle root);
+#else
+static inline int acpi_ioapic_add(acpi_handle root) { return 0; }
+#endif
+
 #ifdef CONFIG_ACPI
 void acpi_os_set_prepare_sleep(int (*func)(u8 sleep_state,
   u32 pm1a_ctrl,  u32 pm1b_ctrl));
-- 
1.8.3.1

[PATCH v3 0/5] Fixing a set of bugs for ioapic hotplug

2016-08-09 Thread Rui Wang

A set of patches fixing bugs found while testing IOAPIC hotplug.

Regards,
Rui

Changelog:

Changes from v2 to v3:
* Rebased on top of 4.8-rc1 per Bjorn & Rafael.
* Improved the commit message of 0003, w/ clearer explanation.

Changes from v1 to v2:
* Split the first patch into two as advised by Bjorn: "would be nicer if
the interface change and header file munging were in a separate patch so
they wouldn't obscure the meat of the change, i.e., the addition of calls
to acpi_ioapic_add()."
* Removed acpi_ioapic_add() as an exported symbol.
* Fixed some typos, and s/acpi/ACPI/, s/ioapic/IOAPIC/ throughout.
* Fixed a warning from 0-day testing.

Rui Wang (5):
  x86/ioapic: Change prototype of acpi_ioapic_add()
  x86/ioapic: Support hot-removal of IOAPICs present during boot
  x86/ioapic: Fix setup_res() failing to get resource
  x86/ioapic: Fix lost IOAPIC resource after hot-removal and hotadd
  x86/ioapic: Fix ioapic failing to request resource

 drivers/acpi/internal.h |  2 --
 drivers/acpi/ioapic.c   | 46 ++
 drivers/acpi/pci_root.c | 12 +++-
 drivers/pci/setup-bus.c |  5 -
 include/linux/acpi.h|  6 ++
 5 files changed, 47 insertions(+), 24 deletions(-)

-- 
1.8.3.1

[PATCH v3 3/5] x86/ioapic: Fix setup_res() failing to get resource

2016-08-09 Thread Rui Wang

acpi_dev_filter_resource_type() returns 0 on success, and 1 on failure.
A return value of zero means there's a matching resource, so we should
continue within setup_res() to get the resource.

Signed-off-by: Rui Wang 
---
 drivers/acpi/ioapic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/ioapic.c b/drivers/acpi/ioapic.c
index 2449377..8ab6d42 100644
--- a/drivers/acpi/ioapic.c
+++ b/drivers/acpi/ioapic.c
@@ -46,7 +46,7 @@ static acpi_status setup_res(struct acpi_resource *acpi_res, 
void *data)
struct resource_win win;
 
res->flags = 0;
-   if (acpi_dev_filter_resource_type(acpi_res, IORESOURCE_MEM) == 0)
+   if (acpi_dev_filter_resource_type(acpi_res, IORESOURCE_MEM))
return AE_OK;
 
if (!acpi_dev_resource_memory(acpi_res, res)) {
-- 
1.8.3.1

[PATCH v3 5/5] x86/ioapic: Fix ioapic failing to request resource

2016-08-09 Thread Rui Wang

handle_ioapic_add() uses request_resource() to request ACPI "_CRS"
resources. This can fail with the following error message:

[  247.325693] ACPI: \_SB_.IIO1.AID1: failed to insert resource

This happens when there are multiple IOAPICs and DSDT groups their
"_CRS" resources as the children of a parent resource, as seen from
/proc/iomem:

fec0-fecf : PNP0003:00
  fec0-fec003ff : IOAPIC 0
  fec01000-fec013ff : IOAPIC 1
  fec4-fec403ff : IOAPIC 2

In this case request_resource() fails because there's a conflicting
resource which is the parent (fec-fecf). Fix it by using
insert_resource() which can request resources by taking the conflicting
resource as the parent.

Signed-off-by: Rui Wang 
---
 drivers/acpi/ioapic.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/ioapic.c b/drivers/acpi/ioapic.c
index ee20111..6d7ce6e 100644
--- a/drivers/acpi/ioapic.c
+++ b/drivers/acpi/ioapic.c
@@ -146,10 +146,12 @@ static acpi_status handle_ioapic_add(acpi_handle handle, 
u32 lvl,
 
crs_res = &ioapic->res;
acpi_walk_resources(handle, METHOD_NAME__CRS, setup_res, crs_res);
+   crs_res->name = type;
+   crs_res->flags |= IORESOURCE_BUSY;
if (crs_res->flags == 0) {
acpi_handle_warn(handle, "failed to get resource\n");
goto exit_release;
-   } else if (request_resource(&iomem_resource, crs_res)) {
+   } else if (insert_resource(&iomem_resource, crs_res)) {
acpi_handle_warn(handle, "failed to insert resource\n");
goto exit_release;
}
-- 
1.8.3.1

Re: [PATCH 1/2] device-tree: aqp8064.dtsi: Remove usb phy dr_mode = "host"

2016-08-09 Thread Bjorn Andersson

On Mon 08 Aug 15:34 PDT 2016, John Stultz wrote:

Changes in this file is commonly prefixed "ARM: dts: apq8064:", please
follow that. Perhaps:

ARM: dts: apq8064: Drop dr_mode property from usb phy

> Most 8064 devices have micro-usb ports for phy1, so setting
> the dr_mode to host here seems incorrect.
> 
> Leaving it unspecified should default to otg, and then
> any boards that wish to specify something else, can
> override it in their dts file.
> 
> Cc: Rob Herring 
> Cc: Andy Gross 
> Cc: Bjorn Andersson 
> Cc: Stephen Boyd 
> Cc: linux-arm-...@vger.kernel.org
> Cc: devicet...@vger.kernel.org
> Signed-off-by: John Stultz 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  arch/arm/boot/dts/qcom-apq8064.dtsi | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/arm/boot/dts/qcom-apq8064.dtsi 
> b/arch/arm/boot/dts/qcom-apq8064.dtsi
> index 9dc83b0..7e43416 100644
> --- a/arch/arm/boot/dts/qcom-apq8064.dtsi
> +++ b/arch/arm/boot/dts/qcom-apq8064.dtsi
> @@ -750,7 +750,6 @@
>   reg = <0x1250 0x400>;
>   interrupts  = ;
>   status  = "disabled";
> - dr_mode = "host";
>  
>   clocks  = <&gcc USB_HS1_XCVR_CLK>,
> <&gcc USB_HS1_H_CLK>;
> -- 
> 1.9.1
>

Re: [PATCH] cpuset: make sure new tasks conform to the current config of the cpuset

2016-08-09 Thread Tejun Heo

On Tue, Aug 09, 2016 at 11:25:01AM +0800, Zefan Li wrote:
> A new task inherits cpus_allowed and mems_allowed masks from its parent,
> but if someone changes cpuset's config by writing to cpuset.cpus/cpuset.mems
> before this new task is inserted into the cgroup's task list, the new task
> won't be updated accordingly.
> 
> Signed-off-by: Zefan Li 

Applied to cgroup/for-4.8-fixes w/ stable cc'd.

Thanks.

-- 
tejun

Re: [PATCH v6 0/2] Block layer support ZAC/ZBC commands

2016-08-09 Thread Shaun Tancheff

On Tue, Aug 9, 2016 at 3:09 AM, Damien Le Moal  wrote:
>> On Aug 9, 2016, at 15:47, Hannes Reinecke  wrote:

[trim]

>>> Since disk type == 0 for everything that isn't HM so I would prefer the
>>> sysfs 'zoned' file just report if the drive is HA or HM.
>>>
>> Okay. So let's put in the 'zoned' attribute the device type:
>> 'host-managed', 'host-aware', or 'device managed'.
>
> I hacked your patches and simply put a "0" or "1" in the sysfs zoned file.
> Any drive that has ZBC/ZAC command support gets a "1", "0" for everything
> else. This means that drive managed models are not exposed as zoned block
> devices. For HM vs HA differentiation, an application can look at the
> device type file since it is already present.
>
> We could indeed set the "zoned" file to the device type, but HM drives and
> regular drives will both have "0" in it, so no differentiation possible.
> The other choice could be the "zoned" bits defined by ZBC, but these
> do not define a value for host managed drives, and the drive managed value
> being not "0" could be confusing too. So I settled for a simple 0/1 boolean.

This seems good to me.

 2) Add ioctls for zone management:
 Report zones (get information from RB tree), reset zone (simple wrapper
 to ioctl for block discard), open zone, close zone and finish zone. That
 will allow mkfs like tools to get zone information without having to parse
 thousands of sysfs files (and can also be integrated in libzbc block 
 backend
 driver for a unified interface with the direct SG_IO path for kernels 
 without
 the ZBC code enabled).
>>>
>>> I can add finish zone ... but I really can't think of a use for it, myself.
>>>
>> Which is not the point. The standard defines this, so clearly someone
>> found it a reasonable addendum. So let's add this for completeness.

Agreed.

> Done: I hacked Shaun ioctl code and added finish zone too. The
> difference with Shaun initial code is that the ioctl are propagated down to
> the driver (__blkdev_driver_ioctl -> sd_ioctl) so that there is no need for
> BIO request definition for the zone operations. So a lot less code added.

The purpose of the BIO flags is not to enable the ioctls so much as
the other way round. Creating BIO op's is to enable issuing ZBC
commands from device mapper targets and file systems without some
heinous ioctl hacks.
Making the resulting block layer interfaces available via ioctls is just a
reasonable way to exercise the code ... or that was my intent.

> The ioctls do not mimic exactly the ZBC standard. For instance, there is no
> reporting options for report zones, nor is the "all" bit supported for open,
> close or finish zone commands. But the information provided on zones is 
> complete
> and maps to the standard definitions.

For the reporting options I have planned to reuse the stream_id in
struct bio when that is formalized. There are certainly other places in
struct bio to stuff a few extra bits ...

As far as the all bit ... this is being handled by all the zone action
commands. Just pass a sector of ~0ul and it's handled in sd.c by
sd_setup_zone_action_cmnd().

Apologies as apparently my documentation here is lacking :-(

> I also added a reset_wp ioctl for completeness, but this one simply calls
> blkdev_issue_discard internally, so it is in fact equivalent to the BLKDISCARD
> ioctl call with a range exactly aligned on a zone.

I'm confused as my patch set includes a Reset WP (BLKRESETZONE) that
creates a REQ_OP_ZONE_RESET .. same as open and close. My
expectation being that BLKDISCARD doesn't really need yet another alias.

[trim]

> Did that too. The blk_zone struct is now exactly 64B. I removed the per zone

Thanks .. being a cache line is harder to whinge about...

> spinlock and replaced it with a flag so that zones can still be locked
> independently using wait_on_bit_lock & friends (I added the functions
> blk_zone_lock, blk_zone_trylock and blk_zone_unlock to do that). This per zone
> locking comes in handy to implement the ioctls code without stalling the 
> command
> queue for read, writes and discard on different zones.
>
> I however kept the zone length in the structure. The reason for doing so is 
> that
> this allows supporting drives with non-constant zone sizes, albeit with a more
> limited interface since in such case the device chunk_sectors is not set (that
> is how a user or application can detect that the zones are not constant size).
> For these drives, the block layer may happily merge BIOs across zone 
> boundaries
> and the discard code will not split and align calls on the zones. But upper 
> layers
> (an FS or a device mapper) can still do all this by themselves if they 
> want/can
> support non-constant zone sizes.
>
> The only exception is drives like the Seagate one with only the last zone of a
> different size. This case is handled exactly as if all zones are the same size
> simply because any operation on the last smaller zone will naturally align as

Re: [PATCH v2 1/1] blk-mq: fix hang caused by freeze/unfreeze sequence

2016-08-09 Thread Tejun Heo

Hello,

On Mon, Aug 08, 2016 at 01:39:08PM +0200, Roman Pen wrote:
> Long time ago there was a similar fix proposed by Akinobu Mita[1],
> but it seems that time everyone decided to fix this subtle race in
> percpu-refcount and Tejun Heo[2] did an attempt (as I can see that
> patchset was not applied).

So, I probably forgot about it while waiting for confirmation of fix.
Can you please verify that the patchset fixes the issue?  I can apply
the patchset right away.

Thanks.

-- 
tejun

Re: Potential race condition in drivers/ata/sata_mv.ko

2016-08-09 Thread Tejun Heo

Hello,

On Fri, Aug 05, 2016 at 03:43:30PM +0300, Pavel Andrianov wrote:
> In drivers/ata/sata_mv.ko function mv_set_main_irq_mask is called several
> times. Twice with a spinlock, twice from init function and once without any
> protection. The call without protection rises to several handlers from
> ata_port_operations. The structure with the ata_port_operations is included
> into a structure 'host' in mv_platform_probe and in mv_pci_init_one. At the
> end of these functions ata_host operations are activated together with
> interrupt handler. The conclusion is: interrupt handler may be executed in
> parallel with handlers from ata_port_operations, or, more formally, it may
> interrupt its execution.
> 
> In mv_set_main_irq_mask and in interrupt handler mv_interrupt the interrupt
> mask is modified, but, as I said, handlers from ata_port_operations do not
> acquire any lock. Thus, the interrupt mask may be set incorrectly if the are
> two conflicting modifications.

It depends on which operations.  Most are only called from EH context
and racing there isn't likely to cause any actual issues.  Care to
submit a patch to fix the issue?

Thanks.

-- 
tejun

Re: [PATCH v5 03/14] arm64/numa: add nid check for memory block

2016-08-09 Thread Leizhen (ThunderTown)



On 2016/8/10 10:12, Hanjun Guo wrote:
> On 2016/8/8 17:18, Zhen Lei wrote:
>> Use the same tactic to cpu and numa-distance nodes.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  arch/arm64/mm/numa.c | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
>> index c7fe3ec..2601660 100644
>> --- a/arch/arm64/mm/numa.c
>> +++ b/arch/arm64/mm/numa.c
>> @@ -141,6 +141,11 @@ int __init numa_add_memblk(int nid, u64 start, u64 end)
>>  {
>>  int ret;
>>
>> +if (nid >= MAX_NUMNODES) {
>> +pr_warn("NUMA: Node id %u exceeds maximum value\n", nid);
>> +return -EINVAL;
>> +}
> 
> I think this check should be added to of_numa_parse_memory_nodes(), which 
> before
> the numa_add_memblk() called, it's the same logic in 
> of_numa_parse_cpu_nodes() and
> the node id is checked before calling numa_add_memblk() in ACPI.

Yes, you are right. This check is arch independent.

> 
> Thanks
> Hanjun
> 
> 
> 
> .
>

Re: [PATCH 1/7] ima: on soft reboot, restore the measurement list

2016-08-09 Thread Michael Ellerman

Thiago Jung Bauermann  writes:

> Am Dienstag, 09 August 2016, 09:01:13 schrieb Mimi Zohar:
>> On Tue, 2016-08-09 at 20:59 +1000, Michael Ellerman wrote:
>> > Mimi Zohar  writes:
>> > > diff --git a/security/integrity/ima/ima.h
>> > > b/security/integrity/ima/ima.h
>> > > index b5728da..84e8d36 100644
>> > > --- a/security/integrity/ima/ima.h
>> > > +++ b/security/integrity/ima/ima.h
>> > > @@ -102,6 +102,13 @@ struct ima_queue_entry {
>> > > 
>> > >  };
>> > >  extern struct list_head ima_measurements;   /* list of all 
> measurements
>> > >  */
>> > > 
>> > > +/* Some details preceding the binary serialized measurement list */
>> > > +struct ima_kexec_hdr {
>> > > +unsigned short version;
>> > > +unsigned long buffer_size;
>> > > +unsigned long count;
>> > > +} __packed;
>> > > +
>> > 
>> > Am I understanding it correctly that this structure is passed between
>> > kernels?
>> Yes, the header prefixes the measurement list, which is being passed on
>> the same computer to the next kernel.  Could the architecture (eg.
>> LE/BE) change between soft re-boots?
>
> Yes. I am able to boot a BE kernel from an LE kernel with my patches. 
> Whether we want to support that or not is another question...

Yes you must support that. BE -> LE and vice versa.

You should also consider the possibility that the next kernel is not
Linux.

cheers

Re: [PATCH] w1: fix timeout_us parameter description

2016-08-09 Thread Wei Yongjun

Hi

On 08/10/2016 06:54 AM, Evgeniy Polyakov wrote:
> Hi
>
> 08.08.2016, 16:52, "Wei Yongjun" :
>> Fix 'timeout_us' parameter description.
>>
>> Signed-off-by: Wei Yongjun 
>> ---
>>  drivers/w1/w1.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/w1/w1.c b/drivers/w1/w1.c
>> index bb34362..e213c67 100644
>> --- a/drivers/w1/w1.c
>> +++ b/drivers/w1/w1.c
>> @@ -53,8 +53,8 @@ int w1_max_slave_ttl = 10;
>>  module_param_named(timeout, w1_timeout, int, 0);
>>  MODULE_PARM_DESC(timeout, "time in seconds between automatic slave 
>> searches");
>>  module_param_named(timeout_us, w1_timeout_us, int, 0);
>> -MODULE_PARM_DESC(timeout, "time in microseconds between automatic slave"
>> - " searches");
>> +MODULE_PARM_DESC(timeout_us,
>> + "time in microseconds between automatic slave searches");
> I believe there will be no harm to put it to on one line, even if it crosses 
> some obscure very-long-line rule

Maybe the bad patch description confused you, this patch the typo in the
first argument of MODULE_PARM_DESC(), use timeout_us instead of timeout.

Re: [PATCH resend 5/5] libata-scsi: fix MODE SELECT translation for Control mode page

2016-08-09 Thread Tejun Heo

On Fri, Jul 22, 2016 at 02:41:54AM +0800, tom.t...@gmail.com wrote:
> From: Tom Yan 
> 
> scsi_done() was called repeatedly and apparently because of that,
> the kernel would call trace when we touch the Control mode page:
> 
> Call Trace:
>  [] dump_stack+0x63/0x81
>  [] __warn+0xcb/0xf0
>  [] warn_slowpath_null+0x1d/0x20
>  [] ata_eh_finish+0xe0/0xf0 [libata]
>  [] sata_pmp_error_handler+0x640/0xa50 [libata]
>  [] ahci_error_handler+0x1d/0x70 [libahci]
>  [] ata_scsi_port_error_handler+0x430/0x770 [libata]
>  [] ? ata_scsi_cmd_error_handler+0xdd/0x160 [libata]
>  [] ata_scsi_error+0xa7/0xf0 [libata]
>  [] scsi_error_handler+0xaa/0x560 [scsi_mod]
>  [] ? scsi_eh_get_sense+0x180/0x180 [scsi_mod]
>  [] kthread+0xd8/0xf0
>  [] ret_from_fork+0x1f/0x40
>  [] ? kthread_worker_fn+0x170/0x170
> ---[ end trace 8b7501047e928a17 ]---
> 
> Removed the unnecessary code and let ata_scsi_translate() do the job.
> 
> Also, since ata_mselect_control() has no ATA command to send to the
> device, ata_scsi_mode_select_xlat() should return 1 for it, so that
> ata_scsi_translate() will finish early to avoid ata_qc_issue().
> 
> Signed-off-by: Tom Yan 

Applied to libata/for-4.9.

Thanks.

-- 
tejun

[PATCH] FUSE: add the async option for the flush/release operation

2016-08-09 Thread Enke Chen

Hi, Miklos:

This patch adds the async option for the flush/release operation in FUSE.

The async flush/release option allows a FUSE-based application to be terminated
without being blocked in the flush/release operation even in the presence of
complex external interactions. In addition, the async operation can be more
efficient when a large number of fuse-based files is involved.

---
Deadlock Example:

Process A is a multi-threaded application that interacts with Process B,
a FUSE-server.


   UNIX-domain socket
App (A)  ---  FUSE-server (B)
   |   |
   |   |
   |   |
   +---+
   open/flush/release


When the FUSE-server receives an open and flush/release operations from
Process A, it would in turn interact with Process A (e.g., coordinating
shared memory allocation and de-allocation) using the connection-oriented
UNIX-domain socket.

A deadlock occurs when Process A is terminating:

  1) As part of process termination (i.e., do_exit() in the kernel), it
 would send "flush/release" to Process B, and wait for its reply due
 to the synchronous nature of the operation.

  2) When Process B receives the "flush/release" request, it would in turn
 send a message to Process A (over the UNIX-domain channel) and wait
 for its reply.

  3) As Process A is terminating, it may not be able to reply to Process B,
 resulting in a deadlock.

   The async flush/release option offers a simple and robust solution to the
   deadlock issue.

   With the async flush/release operation, all the files and sockets in Process
   A can be closed without being blocked, which in turn would un-block the
   operation in Process B using the UNIX-domain socket.
---

Signed-off-by: Enke Chen 

Version: 4.7.0_next_20160805

 fs/fuse/file.c|   39 +++
 fs/fuse/fuse_i.h  |4 
 fs/fuse/inode.c   |4 +++-
 include/uapi/linux/fuse.h |7 ++-
 4 files changed, 40 insertions(+), 14 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index f394aff..7dd144f 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -273,7 +273,8 @@ void fuse_release_common(struct file *file, int opcode)
 * synchronous RELEASE is allowed (and desirable) in this case
 * because the server can be trusted not to screw up.
 */
-   fuse_file_put(ff, ff->fc->destroy_req != NULL);
+   fuse_file_put(ff, (ff->fc->destroy_req != NULL) &&
+ !ff->fc->async_flush);
 }
 
 static int fuse_open(struct inode *inode, struct file *file)
@@ -394,13 +395,19 @@ static void fuse_sync_writes(struct inode *inode)
fuse_release_nowrite(inode);
 }
 
+static void fuse_flush_end(struct fuse_conn *fc, struct fuse_req *req)
+{
+   if (req->out.h.error == -ENOSYS)
+   fc->no_flush = 1;
+}
+
 static int fuse_flush(struct file *file, fl_owner_t id)
 {
struct inode *inode = file_inode(file);
struct fuse_conn *fc = get_fuse_conn(inode);
struct fuse_file *ff = file->private_data;
struct fuse_req *req;
-   struct fuse_flush_in inarg;
+   struct fuse_flush_in *inarg;
int err;
 
if (is_bad_inode(inode))
@@ -423,20 +430,28 @@ static int fuse_flush(struct file *file, fl_owner_t id)
 
req = fuse_get_req_nofail_nopages(fc, file);
memset(&inarg, 0, sizeof(inarg));
-   inarg.fh = ff->fh;
-   inarg.lock_owner = fuse_lock_owner_id(fc, id);
+   inarg = &req->misc.flush_in;
+   inarg->fh = ff->fh;
+   inarg->lock_owner = fuse_lock_owner_id(fc, id);
req->in.h.opcode = FUSE_FLUSH;
req->in.h.nodeid = get_node_id(inode);
req->in.numargs = 1;
-   req->in.args[0].size = sizeof(inarg);
-   req->in.args[0].value = &inarg;
-   __set_bit(FR_FORCE, &req->flags);
-   fuse_request_send(fc, req);
-   err = req->out.h.error;
-   fuse_put_request(fc, req);
-   if (err == -ENOSYS) {
-   fc->no_flush = 1;
+   req->in.args[0].size = sizeof(struct fuse_flush_in);
+   req->in.args[0].value = inarg;
+   if (fc->async_flush) {
+   req->end = fuse_flush_end;
+   __set_bit(FR_BACKGROUND, &req->flags);
+   fuse_request_send_background(fc, req);
err = 0;
+   } else {
+   __set_bit(FR_FORCE, &req->flags);
+   fuse_request_send(fc, req);
+   err = req->out.h.error;
+   fuse_put_request(fc, req);
+   if (err == -ENOSYS) {
+   fc->no_flush = 1;
+   err = 0;
+   }
}
return err;
 }
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index d98d8cc..f212cdd 100644
--- a/fs/fuse/fuse_i.h
++

Re: [PATCH v6 0/2] Block layer support ZAC/ZBC commands

2016-08-09 Thread Shaun Tancheff

On Tue, Aug 9, 2016 at 1:47 AM, Hannes Reinecke  wrote:
> On 08/05/2016 10:35 PM, Shaun Tancheff wrote:
>> On Tue, Aug 2, 2016 at 8:29 PM, Damien Le Moal  
>> wrote:
 On Aug 2, 2016, at 23:35, Hannes Reinecke  wrote:
 On 08/01/2016 07:07 PM, Shaun Tancheff wrote:
> On Mon, Aug 1, 2016 at 4:41 AM, Christoph Hellwig  wrote:

[trim]
>> Also the zone report is 'slow' in that there is an overhead for the
>> report itself but
>> the number of zones per query can be quite large so 4 or 5 I/Os that
>> run into the
>> hundreds if milliseconds to cache the entire drive isn't really unworkable 
>> for
>> something that is used infrequently.
>>
> No, surely not.
> But one of the _big_ advantages for the RB tree is blkdev_discard().
> Without the RB tree any mkfs program will issue a 'discard' for every
> sector. We will be able to coalesce those into one discard per zone, but
> we still need to issue one for _every_ zone.
> Which is (as indicated) really slow, and easily takes several minutes.
> With the RB tree we can short-circuit discards to empty zones, and speed
> up processing time dramatically.
> Sure we could be moving the logic into mkfs and friends, but that would
> require us to change the programs and agree on a library (libzbc?) which
> should be handling that.

Adding an additional library dependency seems overkill for a program
that is already doing ioctls and raw block I/O ... but I would leave that
up to each file system. As it sits issuing the ioctl and walking the array
of data returned [see blkreport.c] is already quite trivial.

I believe the goal here is for F2FS, and perhaps NILFS? to "just
work" with the DISCARD to Reset WP and zone cache in place.

Still quite skeptical about other common file systems
"just working" without their respective mkfs et. al. being
zone aware and handling the topology of the media at mkfs time.
Perhaps there is something I am unaware of?

[trim]

>> I can add finish zone ... but I really can't think of a use for it, myself.
>>
> Which is not the point. The standard defines this, so clearly someone
> found it a reasonable addendum. So let's add this for completeness.

Agreed and queued for the next version.

Regards,
Shaun

[PATCH v5 8/8] drm/rockchip: Add dmc notifier in vop driver

2016-08-09 Thread Lin Huang

when in ddr frequency scaling process, vop can not do
enable or disable operation, since dcf will base on vop vblank
time to do frequency scaling and need to get vop irq if there
have vop enabled. So need register to devfreq notifier, and we can
get the dmc status. Also, when there have two vop enabled, we need
to disable dmc, since dcf only base on one vop vblank time, so the
other panel will flicker when do ddr frequency scaling.

Signed-off-by: Lin Huang 
---
Changes in v5:
- improve some nits

Changes in v4:
- register notifier to devfreq_register_notifier
- use DEVFREQ_PRECHANGE and DEVFREQ_POSTCHANGE to get dmc status
- when two vop enable, disable dmc
- when two vop back to one vop, enable dmc

Changes in v3:
- when do vop eanble/disable, dmc will wait until it finish

Changes in v2:
- None

Changes in v1:
- use wait_event instead usleep

 drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 128 +++-
 1 file changed, 125 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index 31744fe..7ce3890 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -12,6 +12,8 @@
  * GNU General Public License for more details.
  */
 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -118,6 +120,13 @@ struct vop {
 
const struct vop_data *data;
 
+   struct devfreq *devfreq;
+   struct devfreq_event_dev *devfreq_event_dev;
+   struct notifier_block dmc_nb;
+   int dmc_in_process;
+   int vop_switch_status;
+   wait_queue_head_t wait_dmc_queue;
+   wait_queue_head_t wait_vop_switch_queue;
uint32_t *regsbak;
void __iomem *regs;
 
@@ -428,21 +437,59 @@ static void vop_dsp_hold_valid_irq_disable(struct vop 
*vop)
spin_unlock_irqrestore(&vop->irq_lock, flags);
 }
 
+static int dmc_notify(struct notifier_block *nb, unsigned long event,
+ void *data)
+{
+   struct vop *vop = container_of(nb, struct vop, dmc_nb);
+
+   if (event == DEVFREQ_PRECHANGE) {
+   /*
+* check if vop in enable or disable process,
+* if yes, wait until it finishes, use 200ms as
+* timeout.
+*/
+   if (!wait_event_timeout(vop->wait_vop_switch_queue,
+   !vop->vop_switch_status, HZ / 5))
+   dev_warn(vop->dev,
+"Timeout waiting for vop swtich status\n");
+   vop->dmc_in_process = 1;
+   } else if (event == DEVFREQ_POSTCHANGE) {
+   vop->dmc_in_process = 0;
+   wake_up(&vop->wait_dmc_queue);
+   }
+
+   return NOTIFY_OK;
+}
+
 static void vop_enable(struct drm_crtc *crtc)
 {
struct vop *vop = to_vop(crtc);
+   int num_enabled_crtc = 0;
int ret;
 
+   if (vop->is_enabled)
+   return;
+
+   /*
+* if in dmc scaling frequency process, wait until it finishes
+* use 100ms as timeout time.
+*/
+   if (!wait_event_timeout(vop->wait_dmc_queue,
+   !vop->dmc_in_process, HZ / 5))
+   dev_warn(vop->dev,
+"Timeout waiting for dmc when vop enable\n");
+
+   vop->vop_switch_status = 1;
ret = pm_runtime_get_sync(vop->dev);
if (ret < 0) {
dev_err(vop->dev, "failed to get pm runtime: %d\n", ret);
-   return;
+   goto err;
}
 
ret = clk_enable(vop->hclk);
if (ret < 0) {
dev_err(vop->dev, "failed to enable hclk - %d\n", ret);
-   return;
+   goto err;
}
 
ret = clk_enable(vop->dclk);
@@ -456,7 +503,6 @@ static void vop_enable(struct drm_crtc *crtc)
dev_err(vop->dev, "failed to enable aclk - %d\n", ret);
goto err_disable_dclk;
}
-
/*
 * Slave iommu shares power, irq and clock with vop.  It was associated
 * automatically with this master device via common driver code.
@@ -485,6 +531,21 @@ static void vop_enable(struct drm_crtc *crtc)
 
drm_crtc_vblank_on(crtc);
 
+   vop->vop_switch_status = 0;
+   wake_up(&vop->wait_vop_switch_queue);
+
+   /* check how many vop we use now */
+   drm_for_each_crtc(crtc, vop->drm_dev) {
+   if (crtc->state->enable)
+   num_enabled_crtc++;
+   }
+
+   /* if enable two vop, need to disable dmc */
+   if ((num_enabled_crtc > 1) && vop->devfreq) {
+   if (vop->devfreq_event_dev)
+   devfreq_event_disable_edev(vop->devfreq_event_dev);
+   devfreq_suspend_device(vop->devfreq);
+   }
return;
 
 err_disable_aclk:
@@ -493,16 +554,32 @@ err_disable_dclk:
clk_disable(vop->dclk);
 err_disable_hclk:
clk_disable(vop->hclk

[PATCH v5 6/8] Documentation: bindings: add dt documentation for rk3399 dmc

2016-08-09 Thread Lin Huang

This patch adds the documentation for rockchip rk3399 dmc driver.

Signed-off-by: Lin Huang 
---
Changes in v5:
-None

Changes in v4:
-None

Changes in v3:
-None

Changes in v2:
-None 

Changes in v1:
-None

 .../devicetree/bindings/devfreq/rk3399_dmc.txt | 35 ++
 1 file changed, 35 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt

diff --git a/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt 
b/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
new file mode 100644
index 000..90e9581
--- /dev/null
+++ b/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
@@ -0,0 +1,35 @@
+* Rockchip rk3399 dmc device
+
+Required properties:
+- compatible: Must be "rockchip,rk3399-dmc".
+- devfreq-events: Node to get ddr loading, Refer to
+ Documentation/devicetree/bindings/devfreq/rockchip-dif.txt
+- interrupts: The interrupt number to the cpu. The interrupt specifier format
+ depends on the interrupt controller. 
+- clocks: Phandles for clock specified in "clock-names" property
+- clock-names : The name of clock used by the DFI, must be "pclk_ddr_mon";
+- operating-points-v2: Refer to Documentation/devicetree/bindings/power/opp.txt
+  for details.
+- center-supply: Dmc supply node.
+- status: Marks the node enabled/disabled.
+
+Optional properties:
+- ddr_timing: ddr timing need to pass to arm trust firmware
+- upthreshold: the upthreshold to simpleondeamnd policy
+- downdifferential: The downdifferential to simpleondeamnd policy
+
+Example:
+   dmc: dmc {
+   compatible = "rockchip,rk3399-dmc";
+   devfreq-events = <&dfi>;
+   interrupts = ;
+   clocks = <&cru SCLK_DDRCLK>;
+   clock-names = "dmc_clk";
+   ddr_timing = <&ddr_timing>;
+   operating-points-v2 = <&dmc_opp_table>;
+   center-supply = <&ppvar_centerlogic>;
+   upthreshold = <15>;
+   downdifferential = <10>;
+   status = "disabled";
+   };
+
-- 
1.9.1

[PATCH v5 0/8] rk3399 support ddr frequency scaling

2016-08-09 Thread Lin Huang

rk3399 platform have dfi controller can monitor ddr load,
and dcf controller to handle ddr register so we can get the
right ddr frequency and make ddr controller happy work(which
will implement in bl31). So we do ddr frequency scaling with
following flow:

 kernelbl31

monitor ddr load
|
|
get_target_rate
|
|   pass rate to bl31
clk_set_rate(ddr) ->run dcf flow
|   |
|   |
wait dcf interrupt<---trigger dcf interrupt
|
|
  return

Lin Huang (8):
  clk: rockchip: add new clock-type for the ddrclk
  clk: rockchip: rk3399: add SCLK_DDRCLK ID for ddrc
  clk: rockchip: rk3399: add ddrc clock support
  Documentation: bindings: add dt documentation for dfi controller
  PM / devfreq: event: support rockchip dfi controller
  Documentation: bindings: add dt documentation for rk3399 dmc
  PM / devfreq: rockchip: add devfreq driver for rk3399 dmc
  drm/rockchip: Add dmc notifier in vop driver

 .../bindings/devfreq/event/rockchip-dfi.txt|  20 +
 .../devicetree/bindings/devfreq/rk3399_dmc.txt |  35 ++
 drivers/clk/rockchip/Makefile  |   1 +
 drivers/clk/rockchip/clk-ddr.c | 152 ++
 drivers/clk/rockchip/clk-rk3399.c  |  19 +
 drivers/clk/rockchip/clk.c |   9 +
 drivers/clk/rockchip/clk.h |  33 ++
 drivers/devfreq/Kconfig|   9 +
 drivers/devfreq/Makefile   |   1 +
 drivers/devfreq/event/Kconfig  |   7 +
 drivers/devfreq/event/Makefile |   1 +
 drivers/devfreq/event/rockchip-dfi.c   | 253 ++
 drivers/devfreq/rk3399_dmc.c   | 512 +
 drivers/gpu/drm/rockchip/rockchip_drm_vop.c| 128 +-
 include/dt-bindings/clock/rk3399-cru.h |   1 +
 include/soc/rockchip/rockchip_sip.h|  27 ++
 16 files changed, 1205 insertions(+), 3 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/devfreq/event/rockchip-dfi.txt
 create mode 100644 Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
 create mode 100644 drivers/clk/rockchip/clk-ddr.c
 create mode 100644 drivers/devfreq/event/rockchip-dfi.c
 create mode 100644 drivers/devfreq/rk3399_dmc.c
 create mode 100644 include/soc/rockchip/rockchip_sip.h

-- 
1.9.1

[PATCH v5 1/8] clk: rockchip: add new clock-type for the ddrclk

2016-08-09 Thread Lin Huang

On new rockchip platform(rk3399 etc), there have dcf controller to
do ddr frequency scaling, and this controller will implement in
arm-trust-firmware. We add a special clock-type to handle that.

Signed-off-by: Lin Huang 
---
Changes in v5:
- delete unuse mux_flag
- use div_flag to distinguish sip call and other operate

Changes in v4:
- use arm_smccc_smc() to set/read ddr rate

Changes in v3:
- use sip call to set/read ddr rate

Changes in v2:
- use GENMASK instead val_mask
- use divider_recalc_rate() instead DIV_ROUND_UP_ULL
- cleanup code

Changes in v1:
- None

 drivers/clk/rockchip/Makefile   |   1 +
 drivers/clk/rockchip/clk-ddr.c  | 152 
 drivers/clk/rockchip/clk.c  |   9 +++
 drivers/clk/rockchip/clk.h  |  33 
 include/soc/rockchip/rockchip_sip.h |  27 +++
 5 files changed, 222 insertions(+)
 create mode 100644 drivers/clk/rockchip/clk-ddr.c
 create mode 100644 include/soc/rockchip/rockchip_sip.h

diff --git a/drivers/clk/rockchip/Makefile b/drivers/clk/rockchip/Makefile
index f47a2fa..b5f2c8e 100644
--- a/drivers/clk/rockchip/Makefile
+++ b/drivers/clk/rockchip/Makefile
@@ -8,6 +8,7 @@ obj-y   += clk-pll.o
 obj-y  += clk-cpu.o
 obj-y  += clk-inverter.o
 obj-y  += clk-mmc-phase.o
+obj-y  += clk-ddr.o
 obj-$(CONFIG_RESET_CONTROLLER) += softrst.o
 
 obj-y  += clk-rk3036.o
diff --git a/drivers/clk/rockchip/clk-ddr.c b/drivers/clk/rockchip/clk-ddr.c
new file mode 100644
index 000..756e1ad
--- /dev/null
+++ b/drivers/clk/rockchip/clk-ddr.c
@@ -0,0 +1,152 @@
+/*
+ * Copyright (c) 2016 Rockchip Electronics Co. Ltd.
+ * Author: Lin Huang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "clk.h"
+
+struct rockchip_ddrclk {
+   struct clk_hw   hw;
+   void __iomem*reg_base;
+   int mux_offset;
+   int mux_shift;
+   int mux_width;
+   int div_shift;
+   int div_width;
+   int ddr_flag;
+   spinlock_t  *lock;
+};
+
+#define to_rockchip_ddrclk_hw(hw) container_of(hw, struct rockchip_ddrclk, hw)
+
+static int rockchip_ddrclk_set_rate(struct clk_hw *hw, unsigned long drate,
+   unsigned long prate)
+{
+   struct rockchip_ddrclk *ddrclk = to_rockchip_ddrclk_hw(hw);
+   unsigned long flags;
+   struct arm_smccc_res res;
+   int ret;
+
+   spin_lock_irqsave(ddrclk->lock, flags);
+   if (ddrclk->ddr_flag == ROCKCHIP_DDRCLK_SIP) {
+   arm_smccc_smc(SIP_DDR_FREQ, drate, 0, CONFIG_DRAM_SET_RATE,
+ 0, 0, 0, 0, &res);
+   ret = res.a0;
+   }
+   spin_unlock_irqrestore(ddrclk->lock, flags);
+
+   return ret;
+}
+
+static unsigned long
+rockchip_ddrclk_recalc_rate(struct clk_hw *hw,
+   unsigned long parent_rate)
+{
+   struct rockchip_ddrclk *ddrclk = to_rockchip_ddrclk_hw(hw);
+   struct arm_smccc_res res;
+   int ret;
+
+   if (ddrclk->ddr_flag == ROCKCHIP_DDRCLK_SIP) {
+   arm_smccc_smc(SIP_DDR_FREQ, 0, 0, CONFIG_DRAM_GET_RATE,
+ 0, 0, 0, 0, &res);
+   ret = res.a0;
+   }
+
+   return ret;
+}
+
+static long clk_ddrclk_round_rate(struct clk_hw *hw, unsigned long rate,
+ unsigned long *prate)
+{
+   return rate;
+}
+
+static u8 rockchip_ddrclk_get_parent(struct clk_hw *hw)
+{
+   struct rockchip_ddrclk *ddrclk = to_rockchip_ddrclk_hw(hw);
+   int num_parents = clk_hw_get_num_parents(hw);
+   u32 val;
+
+   val = clk_readl(ddrclk->reg_base +
+   ddrclk->mux_offset) >> ddrclk->mux_shift;
+   val &= GENMASK(ddrclk->mux_width - 1, 0);
+
+   if (val >= num_parents)
+   return -EINVAL;
+
+   return val;
+}
+
+static const struct clk_ops rockchip_ddrclk_ops = {
+   .recalc_rate = rockchip_ddrclk_recalc_rate,
+   .set_rate = rockchip_ddrclk_set_rate,
+   .round_rate = clk_ddrclk_round_rate,
+   .get_parent = rockchip_ddrclk_get_parent,
+};
+
+struct clk *rockchip_clk_register_ddrclk(const char *name, int flags,
+const char *const *parent_names,
+u8 num_parents, int mux_offset,
+int mux_shift, int mux_width,
+

[PATCH v5 2/8] clk: rockchip: rk3399: add SCLK_DDRCLK ID for ddrc

2016-08-09 Thread Lin Huang

Signed-off-by: Lin Huang 
---
Changes in v5:
-None
Changes in v4:
-None

Changes in v3:
-None

Changes in v2:
-None 

Changes in v1:
-None

 include/dt-bindings/clock/rk3399-cru.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/dt-bindings/clock/rk3399-cru.h 
b/include/dt-bindings/clock/rk3399-cru.h
index 50a44cf..ce5f3e9 100644
--- a/include/dt-bindings/clock/rk3399-cru.h
+++ b/include/dt-bindings/clock/rk3399-cru.h
@@ -131,6 +131,7 @@
 #define SCLK_DPHY_RX0_CFG  165
 #define SCLK_RMII_SRC  166
 #define SCLK_PCIEPHY_REF100M   167
+#define SCLK_DDRC  168
 
 #define DCLK_VOP0  180
 #define DCLK_VOP1  181
-- 
1.9.1

[PATCH v5 4/8] Documentation: bindings: add dt documentation for dfi controller

2016-08-09 Thread Lin Huang

This patch adds the documentation for rockchip dfi devfreq-event driver.

Signed-off-by: Lin Huang 
---
Changes in v5:
-None

Changes in v4:
-None

Changes in v3:
-None

Changes in v2:
-None 

Changes in v1:
-None

 .../bindings/devfreq/event/rockchip-dfi.txt  | 20 
 1 file changed, 20 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/devfreq/event/rockchip-dfi.txt

diff --git a/Documentation/devicetree/bindings/devfreq/event/rockchip-dfi.txt 
b/Documentation/devicetree/bindings/devfreq/event/rockchip-dfi.txt
new file mode 100644
index 000..bf42255
--- /dev/null
+++ b/Documentation/devicetree/bindings/devfreq/event/rockchip-dfi.txt
@@ -0,0 +1,20 @@
+
+* Rockchip rk3399 DFI device
+
+Required properties:
+- compatible: Must be "rockchip,rk3399-dfi".
+- reg: physical base address of each DFI and length of memory mapped region
+- rockchip,pmu: phandle to the syscon managing the "pmu general register files"
+- clocks: phandles for clock specified in "clock-names" property
+- clock-names : the name of clock used by the DFI, must be "pclk_ddr_mon";
+
+Example:
+   dfi: dfi@0xff63 {
+   reg = <0x00 0xff63 0x00 0x4000>;
+   compatible = "rockchip,rk3399-dfi";
+   rockchip,pmu = <&pmugrf>;
+   clocks = <&cru PCLK_DDR_MON>;
+   clock-names = "pclk_ddr_mon";
+   status = "disabled";
+   };
+
-- 
1.9.1

[PATCH v5 7/8] PM / devfreq: rockchip: add devfreq driver for rk3399 dmc

2016-08-09 Thread Lin Huang

base on dfi result, we do ddr frequency scaling, register
dmc driver to devfreq framework, and use simple-ondemand
policy.

Signed-off-by: Lin Huang 
---
Changes in v5:
- improve dmc driver suggest by Chanwoo Choi

Changes in v4:
- use arm_smccc_smc() function talk to bl31
- delete rockchip_dmc.c file and config
- delete dmc_notify
- adjust probe order

Changes in v3:
- operate dram setting through sip call
- imporve set rate flow

Changes in v2:
- None

Changes in v1:
- move dfi controller to event
- fix set voltage sequence when set rate fail
- change Kconfig type from tristate to bool
- move unuse EXPORT_SYMBOL_GPL()

 drivers/devfreq/Kconfig  |   9 +
 drivers/devfreq/Makefile |   1 +
 drivers/devfreq/rk3399_dmc.c | 512 +++
 3 files changed, 522 insertions(+)
 create mode 100644 drivers/devfreq/rk3399_dmc.c

diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
index a5be56e..749499d 100644
--- a/drivers/devfreq/Kconfig
+++ b/drivers/devfreq/Kconfig
@@ -100,6 +100,15 @@ config ARM_TEGRA_DEVFREQ
  It reads ACTMON counters of memory controllers and adjusts the
  operating frequencies and voltages with OPP support.
 
+config ARM_RK3399_DMC_DEVFREQ
+   tristate "ARM RK3399 DMC DEVFREQ Driver"
+   select PM_OPP
+   select DEVFREQ_GOV_SIMPLE_ONDEMAND
+   help
+  This adds the DEVFREQ driver for the RK3399 dmc(Dynamic Memory 
Controller).
+  It sets the frequency for the memory controller and reads the usage 
counts
+  from hardware.
+
 source "drivers/devfreq/event/Kconfig"
 
 endif # PM_DEVFREQ
diff --git a/drivers/devfreq/Makefile b/drivers/devfreq/Makefile
index 09f11d9..70d9549 100644
--- a/drivers/devfreq/Makefile
+++ b/drivers/devfreq/Makefile
@@ -9,6 +9,7 @@ obj-$(CONFIG_DEVFREQ_GOV_PASSIVE)   += governor_passive.o
 # DEVFREQ Drivers
 obj-$(CONFIG_ARM_EXYNOS_BUS_DEVFREQ)   += exynos-bus.o
 obj-$(CONFIG_ARM_TEGRA_DEVFREQ)+= tegra-devfreq.o
+obj-$(CONFIG_ARM_RK3399_DMC_DEVFREQ)   += rk3399_dmc.o
 
 # DEVFREQ Event Drivers
 obj-$(CONFIG_PM_DEVFREQ_EVENT) += event/
diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
new file mode 100644
index 000..c1157ba
--- /dev/null
+++ b/drivers/devfreq/rk3399_dmc.c
@@ -0,0 +1,512 @@
+/*
+ * Copyright (c) 2016, Fuzhou Rockchip Electronics Co., Ltd.
+ * Author: Lin Huang 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+struct dram_timing {
+   unsigned int ddr3_speed_bin;
+   unsigned int pd_idle;
+   unsigned int sr_idle;
+   unsigned int sr_mc_gate_idle;
+   unsigned int srpd_lite_idle;
+   unsigned int standby_idle;
+   unsigned int dram_dll_dis_freq;
+   unsigned int phy_dll_dis_freq;
+   unsigned int ddr3_odt_dis_freq;
+   unsigned int ddr3_drv;
+   unsigned int ddr3_odt;
+   unsigned int phy_ddr3_ca_drv;
+   unsigned int phy_ddr3_dq_drv;
+   unsigned int phy_ddr3_odt;
+   unsigned int lpddr3_odt_dis_freq;
+   unsigned int lpddr3_drv;
+   unsigned int lpddr3_odt;
+   unsigned int phy_lpddr3_ca_drv;
+   unsigned int phy_lpddr3_dq_drv;
+   unsigned int phy_lpddr3_odt;
+   unsigned int lpddr4_odt_dis_freq;
+   unsigned int lpddr4_drv;
+   unsigned int lpddr4_dq_odt;
+   unsigned int lpddr4_ca_odt;
+   unsigned int phy_lpddr4_ca_drv;
+   unsigned int phy_lpddr4_ck_cs_drv;
+   unsigned int phy_lpddr4_dq_drv;
+   unsigned int phy_lpddr4_odt;
+};
+
+struct rk3399_dmcfreq {
+   struct device *dev;
+   struct devfreq *devfreq;
+   struct devfreq_simple_ondemand_data ondemand_data;
+   struct clk *dmc_clk;
+   struct devfreq_event_dev *edev;
+   struct mutex lock;
+   struct dram_timing *timing;
+
+   /*
+* DDR Converser of Frequency (DCF) is used to implement DDR frequency
+* conversion without the participation of CPU, we will implement and
+* control it in arm trust firmware.
+*/
+   wait_queue_head_t   wait_dcf_queue;
+   int irq;
+   int wait_dcf_flag;
+   struct regulator *vdd_center;
+   unsigned long rate, target_rate;
+   unsigned long volt, target_volt;
+   struct dev_pm_opp *curr_opp;
+};
+
+static int rk3399_dmcfreq_target(struct device *dev, unsigned long *freq,
+u32 flags)
+{
+   struct rk339

[PATCH v5 3/8] clk: rockchip: rk3399: add ddrc clock support

2016-08-09 Thread Lin Huang

add ddrc clock setting, so we can do ddr frequency
scaling on rk3399 platform in future.

Signed-off-by: Lin Huang 
---
Changes in v5:
- fit for the ddr type

Changes in v4:
- None

Changes in v3:
- None

Changes in v2:
- remove clk_ddrc_dpll_src from critical clock list

Changes in v1:
- remove ddrc source CLK_IGNORE_UNUSED flag
- move clk_ddrc and clk_ddrc_dpll_src to critical

 drivers/clk/rockchip/clk-rk3399.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/clk/rockchip/clk-rk3399.c 
b/drivers/clk/rockchip/clk-rk3399.c
index c109d80..b962aeb 100644
--- a/drivers/clk/rockchip/clk-rk3399.c
+++ b/drivers/clk/rockchip/clk-rk3399.c
@@ -118,6 +118,10 @@ PNAME(mux_armclkb_p)   = { 
"clk_core_b_lpll_src",
"clk_core_b_bpll_src",
"clk_core_b_dpll_src",
"clk_core_b_gpll_src" };
+PNAME(mux_ddrclk_p)= { "clk_ddrc_lpll_src",
+   "clk_ddrc_bpll_src",
+   "clk_ddrc_dpll_src",
+   "clk_ddrc_gpll_src" };
 PNAME(mux_aclk_cci_p)  = { "cpll_aclk_cci_src",
"gpll_aclk_cci_src",
"npll_aclk_cci_src",
@@ -1377,6 +1381,18 @@ static struct rockchip_clk_branch rk3399_clk_branches[] 
__initdata = {
COMPOSITE_NOMUX(0, "clk_test", "clk_test_pre", CLK_IGNORE_UNUSED,
RK3368_CLKSEL_CON(58), 0, 5, DFLAGS,
RK3368_CLKGATE_CON(13), 11, GFLAGS),
+
+   /* ddrc */
+   GATE(0, "clk_ddrc_lpll_src", "lpll", 0, RK3399_CLKGATE_CON(3),
+0, GFLAGS),
+   GATE(0, "clk_ddrc_bpll_src", "bpll", 0, RK3399_CLKGATE_CON(3),
+1, GFLAGS),
+   GATE(0, "clk_ddrc_dpll_src", "dpll", 0, RK3399_CLKGATE_CON(3),
+2, GFLAGS),
+   GATE(0, "clk_ddrc_gpll_src", "gpll", 0, RK3399_CLKGATE_CON(3),
+3, GFLAGS),
+   COMPOSITE_DDRCLK(SCLK_DDRC, "clk_ddrc", mux_ddrclk_p, 0,
+  RK3399_CLKSEL_CON(6), 4, 2, 0, 0, ROCKCHIP_DDRCLK_SIP),
 };
 
 static struct rockchip_clk_branch rk3399_clk_pmu_branches[] __initdata = {
@@ -1487,6 +1503,9 @@ static const char *const rk3399_cru_critical_clocks[] 
__initconst = {
"gpll_hclk_perilp1_src",
"gpll_aclk_perilp0_src",
"gpll_aclk_perihp_src",
+
+   /* ddrc */
+   "clk_ddrc"
 };
 
 static const char *const rk3399_pmucru_critical_clocks[] __initconst = {
-- 
1.9.1

[PATCH v5 5/8] PM / devfreq: event: support rockchip dfi controller

2016-08-09 Thread Lin Huang

on rk3399 platform, there is dfi conroller can monitor
ddr load, base on this result, we can do ddr freqency
scaling.

Signed-off-by: Lin Huang 
Acked-by: Chanwoo Choi 
---
Changes in v5:
-None

Changes in v4:
-None

Changes in v3:
-None

Changes in v2:
-None 

Changes in v1:
-None

 drivers/devfreq/event/Kconfig|   7 +
 drivers/devfreq/event/Makefile   |   1 +
 drivers/devfreq/event/rockchip-dfi.c | 253 +++
 3 files changed, 261 insertions(+)
 create mode 100644 drivers/devfreq/event/rockchip-dfi.c

diff --git a/drivers/devfreq/event/Kconfig b/drivers/devfreq/event/Kconfig
index eb6f74a..20d82c2 100644
--- a/drivers/devfreq/event/Kconfig
+++ b/drivers/devfreq/event/Kconfig
@@ -30,4 +30,11 @@ config DEVFREQ_EVENT_EXYNOS_PPMU
  (Platform Performance Monitoring Unit) counters to estimate the
  utilization of each module.
 
+config DEVFREQ_EVENT_ROCKCHIP_DFI
+   tristate "ROCKCHIP DFI DEVFREQ event Driver"
+   depends on ARCH_ROCKCHIP
+   help
+ This add the devfreq-event driver for Rockchip SoC. It provides DFI
+ (DDR Monitor Module) driver to count ddr load.
+
 endif # PM_DEVFREQ_EVENT
diff --git a/drivers/devfreq/event/Makefile b/drivers/devfreq/event/Makefile
index 3d6afd3..dda7090 100644
--- a/drivers/devfreq/event/Makefile
+++ b/drivers/devfreq/event/Makefile
@@ -2,3 +2,4 @@
 
 obj-$(CONFIG_DEVFREQ_EVENT_EXYNOS_NOCP) += exynos-nocp.o
 obj-$(CONFIG_DEVFREQ_EVENT_EXYNOS_PPMU) += exynos-ppmu.o
+obj-$(CONFIG_DEVFREQ_EVENT_ROCKCHIP_DFI) += rockchip-dfi.o
diff --git a/drivers/devfreq/event/rockchip-dfi.c 
b/drivers/devfreq/event/rockchip-dfi.c
new file mode 100644
index 000..3f12be7
--- /dev/null
+++ b/drivers/devfreq/event/rockchip-dfi.c
@@ -0,0 +1,253 @@
+/*
+ * Copyright (c) 2016, Fuzhou Rockchip Electronics Co., Ltd
+ * Author: Lin Huang 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define RK3399_DMC_NUM_CH  2
+
+/* DDRMON_CTRL */
+#define DDRMON_CTRL0x04
+#define CLR_DDRMON_CTRL(0x1f << 0)
+#define LPDDR4_EN  (0x10001 << 4)
+#define HARDWARE_EN(0x10001 << 3)
+#define LPDDR3_EN  (0x10001 << 2)
+#define SOFTWARE_EN(0x10001 << 1)
+#define TIME_CNT_EN(0x10001 << 0)
+
+#define DDRMON_CH0_COUNT_NUM   0x28
+#define DDRMON_CH0_DFI_ACCESS_NUM  0x2c
+#define DDRMON_CH1_COUNT_NUM   0x3c
+#define DDRMON_CH1_DFI_ACCESS_NUM  0x40
+
+/* pmu grf */
+#define PMUGRF_OS_REG2 0x308
+#define DDRTYPE_SHIFT  13
+#define DDRTYPE_MASK   7
+
+enum {
+   DDR3 = 3,
+   LPDDR3 = 6,
+   LPDDR4 = 7,
+   UNUSED = 0xFF
+};
+
+struct dmc_usage {
+   u32 access;
+   u32 total;
+};
+
+struct rockchip_dfi {
+   struct devfreq_event_dev *edev;
+   struct devfreq_event_desc *desc;
+   struct dmc_usage ch_usage[RK3399_DMC_NUM_CH];
+   struct device *dev;
+   void __iomem *regs;
+   struct regmap *regmap_pmu;
+   struct clk *clk;
+};
+
+static void rockchip_dfi_start_hardware_counter(struct devfreq_event_dev *edev)
+{
+   struct rockchip_dfi *info = devfreq_event_get_drvdata(edev);
+   void __iomem *dfi_regs = info->regs;
+   u32 val;
+   u32 ddr_type;
+
+   /* get ddr type */
+   regmap_read(info->regmap_pmu, PMUGRF_OS_REG2, &val);
+   ddr_type = (val >> DDRTYPE_SHIFT) & DDRTYPE_MASK;
+
+   /* clear DDRMON_CTRL setting */
+   writel_relaxed(CLR_DDRMON_CTRL, dfi_regs + DDRMON_CTRL);
+
+   /* set ddr type to dfi */
+   if (ddr_type == LPDDR3)
+   writel_relaxed(LPDDR3_EN, dfi_regs + DDRMON_CTRL);
+   else if (ddr_type == LPDDR4)
+   writel_relaxed(LPDDR4_EN, dfi_regs + DDRMON_CTRL);
+
+   /* enable count, use software mode */
+   writel_relaxed(SOFTWARE_EN, dfi_regs + DDRMON_CTRL);
+}
+
+static void rockchip_dfi_stop_hardware_counter(struct devfreq_event_dev *edev)
+{
+   struct rockchip_dfi *info = devfreq_event_get_drvdata(edev);
+   void __iomem *dfi_regs = info->regs;
+   u32 val;
+
+   val = readl_relaxed(dfi_regs + DDRMON_CTRL);
+   val &= ~SOFTWARE_EN;
+   writel_relaxed(val, dfi_regs + DDRMON_CTRL);
+}
+
+static int rockchip_dfi_get_busier_ch(struct devfreq_event_dev *edev)
+{
+   struct rockchip_dfi *info = devfreq_event_get_drvdata(edev);
+   u32 tmp, max = 0;
+   u32 i, busier_ch = 0;
+   void __iomem *dfi_regs = info->regs;
+
+   rockchip_dfi_stop_ha

Re: Regarding AHCI_MAX_SG and (ATA_HORKAGE_MAX_SEC_1024)

2016-08-09 Thread Tejun Heo

Hello, Tom.

On Sun, Aug 07, 2016 at 10:10:17PM +0800, Tom Yan wrote:
> So the (not so) recent bump of BLK_DEF_MAX_SECTORS from 1024 to 2560
> (commit d2be537c3ba3) seemed to have caused trouble to some of the ATA
> devices, which were then worked around with ATA_HORKAGE_MAX_SEC_1024.
> 
> However, I am suspecting that the bump of BLK_DEF_MAX_SECTORS is not
> the "real" cause of the trouble, but the fact that AHCI_MAX_SG has
> been set to a weird value of 168 (with a comment "hardware max is
> 64K", which neither seem to make any sense).

Hmmm.. why not?  The hardware limit is 64k and the driver is using a
lower limit of 168 most likely because it doesn't make noticeable
difference beyond certain point and it determines the size of
contiguous memory which has to be allocated for the command table.
Each sg entry is 16 bytes.  Pushing it to the hardware limit would
require an order 9 allocation for each port.

> AHCI_MAX_SG is used to set the sg_tablesize (i.e. max_segments,
> apparently), which is apparently used to derive the actual "request
> size" (that is, if it is lower than max_sectors(_kb), it will be the
> limiting factor instead).
>
> For example, no matter if the drive has max_sectors set to 2560, or to
> 65535 (by adding it as the Optimal Transfer Length to libata's SATL,
> which is also max_hw_sectors that is set from ATA_MAX_SECTORS_LBA48),
> "avgrq-sz" in `iostat` will be capped at 1344 (168 * 8).

Not necessarily.  A single sg entry can point to an area larger than
PAGE_SIZE.

> However, if I change AHCI_MAX_SG to 128 (which is also the
> sg_tablesize set in libata.h from LIBATA_MAX_PRD), "avgrq-sz" in
> `iostat` will be capped at 1024 (128 * 8), which should make
> ATA_HORKAGE_MAX_SEC_1024 unnecessary.
> 
> So why has AHCI_MAX_SG been set to 168 anyway?

As written above, that probably makes the ahci command table size
nicely aligned.

Thanks.

-- 
tejun

Re: [PATCH v5 5/6] usb: chipidea: let chipidea core device of_node equal's glue layer device of_node

2016-08-09 Thread Peter Chen

On Tue, Aug 09, 2016 at 05:15:36PM -0700, Stephen Boyd wrote:
> Quoting Peter Chen (2016-08-08 01:52:10)
> > From: Peter Chen 
> > 
> > At device tree, we have no device node for chipidea core,
> > the glue layer's node is the parent node for host and udc
> > device. But in related driver, the parent device is chipidea
> > core. So, in order to let the common driver get parent's node,
> > we let the core's device node equals glue layer device node.
> > 
> > Signed-off-by: Peter Chen 
> > Tested-by: Maciej S. Szmigiero 
> > Tested-by Joshua Clayton 
> > ---
> >  drivers/usb/chipidea/core.c | 11 +++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/drivers/usb/chipidea/core.c b/drivers/usb/chipidea/core.c
> > index 69426e6..b189dc7 100644
> > --- a/drivers/usb/chipidea/core.c
> > +++ b/drivers/usb/chipidea/core.c
> > @@ -954,6 +954,15 @@ static int ci_hdrc_probe(struct platform_device *pdev)
> > dev_err(dev, "unable to init phy: %d\n", ret);
> > return ret;
> > }
> > +   /*
> > +* At device tree, we have no device node for chipidea core,
> > +* the glue layer's node is the parent node for host and udc
> > +* device. But in related driver, the parent device is chipidea
> > +* core. So, in order to let the common driver get parent's node,
> > +* we let the core's device node equals glue layer's node.
> > +*/
> > +   if (dev->parent && dev->parent->of_node)
> > +   dev->of_node = dev->parent->of_node;
> 
> Can this be done earlier? Perhaps after hw_device_init() in this probe
> routine? That would allow me to remove the awkward parent searching in
> my ULPI DT awareness patch.

The reason why I locate it there is to avoid "goto label" for error
path during PHY's get and initialization operation.

Ok, to simplify your work, I will change it at next version.

-- 

Best Regards,
Peter Chen

Re: [PATCH v2] mm/slub: Run free_partial() outside of the kmem_cache_node->list_lock

2016-08-09 Thread Vladimir Davydov

On Tue, Aug 09, 2016 at 04:27:46PM +0100, Chris Wilson wrote:
...
> diff --git a/mm/slub.c b/mm/slub.c
> index 825ff45..58f0eb6 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3479,6 +3479,7 @@ static void list_slab_objects(struct kmem_cache *s, 
> struct page *page,
>   */
>  static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n)
>  {
> + LIST_HEAD(partial_list);

nit: slabs added to this list are not partially used - they are free, so
let's call it 'free_slabs' or 'discard_list' or just 'discard', please

>   struct page *page, *h;
>  
>   BUG_ON(irqs_disabled());
> @@ -3486,13 +3487,16 @@ static void free_partial(struct kmem_cache *s, struct 
> kmem_cache_node *n)
>   list_for_each_entry_safe(page, h, &n->partial, lru) {
>   if (!page->inuse) {
>   remove_partial(n, page);
> - discard_slab(s, page);
> + list_add(&page->lru, &partial_list);

If there are objects left in the cache on destruction, the cache won't
be destroyed. Instead it will be left on the slab_list and can get
reused later. So we should use list_move() here to always leave
n->partial in a consistent state, even in case of a leak.

>   } else {
>   list_slab_objects(s, page,
>   "Objects remaining in %s on __kmem_cache_shutdown()");
>   }
>   }
>   spin_unlock_irq(&n->list_lock);
> +
> + list_for_each_entry_safe(page, h, &partial_list, lru)
> + discard_slab(s, page);
>  }
>  
>  /*

Re: [PATCH V3 1/3] tracing: add a possibility of exporting function trace to other places instead of ring buffer only

2016-08-09 Thread Chunyan Zhang

On Tue, Aug 9, 2016 at 11:35 PM, Steven Rostedt  wrote:
> On Tue,  9 Aug 2016 14:32:39 +0800
> Chunyan Zhang  wrote:
>
>> Currently ring buffer is the only output of Function traces, this patch
>> added trace_export concept which would process the traces and export
>> traces to a registered destination which can be ring buffer or some other
>> storage, in this way if we want Function traces to be sent to other
>> destination rather than ring buffer only, we just need to register a new
>> trace_export and implement its own .commit() callback or just use
>> 'trace_generic_commit()' which this patch also added and hooks up its
>> own .write() functio for writing traces to the storage.
>>
>> Currently, only Function trace (TRACE_FN) is supported.
>>
>> Signed-off-by: Chunyan Zhang 
>> ---
>>  include/linux/trace.h |  31 +
>>  kernel/trace/trace.c  | 124 
>> +-
>>  kernel/trace/trace.h  |  31 +
>>  3 files changed, 185 insertions(+), 1 deletion(-)
>>  create mode 100644 include/linux/trace.h
>>
>> diff --git a/include/linux/trace.h b/include/linux/trace.h
>> new file mode 100644
>> index 000..bc7f503
>> --- /dev/null
>> +++ b/include/linux/trace.h
>> @@ -0,0 +1,31 @@
>> +#ifndef _LINUX_TRACE_H
>> +#define _LINUX_TRACE_H
>> +
>> +#include 
>> +struct trace_array;
>> +
>> +#ifdef CONFIG_TRACING
>> +/*
>> + * The trace export - an export of function traces.  Every ftrace_ops
>> + * has at least one export which would output function traces to ring
>> + * buffer.
>> + *
>> + * tr- the trace_array this export belongs to
>> + * commit- commit the traces to ring buffer and/or some other places
>> + * write - copy traces which have been delt with ->commit() to
>> + * the destination
>> + */
>> +struct trace_export {
>> + char name[16];
>> + struct trace_export *next;
>
> Should document above name and next. What's name used for? Is it

Sure, I will document them in the next revision.

Speaking to the 'name' here... I just think it will probably be useful
in the future, for example, if we need an userspace interface for
users to decide which trace_export should be used.

> visible to userspace? Add "next" just to be consistent as that's pretty
> obvious what it is for.
>
>> + struct trace_array  *tr;
>> + void (*commit)(struct trace_array *, struct ring_buffer_event *);
>> + void (*write)(const char *, unsigned int);
>> +};
>> +
>> +int register_trace_export(struct trace_export *export);
>> +int unregister_trace_export(struct trace_export *export);
>> +
>> +#endif   /* CONFIG_TRACING */
>> +
>> +#endif   /* _LINUX_TRACE_H */
>> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
>> index dade4c9..67ae581 100644
>> --- a/kernel/trace/trace.c
>> +++ b/kernel/trace/trace.c
>> @@ -40,6 +40,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>
>>  #include "trace.h"
>> @@ -2128,6 +2129,127 @@ void trace_buffer_unlock_commit_regs(struct 
>> trace_array *tr,
>>   ftrace_trace_userstack(buffer, flags, pc);
>>  }
>>
>> +static inline void
>> +trace_generic_commit(struct trace_array *tr,
>> +struct ring_buffer_event *event)
>> +{
>> + struct trace_entry *entry;
>> + struct trace_export *export = tr->export;
>> + unsigned int size = 0;
>> +
>> + entry = ring_buffer_event_data(event);
>> +
>> + trace_entry_size(size, entry->type);
>> + if (!size)
>> + return;
>> +
>> + if (export->write)
>> + export->write((char *)entry, size);
>> +}
>> +
>> +static inline void
>> +trace_rb_commit(struct trace_array *tr,
>> +struct ring_buffer_event *event)
>> +{
>> + __buffer_unlock_commit(tr->trace_buffer.buffer, event);
>> +}
>> +
>> +static DEFINE_MUTEX(trace_export_lock);
>> +
>> +static struct trace_export trace_export_rb __read_mostly = {
>> + .name   = "rb",
>> + .commit = trace_rb_commit,
>> + .next   = NULL,
>> +};
>> +static struct trace_export *trace_fn_exports __read_mostly = 
>> &trace_export_rb;
>> +
>> +inline void
>> +trace_function_exports(struct trace_array *tr,
>> +struct ring_buffer_event *event)
>> +{
>> + struct trace_export *export;
>> +
>> + mutex_lock(&trace_export_lock);
>
> Wait! Are you calling a mutex from the function tracer? This will blow
> up easily. The function callbacks must be totally lockless!

Okay, I just wanted to protect the list from being changed while being used.
What do you think if I change to make adding/removing trace exports
from the list are only permitted when the trace isn't enabled?

>
>> +
>> + for (export = trace_fn_exports; export && export->commit;
>> +  export = export->next) {
>> + tr->export = export;
>> + export->commit(tr, event);
>> + }
>> +
>> + mutex_unlock(&trace_export_lock);
>> +}
>> +
>> +static void add_trace_fn_export(stru

Re: [PATCH v5 6/6] ARM: dts: imx6qdl-udoo.dtsi: fix onboard USB HUB property

2016-08-09 Thread Peter Chen

On Tue, Aug 09, 2016 at 04:33:35PM -0700, Joshua Clayton wrote:
> Hi Peter,
> 
> On 08/08/2016 01:52 AM, Peter Chen wrote:
> > The current dts describes USB HUB's property at USB controller's
> > entry, it is improper. The USB HUB should be the child node
> > under USB controller, and power sequence properties are under
> > it.
> >
> > Signed-off-by: Peter Chen 
> > ---
> >  arch/arm/boot/dts/imx6qdl-udoo.dtsi | 26 +-
> >  1 file changed, 13 insertions(+), 13 deletions(-)
> >
> > diff --git a/arch/arm/boot/dts/imx6qdl-udoo.dtsi 
> > b/arch/arm/boot/dts/imx6qdl-udoo.dtsi
> > index 3bee2f9..f29a72c2f 100644
> > --- a/arch/arm/boot/dts/imx6qdl-udoo.dtsi
> > +++ b/arch/arm/boot/dts/imx6qdl-udoo.dtsi
> > @@ -9,6 +9,8 @@
> >   *
> >   */
> >  
> > +#include 
> > +
> >  / {
> > aliases {
> > backlight = &backlight;
> > @@ -58,17 +60,6 @@
> > #address-cells = <1>;
> > #size-cells = <0>;
> >  
> > -   reg_usb_h1_vbus: regulator@0 {
> > -   compatible = "regulator-fixed";
> > -   reg = <0>;
> > -   regulator-name = "usb_h1_vbus";
> > -   regulator-min-microvolt = <500>;
> > -   regulator-max-microvolt = <500>;
> > -   enable-active-high;
> > -   startup-delay-us = <2>; /* USB2415 requires a POR of 1 
> > us minimum */
> > -   gpio = <&gpio7 12 0>;
> > -   };
> > -
> > reg_panel: regulator@1 {
> > compatible = "regulator-fixed";
> > reg = <1>;
> > @@ -259,9 +250,18 @@
> >  &usbh1 {
> > pinctrl-names = "default";
> > pinctrl-0 = <&pinctrl_usbh>;
> > -   vbus-supply = <®_usb_h1_vbus>;
> > -   clocks = <&clks IMX6QDL_CLK_CKO>;
> > status = "okay";
> > +
> > +   #address-cells = <1>;
> > +   #size-cells = <0>;
> Assuming they are needed,
> #address-cells and #size-cells should go in imx6qdl.dtsi,
> rather than in board dts files, shouldn't they?

Yes, you are right. All imx USB controller has only one port.


> > +   usb2415: hub@1 {
> > +   compatible = "usb424,2514";
> > +   reg = <1>;
> Does   have any effect?
> I couldn't find any reference to it in the patches.
> (so apologies if it is in core code)
> Does it matter?

Please see 69bec7259853 ("USB: core: let USB device know device node")
for detail.

> Would it be possible to connect
> more than one hub to the same usb phy?

No possible for current imx, but possible for other SoCs.

-- 

Best Regards,
Peter Chen

Re: [PATCH 1/1] Add CPU temperature sensor support for i.MX53

2016-08-09 Thread Tejun Heo

Hello, Fabien.

Is $SUBJ right?  CPU temperature?

On Wed, Aug 03, 2016 at 09:35:56AM +0200, Fabien Lahoudere wrote:
> From: Csaba Kertesz 
> 
> The original patch was made by Richard Zhu for kernel 2.6.x:
> 
> ENGR00134041-MX53-Add-the-SATA-AHCI-temperature-monitor.patch
> 
> The old source code was migrated to the new kernel 3.x. The concept of
> value reading was changed a bit:
> 
> 1. The new 3.x kernel functions (imx_phy_reg_read, imx_phy_reg_write) use
>16 bit registers while the original implementation used 32 bit integers
>for this purpose.

The description seems pretty dated.

> 2. The communication is guarded against infinite loop to give up a certain
>register reading after 10 attempts. This number comes from the
>original implementation. A new variable (read_attempt) is introduced to
>count the trials.

That's a very high number for a retry counter.

> +/* SATA AHCI temperature monitor */
> +static ssize_t sata_ahci_current_tmp(struct device *dev, struct 
> device_attribute

Abbreviating temperature to tmp probably isn't the best idea.

Patch looks good to me except for the above nits; however, I wonder
whether it being a one-off sysfs attribute is the right approach.
Don't we have subsystems and interfaces for things like temperature
readings?

Thanks.

-- 
tejun

Re: c6x linker issue on linux-next-20160808 + some linker table work

2016-08-09 Thread Mark Salter

On Tue, 2016-08-09 at 19:09 -0700, Luis R. Rodriguez wrote:
> On Aug 9, 2016 6:50 PM, "Mark Salter"  wrote:
> >
> > On Tue, 2016-08-09 at 20:40 +0200, Luis R. Rodriguez wrote:
> > > On Tue, Aug 09, 2016 at 01:04:00PM -0400, Mark Salter wrote:
> > > >
> > > > On Tue, 2016-08-09 at 06:37 -0700, Guenter Roeck wrote:
> > > > >
> > > > > On 08/09/2016 01:11 AM, Luis R. Rodriguez wrote:
> > > > > >
> > > > > >
> > > > > > Mark, Aurelien,
> > > > > >
> > > > > > I've run into a linker (ld) issue caused by the linker table work 
> > > > > > I've
> > > > > > been working on [0]. I looked into this and for the life of me, I
> > > > > > cannot comprehend what the problem is, so was hoping you folks might
> > > > > > be able to chime in.
> > > > > >
> > > > > For reference, the error is
> > > > >
> > > > > c6x-elf-ld: drivers/built-in.o: SB-relative relocation but 
> > > > > __c6xabi_DSBT_BASE not defined
> > > > > c6x-elf-ld: drivers/built-in.o: SB-relative relocation but 
> > > > > __c6xabi_DSBT_BASE not defined
> > > > DSBT is a reference to the no-MMU userspace ABI used by c6x. The kernel 
> > > > shouldn't
> > > > be referencing DSBT base. The -mno-dsbt gcc flag should prevent it.
> > > I see -mno-dsbt on arch/c6x/Makefile already -- however at link time this 
> > > is
> > > an issue if linker tables are used it seems. Do you have any other 
> > > recommendation?
> > >
> > > I will note that it would seem that even i386 and x86-64 
> > > compiler/binutils seem
> > > to have relocation issues on older compiler/binutils, for instance:
> >
> > I see the problem with gcc 6 as well.
> >
> > So there appears to be some toolchain issues at play here. We build the 
> > kernel with two
> > c6x-specific options: -mno-dsbt and -msdata=none. I already mentioned dsbt. 
> > The sdata
> > option may be one of:
> >
> > -msdata=default
> >      Put small global and static data in the .neardata section, which is 
> > pointed to by
> >      register B14. Put small uninitialized global and static data in the 
> > .bss section,
> >      which is adjacent to the .neardata section. Put small read-only data 
> > into the 
> >      .rodata section. The corresponding sections used for large pieces of 
> > data are
> >      .fardata, .far and .const.
> >
> > -msdata=all
> >     Put all data, not just small objects, into the sections reserved for 
> > small data,
> >     and use addressing relative to the B14 register to access them.
> >
> > -msdata=none
> >     Make no use of the sections reserved for small data, and use absolute 
> > addresses
> >     to access all data. Put all initialized global and static data in the 
> > .fardata
> >     section, and all uninitialized data in the .far section. Put all 
> > constant data
> >     into the .const section.
> >
> >
> > Both small data and DSBT make use of base register + 15-bit offset to 
> > access data
> > and thus the SB-relative reloc in the above error message.
> >
> > I think that gcc sees the .rodata section from DEFINE_LINKTABLE_RO() for 
> > builtin_fw
> > and thinks it needs an SB-relative reloc. When the linker sees that reloc, 
> > it thinks
> > it needs the dsbt base register and thus the error. Interestingly, weak 
> > data is
> > never put in the small data section so if gcc sees that data is weak, it 
> > doesn't
> > check the section name to see if it is a small data section. So SB-relative 
> > only
> > gets used for builtin_fw__end, but not the weak builtin_fw even though they 
> > both
> > are in the .rodata section.
> >
> > I suspect gcc should avoid being fooled by .rodata if -msdata=none is used.
> > Regardless, I think this could all be avoided if the RO tables used .const
> > instead of .rodata for c6x.
> Thanks for the thorough analysis, would you be OK for c6x to use .const for 
> all read only linker tables or section ranges ?
> I had not added #ifndef around the core-sections.h main ELF definitons but 
> could add one as its needed. In this case perhals that is needed and fine by 
> you
> for SECTION_RODATA.
> We can also override any of the core section setter helpers for archs but in 
> this case based on what you say it seems this is needed. Unless of course just
> -msdata=none is fine and that's not yet used and you prefer that.
>   Luis

We're already using -msdata=none for kernel builds. From the gcc docs, one 
would think
all const data goes into .const with -msdata=none, but the kernel forces a lot 
of weak
const kallsyms data ,rodata so c6x vmlinux.lds still needs to have a .rodata 
section. I
think we need to use .const for the c6x read-only linker tables and keep 
.rodata for
RO_DATA_SECTION in vmlinux.lds.h.

Re: [PATCH v6 1/3] PCI: Add Precision Time Measurement (PTM) support

2016-08-09 Thread Yong, Jonathan

On 06/14/2016 03:05, Bjorn Helgaas wrote:
> From: Jonathan Yong 
> 
> Add Precision Time Measurement (PTM) support (see PCIe r3.1, sec 6.22).
> 
> Enable PTM on PTM Root devices and switch ports.  This does not enable PTM
> on endpoints.
> 
> There currently are no PTM-capable devices on the market, but it is
> expected to be supported by the Intel Apollo Lake platform.
> 
> [bhelgaas: complete rework]
> Signed-off-by: Jonathan Yong 
> Signed-off-by: Bjorn Helgaas 

Hi,

Any updates on the PTM changes?

Re: Kernel modules under new copyleft licence : (was Re: [PATCH v2] module.h: add copyleft-next >= 0.3.1 as GPL compatible)

2016-08-09 Thread Linus Torvalds

On Tue, Aug 9, 2016 at 1:14 PM, Luis R. Rodriguez  wrote:
>
> I'm personally fine with MODULE_LICENSE("GPL") being used with copyleft-next 
> code
> and find it sensible.

I'd rather have the kernel license be as clear as possible, so I'd
tend to prefer that

  MODULE_LICENSE("GPL")

and then if you want to dual-license it, just put something like "or,
at your option, copyleft-next" in the comment at the top.

That makes it clear that as far as the kernel is concerned, it's
GPLv2, but if somebody finds it useful for other projects, they can
choose to take that file under copyleft-next (whatever version that
would be..).

  Linus

Re: [PATCH v6 2/4] mfd: lp873x: Add lp873x PMIC support

2016-08-09 Thread Keerthy




On Tuesday 09 August 2016 06:23 PM, Lee Jones wrote:

On Mon, 08 Aug 2016, Keerthy wrote:


The LP873X chip is a power management IC for Portable Navigation Systems
 and Tablet Computing devices. It contains the following components:

  - Regulators.
  - Configurable General Purpose Output Signals(GPO).

PMIC interacts with the main processor through i2c. PMIC has
couple of LDOs(Linear Regulators), couple of BUCKs (Step-Down DC-DC
Converter Cores) and GPOs(General Purpose Output Signals).

Signed-off-by: Keerthy 
---
Changes in v6:

   * Rebased on top of 
http://www.gossamer-threads.com/lists/linux/kernel/2457552.
   * Hence added probe_new instead of probe and removed unused i2c_device_id.


No, please don't do that.  This patch-set is blocked.

Please rebase on top of a mainline release and re-send.

We still need the I2C table, for now.


Okay. Guess i did some testing for that series then :-P.
I will revert only that and send patch[es [2-4].

Regards,
Keerthy



Changes in v4:

   * Added Author.
   * Added the mfd_cell for gpio.

Changes in v3:

   * Reordered the probe code.
   * Fixed Typo in Kconfig description.
   * Removed unused member from struct lp873x.

  drivers/mfd/Kconfig|  14 +++
  drivers/mfd/Makefile   |   2 +
  drivers/mfd/lp873x.c   |  89 +++
  include/linux/mfd/lp873x.h | 264 +
  4 files changed, 369 insertions(+)
  create mode 100644 drivers/mfd/lp873x.c
  create mode 100644 include/linux/mfd/lp873x.h

diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index 2d1fb64..45fe00a 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -1224,6 +1224,20 @@ config MFD_TPS65217
  This driver can also be built as a module.  If so, the module
  will be called tps65217.

+config MFD_TI_LP873X
+   tristate "TI LP873X Power Management IC"
+   depends on I2C
+   select MFD_CORE
+   select REGMAP_I2C
+   help
+ If you say yes here then you get support for the LP873X series of
+ Power Management Integrated Circuits(PMIC).
+ These include voltage regulators, Thermal protection, Configurable
+ General Purpose Outputs(GPO) that are used in portable devices.
+
+ This driver can also be built as a module. If so, the module
+ will be called lp873x.
+
  config MFD_TPS65218
tristate "TI TPS65218 Power Management chips"
depends on I2C
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index 2ba3ba3..42acbcd 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -22,6 +22,8 @@ obj-$(CONFIG_HTC_EGPIO)   += htc-egpio.o
  obj-$(CONFIG_HTC_PASIC3)  += htc-pasic3.o
  obj-$(CONFIG_HTC_I2CPLD)  += htc-i2cpld.o

+obj-$(CONFIG_MFD_TI_LP873X)+= lp873x.o
+
  obj-$(CONFIG_MFD_DAVINCI_VOICECODEC)  += davinci_voicecodec.o
  obj-$(CONFIG_MFD_DM355EVM_MSP)+= dm355evm_msp.o
  obj-$(CONFIG_MFD_TI_AM335X_TSCADC)+= ti_am335x_tscadc.o
diff --git a/drivers/mfd/lp873x.c b/drivers/mfd/lp873x.c
new file mode 100644
index 000..3c8e8d0
--- /dev/null
+++ b/drivers/mfd/lp873x.c
@@ -0,0 +1,89 @@
+/*
+ * Copyright (C) 2016 Texas Instruments Incorporated - http://www.ti.com/
+ *
+ * Author: Keerthy 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+static const struct regmap_config lp873x_regmap_config = {
+   .reg_bits = 8,
+   .val_bits = 8,
+   .max_register = LP873X_REG_MAX,
+};
+
+static const struct mfd_cell lp873x_cells[] = {
+   { .name = "lp873x-regulator", },
+   { .name = "lp873x-gpio", },
+};
+
+static int lp873x_probe(struct i2c_client *client)
+{
+   struct lp873x *lp873;
+   int ret;
+   unsigned int otpid;
+
+   lp873 = devm_kzalloc(&client->dev, sizeof(*lp873), GFP_KERNEL);
+   if (!lp873)
+   return -ENOMEM;
+
+   lp873->dev = &client->dev;
+
+   lp873->regmap = devm_regmap_init_i2c(client, &lp873x_regmap_config);
+   if (IS_ERR(lp873->regmap)) {
+   ret = PTR_ERR(lp873->regmap);
+   dev_err(lp873->dev,
+   "Failed to initialize register map: %d\n", ret);
+   return ret;
+   }
+
+   mutex_init(&lp873->lock);
+
+   ret = regmap_read(lp873->regmap, LP873X_REG_OTP_REV, &otpid);
+   if (ret) {
+   dev_err(lp873->dev, "Failed to read OTP ID\n");
+   return ret;
+   }
+
+   lp873->rev = otpid & LP873X_OTP_REV_OTP_ID;
+   i2c_set_clientda

Re: [PATCH v6 1/4] Documentation: mfd: LP873X: Add information for the mfd driver

2016-08-09 Thread Keerthy




On Tuesday 09 August 2016 06:20 PM, Lee Jones wrote:

On Mon, 08 Aug 2016, Keerthy wrote:


The lp873x series of PMICs have a bunch of regulators and a couple
of GPO(General Purpose Outputs).
Add information for the MFD and regulator drivers.

Acked-by: Rob Herring 
Signed-off-by: Keerthy 


These should be in chronological order.  Rob could not have Acked the
patch before you sent it.


---
Changes in v6:

   * Added more formating for properties.

Changes in v4:

   * Added the GPIO properties.

Changes in v3:

   * Changed the example node lable to pmic from lp8733.

  Documentation/devicetree/bindings/mfd/lp873x.txt | 59 
  1 file changed, 59 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/mfd/lp873x.txt


Patch looks good though.

I'll fix the nit above.


Okay thanks.



Applied, thanks.


diff --git a/Documentation/devicetree/bindings/mfd/lp873x.txt 
b/Documentation/devicetree/bindings/mfd/lp873x.txt
new file mode 100644
index 000..1377c25
--- /dev/null
+++ b/Documentation/devicetree/bindings/mfd/lp873x.txt
@@ -0,0 +1,59 @@
+TI LP873X PMIC MFD driver
+
+Required properties:
+  - compatible:"ti,lp8732", "ti,lp8733"
+  - reg:   I2C slave address.
+  - gpio-controller:   Marks the device node as a GPIO Controller.
+  - #gpio-cells:   Should be two.  The first cell is the pin number and
+   the second cell is used to specify flags.
+   See ../gpio/gpio.txt for more information.
+  - regulators:List of child nodes that specify the regulator
+   initialization data.
+Example:
+
+pmic: lp8733@60 {
+   compatible = "ti,lp8733";
+   reg = <0x60>;
+   gpio-controller;
+   #gpio-cells = <2>;
+
+   regulators {
+   lp8733_buck0: buck0 {
+   regulator-name = "lp8733-buck0";
+   regulator-min-microvolt = <80>;
+   regulator-max-microvolt = <140>;
+   regulator-min-microamp = <150>;
+   regulator-max-microamp = <400>;
+   regulator-ramp-delay = <1>;
+   regulator-always-on;
+   regulator-boot-on;
+   };
+
+   lp8733_buck1: buck1 {
+   regulator-name = "lp8733-buck1";
+   regulator-min-microvolt = <80>;
+   regulator-max-microvolt = <140>;
+   regulator-min-microamp = <150>;
+   regulator-max-microamp = <400>;
+   regulator-ramp-delay = <1>;
+   regulator-boot-on;
+   regulator-always-on;
+   };
+
+   lp8733_ldo0: ldo0 {
+   regulator-name = "lp8733-ldo0";
+   regulator-min-microvolt = <80>;
+   regulator-max-microvolt = <300>;
+   regulator-boot-on;
+   regulator-always-on;
+   };
+
+   lp8733_ldo1: ldo1 {
+   regulator-name = "lp8733-ldo1";
+   regulator-min-microvolt = <80>;
+   regulator-max-microvolt = <300>;
+   regulator-always-on;
+   regulator-boot-on;
+   };
+   };
+};

Re: [PATCH][RESEND] thermal: hwmon: EXPORT_SYMBOL_GPL for thermal hwmon sysfs

2016-08-09 Thread Kuninori Morimoto


Hi Zhang

> > Hi Linux-PM, Linux-Kernel ML
> > 
> > I posted thermal driver patch 2month ago, but no response and nothing
> > happen.
> > I'm following scripts/get_maintainer.pl, but am I wrong ??
> > Who is the maintainer of these patches ??
> > 
> The patch is queued for 4.8-rc2.
> As you can see at 
> https://git.kernel.org/cgit/linux/kernel/git/rzhang/linux.git/log/?h=next

Oh, thanks
But, I think it doesn't include [1/2] patch, is it rejected ?

> > > > Kuninori Morimoto (2):
> > > >   thermal: rcar-thermal: enable hwmon when thermal_zone
> > > >   thermal: hwmon: EXPORT_SYMBOL_GPL for thermal hwmon sysfs


Best regards
---
Kuninori Morimoto

RE: [PATCH][RESEND] thermal: hwmon: EXPORT_SYMBOL_GPL for thermal hwmon sysfs

2016-08-09 Thread Zhang, Rui



> -Original Message-
> From: Kuninori Morimoto [mailto:kuninori.morimoto...@renesas.com]
> Sent: Wednesday, August 10, 2016 10:10 AM
> To: Kuninori Morimoto 
> Cc: Zhang, Rui ; edubez...@gmail.com; Geert
> Uytterhoeven ; linux-kernel@vger.kernel.org; linux-
> renesas-...@vger.kernel.org; linux...@vger.kernel.org;
> yoshihiro.shimoda...@renesas.com; cm-h...@jinso.co.jp; PhucBui  p...@jinso.co.jp>
> Subject: Re: [PATCH][RESEND] thermal: hwmon: EXPORT_SYMBOL_GPL for
> thermal hwmon sysfs
> Importance: High
> 
> 
> Hi Linux-PM, Linux-Kernel ML
> 
> I posted thermal driver patch 2month ago, but no response and nothing
> happen.
> I'm following scripts/get_maintainer.pl, but am I wrong ??
> Who is the maintainer of these patches ??
> 
The patch is queued for 4.8-rc2.
As you can see at 
https://git.kernel.org/cgit/linux/kernel/git/rzhang/linux.git/log/?h=next

Thanks,
Rui

> > Hi Zhang
> >
> > ping ??
> >
> > > These are resend patches for rcar-thermal hwmon.
> > >
> > > Kuninori Morimoto (2):
> > >   thermal: rcar-thermal: enable hwmon when thermal_zone
> > >   thermal: hwmon: EXPORT_SYMBOL_GPL for thermal hwmon sysfs
> > >
> > >  drivers/thermal/rcar_thermal.c  | 20 ++--
> > > drivers/thermal/thermal_hwmon.c |  2 ++
> > >  2 files changed, 20 insertions(+), 2 deletions(-)
> 
> 
> Best regards
> ---
> Kuninori Morimoto

Re: [PATCH] mm: optimize find_zone_movable_pfns_for_nodes to avoid unnecessary loop.

2016-08-09 Thread zhong jiang

On 2016/8/10 7:29, Andrew Morton wrote:
> On Fri, 5 Aug 2016 22:04:07 +0800 zhongjiang  wrote:
>
>> when required_kernelcore decrease to zero, we should exit the loop in time.
>> because It will waste time to scan the remainder node.
> The patch is rather ugly and it only affects __init code, so the only
> benefit will be to boot time.
   yes
> Do we have any timing measurements which would justify changing this code?
  I am sorry for that.  That is a only theoretical analysis.
>

[PATCH v4] arm64: mm: convert __dma_* routines to use start, size

2016-08-09 Thread Kwangwoo Lee

__dma_* routines have been converted to use start and size instread of
start and end addresses. The patch was origianlly for adding
__clean_dcache_area_poc() which will be used in pmem driver to clean
dcache to the PoC(Point of Coherency) in arch_wb_cache_pmem().

The functionality of __clean_dcache_area_poc()  was equivalent to
__dma_clean_range(). The difference was __dma_clean_range() uses the end
address, but __clean_dcache_area_poc() uses the size to clean.

Thus, __clean_dcache_area_poc() has been revised with a fallthrough
function of __dma_clean_range() after the change that __dma_* routines
use start and size instead of using start and end.

As a consequence of using start and size, the name of __dma_* routines
has also been altered following the terminology below:
area: takes a start and size
range: takes a start and end

Cc: Russell King - ARM Linux 
Cc: Will Deacon 
Cc: Mark Rutland 
Reviewed-by: Robin Murphy 
Signed-off-by: Kwangwoo Lee 
---
v4)
add Reviewed-by: and Cc: lines

v3)
clean up in __dma_clean_area() based on 823066d9edcd dcache_by_line_op fix

v2)
change the names of __dma_* rountines to use area instead of range
fix to use __dma_flush_area() in dma-mapping.c

v1)
change __dma_* routines to use start, size
add __clean_dcache_area_poc() as a fallthrough of __dma_clean_range()

 arch/arm64/include/asm/cacheflush.h |  3 +-
 arch/arm64/mm/cache.S   | 82 ++---
 arch/arm64/mm/dma-mapping.c |  6 +--
 3 files changed, 44 insertions(+), 47 deletions(-)

diff --git a/arch/arm64/include/asm/cacheflush.h 
b/arch/arm64/include/asm/cacheflush.h
index c64268d..2e5fb97 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -68,6 +68,7 @@
 extern void flush_cache_range(struct vm_area_struct *vma, unsigned long start, 
unsigned long end);
 extern void flush_icache_range(unsigned long start, unsigned long end);
 extern void __flush_dcache_area(void *addr, size_t len);
+extern void __clean_dcache_area_poc(void *addr, size_t len);
 extern void __clean_dcache_area_pou(void *addr, size_t len);
 extern long __flush_cache_user_range(unsigned long start, unsigned long end);
 
@@ -85,7 +86,7 @@ static inline void flush_cache_page(struct vm_area_struct 
*vma,
  */
 extern void __dma_map_area(const void *, size_t, int);
 extern void __dma_unmap_area(const void *, size_t, int);
-extern void __dma_flush_range(const void *, const void *);
+extern void __dma_flush_area(const void *, size_t);
 
 /*
  * Copy user data from/to a page which is mapped into a different
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 07d7352..58b5a90 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -105,19 +105,20 @@ ENTRY(__clean_dcache_area_pou)
 ENDPROC(__clean_dcache_area_pou)
 
 /*
- * __inval_cache_range(start, end)
- * - start   - start address of region
- * - end - end address of region
+ * __dma_inv_area(start, size)
+ * - start   - virtual start address of region
+ * - size- size in question
  */
-ENTRY(__inval_cache_range)
+__dma_inv_area:
+   add x1, x1, x0
/* FALLTHROUGH */
 
 /*
- * __dma_inv_range(start, end)
- * - start   - virtual start address of region
- * - end - virtual end address of region
+ * __inval_cache_range(start, end)
+ * - start   - start address of region
+ * - end - end address of region
  */
-__dma_inv_range:
+ENTRY(__inval_cache_range)
dcache_line_size x2, x3
sub x3, x2, #1
tst x1, x3  // end cache line aligned?
@@ -136,46 +137,43 @@ __dma_inv_range:
dsb sy
ret
 ENDPIPROC(__inval_cache_range)
-ENDPROC(__dma_inv_range)
+ENDPROC(__dma_inv_area)
+
+/*
+ * __clean_dcache_area_poc(kaddr, size)
+ *
+ * Ensure that any D-cache lines for the interval [kaddr, kaddr+size)
+ * are cleaned to the PoC.
+ *
+ * - kaddr   - kernel address
+ * - size- size in question
+ */
+ENTRY(__clean_dcache_area_poc)
+   /* FALLTHROUGH */
 
 /*
- * __dma_clean_range(start, end)
+ * __dma_clean_area(start, size)
  * - start   - virtual start address of region
- * - end - virtual end address of region
+ * - size- size in question
  */
-__dma_clean_range:
-   dcache_line_size x2, x3
-   sub x3, x2, #1
-   bic x0, x0, x3
-1:
-alternative_if_not ARM64_WORKAROUND_CLEAN_CACHE
-   dc  cvac, x0
-alternative_else
-   dc  civac, x0
-alternative_endif
-   add x0, x0, x2
-   cmp x0, x1
-   b.lo1b
-   dsb sy
+__dma_clean_area:
+   dcache_by_line_op cvac, sy, x0, x1, x2, x3
ret
-ENDPROC(__dma_clean_range)
+ENDPIPROC(__clean_dcache_area_poc)
+ENDPROC(__dma_clean_area)
 
 /*
- * __dma_flush_range(start, end)
+ * __dma_flush_area(start, size)
+ *
+ * clean & invalidate D / U line
+ *
  * - start   - virtual start add

Re: [PATCH] usb: core: Add runtime resume checking

2016-08-09 Thread Baolin Wang

Hi Greg,

On 9 August 2016 at 18:26, Greg KH  wrote:
> On Tue, Aug 09, 2016 at 05:33:33PM +0800, Baolin Wang wrote:
>> When the usb device has entered suspend state by runtime suspend method, and
>> the sustem also try to enter suspend state by issuing usb_dev_suspend(), it
>> will issue pm_runtime_resume() function to deal with wrong wakeup setting in
>> choose_wakeup() function.
>>
>> But if usb device resumes failed due to xhci has been into suspend state and
>> hardware is not accessible, which will set runtime errors. Thus when there is
>> slave attached, usb device will resume failed by runtime resume method due to
>> previous runtime errors.
>
> I really can't parse the first sentance in this paragraph, what exactly
> makes xhci so "unique" here?

Sorry for confusing, I try to explain it clearly. Considering strict
power management for mobile device, we should also power off the usb
controller if there are no slaves attached even though it is usb host
function.

For example: No slave attached> usb interface runtime suspend
> usb device runtime suspend -> xhci suspend -> power off
usb controller. After that if the system wants to enter suspend state,
then it also will issue usb_dev_suspend(), then the
pm_runtime_resume() function (issued in choose_wakeup() function) will
return -ESHUTDOWN due to xhci has been suspend and hardware is not
accessible.

After system entering resume state, if there is slave attached >
power on usb controller -> xhci resume -> usb device runtime
resume > usb interface runtime resume. Usb device will resume
failed if runtime errors is set (-ESHUTDOWN), thus we should clear the
runtime errors in choose_wakeup() function to avoid this situation.

>
>> Then we should check if it resumes successfully in choose_wakeup() function,
>
> what is "it"?

It present pm_runtime_resume() issued in choose_wakeup() function.

>
>> if it failed we should clear the runtime errors by pm_runtime_set_suspended()
>> function to avoid runtime resume failure.
>
> Again, what is "it"?

It present pm_runtime_resume() issued in choose_wakeup() function.

>
>>
>> Signed-off-by: Baolin Wang 
>> ---
>>  drivers/usb/core/driver.c |9 +++--
>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c
>> index dadd1e8d..a1a0f5f 100644
>> --- a/drivers/usb/core/driver.c
>> +++ b/drivers/usb/core/driver.c
>> @@ -1412,6 +1412,7 @@ static int usb_resume_both(struct usb_device *udev, 
>> pm_message_t msg)
>>  static void choose_wakeup(struct usb_device *udev, pm_message_t msg)
>>  {
>>   int w;
>> + int ret;
>>
>>   /* Remote wakeup is needed only when we actually go to sleep.
>>* For things like FREEZE and QUIESCE, if the device is already
>> @@ -1431,8 +1432,12 @@ static void choose_wakeup(struct usb_device *udev, 
>> pm_message_t msg)
>>   /* If the device is autosuspended with the wrong wakeup setting,
>>* autoresume now so the setting can be changed.
>>*/
>> - if (udev->state == USB_STATE_SUSPENDED && w != udev->do_remote_wakeup)
>> - pm_runtime_resume(&udev->dev);
>> + if (udev->state == USB_STATE_SUSPENDED && w != udev->do_remote_wakeup) 
>> {
>> + ret = pm_runtime_resume(&udev->dev);
>> + if (ret == -ESHUTDOWN)
>> + pm_runtime_set_suspended(&udev->dev);
>
> why is 'ret' needed:
> if (pm_runtime_resume(&udev->dev) == -ESHUTDOWN)

OK. I can modify it in next version if you agree this patch.

>
>
> Why would resume fail?

Like I explained above. Thanks.

-- 
Baolin.wang
Best Regards

RE: [PATCH v3] arm64: mm: convert __dma_* routines to use start, size

2016-08-09 Thread kwangwoo....@sk.com

Hi Robin,

> -Original Message-
> From: Robin Murphy [mailto:robin.mur...@arm.com]
> Sent: Tuesday, August 09, 2016 8:51 PM
> To: 이광우(LEE KWANGWOO) MS SW; Russell King - ARM Linux; Catalin Marinas; Will 
> Deacon; Mark Rutland;
> linux-arm-ker...@lists.infradead.org
> Cc: 정우석(CHUNG WOO SUK) MS SW; 김현철(KIM HYUNCHUL) MS SW; 
> linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v3] arm64: mm: convert __dma_* routines to use start, size
> 
> On 02/08/16 01:50, Kwangwoo Lee wrote:
> > __dma_* routines have been converted to use start and size instread of
> > start and end addresses. The patch was origianlly for adding
> > __clean_dcache_area_poc() which will be used in pmem driver to clean
> > dcache to the PoC(Point of Coherency) in arch_wb_cache_pmem().
> >
> > The functionality of __clean_dcache_area_poc()  was equivalent to
> > __dma_clean_range(). The difference was __dma_clean_range() uses the end
> > address, but __clean_dcache_area_poc() uses the size to clean.
> >
> > Thus, __clean_dcache_area_poc() has been revised with a fallthrough
> > function of __dma_clean_range() after the change that __dma_* routines
> > use start and size instead of using start and end.
> >
> > As a consequence of using start and size, the name of __dma_* routines
> > has also been altered following the terminology below:
> > area: takes a start and size
> > range: takes a start and end
> 
> This looks pretty nice and tidy now; I don't see anything obviously
> wrong, and comparing before-and-after disassemblies shows essentially
> nothing more than the movement of some add instructions as expected, so:
> 
> Reviewed-by: Robin Murphy 

Thank you very much for your review!
I'm going to add Reviewed-by: and Cc: lines and send it again. Thanks!

Best Regards,
Kwangwoo Lee

Re: [PATCH v5 03/14] arm64/numa: add nid check for memory block

2016-08-09 Thread Hanjun Guo

On 2016/8/8 17:18, Zhen Lei wrote:
> Use the same tactic to cpu and numa-distance nodes.
>
> Signed-off-by: Zhen Lei 
> ---
>  arch/arm64/mm/numa.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
> index c7fe3ec..2601660 100644
> --- a/arch/arm64/mm/numa.c
> +++ b/arch/arm64/mm/numa.c
> @@ -141,6 +141,11 @@ int __init numa_add_memblk(int nid, u64 start, u64 end)
>  {
>   int ret;
>
> + if (nid >= MAX_NUMNODES) {
> + pr_warn("NUMA: Node id %u exceeds maximum value\n", nid);
> + return -EINVAL;
> + }

I think this check should be added to of_numa_parse_memory_nodes(), which before
the numa_add_memblk() called, it's the same logic in of_numa_parse_cpu_nodes() 
and
the node id is checked before calling numa_add_memblk() in ACPI.

Thanks
Hanjun

Re: [PATCH][RESEND] thermal: hwmon: EXPORT_SYMBOL_GPL for thermal hwmon sysfs

2016-08-09 Thread Kuninori Morimoto


Hi Linux-PM, Linux-Kernel ML

I posted thermal driver patch 2month ago, but no response and nothing happen.
I'm following scripts/get_maintainer.pl, but am I wrong ??
Who is the maintainer of these patches ??

> Hi Zhang
> 
> ping ??
> 
> > These are resend patches for rcar-thermal hwmon.
> > 
> > Kuninori Morimoto (2):
> >   thermal: rcar-thermal: enable hwmon when thermal_zone
> >   thermal: hwmon: EXPORT_SYMBOL_GPL for thermal hwmon sysfs
> > 
> >  drivers/thermal/rcar_thermal.c  | 20 ++--
> >  drivers/thermal/thermal_hwmon.c |  2 ++
> >  2 files changed, 20 insertions(+), 2 deletions(-)


Best regards
---
Kuninori Morimoto

Re: [PATCH v5 00/14] fix some type infos and bugs for arm64/of numa

2016-08-09 Thread Hanjun Guo

Hi Zhen,

On 2016/8/8 17:18, Zhen Lei wrote:
> v4 -> v5:
> This version has no code changes, just add "Acked-by: Rob Herring 
> "
> into patches 1, 2, 4, 6, 7, 13, 14. Because these patches rely on some acpi 
> numa
> patches, and the latter had not been upstreamed in 4.7, but upstreamed in 
> 4.8-rc1,
> so I resend my patches again.

I think we need to mention this patch set is rebased on top of 4,8-rc1, and

 - patch 1~5 are fixes which are targeting for 4.8.

 - patch 6~14 are cleanups and new features, which are targeting for 4.9, in 
detail,
   - patch 6~8 are cleanups
   - patch 9~14 are new features adding per cpu area and memory less node 
support.

Catalin, do you think Zhen needs to separate this patch set into two and then 
resend?

Thanks
Hanjun

Re: [PATCH] rcu: Fix soft lockup for rcu_nocb_kthread

2016-08-09 Thread Paul E. McKenney

On Wed, Aug 10, 2016 at 09:13:14AM +0800, Ding Tianhong wrote:
> On 2016/6/16 22:19, Paul E. McKenney wrote:
> > On Thu, Jun 16, 2016 at 02:09:47PM +0800, Ding Tianhong wrote:
> >> On 2016/6/15 23:49, Paul E. McKenney wrote:
> >>> On Wed, Jun 15, 2016 at 03:27:36PM +0800, Ding Tianhong wrote:
>  I met this problem when using the Testgine to send package to ixgbevf nic
>  by this steps:
>  1. Connect to ixgbevf, and set the speed to 10Gb/s, it could work fine.
>  2. Then use ifconfig to down the nic and up again, loop for several 
>  times.
>  3. The system panic by soft lockup.
> >>>
> >>> Good catch, queued for review and testing.  But what .config was your
> >>> kernel built with?
> >>>
> >>
> >> I use the redhat7.1 defconfig to build my kernel, and the RCU config is 
> >> this:
> >>  120 #
> >>  121 # RCU Subsystem
> >>  122 #
> >>  123 CONFIG_TREE_RCU=y
> >>  124 # CONFIG_PREEMPT_RCU is not set
> >>  125 CONFIG_RCU_STALL_COMMON=y
> >>  126 CONFIG_CONTEXT_TRACKING=y
> >>  127 CONFIG_RCU_USER_QS=y
> >>  128 # CONFIG_CONTEXT_TRACKING_FORCE is not set
> >>  129 CONFIG_RCU_FANOUT=64
> >>  130 CONFIG_RCU_FANOUT_LEAF=16
> >>  131 # CONFIG_RCU_FANOUT_EXACT is not set
> >>  132 # CONFIG_RCU_FAST_NO_HZ is not set
> >>  133 # CONFIG_TREE_RCU_TRACE is not set
> >>  134 CONFIG_RCU_NOCB_CPU=y
> >>  135 CONFIG_RCU_NOCB_CPU_ALL=y
> >>  136 CONFIG_BUILD_BIN2C=y
> > 
> > Thank you!  You were running with preemption disabled, so your system
> > would indeed be very susceptible to this problem.
> > 
> >>> Also, I did tweak both the commit log and the patch.  Your cond_resched()
> >>> would prevent soft lockups, but not RCU stalls, so I substituted
> >>> cond_resched_rcu_qs().  Please let me know if either of those changes
> >>> causes problems at your end.
> >>
> >> Looks fine to me, I will apply this to my branch and test it, thanks.
> > 
> > Please let me know how it goes!
> > 
> > Thanx, Paul
> > 
> 
> Hi Paul:
> 
> It has been a long time after applying this patch, and didn't found any 
> problem, I believe this patch is fine, thanks.

Very good!  I will push this one upstream during the next merge window.

Thanx, Paul

> Ding
> 
> >> Ding
> >>
> >>>
> >>>   Thanx, Paul
> >>>
> >>> 
> >>>
> >>> commit c317cf19b34c0d2787b787c38bd2c8fe433215da
> >>> Author: Ding Tianhong 
> >>> Date:   Wed Jun 15 15:27:36 2016 +0800
> >>>
> >>> rcu: Fix soft lockup for rcu_nocb_kthread
> >>> 
> >>> Carrying out the following steps results in a softlockup in the
> >>> RCU callback-offload (rcuo) kthreads:
> >>> 
> >>> 1. Connect to ixgbevf, and set the speed to 10Gb/s.
> >>> 2. Use ifconfig to bring the nic up and down repeatedly.
> >>> 
> >>> [  317.005148] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
> >>> [  368.106005] BUG: soft lockup - CPU#1 stuck for 22s! [rcuos/1:15]
> >>> [  368.106005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> >>> [  368.106005] task: 88057dd8a220 ti: 88057dd9c000 task.ti: 
> >>> 88057dd9c000
> >>> [  368.106005] RIP: 0010:[]  [] 
> >>> fib_table_lookup+0x14/0x390
> >>> [  368.106005] RSP: 0018:88061fc83ce8  EFLAGS: 0286
> >>> [  368.106005] RAX: 0001 RBX: 020155c0 RCX: 
> >>> 0001
> >>> [  368.106005] RDX: 88061fc83d50 RSI: 88061fc83d70 RDI: 
> >>> 880036d11a00
> >>> [  368.106005] RBP: 88061fc83d08 R08: 0001 R09: 
> >>> 
> >>> [  368.106005] R10: 880036d11a00 R11: 819e0900 R12: 
> >>> 88061fc83c58
> >>> [  368.106005] R13: 816154dd R14: 88061fc83d08 R15: 
> >>> 020155c0
> >>> [  368.106005] FS:  () GS:88061fc8() 
> >>> knlGS:
> >>> [  368.106005] CS:  0010 DS:  ES:  CR0: 80050033
> >>> [  368.106005] CR2: 7f8c2aee9c40 CR3: 00057b222000 CR4: 
> >>> 000407e0
> >>> [  368.106005] DR0:  DR1:  DR2: 
> >>> 
> >>> [  368.106005] DR3:  DR6: 0ff0 DR7: 
> >>> 0400
> >>> [  368.106005] Stack:
> >>> [  368.106005]  01c0 88057b766000 8802e380b000 
> >>> 88057af03e00
> >>> [  368.106005]  88061fc83dc0 815349a6 88061fc83d40 
> >>> 814ee146
> >>> [  368.106005]  8802e380af00 e380af00 819e0900 
> >>> 020155c001c0
> >>> [  368.106005] Call Trace:
> >>> [  368.106005]  
> >>> [  368.106005]
> >>> [  368.106005]  [] ip_route_input_noref+0x516/0xbd0
> >>> [  368.106005]  [] ? skb_release_data+0xd6/0x110
> >>> [  368.106005]  [] ? kfree_skb+0x3a/0xa0
> >>> [  3

Re: c6x linker issue on linux-next-20160808 + some linker table work

2016-08-09 Thread Mark Salter

On Tue, 2016-08-09 at 20:40 +0200, Luis R. Rodriguez wrote:
> On Tue, Aug 09, 2016 at 01:04:00PM -0400, Mark Salter wrote:
> > 
> > On Tue, 2016-08-09 at 06:37 -0700, Guenter Roeck wrote:
> > > 
> > > On 08/09/2016 01:11 AM, Luis R. Rodriguez wrote:
> > > > 
> > > > 
> > > > Mark, Aurelien,
> > > > 
> > > > I've run into a linker (ld) issue caused by the linker table work I've
> > > > been working on [0]. I looked into this and for the life of me, I
> > > > cannot comprehend what the problem is, so was hoping you folks might
> > > > be able to chime in.
> > > > 
> > > For reference, the error is
> > > 
> > > c6x-elf-ld: drivers/built-in.o: SB-relative relocation but 
> > > __c6xabi_DSBT_BASE not defined
> > > c6x-elf-ld: drivers/built-in.o: SB-relative relocation but 
> > > __c6xabi_DSBT_BASE not defined
> > DSBT is a reference to the no-MMU userspace ABI used by c6x. The kernel 
> > shouldn't
> > be referencing DSBT base. The -mno-dsbt gcc flag should prevent it.
> I see -mno-dsbt on arch/c6x/Makefile already -- however at link time this is
> an issue if linker tables are used it seems. Do you have any other 
> recommendation?
> 
> I will note that it would seem that even i386 and x86-64 compiler/binutils 
> seem
> to have relocation issues on older compiler/binutils, for instance:

I see the problem with gcc 6 as well.

So there appears to be some toolchain issues at play here. We build the kernel 
with two
c6x-specific options: -mno-dsbt and -msdata=none. I already mentioned dsbt. The 
sdata
option may be one of:

-msdata=default
     Put small global and static data in the .neardata section, which is 
pointed to by
     register B14. Put small uninitialized global and static data in the .bss 
section,
     which is adjacent to the .neardata section. Put small read-only data into 
the 
     .rodata section. The corresponding sections used for large pieces of data 
are
     .fardata, .far and .const.

-msdata=all
    Put all data, not just small objects, into the sections reserved for small 
data,
    and use addressing relative to the B14 register to access them.

-msdata=none
    Make no use of the sections reserved for small data, and use absolute 
addresses
    to access all data. Put all initialized global and static data in the 
.fardata
    section, and all uninitialized data in the .far section. Put all constant 
data
    into the .const section.


Both small data and DSBT make use of base register + 15-bit offset to access 
data
and thus the SB-relative reloc in the above error message.

I think that gcc sees the .rodata section from DEFINE_LINKTABLE_RO() for 
builtin_fw
and thinks it needs an SB-relative reloc. When the linker sees that reloc, it 
thinks
it needs the dsbt base register and thus the error. Interestingly, weak data is
never put in the small data section so if gcc sees that data is weak, it doesn't
check the section name to see if it is a small data section. So SB-relative only
gets used for builtin_fw__end, but not the weak builtin_fw even though they both
are in the .rodata section.

I suspect gcc should avoid being fooled by .rodata if -msdata=none is used.
Regardless, I think this could all be avoided if the RO tables used .const
instead of .rodata for c6x.


> gcc-4.7.2
> binutils-2.22 
> 
> Yields:
> 
> x86_64:allyesconfig || x86_64:allmodconfig
> Invalid absolute R_X86_64_64 relocation: __per_cpu_load 
> 
> i386:defconfig
> Invalid absolute R_386_32 relocation: __vvar_page
> 
> This issue on x86 is not observed as so far as of gcc 5.2.1
> and binutils-2.26.1.
> 
> hpa -- if you can think of a work around for this for older compilers/linkers
> let me know... unless we are OK to increase the requirements for x86.
> 
>   Luis

[Update][PATCH 1/2] cpufreq / sched: Pass flags to cpufreq_update_util()

2016-08-09 Thread Rafael J. Wysocki

From: Rafael J. Wysocki 

It is useful to know the reason why cpufreq_update_util() has just
been called and that can be passed as flags to cpufreq_update_util()
and to the ->func() callback in struct update_util_data.  However,
doing that in addition to passing the util and max arguments they
already take would be clumsy, so avoid it.

Instead, use the observation that the schedutil governor is part
of the scheduler proper, so it can access scheduler data directly.
This allows the util and max arguments of cpufreq_update_util()
and the ->func() callback in struct update_util_data to be replaced
with a flags one, but schedutil has to be modified to follow.

Thus make the schedutil governor obtain the CFS utilization
information from the scheduler and use the "RT" and "DL" flags
instead of the special utilization value of ULONG_MAX to track
updates from the RT and DL sched classes.  Make it non-modular
too to avoid having to export scheduler variables to modules at
large.

Next, update all of the other users of cpufreq_update_util()
and the ->func() callback in struct update_util_data accordingly.

Suggested-by: Peter Zijlstra 
Signed-off-by: Rafael J. Wysocki 
---

Actually, while changing schedutil to be non-modular, some modularity-related
code can be dropped from it too.

---
 drivers/cpufreq/Kconfig|5 --
 drivers/cpufreq/cpufreq_governor.c |2 -
 drivers/cpufreq/intel_pstate.c |2 -
 include/linux/sched.h  |   12 --
 kernel/sched/cpufreq.c |2 -
 kernel/sched/cpufreq_schedutil.c   |   67 -
 kernel/sched/deadline.c|4 +-
 kernel/sched/fair.c|   11 ++
 kernel/sched/rt.c  |4 +-
 kernel/sched/sched.h   |   31 +
 10 files changed, 67 insertions(+), 73 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -260,7 +260,7 @@ static void dbs_irq_work(struct irq_work
 }
 
 static void dbs_update_util_handler(struct update_util_data *data, u64 time,
-   unsigned long util, unsigned long max)
+   unsigned int flags)
 {
struct cpu_dbs_info *cdbs = container_of(data, struct cpu_dbs_info, 
update_util);
struct policy_dbs_info *policy_dbs = cdbs->policy_dbs;
Index: linux-pm/drivers/cpufreq/intel_pstate.c
===
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -1329,7 +1329,7 @@ static inline void intel_pstate_adjust_b
 }
 
 static void intel_pstate_update_util(struct update_util_data *data, u64 time,
-unsigned long util, unsigned long max)
+unsigned int flags)
 {
struct cpudata *cpu = container_of(data, struct cpudata, update_util);
u64 delta_ns = time - cpu->sample.time;
Index: linux-pm/include/linux/sched.h
===
--- linux-pm.orig/include/linux/sched.h
+++ linux-pm/include/linux/sched.h
@@ -3469,15 +3469,19 @@ static inline unsigned long rlimit_max(u
return task_rlimit_max(current, limit);
 }
 
+#define SCHED_CPUFREQ_RT   (1U << 0)
+#define SCHED_CPUFREQ_DL   (1U << 1)
+
+#define SCHED_CPUFREQ_RT_DL(SCHED_CPUFREQ_RT | SCHED_CPUFREQ_DL)
+
 #ifdef CONFIG_CPU_FREQ
 struct update_util_data {
-   void (*func)(struct update_util_data *data,
-u64 time, unsigned long util, unsigned long max);
+   void (*func)(struct update_util_data *data, u64 time, unsigned int 
flags);
 };
 
 void cpufreq_add_update_util_hook(int cpu, struct update_util_data *data,
-   void (*func)(struct update_util_data *data, u64 time,
-unsigned long util, unsigned long max));
+   void (*func)(struct update_util_data *data, u64 time,
+   unsigned int flags));
 void cpufreq_remove_update_util_hook(int cpu);
 #endif /* CONFIG_CPU_FREQ */
 
Index: linux-pm/kernel/sched/cpufreq.c
===
--- linux-pm.orig/kernel/sched/cpufreq.c
+++ linux-pm/kernel/sched/cpufreq.c
@@ -33,7 +33,7 @@ DEFINE_PER_CPU(struct update_util_data *
  */
 void cpufreq_add_update_util_hook(int cpu, struct update_util_data *data,
void (*func)(struct update_util_data *data, u64 time,
-unsigned long util, unsigned long max))
+unsigned int flags))
 {
if (WARN_ON(!data || !func))
return;
Index: linux-pm/kernel/sched/cpufreq_schedutil.c
===

Re: [PATCH] i2c: uniphier{-f}: don't print error when adding adapter fails

2016-08-09 Thread Guenter Roeck

On Tue, Aug 09, 2016 at 10:11:40PM +0200, Wolfram Sang wrote:
> The core will do this for us now.
> 
> Signed-off-by: Wolfram Sang 

Acked-by: Guenter Roeck 

> ---
>  drivers/i2c/busses/i2c-uniphier-f.c | 5 -
>  drivers/i2c/busses/i2c-uniphier.c   | 5 -
>  2 files changed, 10 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-uniphier-f.c 
> b/drivers/i2c/busses/i2c-uniphier-f.c
> index aeead0d27d1007..35608531fe070d 100644
> --- a/drivers/i2c/busses/i2c-uniphier-f.c
> +++ b/drivers/i2c/busses/i2c-uniphier-f.c
> @@ -550,11 +550,6 @@ static int uniphier_fi2c_probe(struct platform_device 
> *pdev)
>   }
>  
>   ret = i2c_add_adapter(&priv->adap);
> - if (ret) {
> - dev_err(dev, "failed to add I2C adapter\n");
> - goto err;
> - }
> -
>  err:
>   if (ret)
>   clk_disable_unprepare(priv->clk);
> diff --git a/drivers/i2c/busses/i2c-uniphier.c 
> b/drivers/i2c/busses/i2c-uniphier.c
> index 475a5eb514e215..d6e612a0e02a9d 100644
> --- a/drivers/i2c/busses/i2c-uniphier.c
> +++ b/drivers/i2c/busses/i2c-uniphier.c
> @@ -407,11 +407,6 @@ static int uniphier_i2c_probe(struct platform_device 
> *pdev)
>   }
>  
>   ret = i2c_add_adapter(&priv->adap);
> - if (ret) {
> - dev_err(dev, "failed to add I2C adapter\n");
> - goto err;
> - }
> -
>  err:
>   if (ret)
>   clk_disable_unprepare(priv->clk);
> -- 
> 2.8.1
>

Re: [copyleft-next] Re: Kernel modules under new copyleft licence : (was Re: [PATCH v2] module.h: add copyleft-next >= 0.3.1 as GPL compatible)

2016-08-09 Thread Luis R. Rodriguez

On Tue, Aug 09, 2016 at 10:14:48PM +0200, Luis R. Rodriguez wrote:
> On Tue, Aug 09, 2016 at 09:04:35PM +0100, Alan Cox wrote:
> > > > (Going back to pick up the specific licence thread)
> > 
> > > > 
> > > > I'd like to see Richard do so as well.
> > > With Richard that's 3 attorneys now.
> > 
> > None of whom I believe represent the Linux project or foundation ?
> > 
> > Linus has to make this call, nobody else and he is probablygoing to go
> > ape if you try and sneak another licence into the kernel without
> > flagging it up with him clearly first. You need to discuss it with
> > Linus up front.
> 
> To be clear I first poked the Linux Foundation about this, I went through the
> process recommended by them. If there is a process out of place its by no
> means an issue on my end.
> 
> > > I'll proceed to submit some code with this license as you request,
> > > Rusty.  Its
> > > however not for modules yet so I would not make use of the
> > > MODULE_LICENSE("copyleft-next") tag yet, however the license will be
> > > on top of
> > > a header.
> > 
> > We have the GPL/extra rights tag for this already. Also when it's
> > merged with the kernel we'd I'm sure pick the derivative work under the
> > GPL option so we'd only need the GPL tag.
> > 
> > There are specific reasons for the extra rights language - it avoids
> > games like MODULE_LICENSE("BSD") and then giving people just a binary
> > and it being counted as GPL compliant activity. The same problem exists
> > in your licence post sunset. That single tag is also why we don't have
> > to list BSD, MIT, and every variant thereof in the table which saves us
> > so much pain. If you must have the actual text in the .ko file then put
> > it in your MODULE_DESCRIPTION().
> 
> I'm personally fine with MODULE_LICENSE("GPL") being used with copyleft-next 
> code
> and find it sensible.

Adding Linus now, for some reason I think you added him with an incorrect
domain name, Alan.

  Luis

[PATCHv5] arm64: Handle el1 synchronous instruction aborts cleanly

2016-08-09 Thread Laura Abbott


Executing from a non-executable area gives an ugly message:

lkdtm: Performing direct entry EXEC_RODATA
lkdtm: attempting ok execution at 084c0e08
lkdtm: attempting bad execution at 08880700
Bad mode in Synchronous Abort handler detected on CPU2, code 0x840e -- IABT 
(current EL)
CPU: 2 PID: 998 Comm: sh Not tainted 4.7.0-rc2+ #13
Hardware name: linux,dummy-virt (DT)
task: 800077e35780 ti: 80007797 task.ti: 80007797
PC is at lkdtm_rodata_do_nothing+0x0/0x8
LR is at execute_location+0x74/0x88

The 'IABT (current EL)' indicates the error but it's a bit cryptic
without knowledge of the ARM ARM. There is also no indication of the
specific address which triggered the fault. The increase in kernel
page permissions makes hitting this case more likely as well.
Handling the case in the vectors gives a much more familiar looking
error message:

lkdtm: Performing direct entry EXEC_RODATA
lkdtm: attempting ok execution at 084c0840
lkdtm: attempting bad execution at 08880680
Unable to handle kernel paging request at virtual address 08880680
pgd = 889b2000
[08880680] *pgd=489b4003, *pud=48904003, 
*pmd=
Internal error: Oops: 840e [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 997 Comm: sh Not tainted 4.7.0-rc1+ #24
Hardware name: linux,dummy-virt (DT)
task: 800077f9f080 ti: 88a1c000 task.ti: 88a1c000
PC is at lkdtm_rodata_do_nothing+0x0/0x8
LR is at execute_location+0x74/0x88

Acked-by: Mark Rutland 
Signed-off-by: Laura Abbott 
---
v5: Fall through to data abort case since the code is now the same
---
 arch/arm64/kernel/entry.S |  7 +++
 arch/arm64/mm/fault.c | 14 --
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 96e4a2b..441420c 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -353,6 +353,8 @@ el1_sync:
lsr x24, x1, #ESR_ELx_EC_SHIFT  // exception class
cmp x24, #ESR_ELx_EC_DABT_CUR   // data abort in EL1
b.eqel1_da
+   cmp x24, #ESR_ELx_EC_IABT_CUR   // instruction abort in EL1
+   b.eqel1_ia
cmp x24, #ESR_ELx_EC_SYS64  // configurable trap
b.eqel1_undef
cmp x24, #ESR_ELx_EC_SP_ALIGN   // stack alignment exception
@@ -364,6 +366,11 @@ el1_sync:
cmp x24, #ESR_ELx_EC_BREAKPT_CUR// debug exception in EL1
b.geel1_dbg
b   el1_inv
+
+el1_ia:
+   /*
+* Fall through to the Data abort case
+*/
 el1_da:
/*
 * Data abort handling
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index c8beaa0..05d2bd7 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -153,6 +153,11 @@ int ptep_set_access_flags(struct vm_area_struct *vma,
 }
 #endif
 
+static bool is_el1_instruction_abort(unsigned int esr)
+{
+   return ESR_ELx_EC(esr) == ESR_ELx_EC_IABT_CUR;
+}
+
 /*
  * The kernel tried to access some page that wasn't present.
  */
@@ -161,8 +166,9 @@ static void __do_kernel_fault(struct mm_struct *mm, 
unsigned long addr,
 {
/*
 * Are we prepared to handle this kernel fault?
+* We are almost certainly not prepared to handle instruction faults.
 */
-   if (fixup_exception(regs))
+   if (!is_el1_instruction_abort(esr) && fixup_exception(regs))
return;
 
/*
@@ -267,7 +273,8 @@ static inline bool is_permission_fault(unsigned int esr)
unsigned int ec   = ESR_ELx_EC(esr);
unsigned int fsc_type = esr & ESR_ELx_FSC_TYPE;
 
-   return (ec == ESR_ELx_EC_DABT_CUR && fsc_type == ESR_ELx_FSC_PERM);
+   return (ec == ESR_ELx_EC_DABT_CUR && fsc_type == ESR_ELx_FSC_PERM) ||
+  (ec == ESR_ELx_EC_IABT_CUR && fsc_type == ESR_ELx_FSC_PERM);
 }
 
 static bool is_el0_instruction_abort(unsigned int esr)
@@ -312,6 +319,9 @@ static int __kprobes do_page_fault(unsigned long addr, 
unsigned int esr,
if (regs->orig_addr_limit == KERNEL_DS)
die("Accessing user space memory with fs=KERNEL_DS", 
regs, esr);
 
+   if (is_el1_instruction_abort(esr))
+   die("Attempting to execute userspace memory", regs, 
esr);
+
if (!search_exception_tables(regs->pc))
die("Accessing user space memory outside uaccess.h 
routines", regs, esr);
}
-- 
2.7.4

Re: [PATCH 4.6 00/96] 4.6.6-stable review

2016-08-09 Thread Guenter Roeck

On Tue, Aug 09, 2016 at 07:22:21PM +0200, Greg Kroah-Hartman wrote:
> > Can you push it into the -rc repository ? I see the patch in the queue,
> > but not in the repository.
> 
> Sorry about that, now regenerated.
> 

All but the unicore32 build problems are now fixed.

Guenter

Making pdfdocs with sphinx - only select rst targets

2016-08-09 Thread Luis R. Rodriguez

I'm excited to see the new documentation format, so I'm changing my
documentation in the pending patches I have to use it. I however
cannot generate anything other than the main

Documentation/output/pdf/Kernel.pdf

How can I see in PDF the other documentation?

I'm using:

make DOCBOOKS="" pdfdocs

The Documentation/output/pdf/Kernel.pdf only has a bit of the
documentation on how to write docs, nothing else. Using htmldocs as a
target works but I am not a fan of that output.

  Luis

[PATCH] arm64: KVM: Save two instructions in __guest_enter()

2016-08-09 Thread Shanker Donthineni

We are doing an unnecessary stack push/pop operation when restoring
the guest registers x0-x18 in __guest_enter(). This patch saves the
two instructions by using x18 as a base register. No need to store
the vcpu context pointer in stack because it is redundant and not
being used anywhere, the same information is available in tpidr_el2.

Signed-off-by: Shanker Donthineni 
---
 arch/arm64/kvm/hyp/entry.S | 66 ++
 1 file changed, 32 insertions(+), 34 deletions(-)

diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index ce9e5e5..d2e09a1 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -55,37 +55,32 @@
  */
 ENTRY(__guest_enter)
// x0: vcpu
-   // x1: host/guest context
-   // x2-x18: clobbered by macros
+   // x1: host context
+   // x2-x17: clobbered by macros
+   // x18: guest context
 
// Store the host regs
save_callee_saved_regs x1
 
-   // Preserve vcpu & host_ctxt for use at exit time
-   stp x0, x1, [sp, #-16]!
+   // Preserve the host_ctxt for use at exit time
+   str x1, [sp, #-16]!
 
-   add x1, x0, #VCPU_CONTEXT
+   add x18, x0, #VCPU_CONTEXT
 
-   // Prepare x0-x1 for later restore by pushing them onto the stack
-   ldp x2, x3, [x1, #CPU_XREG_OFFSET(0)]
-   stp x2, x3, [sp, #-16]!
+   // Restore guest regs x19-x29, lr
+   restore_callee_saved_regs x18
 
-   // x2-x18
-   ldp x2, x3,   [x1, #CPU_XREG_OFFSET(2)]
-   ldp x4, x5,   [x1, #CPU_XREG_OFFSET(4)]
-   ldp x6, x7,   [x1, #CPU_XREG_OFFSET(6)]
-   ldp x8, x9,   [x1, #CPU_XREG_OFFSET(8)]
-   ldp x10, x11, [x1, #CPU_XREG_OFFSET(10)]
-   ldp x12, x13, [x1, #CPU_XREG_OFFSET(12)]
-   ldp x14, x15, [x1, #CPU_XREG_OFFSET(14)]
-   ldp x16, x17, [x1, #CPU_XREG_OFFSET(16)]
-   ldr x18,  [x1, #CPU_XREG_OFFSET(18)]
-
-   // x19-x29, lr
-   restore_callee_saved_regs x1
-
-   // Last bits of the 64bit state
-   ldp x0, x1, [sp], #16
+   // Restore guest regs x0-x18
+   ldp x0, x1,   [x18, #CPU_XREG_OFFSET(0)]
+   ldp x2, x3,   [x18, #CPU_XREG_OFFSET(2)]
+   ldp x4, x5,   [x18, #CPU_XREG_OFFSET(4)]
+   ldp x6, x7,   [x18, #CPU_XREG_OFFSET(6)]
+   ldp x8, x9,   [x18, #CPU_XREG_OFFSET(8)]
+   ldp x10, x11, [x18, #CPU_XREG_OFFSET(10)]
+   ldp x12, x13, [x18, #CPU_XREG_OFFSET(12)]
+   ldp x14, x15, [x18, #CPU_XREG_OFFSET(14)]
+   ldp x16, x17, [x18, #CPU_XREG_OFFSET(16)]
+   ldr x18,  [x18, #CPU_XREG_OFFSET(18)]
 
// Do not touch any register after this!
eret
@@ -100,6 +95,16 @@ ENTRY(__guest_exit)
 
add x2, x0, #VCPU_CONTEXT
 
+   // Store the guest regs x19-x29, lr
+   save_callee_saved_regs x2
+
+   // Retrieve the guest regs x0-x3 from the stack
+   ldp x21, x22, [sp], #16 // x2, x3
+   ldp x19, x20, [sp], #16 // x0, x1
+
+   // Store the guest regs x0-x18
+   stp x19, x20, [x2, #CPU_XREG_OFFSET(0)]
+   stp x21, x22, [x2, #CPU_XREG_OFFSET(2)]
stp x4, x5,   [x2, #CPU_XREG_OFFSET(4)]
stp x6, x7,   [x2, #CPU_XREG_OFFSET(6)]
stp x8, x9,   [x2, #CPU_XREG_OFFSET(8)]
@@ -109,20 +114,13 @@ ENTRY(__guest_exit)
stp x16, x17, [x2, #CPU_XREG_OFFSET(16)]
str x18,  [x2, #CPU_XREG_OFFSET(18)]
 
-   ldp x6, x7, [sp], #16   // x2, x3
-   ldp x4, x5, [sp], #16   // x0, x1
+   // Restore the host_ctxt from the stack
+   ldr x2, [sp], #16
 
-   stp x4, x5, [x2, #CPU_XREG_OFFSET(0)]
-   stp x6, x7, [x2, #CPU_XREG_OFFSET(2)]
-
-   save_callee_saved_regs x2
-
-   // Restore vcpu & host_ctxt from the stack
-   // (preserving return code in x1)
-   ldp x0, x2, [sp], #16
// Now restore the host regs
restore_callee_saved_regs x2
 
+   // Preserving return code (x1)
mov x0, x1
ret
 ENDPROC(__guest_exit)
-- 
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, 
Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.

Re: [PATCH] rcu: Fix soft lockup for rcu_nocb_kthread

2016-08-09 Thread Ding Tianhong

On 2016/6/16 22:19, Paul E. McKenney wrote:
> On Thu, Jun 16, 2016 at 02:09:47PM +0800, Ding Tianhong wrote:
>> On 2016/6/15 23:49, Paul E. McKenney wrote:
>>> On Wed, Jun 15, 2016 at 03:27:36PM +0800, Ding Tianhong wrote:
 I met this problem when using the Testgine to send package to ixgbevf nic
 by this steps:
 1. Connect to ixgbevf, and set the speed to 10Gb/s, it could work fine.
 2. Then use ifconfig to down the nic and up again, loop for several times.
 3. The system panic by soft lockup.
>>>
>>> Good catch, queued for review and testing.  But what .config was your
>>> kernel built with?
>>>
>>
>> I use the redhat7.1 defconfig to build my kernel, and the RCU config is this:
>>  120 #
>>  121 # RCU Subsystem
>>  122 #
>>  123 CONFIG_TREE_RCU=y
>>  124 # CONFIG_PREEMPT_RCU is not set
>>  125 CONFIG_RCU_STALL_COMMON=y
>>  126 CONFIG_CONTEXT_TRACKING=y
>>  127 CONFIG_RCU_USER_QS=y
>>  128 # CONFIG_CONTEXT_TRACKING_FORCE is not set
>>  129 CONFIG_RCU_FANOUT=64
>>  130 CONFIG_RCU_FANOUT_LEAF=16
>>  131 # CONFIG_RCU_FANOUT_EXACT is not set
>>  132 # CONFIG_RCU_FAST_NO_HZ is not set
>>  133 # CONFIG_TREE_RCU_TRACE is not set
>>  134 CONFIG_RCU_NOCB_CPU=y
>>  135 CONFIG_RCU_NOCB_CPU_ALL=y
>>  136 CONFIG_BUILD_BIN2C=y
> 
> Thank you!  You were running with preemption disabled, so your system
> would indeed be very susceptible to this problem.
> 
>>> Also, I did tweak both the commit log and the patch.  Your cond_resched()
>>> would prevent soft lockups, but not RCU stalls, so I substituted
>>> cond_resched_rcu_qs().  Please let me know if either of those changes
>>> causes problems at your end.
>>
>> Looks fine to me, I will apply this to my branch and test it, thanks.
> 
> Please let me know how it goes!
> 
>   Thanx, Paul
> 

Hi Paul:

It has been a long time after applying this patch, and didn't found any 
problem, I believe this patch is fine, thanks.

Ding

>> Ding
>>
>>>
>>> Thanx, Paul
>>>
>>> 
>>>
>>> commit c317cf19b34c0d2787b787c38bd2c8fe433215da
>>> Author: Ding Tianhong 
>>> Date:   Wed Jun 15 15:27:36 2016 +0800
>>>
>>> rcu: Fix soft lockup for rcu_nocb_kthread
>>> 
>>> Carrying out the following steps results in a softlockup in the
>>> RCU callback-offload (rcuo) kthreads:
>>> 
>>> 1. Connect to ixgbevf, and set the speed to 10Gb/s.
>>> 2. Use ifconfig to bring the nic up and down repeatedly.
>>> 
>>> [  317.005148] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
>>> [  368.106005] BUG: soft lockup - CPU#1 stuck for 22s! [rcuos/1:15]
>>> [  368.106005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>>> [  368.106005] task: 88057dd8a220 ti: 88057dd9c000 task.ti: 
>>> 88057dd9c000
>>> [  368.106005] RIP: 0010:[]  [] 
>>> fib_table_lookup+0x14/0x390
>>> [  368.106005] RSP: 0018:88061fc83ce8  EFLAGS: 0286
>>> [  368.106005] RAX: 0001 RBX: 020155c0 RCX: 
>>> 0001
>>> [  368.106005] RDX: 88061fc83d50 RSI: 88061fc83d70 RDI: 
>>> 880036d11a00
>>> [  368.106005] RBP: 88061fc83d08 R08: 0001 R09: 
>>> 
>>> [  368.106005] R10: 880036d11a00 R11: 819e0900 R12: 
>>> 88061fc83c58
>>> [  368.106005] R13: 816154dd R14: 88061fc83d08 R15: 
>>> 020155c0
>>> [  368.106005] FS:  () GS:88061fc8() 
>>> knlGS:
>>> [  368.106005] CS:  0010 DS:  ES:  CR0: 80050033
>>> [  368.106005] CR2: 7f8c2aee9c40 CR3: 00057b222000 CR4: 
>>> 000407e0
>>> [  368.106005] DR0:  DR1:  DR2: 
>>> 
>>> [  368.106005] DR3:  DR6: 0ff0 DR7: 
>>> 0400
>>> [  368.106005] Stack:
>>> [  368.106005]  01c0 88057b766000 8802e380b000 
>>> 88057af03e00
>>> [  368.106005]  88061fc83dc0 815349a6 88061fc83d40 
>>> 814ee146
>>> [  368.106005]  8802e380af00 e380af00 819e0900 
>>> 020155c001c0
>>> [  368.106005] Call Trace:
>>> [  368.106005]  
>>> [  368.106005]
>>> [  368.106005]  [] ip_route_input_noref+0x516/0xbd0
>>> [  368.106005]  [] ? skb_release_data+0xd6/0x110
>>> [  368.106005]  [] ? kfree_skb+0x3a/0xa0
>>> [  368.106005]  [] ip_rcv_finish+0x29f/0x350
>>> [  368.106005]  [] ip_rcv+0x234/0x380
>>> [  368.106005]  [] 
>>> __netif_receive_skb_core+0x676/0x870
>>> [  368.106005]  [] __netif_receive_skb+0x18/0x60
>>> [  368.106005]  [] process_backlog+0xae/0x180
>>> [  368.106005]  [] net_rx_action+0x152/0x240
>>> [  368.106005]  [] __do_softirq+0xef/0x280
>>> [  368.106005]  [] call_softirq+0x1c/0x30
>>> [  368.106005]

Re: [PATCHv4] arm64: Handle el1 synchronous instruction aborts cleanly

2016-08-09 Thread Laura Abbott


On 08/09/2016 06:24 AM, Will Deacon wrote:

On Mon, Aug 08, 2016 at 05:35:34PM -0700, Laura Abbott wrote:

Executing from a non-executable area gives an ugly message:

lkdtm: Performing direct entry EXEC_RODATA
lkdtm: attempting ok execution at 084c0e08
lkdtm: attempting bad execution at 08880700
Bad mode in Synchronous Abort handler detected on CPU2, code 0x840e -- IABT 
(current EL)
CPU: 2 PID: 998 Comm: sh Not tainted 4.7.0-rc2+ #13
Hardware name: linux,dummy-virt (DT)
task: 800077e35780 ti: 80007797 task.ti: 80007797
PC is at lkdtm_rodata_do_nothing+0x0/0x8
LR is at execute_location+0x74/0x88

The 'IABT (current EL)' indicates the error but it's a bit cryptic
without knowledge of the ARM ARM. There is also no indication of the
specific address which triggered the fault. The increase in kernel
page permissions makes hitting this case more likely as well.
Handling the case in the vectors gives a much more familiar looking
error message:

lkdtm: Performing direct entry EXEC_RODATA
lkdtm: attempting ok execution at 084c0840
lkdtm: attempting bad execution at 08880680
Unable to handle kernel paging request at virtual address 08880680
pgd = 889b2000
[08880680] *pgd=489b4003, *pud=48904003, 
*pmd=
Internal error: Oops: 840e [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 997 Comm: sh Not tainted 4.7.0-rc1+ #24
Hardware name: linux,dummy-virt (DT)
task: 800077f9f080 ti: 88a1c000 task.ti: 88a1c000
PC is at lkdtm_rodata_do_nothing+0x0/0x8
LR is at execute_location+0x74/0x88

Signed-off-by: Laura Abbott 
Acked-by: Mark Rutland 
---
v4: Rebased to master, extra error message to indicate execution of userspace
memory
---
 arch/arm64/kernel/entry.S | 18 ++
 arch/arm64/mm/fault.c | 14 --
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 96e4a2b..bdfadef 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -353,6 +353,8 @@ el1_sync:
lsr x24, x1, #ESR_ELx_EC_SHIFT  // exception class
cmp x24, #ESR_ELx_EC_DABT_CUR   // data abort in EL1
b.eqel1_da
+   cmp x24, #ESR_ELx_EC_IABT_CUR   // instruction abort in EL1
+   b.eqel1_ia
cmp x24, #ESR_ELx_EC_SYS64  // configurable trap
b.eqel1_undef
cmp x24, #ESR_ELx_EC_SP_ALIGN   // stack alignment exception
@@ -364,6 +366,22 @@ el1_sync:
cmp x24, #ESR_ELx_EC_BREAKPT_CUR// debug exception in EL1
b.geel1_dbg
b   el1_inv
+el1_ia:
+   /*
+* Instruction abort handling
+*/
+   mrs x0, far_el1
+   enable_dbg
+   // re-enable interrupts if they were enabled in the aborted context
+   tbnzx23, #7, 1f // PSR_I_BIT
+   enable_irq
+1:
+   mov x2, sp  // struct pt_regs
+   bl  do_mem_abort
+
+   // disable interrupts before pulling preserved data off the stack
+   disable_irq
+   kernel_exit 1


This looks identical to the el1_da code immediately below. Can we not
just have a fallthrough?

Will



Yes, good point. It made sense to have the separate code when there was
a flag.

Thanks,
Laura

Re: [PATCH] ARM: dts: add rk3288-firefly-reload

2016-08-09 Thread Randy Li




On 08/09/2016 06:58 PM, Heiko Stübner wrote:

Am Dienstag, 9. August 2016, 18:06:33 schrieb Randy Li:

從我的 iPad 傳送


陈豪  於 2016年8月9日 下午6:02 寫道：

well，it has already been added

The root cause is not act8846. Firefly have a bug with sdmmc and it
seems they didn't fix it in firefly-reload.

http://bbs.t-firefly.com/forum.php?mod=viewthread&tid=256

Thanks Jacob for pointing out this issue - I was always wondering why it
wasn't working :-) .



Yes, Jacob is right. Those high speed options need the voltage down to 1.8v.
They should be removed.

Randy, can you provide a follow-up patch that removes these excess properties
please?

I will send a patch in weekend. I need some time to confirm whether
the sdio wifi would meet the same problem in firefly reload.
I found those patches are not been merged yet, should I just send
a new version or just a patch remove those incorrect excess properties?



Thanks
Heiko



--
Randy Li
The third produce department

Re: [PATCH v1] firmware_class: encapsulate firmware loading status

2016-08-09 Thread Luis R. Rodriguez

On Thu, Aug 04, 2016 at 02:27:16PM +0200, Daniel Wagner wrote:
> From: Daniel Wagner 
> 
> The firmware user helper code tracks the current state of the loading
> process via an member of struct firmware_buf and a completion. Let's
> encapsulate this simple state machine into struct fw_status. The aim is
> to encrease readiblity and reduce the usage of the fw_lock.

Great, emphasis, reduce use of fw_lock, good stuff!

> The fw_lock is not needed to protect the status update anymore. We don't
> do any RMW operations. Instead we just do a write or a read, not both at
> the same time.
> 
> [v1: moved fw_status into !CONFIG_FW_LOADER_USER_HELPER section,
>  reported by 0day kbuild]
> 
> Signed-off-by: Daniel Wagner 
> Cc: Ming Lei 
> Cc: Luis R. Rodriguez 
> Cc: Greg Kroah-Hartman 
> ---
> 
> Hi,
> 
> In [0] we have a discussion on how the firmware_class API might be
> changed to improve the current handling of firmware loading. This
> patch was part of the orignal RFC which triggered the discussion.
> 
> I think it is worth taking this one anyway. Maybe as I suggested it
> could be part of the series from Luis.

I'm happy to queue it in on my end however your changes are a bit orthogonal
as you help optimize us away from the usermode helper, I just compartamentalize
that whole API away into a new one so this can go in separately. In terms of
coordination -- sure order will help to get right so I can queue it in, in
that sense. But we're not yet sure if sysdsata will go in first, and I'm happy
for this to go in first as it does not conflict as its slightly orthogonal.

So order here does not interfere with my series -- lets just review this and
its good lets let it go in.

What you do is strip us further from the user mode helper and that
is a good thing.

My review below.

> It cleans up the code base (okay my opinion) 

You do little to sell this. In fact, if this is OK, it does a good
compartamentalization of a completion and a lock and implicates the
wait stuff only onto the usermoder helper, indeed that's a win
worth documenting on the commit log.

> and removes the
> complete_all() call which is problematic for -rt. complete_all() can
> be used in any context including IRQ.

I see. But in this case the code in question should never run in IRQ context?

> That could lead to unbounded
> work in the IRQ context and that is a no go for -rt.

Is the fear of the call to be used in IRQ context or the waiters to
somehow work in IRQ context somehow. The waiters were sleeping.. so
that I think leaves only the call site of the complete_all() to worry
about, but I can't see that happening in IRQ context. Please
correct me if I'm wrong.

> So here the
> attempt to reduce the number of complete_all() calls where possible.

OK so this is the real motivation.

> I have left this argument out in the commit message because I was told '-rt'
> arguments don't count for inclusion.

Sure, but I appreciate this explanation, thanks for that !

Can you provide a set of commits accepted upstream or on linux-next
where such conversion has been done and accepted as well elsewhere
in the kernel ?

I know its just pending patches for review but this has me thinking, is
the use of async functionality in the sysdata patches kosher for RT ?

> cheers,
> daniel
> 
> [0] http://www.spinics.net/lists/linux-wireless/msg153005.html
> 
> drivers/base/firmware_class.c | 154 --
>  1 file changed, 89 insertions(+), 65 deletions(-)
> 
> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> index 22d1760..33eb372 100644
> --- a/drivers/base/firmware_class.c
> +++ b/drivers/base/firmware_class.c
> @@ -30,6 +30,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -91,19 +92,83 @@ static inline bool fw_is_builtin_firmware(const struct 
> firmware *fw)
>  }
>  #endif
>  
> +static int loading_timeout = 60; /* In seconds */
> +
> +static inline long firmware_loading_timeout(void)
> +{
> + return loading_timeout > 0 ? loading_timeout * HZ : MAX_JIFFY_OFFSET;
> +}

Seems like we can wrap the above loading_timeout and firmware_loading_timeout 
onto
CONFIG_FW_LOADER_USER_HELPER -- or provide a helper that returns some
static nonsense value that works for !CONFIG_FW_LOADER_USER_HELPER.

The move of the code above also makes this change harder to review.

> +
>  enum {
> + FW_STATUS_UNKNOWN,
>   FW_STATUS_LOADING,
>   FW_STATUS_DONE,
> - FW_STATUS_ABORT,
> + FW_STATUS_ABORTED,
>  };

Come to think of it, even if CONFIG_FW_LOADER_USER_HELPER is enabled
we should only have a need to use this wait crap if an explicit
caller in the kernel requested to use the usermode helper, and as
my patches show there are only 2 of those cases left in the kernel.

To be clear if CONFIG_FW_LOADER_USER_HELPER_FALLBACK (not many distros left) is
set we're stuck and always have to use this, if you only have
CONFIG_FW_LOADER_USER_HELPER (most distros) then o

[PATCH 0/2] cpufreq / sched: Rework of cpufreq_update_util() arguments

2016-08-09 Thread Rafael J. Wysocki

Hi,

There were some comments on the "cpufreq / sched: cpufreq_update_util() flags
and iowait boosting" series I sent some time ago and I wanted to address them,
but for this purpose I had to combine patches [1-2,4/7] from that series
into one and make some changes on top of that.

Then I thought it would be better to send that separately from the iowait
boost part of that series, so here it goes.

[1/2] Removes the util and max args from cpufreq_update_util() and governor
  callbacks and adds a flags argument instead of them.  That argument
  is then used to handle RT and DL in schedutil and the utilization data
  are accessed by it directly (so it is non-modular now to avoid exporting
  the scheduler internals to modules).
[2/2] Replaces the time argument of cpufreq_update_util() with an rq pointer
  which allows some simplifications to be made.

There should be no changes in behavior as a result of this.

Thanks,
Rafael

[PATCH 1/2] cpufreq / sched: Pass flags to cpufreq_update_util()

2016-08-09 Thread Rafael J. Wysocki

From: Rafael J. Wysocki 

It is useful to know the reason why cpufreq_update_util() has just
been called and that can be passed as flags to cpufreq_update_util()
and to the ->func() callback in struct update_util_data.  However,
doing that in addition to passing the util and max arguments they
already take would be clumsy, so avoid it.

Instead, use the observation that the schedutil governor is part
of the scheduler proper, so it can access scheduler data directly.
This allows the util and max arguments of cpufreq_update_util()
and the ->func() callback in struct update_util_data to be replaced
with a flags one, but schedutil has to be modified to follow.

Thus make the schedutil governor obtain the CFS utilization
information from the scheduler and use the "RT" and "DL" flags
instead of the special utilization value of ULONG_MAX to track
updates from the RT and DL sched classes.  Make it non-modular
too to avoid having to export scheduler variables to modules at
large.

Next, update all of the other users of cpufreq_update_util()
and the ->func() callback in struct update_util_data accordingly.

Suggested-by: Peter Zijlstra 
Signed-off-by: Rafael J. Wysocki 
---
 drivers/cpufreq/Kconfig|5 
 drivers/cpufreq/cpufreq_governor.c |2 -
 drivers/cpufreq/intel_pstate.c |2 -
 include/linux/sched.h  |   12 +++---
 kernel/sched/cpufreq.c |2 -
 kernel/sched/cpufreq_schedutil.c   |   41 +++--
 kernel/sched/deadline.c|4 +--
 kernel/sched/fair.c|   11 +++--
 kernel/sched/rt.c  |4 +--
 kernel/sched/sched.h   |   31 +--
 10 files changed, 61 insertions(+), 53 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -260,7 +260,7 @@ static void dbs_irq_work(struct irq_work
 }
 
 static void dbs_update_util_handler(struct update_util_data *data, u64 time,
-   unsigned long util, unsigned long max)
+   unsigned int flags)
 {
struct cpu_dbs_info *cdbs = container_of(data, struct cpu_dbs_info, 
update_util);
struct policy_dbs_info *policy_dbs = cdbs->policy_dbs;
Index: linux-pm/drivers/cpufreq/intel_pstate.c
===
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -1329,7 +1329,7 @@ static inline void intel_pstate_adjust_b
 }
 
 static void intel_pstate_update_util(struct update_util_data *data, u64 time,
-unsigned long util, unsigned long max)
+unsigned int flags)
 {
struct cpudata *cpu = container_of(data, struct cpudata, update_util);
u64 delta_ns = time - cpu->sample.time;
Index: linux-pm/include/linux/sched.h
===
--- linux-pm.orig/include/linux/sched.h
+++ linux-pm/include/linux/sched.h
@@ -3469,15 +3469,19 @@ static inline unsigned long rlimit_max(u
return task_rlimit_max(current, limit);
 }
 
+#define SCHED_CPUFREQ_RT   (1U << 0)
+#define SCHED_CPUFREQ_DL   (1U << 1)
+
+#define SCHED_CPUFREQ_RT_DL(SCHED_CPUFREQ_RT | SCHED_CPUFREQ_DL)
+
 #ifdef CONFIG_CPU_FREQ
 struct update_util_data {
-   void (*func)(struct update_util_data *data,
-u64 time, unsigned long util, unsigned long max);
+   void (*func)(struct update_util_data *data, u64 time, unsigned int 
flags);
 };
 
 void cpufreq_add_update_util_hook(int cpu, struct update_util_data *data,
-   void (*func)(struct update_util_data *data, u64 time,
-unsigned long util, unsigned long max));
+   void (*func)(struct update_util_data *data, u64 time,
+   unsigned int flags));
 void cpufreq_remove_update_util_hook(int cpu);
 #endif /* CONFIG_CPU_FREQ */
 
Index: linux-pm/kernel/sched/cpufreq.c
===
--- linux-pm.orig/kernel/sched/cpufreq.c
+++ linux-pm/kernel/sched/cpufreq.c
@@ -33,7 +33,7 @@ DEFINE_PER_CPU(struct update_util_data *
  */
 void cpufreq_add_update_util_hook(int cpu, struct update_util_data *data,
void (*func)(struct update_util_data *data, u64 time,
-unsigned long util, unsigned long max))
+unsigned int flags))
 {
if (WARN_ON(!data || !func))
return;
Index: linux-pm/kernel/sched/cpufreq_schedutil.c
===
--- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
+++ linux-pm/kernel/

[PATCH 2/2] cpufreq / sched: Check cpu_of(rq) in cpufreq_update_util()

2016-08-09 Thread Rafael J. Wysocki

From: Rafael J. Wysocki 

All of the callers of cpufreq_update_util() check whether or not
cpu_of(rq) is equal to smp_processor_id() before calling it and pass
rq_clock(rq) to it as the time argument, so rework it to take a
runqueue pointer as the argument and move the cpu_of(rq) check and
the rq_clock(rq) evaluation into it.

Signed-off-by: Rafael J. Wysocki 
---
 kernel/sched/deadline.c |3 +--
 kernel/sched/fair.c |5 +
 kernel/sched/rt.c   |3 +--
 kernel/sched/sched.h|   15 +--
 4 files changed, 12 insertions(+), 14 deletions(-)

Index: linux-pm/kernel/sched/deadline.c
===
--- linux-pm.orig/kernel/sched/deadline.c
+++ linux-pm/kernel/sched/deadline.c
@@ -733,8 +733,7 @@ static void update_curr_dl(struct rq *rq
}
 
/* kick cpufreq (see the comment in kernel/sched/sched.h). */
-   if (cpu_of(rq) == smp_processor_id())
-   cpufreq_update_util(rq_clock(rq), SCHED_CPUFREQ_DL);
+   cpufreq_update_util(rq, SCHED_CPUFREQ_DL);
 
schedstat_set(curr->se.statistics.exec_max,
  max(curr->se.statistics.exec_max, delta_exec));
Index: linux-pm/kernel/sched/fair.c
===
--- linux-pm.orig/kernel/sched/fair.c
+++ linux-pm/kernel/sched/fair.c
@@ -2876,8 +2876,6 @@ static inline void update_tg_load_avg(st
 static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq)
 {
if (&this_rq()->cfs == cfs_rq) {
-   struct rq *rq = rq_of(cfs_rq);
-
/*
 * There are a few boundary cases this might miss but it should
 * get called often enough that that should (hopefully) not be
@@ -2894,8 +2892,7 @@ static inline void cfs_rq_util_change(st
 *
 * See cpu_util().
 */
-   if (cpu_of(rq) == smp_processor_id())
-   cpufreq_update_util(rq_clock(rq), 0);
+   cpufreq_update_util(rq_of(cfs_rq), 0);
}
 }
 
Index: linux-pm/kernel/sched/rt.c
===
--- linux-pm.orig/kernel/sched/rt.c
+++ linux-pm/kernel/sched/rt.c
@@ -958,8 +958,7 @@ static void update_curr_rt(struct rq *rq
return;
 
/* Kick cpufreq (see the comment in kernel/sched/sched.h). */
-   if (cpu_of(rq) == smp_processor_id())
-   cpufreq_update_util(rq_clock(rq), SCHED_CPUFREQ_RT);
+   cpufreq_update_util(rq, SCHED_CPUFREQ_RT);
 
schedstat_set(curr->se.statistics.exec_max,
  max(curr->se.statistics.exec_max, delta_exec));
Index: linux-pm/kernel/sched/sched.h
===
--- linux-pm.orig/kernel/sched/sched.h
+++ linux-pm/kernel/sched/sched.h
@@ -1763,11 +1763,11 @@ DECLARE_PER_CPU(struct update_util_data
 
 /**
  * cpufreq_update_util - Take a note about CPU utilization changes.
- * @time: Current time.
+ * @rq: Runqueue to carry out the update for.
  * @flags: Update reason flags.
  *
- * This function is called by the scheduler on the CPU whose utilization is
- * being updated.
+ * This function is called by the scheduler to invoke the CPU frequency
+ * governor.
  *
  * It can only be called from RCU-sched read-side critical sections.
  *
@@ -1783,16 +1783,19 @@ DECLARE_PER_CPU(struct update_util_data
  * but that really is a band-aid.  Going forward it should be replaced with
  * solutions targeted more specifically at RT and DL tasks.
  */
-static inline void cpufreq_update_util(u64 time, unsigned int flags)
+static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
 {
struct update_util_data *data;
 
+   if (cpu_of(rq) != smp_processor_id())
+   return;
+
data = rcu_dereference_sched(*this_cpu_ptr(&cpufreq_update_util_data));
if (data)
-   data->func(data, time, flags);
+   data->func(data, rq_clock(rq), flags);
 }
 #else
-static inline void cpufreq_update_util(u64 time, unsigned int flags) {}
+static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {}
 #endif /* CONFIG_CPU_FREQ */
 
 #ifdef arch_scale_freq_capacity

[PATCH v5 2/2] Add support for SCT Write Same

2016-08-09 Thread Shaun Tancheff

SATA drives may support write same via SCT. This is useful
for setting the drive contents to a specific pattern (0's).

Translate a SCSI WRITE SAME command to be either a DSM TRIM command or
an SCT Write Same command.

Based on the UNMAP flag:
  - When set translate to DSM TRIM
  - When not set translate to SCT Write Same

Signed-off-by: Shaun Tancheff 
---
v5:
 - Addressed review comments
 - Report support for ZBC only for zoned devices.
 - kmap page during rewrite
 - Fix unmap set to require trim or error, if not unmap then sct write
   same or error.
v4:
 - Added partial MAINTENANCE_IN opcode simulation
 - Dropped all changes in drivers/scsi/*
 - Changed to honor the UNMAP flag -> TRIM, no UNMAP -> SCT.
v3:
 - Demux UNMAP/TRIM from WRITE SAME
v2:
 - Remove fugly ata hacking from sd.c
---
 drivers/ata/libata-scsi.c | 189 +++---
 include/linux/ata.h   |  43 +++
 2 files changed, 205 insertions(+), 27 deletions(-)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index a71067a..99b0e6c 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1159,8 +1159,6 @@ static void ata_scsi_sdev_config(struct scsi_device *sdev)
 {
sdev->use_10_for_rw = 1;
sdev->use_10_for_ms = 1;
-   sdev->no_report_opcodes = 1;
-   sdev->no_write_same = 1;
 
/* Schedule policy is determined by ->qc_defer() callback and
 * it needs to see every deferred qc.  Set dev_blocked to 1 to
@@ -3325,6 +3323,41 @@ static unsigned int ata_format_dsm_trim_descr(struct 
scatterlist *sg, u32 num,
return used_bytes;
 }
 
+/**
+ * ata_format_dsm_trim_descr() - SATL Write Same to ATA SCT Write Same
+ * @sg: Scatter / Gather list attached to command.
+ * @lba: Starting sector
+ * @num: Number of bytes to be zero'd.
+ *
+ * Rewrite the WRITE SAME descriptor to be an SCT Write Same formatted
+ * descriptor.
+ * NOTE: Writes a pattern (0's) in the foreground.
+ *   Large write-same requents can timeout.
+ */
+static void ata_format_sct_write_same(struct scatterlist *sg, u64 lba, u64 num)
+{
+   void *ptr = kmap_atomic(sg_page(sg));
+   u16 *sctpg = ptr + sg->offset;
+
+   put_unaligned_le16(0x0002,  &sctpg[0]); /* SCT_ACT_WRITE_SAME */
+   put_unaligned_le16(0x0101,  &sctpg[1]); /* WRITE PTRN FG */
+   put_unaligned_le64(lba, &sctpg[2]);
+   put_unaligned_le64(num, &sctpg[6]);
+   put_unaligned_le32(0u,  &sctpg[10]);
+
+   kunmap_atomic(ptr);
+}
+
+/**
+ * ata_scsi_write_same_xlat() - SATL Write Same to ATA SCT Write Same
+ * @qc: Command to be translated
+ *
+ * Translate a SCSI WRITE SAME command to be either a DSM TRIM command or
+ * an SCT Write Same command.
+ * Based on WRITE SAME has the UNMAP flag
+ *   When set translate to DSM TRIM
+ *   When clear translate to SCT Write Same
+ */
 static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
 {
struct ata_taskfile *tf = &qc->tf;
@@ -3338,6 +3371,7 @@ static unsigned int ata_scsi_write_same_xlat(struct 
ata_queued_cmd *qc)
u32 size;
u16 fp;
u8 bp = 0xff;
+   u8 unmap = cdb[1] & 0x8;
 
/* we may not issue DMA commands if no DMA mode is set */
if (unlikely(!dev->dma_mode))
@@ -3350,10 +3384,23 @@ static unsigned int ata_scsi_write_same_xlat(struct 
ata_queued_cmd *qc)
scsi_16_lba_len(cdb, &block, &n_block);
 
/* for now we only support WRITE SAME with the unmap bit set */
-   if (unlikely(!(cdb[1] & 0x8))) {
-   fp = 1;
-   bp = 3;
-   goto invalid_fld;
+   if (unmap) {
+   if ((dev->horkage & ATA_HORKAGE_NOTRIM) ||
+   !ata_id_has_trim(dev->id)) {
+   fp = 1;
+   bp = 3;
+   goto invalid_fld;
+   }
+   if (n_block > 0x * trmax) {
+   fp = 2;
+   goto invalid_fld;
+   }
+   } else {
+   if (!ata_id_sct_write_same(dev->id)) {
+   fp = 1;
+   bp = 3;
+   goto invalid_fld;
+   }
}
 
/*
@@ -3364,30 +3411,42 @@ static unsigned int ata_scsi_write_same_xlat(struct 
ata_queued_cmd *qc)
goto invalid_param_len;
 
sg = scsi_sglist(scmd);
-   if (n_block <= 0x * cmax) {
+   if (unmap) {
size = ata_format_dsm_trim_descr(sg, trmax, block, n_block);
+   if (ata_ncq_enabled(dev) && ata_fpdma_dsm_supported(dev)) {
+   /* Newer devices support queued TRIM commands */
+   tf->protocol = ATA_PROT_NCQ;
+   tf->command = ATA_CMD_FPDMA_SEND;
+   tf->hob_nsect = ATA_SUBCMD_FPDMA_SEND_DSM & 0x1f;
+   tf->nsect = qc->tag << 3;
+   tf->hob_feature = (size / 512) >> 8;
+

[PATCH v5 1/2] Use kmap_atomic when rewriting attached page

2016-08-09 Thread Shaun Tancheff

The current SATL for WRITE_SAME does not protect against misaligned
pages. Additionally the associated page should also kmap'd when
being modified.

Signed-off-by: Shaun Tancheff 
---
 v5: Added prep patch to work with non-page aligned scatterlist pages
 and use kmap_atomic() to lock page during modification.

 drivers/ata/libata-scsi.c | 53 ++-
 include/linux/ata.h   | 26 ---
 2 files changed, 48 insertions(+), 31 deletions(-)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index e207b33..a71067a 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -3282,16 +3282,60 @@ static unsigned int ata_scsi_pass_thru(struct 
ata_queued_cmd *qc)
return 1;
 }
 
+/**
+ * ata_format_dsm_trim_descr() - SATL Write Same to DSM Trim
+ * @sg: Scatter / Gather list attached to command.
+ * @num: Maximum number of entries (nominally 64).
+ * @sector: Starting sector
+ * @count: Total Range of request
+ *
+ * Rewrite the WRITE SAME descriptor to be a DSM TRIM little-endian formatted
+ * descriptor.
+ *
+ * Upto 64 entries of the format:
+ *   63:48 Range Length
+ *   47:0  LBA
+ *
+ *  Range Length of 0 is ignored.
+ *  LBA's should be sorted order and not overlap.
+ *
+ * NOTE: this is the same format as ADD LBA(S) TO NV CACHE PINNED SET
+ */
+static unsigned int ata_format_dsm_trim_descr(struct scatterlist *sg, u32 num,
+ u64 sector, u32 count)
+{
+   void *ptr = kmap_atomic(sg_page(sg));
+   __le64 *buffer = ptr + sg->offset;
+   u32 i = 0, used_bytes;
+
+   while (i < num) {
+   u64 entry = sector |
+   ((u64)(count > 0x ? 0x : count) << 48);
+   buffer[i++] = __cpu_to_le64(entry);
+   if (count <= 0x)
+   break;
+   count -= 0x;
+   sector += 0x;
+   }
+
+   used_bytes = ALIGN(i * 8, 512);
+   memset(buffer + i, 0, used_bytes - i * 8);
+
+   kunmap_atomic(ptr);
+   return used_bytes;
+}
+
 static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
 {
struct ata_taskfile *tf = &qc->tf;
struct scsi_cmnd *scmd = qc->scsicmd;
struct ata_device *dev = qc->dev;
const u8 *cdb = scmd->cmnd;
+   struct scatterlist *sg;
u64 block;
u32 n_block;
+   const u32 trmax = ATA_MAX_TRIM_RNUM;
u32 size;
-   void *buf;
u16 fp;
u8 bp = 0xff;
 
@@ -3319,10 +3363,9 @@ static unsigned int ata_scsi_write_same_xlat(struct 
ata_queued_cmd *qc)
if (!scsi_sg_count(scmd))
goto invalid_param_len;
 
-   buf = page_address(sg_page(scsi_sglist(scmd)));
-
-   if (n_block <= 65535 * ATA_MAX_TRIM_RNUM) {
-   size = ata_set_lba_range_entries(buf, ATA_MAX_TRIM_RNUM, block, 
n_block);
+   sg = scsi_sglist(scmd);
+   if (n_block <= 0x * cmax) {
+   size = ata_format_dsm_trim_descr(sg, trmax, block, n_block);
} else {
fp = 2;
goto invalid_fld;
diff --git a/include/linux/ata.h b/include/linux/ata.h
index adbc812..45a1d71 100644
--- a/include/linux/ata.h
+++ b/include/linux/ata.h
@@ -1071,32 +1071,6 @@ static inline void ata_id_to_hd_driveid(u16 *id)
 #endif
 }
 
-/*
- * Write LBA Range Entries to the buffer that will cover the extent from
- * sector to sector + count.  This is used for TRIM and for ADD LBA(S)
- * TO NV CACHE PINNED SET.
- */
-static inline unsigned ata_set_lba_range_entries(void *_buffer,
-   unsigned num, u64 sector, unsigned long count)
-{
-   __le64 *buffer = _buffer;
-   unsigned i = 0, used_bytes;
-
-   while (i < num) {
-   u64 entry = sector |
-   ((u64)(count > 0x ? 0x : count) << 48);
-   buffer[i++] = __cpu_to_le64(entry);
-   if (count <= 0x)
-   break;
-   count -= 0x;
-   sector += 0x;
-   }
-
-   used_bytes = ALIGN(i * 8, 512);
-   memset(buffer + i, 0, used_bytes - i * 8);
-   return used_bytes;
-}
-
 static inline bool ata_ok(u8 status)
 {
return ((status & (ATA_BUSY | ATA_DRDY | ATA_DF | ATA_DRQ | ATA_ERR))
-- 
2.8.1

[PATCH v5 0/2] Add support for SCT Write Same

2016-08-09 Thread Shaun Tancheff

At some point the method of issuing Write Same for ATA drives changed.
Currently write same is commonly available via SCT so expose the SCT
capabilities and use SCT Write Same when it is available.

This is useful for zoned based media that prefers to support discard
with lbprz set, aka discard zeroes data by mapping discard operations to
reset write pointer operations. Conventional zones that do not support
reset write pointer can still honor the discard zeroes data by issuing
a write same over the zone.

It may also be nice to know if various controllers that currently
disable WRITE SAME will work with the SCT Write Same code path:
  aacraid, arcmsr, megaraid, 3w-9xxx, 3w-sas, 3w-, gdth, hpsa, ips,
  megaraid, pmcraid, storvsc_drv

This patch against v4.8-rc1 is also at 

https://github.com/stancheff/linux/tree/v4.8-rc1+ws.v5

g...@github.com:stancheff/linux.git v4.8-rc1+ws.v5

Shaun Tancheff (2):
  Use kmap_atomic when rewriting attached page
  Add support for SCT Write Same

 drivers/ata/libata-scsi.c | 240 --
 include/linux/ata.h   |  69 -
 2 files changed, 252 insertions(+), 57 deletions(-)

-- 
2.8.1

Re: [PATCH v2 3/3] powerpc: Convert fsl_rstcr_restart to a reset handler

2016-08-09 Thread Nicholas Piggin

On Tue, 9 Aug 2016 11:47:37 -0700
Andrey Smirnov  wrote:

> On Sun, Jul 31, 2016 at 9:03 PM, Nicholas Piggin  wrote:
> > On Thu, 28 Jul 2016 16:07:18 -0700
> > Andrey Smirnov  wrote:
> >  
> >> Convert fsl_rstcr_restart into a function to be registered with
> >> register_reset_handler().
> >>
> >> Signed-off-by: Andrey Smirnov 
> >> ---
> >>
> >> Changes since v1:
> >>
> >>   - fsl_rstcr_restart is registered as a reset handler in
> >>   setup_rstcr, replacing per-board arch_initcall approach  
> >
> > Bear in mind I don't know much about the embedded or platform code!
> >
> > The documentation for reset notifiers says that they are expected
> > to be registered from drivers, not arch code. That seems to only be
> > intended to mean that the standard ISA or platform reset would
> > normally be handled directly by the arch, whereas if you have an
> > arch specific driver for a reset hardware that just happens to live
> > under arch/, then fsl_rstcr_restart / mpc85xx_cds_restart would be
> > valid use of reset notifier.  
> 
> Yeah, IMHO there's quite a bit of code in sysdev/ which in ideal world
> would go into drivers/ and I think fsl_rstcr_restart is among it
> (similar example on MIPS is drivers/power/reset/brcmstb-reboot.c).
> 
> >
> > So this change seems reasonable to me. One small question:
> >
> >  
> >> +static int mpc85xx_cds_restart_register(void)
> >> +{
> >> + static struct notifier_block restart_handler;
> >> +
> >> + restart_handler.notifier_call = mpc85xx_cds_restart;
> >> + restart_handler.priority = 192;  
> >
> > Should there be a header with #define's for these priorities?  
> 
> I don't have any strong preference either way, I do however think that
> introducing such #define should go into a separate patch-set, since
> you'd probably want to propagate that change across all of the users
> of the API.

You're probably right. I was thinking because powerpc has not used it
before we could use local defines, but it really does need a global
location.

Thanks,
Nick

[PATCH] mm, hugetlb: switch hugetlbfs to multi-order radix-tree entries

2016-08-09 Thread Naoya Horiguchi

Hi Kirill,

I wrote a patch to switch hugetlbfs to multi-order radix tree.
Hopefully it's queued to your series.

Thanks,
Naoya Horiguchi
---
From: Naoya Horiguchi 
Date: Wed, 10 Aug 2016 09:49:09 +0900
Subject: [PATCH] mm, hugetlb: switch hugetlbfs to multi-order radix-tree
 entries

Currently, hugetlb pages are linked to page cache on the basis of hugepage
offset (derived from vma_hugecache_offset()) for historical reason, which
doesn't match to the generic usage of page cache and requires some routines
to covert page offset <=> hugepage offset in common path. This patch
adjusts code for multi-order radix-tree to avoid the situation.

Main change is on the behavior of page->index for hugetlbfs. Before this
patch, it represented hugepage offset, but with this patch it represents
page offset. So index-related code have to be updated.
Note that hugetlb_fault_mutex_hash() and reservation region handling are
still working with hugepage offset.

Signed-off-by: Naoya Horiguchi 
---
 fs/hugetlbfs/inode.c| 22 ++
 include/linux/pagemap.h | 10 +-
 mm/filemap.c| 26 +++---
 mm/hugetlb.c| 19 ++-
 4 files changed, 32 insertions(+), 45 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 4ea71eba40a5..fc918c0e33e9 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -388,8 +388,8 @@ static void remove_inode_hugepages(struct inode *inode, 
loff_t lstart,
 {
struct hstate *h = hstate_inode(inode);
struct address_space *mapping = &inode->i_data;
-   const pgoff_t start = lstart >> huge_page_shift(h);
-   const pgoff_t end = lend >> huge_page_shift(h);
+   const pgoff_t start = lstart >> PAGE_SHIFT;
+   const pgoff_t end = lend >> PAGE_SHIFT;
struct vm_area_struct pseudo_vma;
struct pagevec pvec;
pgoff_t next;
@@ -447,8 +447,7 @@ static void remove_inode_hugepages(struct inode *inode, 
loff_t lstart,
 
i_mmap_lock_write(mapping);
hugetlb_vmdelete_list(&mapping->i_mmap,
-   next * pages_per_huge_page(h),
-   (next + 1) * pages_per_huge_page(h));
+   next, next + 1);
i_mmap_unlock_write(mapping);
}
 
@@ -467,7 +466,8 @@ static void remove_inode_hugepages(struct inode *inode, 
loff_t lstart,
freed++;
if (!truncate_op) {
if (unlikely(hugetlb_unreserve_pages(inode,
-   next, next + 1, 1)))
+   (next) << huge_page_order(h),
+   (next + 1) << 
huge_page_order(h), 1)))
hugetlb_fix_reserve_counts(inode,
rsv_on_error);
}
@@ -552,8 +552,6 @@ static long hugetlbfs_fallocate(struct file *file, int 
mode, loff_t offset,
struct hstate *h = hstate_inode(inode);
struct vm_area_struct pseudo_vma;
struct mm_struct *mm = current->mm;
-   loff_t hpage_size = huge_page_size(h);
-   unsigned long hpage_shift = huge_page_shift(h);
pgoff_t start, index, end;
int error;
u32 hash;
@@ -569,8 +567,8 @@ static long hugetlbfs_fallocate(struct file *file, int 
mode, loff_t offset,
 * For this range, start is rounded down and end is rounded up
 * as well as being converted to page offsets.
 */
-   start = offset >> hpage_shift;
-   end = (offset + len + hpage_size - 1) >> hpage_shift;
+   start = offset >> PAGE_SHIFT;
+   end = (offset + len + huge_page_size(h) - 1) >> PAGE_SHIFT;
 
inode_lock(inode);
 
@@ -588,7 +586,7 @@ static long hugetlbfs_fallocate(struct file *file, int 
mode, loff_t offset,
pseudo_vma.vm_flags = (VM_HUGETLB | VM_MAYSHARE | VM_SHARED);
pseudo_vma.vm_file = file;
 
-   for (index = start; index < end; index++) {
+   for (index = start; index < end; index += pages_per_huge_page(h)) {
/*
 * This is supposed to be the vaddr where the page is being
 * faulted in, but we have no vaddr here.
@@ -609,10 +607,10 @@ static long hugetlbfs_fallocate(struct file *file, int 
mode, loff_t offset,
}
 
/* Set numa allocation policy based on index */
-   hugetlb_set_vma_policy(&pseudo_vma, inode, index);
+   hugetlb_set_vma_policy(&pseudo_vma, inode, index >> 
huge_page_order(h));
 
/* addr is the offset within the file (zero based) */
-   addr = index * hpage_size;
+   addr = index << PAGE_SHIFT & ~huge_page_mask(h);
 
/*

Re: [PATCH] i2c: uniphier{-f}: don't print error when adding adapter fails

2016-08-09 Thread Masahiro Yamada

2016-08-10 5:11 GMT+09:00 Wolfram Sang :
> The core will do this for us now.
>
> Signed-off-by: Wolfram Sang 
> ---
>  drivers/i2c/busses/i2c-uniphier-f.c | 5 -
>  drivers/i2c/busses/i2c-uniphier.c   | 5 -
>  2 files changed, 10 deletions(-)
>
> diff --git a/drivers/i2c/busses/i2c-uniphier-f.c 
> b/drivers/i2c/busses/i2c-uniphier-f.c
> index aeead0d27d1007..35608531fe070d 100644
> --- a/drivers/i2c/busses/i2c-uniphier-f.c
> +++ b/drivers/i2c/busses/i2c-uniphier-f.c
> @@ -550,11 +550,6 @@ static int uniphier_fi2c_probe(struct platform_device 
> *pdev)
> }
>
> ret = i2c_add_adapter(&priv->adap);
> -   if (ret) {
> -   dev_err(dev, "failed to add I2C adapter\n");
> -   goto err;
> -   }
> -
>  err:
> if (ret)
> clk_disable_unprepare(priv->clk);
> diff --git a/drivers/i2c/busses/i2c-uniphier.c 
> b/drivers/i2c/busses/i2c-uniphier.c
> index 475a5eb514e215..d6e612a0e02a9d 100644
> --- a/drivers/i2c/busses/i2c-uniphier.c
> +++ b/drivers/i2c/busses/i2c-uniphier.c
> @@ -407,11 +407,6 @@ static int uniphier_i2c_probe(struct platform_device 
> *pdev)
> }
>
> ret = i2c_add_adapter(&priv->adap);
> -   if (ret) {
> -   dev_err(dev, "failed to add I2C adapter\n");
> -   goto err;
> -   }
> -
>  err:
> if (ret)
> clk_disable_unprepare(priv->clk);


This version looks good to me.  :)


Acked-by: Masahiro Yamada 


(Please make sure to squash this and the other big one into a single patch.)



-- 
Best Regards
Masahiro Yamada

Re: [PATCH v2 1/3] powerpc: Factor out common code in setup-common.c

2016-08-09 Thread Nicholas Piggin

On Tue, 9 Aug 2016 09:30:54 -0700
Andrey Smirnov  wrote:

> On Sun, Jul 31, 2016 at 8:40 PM, Nicholas Piggin  wrote:
> > On Thu, 28 Jul 2016 16:07:16 -0700
> > Andrey Smirnov  wrote:
> >  
> >> Factor out a small bit of common code in machine_restart(),
> >> machine_power_off() and machine_halt().
> >>
> >> Signed-off-by: Andrey Smirnov 
> >> ---
> >>
> >> No changes compared to v1.
> >>
> >>  arch/powerpc/kernel/setup-common.c | 23 ++-
> >>  1 file changed, 14 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/arch/powerpc/kernel/setup-common.c
> >> b/arch/powerpc/kernel/setup-common.c index 714b4ba..5cd3283 100644
> >> --- a/arch/powerpc/kernel/setup-common.c
> >> +++ b/arch/powerpc/kernel/setup-common.c
> >> @@ -130,15 +130,22 @@ void machine_shutdown(void)
> >>   ppc_md.machine_shutdown();
> >>  }
> >>
> >> +static void machine_hang(void)
> >> +{
> >> + pr_emerg("System Halted, OK to turn off power\n");
> >> + local_irq_disable();
> >> + while (1)
> >> + ;
> >> +}  
> >
> > What's the intended semantics of this function? A default power
> > off handler when the platform supplies none?  
> 
> I was mostly trying to avoid code duplication in
> machine_halt/machine_restart/machine_power_off and didn't intend that
> function to be used outside. The semantics is just - to hang the CPU
> when things didn't go as expected and code that was supposed to
> restart/halt/power off the machine failed.
> 
> > Would ppc_power_off()
> > be a good name?  
> 
> Calling it "power_off" seems a bit misleading, the function doesn't
> really try to do anything related to powering off, really.

Okay I don't feel too strongly against it.

Thanks,
Nick

[PATCH] soc: qcom: smd: Request irqs after parsing properties

2016-08-09 Thread Bjorn Andersson

The code exectued by the interrupt handler depends on the values parsed
after requesting the irq, just to be save we should therefor move the
request_irq() call to be done after parsing the properties.

Signed-off-by: Bjorn Andersson 
---
 drivers/soc/qcom/smd.c | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/soc/qcom/smd.c b/drivers/soc/qcom/smd.c
index 63e72eb9baa7..679f7778a4e3 100644
--- a/drivers/soc/qcom/smd.c
+++ b/drivers/soc/qcom/smd.c
@@ -1348,22 +1348,6 @@ static int qcom_smd_parse_edge(struct device *dev,
 
edge->of_node = of_node_get(node);
 
-   irq = irq_of_parse_and_map(node, 0);
-   if (irq < 0) {
-   dev_err(dev, "required smd interrupt missing\n");
-   return -EINVAL;
-   }
-
-   ret = devm_request_irq(dev, irq,
-  qcom_smd_edge_intr, IRQF_TRIGGER_RISING,
-  node->name, edge);
-   if (ret) {
-   dev_err(dev, "failed to request smd irq\n");
-   return ret;
-   }
-
-   edge->irq = irq;
-
key = "qcom,smd-edge";
ret = of_property_read_u32(node, key, &edge->edge_id);
if (ret) {
@@ -1398,6 +1382,22 @@ static int qcom_smd_parse_edge(struct device *dev,
return -EINVAL;
}
 
+   irq = irq_of_parse_and_map(node, 0);
+   if (irq < 0) {
+   dev_err(dev, "required smd interrupt missing\n");
+   return -EINVAL;
+   }
+
+   ret = devm_request_irq(dev, irq,
+  qcom_smd_edge_intr, IRQF_TRIGGER_RISING,
+  node->name, edge);
+   if (ret) {
+   dev_err(dev, "failed to request smd irq\n");
+   return ret;
+   }
+
+   edge->irq = irq;
+
return 0;
 }
 
-- 
2.5.0

[PATCH] soc: qcom: smd: Simplify multi channel handling

2016-08-09 Thread Bjorn Andersson

Multi-channel clients split between several drivers need a way to close
individual channels, as these drivers might be removed individually.
With this in place the responsibility of closing additionally opened
channels to the client as well only concerning smd about the primary
channel.

With this approach we will only trigger removal of SMD devices based on
the state of the primary channel, however we get in sync with how rpmsg
works.

Signed-off-by: Bjorn Andersson 
---

We do not have any drivers in the tree that will suffer from the loss of this.

 drivers/soc/qcom/smd.c   | 34 --
 include/linux/soc/qcom/smd.h |  7 +++
 2 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/drivers/soc/qcom/smd.c b/drivers/soc/qcom/smd.c
index ac1957dfdf24..63e72eb9baa7 100644
--- a/drivers/soc/qcom/smd.c
+++ b/drivers/soc/qcom/smd.c
@@ -197,7 +197,6 @@ struct qcom_smd_channel {
void *drvdata;
 
struct list_head list;
-   struct list_head dev_list;
 };
 
 /**
@@ -891,8 +890,6 @@ static int qcom_smd_dev_remove(struct device *dev)
struct qcom_smd_device *qsdev = to_smd_device(dev);
struct qcom_smd_driver *qsdrv = to_smd_driver(dev);
struct qcom_smd_channel *channel = qsdev->channel;
-   struct qcom_smd_channel *tmp;
-   struct qcom_smd_channel *ch;
 
qcom_smd_channel_set_state(channel, SMD_CHANNEL_CLOSING);
 
@@ -911,15 +908,9 @@ static int qcom_smd_dev_remove(struct device *dev)
if (qsdrv->remove)
qsdrv->remove(qsdev);
 
-   /*
-* The client is now gone, close and release all channels associated
-* with this sdev
-*/
-   list_for_each_entry_safe(ch, tmp, &channel->dev_list, dev_list) {
-   qcom_smd_channel_close(ch);
-   list_del(&ch->dev_list);
-   ch->qsdev = NULL;
-   }
+   /* The client is now gone, close the primary channel */
+   qcom_smd_channel_close(channel);
+   channel->qsdev = NULL;
 
return 0;
 }
@@ -1091,6 +1082,8 @@ qcom_smd_find_channel(struct qcom_smd_edge *edge, const 
char *name)
  *
  * Returns a channel handle on success, or -EPROBE_DEFER if the channel isn't
  * ready.
+ *
+ * Any channels returned must be closed with a call to qcom_smd_close_channel()
  */
 struct qcom_smd_channel *qcom_smd_open_channel(struct qcom_smd_channel *parent,
   const char *name,
@@ -1120,15 +1113,21 @@ struct qcom_smd_channel *qcom_smd_open_channel(struct 
qcom_smd_channel *parent,
return ERR_PTR(ret);
}
 
-   /*
-* Append the list of channel to the channels associated with the sdev
-*/
-   list_add_tail(&channel->dev_list, &sdev->channel->dev_list);
-
return channel;
 }
 EXPORT_SYMBOL(qcom_smd_open_channel);
 
+/**
+ * qcom_smd_close_channel() - close an additionally opened channel
+ * @channel:   channel handle, returned by qcom_smd_open_channel()
+ */
+void qcom_smd_close_channel(struct qcom_smd_channel *channel)
+{
+   qcom_smd_channel_close(channel);
+   channel->qsdev = NULL;
+}
+EXPORT_SYMBOL(qcom_smd_close_channel);
+
 /*
  * Allocate the qcom_smd_channel object for a newly found smd channel,
  * retrieving and validating the smem items involved.
@@ -1150,7 +1149,6 @@ static struct qcom_smd_channel 
*qcom_smd_create_channel(struct qcom_smd_edge *ed
if (!channel)
return ERR_PTR(-ENOMEM);
 
-   INIT_LIST_HEAD(&channel->dev_list);
channel->edge = edge;
channel->name = devm_kstrdup(smd->dev, name, GFP_KERNEL);
if (!channel->name)
diff --git a/include/linux/soc/qcom/smd.h b/include/linux/soc/qcom/smd.h
index 910ce1d9ba89..324b1decfffb 100644
--- a/include/linux/soc/qcom/smd.h
+++ b/include/linux/soc/qcom/smd.h
@@ -55,6 +55,7 @@ void qcom_smd_driver_unregister(struct qcom_smd_driver *drv);
 struct qcom_smd_channel *qcom_smd_open_channel(struct qcom_smd_channel 
*channel,
   const char *name,
   qcom_smd_cb_t cb);
+void qcom_smd_close_channel(struct qcom_smd_channel *channel);
 void *qcom_smd_get_drvdata(struct qcom_smd_channel *channel);
 void qcom_smd_set_drvdata(struct qcom_smd_channel *channel, void *data);
 int qcom_smd_send(struct qcom_smd_channel *channel, const void *data, int len);
@@ -83,6 +84,12 @@ qcom_smd_open_channel(struct qcom_smd_channel *channel,
return NULL;
 }
 
+static inline void qcom_smd_close_channel(struct qcom_smd_channel *channel)
+{
+   /* This shouldn't be possible */
+   WARN_ON(1);
+}
+
 static inline void *qcom_smd_get_drvdata(struct qcom_smd_channel *channel)
 {
/* This shouldn't be possible */
-- 
2.5.0

Re: [v10 PATCH 0/5] Rockchip Type-C and DisplayPort driver

2016-08-09 Thread Chanwoo Choi

Hi Chris,

On 2016년 08월 10일 08:32, Chris Zhong wrote:
> Hi all
> 
> This series patch is for rockchip Type-C phy and DisplayPort controller
> driver.
> 
> The USB Type-C PHY is designed to support the USB3 and DP applications.
> The PHY basically has two main components: USB3 and DisplyPort. USB3
> operates in SuperSpeed mode and the DP can operate at RBR, HBR and HBR2
> data rates. The Type-C cable orientation detection and Power Delivery
> (PD) is accomplished using a PD PHY or a exernal PD chip.
> 
> The DP controller is compliant with DisplayPort Specification,
> Version 1.3, This IP is compatible with the rockchip type-c PHY IP.
> There is a uCPU in DP controller, it need a firmware to work, please
> put the firmware file[0] to /lib/firmware/rockchip/dptx.bin. The uCPU
> in charge of aux communication and link training, the host use mailbox
> to communicate with the ucpu.
> 
> The DP contoller has register a notification with extcon API, to get the
> alt mode from PD, the PD driver need call the devm_extcon_dev_allocate
> to create a extcon device and use extcon_set_state to notify DP
> controller. And call extcon_set_cable_property to set orientation.
> 
> About the DP audio, cdn-dp registered 2 DAIs: 0 is I2S, 1 is SPDIF.
> We can reference them in simple-card.
> 
> This series is based on Mark Yao's branch[1] and Chanwoo Choi's
> extcon-next branch[2], and the clk patch[3].
> 
> I test this patches on the rk3399-evb board, with a fusb302 driver,
> this branch has no rk3399.dtsi, so the patch about dts is not included
> in this series.
> 
>>From V9, the Type-C PHY is split into two PHYs: DP and USB3. The PHY
> will be init, no matter which PHY be power_on. The DP module will
> enter A2 mode (standby mode) after phy_init, if DP PHY is powered on,
> the DP module will enter to A0 mode(running mode). Then if DP PHY is
> powered off, DP module will back to A2 mode. If everything is
> un-plugged, phy will be deinit.
> 
> [0]
> https://patchwork.kernel.org/patch/9249693/
> [1]
> https://github.com/markyzq/kernel-drm-rockchip/tree/drm-rockchip-next-2016-05-23
> [2]
> https://git.kernel.org/cgit/linux/kernel/git/chanwoo/extcon.git/log/?h=extcon-test
> - extcon: Add the extcon_type to gather each connector into five category
> - extcon: Add the support for extcon property according to extcon type
> - extcon: Add the support for the capability of each property
> - extcon: Rename the extcon_set/get_state() to maintain the function naming
> pattern
> - extcon: Add the synchronization extcon APIs to support the notification
> - extcon: Add EXTCON_DISP_DP and the property for USB Type-C

The extcon patches are merged on extcon-next branch.
So, you can check them on both extcon git and linux-next git repo.

[snip]

Regards,
Chanwoo Choi

[PATCH] soc: qcom: smd: Correct compile stub prototypes

2016-08-09 Thread Bjorn Andersson

The prototypes for the compile stubs was not properly marked as static
inline, this patch corrects this.

Fixes: f79a917e69e1 ("Merge tag 'qcom-soc-for-4.7-2' into net-next")
Signed-off-by: Bjorn Andersson 
---
 include/linux/soc/qcom/smd.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/soc/qcom/smd.h b/include/linux/soc/qcom/smd.h
index cbb0f06c41b2..910ce1d9ba89 100644
--- a/include/linux/soc/qcom/smd.h
+++ b/include/linux/soc/qcom/smd.h
@@ -83,14 +83,14 @@ qcom_smd_open_channel(struct qcom_smd_channel *channel,
return NULL;
 }
 
-void *qcom_smd_get_drvdata(struct qcom_smd_channel *channel)
+static inline void *qcom_smd_get_drvdata(struct qcom_smd_channel *channel)
 {
/* This shouldn't be possible */
WARN_ON(1);
return NULL;
 }
 
-void qcom_smd_set_drvdata(struct qcom_smd_channel *channel, void *data)
+static inline void qcom_smd_set_drvdata(struct qcom_smd_channel *channel, void 
*data)
 {
/* This shouldn't be possible */
WARN_ON(1);
-- 
2.5.0

Re: [PACTH v1] mm, proc: Implement /proc//totmaps

2016-08-09 Thread Sonny Rao

On Tue, Aug 9, 2016 at 12:16 PM, Konstantin Khlebnikov  wrote:
>
> On Tue, Aug 9, 2016 at 7:05 PM,   wrote:
> > From: Sonny Rao 
> >
> > This is based on earlier work by Thiago Goncales. It implements a new
> > per process proc file which summarizes the contents of the smaps file
> > but doesn't display any addresses.  It gives more detailed information
> > than statm like the PSS (proprotional set size).  It differs from the
> > original implementation in that it doesn't use the full blown set of
> > seq operations, uses a different termination condition, and doesn't
> > displayed "Locked" as that was broken on the original implemenation.
> >
> > This new proc file provides information faster than parsing the potentially
> > huge smaps file.
>
> What statistics do you really need?

PSS (Proportional Set Size) and related accounting of shared pages
(swap could be shared) is where the existing summaries of memory usage
are cumbersome.

>
>
> I think, performance and flexibility issues could be really solved only by new
> syscall for querying memory statistics for address range in any process:
> process_vm_stat() or some kind of pumped fincore() for /proc/$pid/mem


That would be a good long term solution if people want similarly
complicated statistics without having to iterate through current
interfaces.
I mentioned monitoring before but I'll add that Proportional Set size,
Unique Set Size, Swap are per process are also useful because they
help us make better decisions about what processes need to be
throttled or gracefully killed.

>
> >
> > Signed-off-by: Sonny Rao 
> >
> > Tested-by: Robert Foss 
> > Signed-off-by: Robert Foss 
> >
> > ---
> >  fs/proc/base.c |   1 +
> >  fs/proc/internal.h |   4 ++
> >  fs/proc/task_mmu.c | 126 
> > +
> >  3 files changed, 131 insertions(+)
> >
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index a11eb71..de3acdf 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -2855,6 +2855,7 @@ static const struct pid_entry tgid_base_stuff[] = {
> > REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
> > REG("smaps",  S_IRUGO, proc_pid_smaps_operations),
> > REG("pagemap",S_IRUSR, proc_pagemap_operations),
> > +   REG("totmaps",S_IRUGO, proc_totmaps_operations),
> >  #endif
> >  #ifdef CONFIG_SECURITY
> > DIR("attr",   S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, 
> > proc_attr_dir_operations),
> > diff --git a/fs/proc/internal.h b/fs/proc/internal.h
> > index aa27810..6f3540f 100644
> > --- a/fs/proc/internal.h
> > +++ b/fs/proc/internal.h
> > @@ -58,6 +58,9 @@ union proc_op {
> > struct task_struct *task);
> >  };
> >
> > +
> > +extern const struct file_operations proc_totmaps_operations;
> > +
> >  struct proc_inode {
> > struct pid *pid;
> > int fd;
> > @@ -281,6 +284,7 @@ struct proc_maps_private {
> > struct mm_struct *mm;
> >  #ifdef CONFIG_MMU
> > struct vm_area_struct *tail_vma;
> > +   struct mem_size_stats *mss;
> >  #endif
> >  #ifdef CONFIG_NUMA
> > struct mempolicy *task_mempolicy;
> > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > index 4648c7f..b61873e 100644
> > --- a/fs/proc/task_mmu.c
> > +++ b/fs/proc/task_mmu.c
> > @@ -802,6 +802,81 @@ static int show_smap(struct seq_file *m, void *v, int 
> > is_pid)
> > return 0;
> >  }
> >
> > +static void add_smaps_sum(struct mem_size_stats *mss,
> > +   struct mem_size_stats *mss_sum)
> > +{
> > +   mss_sum->resident += mss->resident;
> > +   mss_sum->pss += mss->pss;
> > +   mss_sum->shared_clean += mss->shared_clean;
> > +   mss_sum->shared_dirty += mss->shared_dirty;
> > +   mss_sum->private_clean += mss->private_clean;
> > +   mss_sum->private_dirty += mss->private_dirty;
> > +   mss_sum->referenced += mss->referenced;
> > +   mss_sum->anonymous += mss->anonymous;
> > +   mss_sum->anonymous_thp += mss->anonymous_thp;
> > +   mss_sum->swap += mss->swap;
> > +}
> > +
> > +static int totmaps_proc_show(struct seq_file *m, void *data)
> > +{
> > +   struct proc_maps_private *priv = m->private;
> > +   struct mm_struct *mm;
> > +   struct vm_area_struct *vma;
> > +   struct mem_size_stats *mss_sum = priv->mss;
> > +
> > +   /* reference to priv->task already taken */
> > +   /* but need to get the mm here because */
> > +   /* task could be in the process of exiting */
> > +   mm = get_task_mm(priv->task);
> > +   if (!mm || IS_ERR(mm))
> > +   return -EINVAL;
> > +
> > +   down_read(&mm->mmap_sem);
> > +   hold_task_mempolicy(priv);
> > +
> > +   for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
> > +   struct mem_size_stats mss;
> > +   struct mm_walk smaps_walk = {
> > +   .pmd_entry = smaps_pte_range,
> > +   .mm = vma->vm_m

Re: [RFCv2][PATCH 2/5] arm: Implement ARCH_HAS_FORCE_CACHE

2016-08-09 Thread Florian Fainelli

On 08/09/2016 05:13 PM, Laura Abbott wrote:
> On 08/09/2016 02:56 PM, Florian Fainelli wrote:
>> On 08/08/2016 10:49 AM, Laura Abbott wrote:
>>> arm may need the kernel_force_cache APIs to guarantee data consistency.
>>> Implement versions of these APIs based on the DMA APIs.
>>>
>>> Signed-off-by: Laura Abbott 
>>> ---
>>>  arch/arm/include/asm/cacheflush.h |   4 ++
>>>  arch/arm/mm/dma-mapping.c | 119
>>> --
>>>  arch/arm/mm/flush.c   | 115
>>> 
>>
>> Why is the code moved between dma-mapping.c and flush.c? It was not
>> obvious while looking at these patches why this is needed.
>>
> 
> I wanted to use the cache flushing routines from dma-mapping.c and
> it seemed better to pull them out vs. trying to put more generic
> cache flushing code in dma-mapping.c. flush.c seemed like an
> appropriate place although I forgot about the dependency on CONFIG_MMU.
> It can certainly remain in dma-mapping.c if deemed appropriate.

My concern is that this is an area of the kernel where you might be
looking for stable backports, so avoiding churn in there is desireable
and if the new cache APIs become accepted and standard, since they are
building directly on top of the DMA-API, keeping them in dma-mapping.c
seems consistent.

My 2 cents.
-- 
Florian

Re: [PATCH v5 5/6] usb: chipidea: let chipidea core device of_node equal's glue layer device of_node

2016-08-09 Thread Stephen Boyd

Quoting Peter Chen (2016-08-08 01:52:10)
> From: Peter Chen 
> 
> At device tree, we have no device node for chipidea core,
> the glue layer's node is the parent node for host and udc
> device. But in related driver, the parent device is chipidea
> core. So, in order to let the common driver get parent's node,
> we let the core's device node equals glue layer device node.
> 
> Signed-off-by: Peter Chen 
> Tested-by: Maciej S. Szmigiero 
> Tested-by Joshua Clayton 
> ---
>  drivers/usb/chipidea/core.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/usb/chipidea/core.c b/drivers/usb/chipidea/core.c
> index 69426e6..b189dc7 100644
> --- a/drivers/usb/chipidea/core.c
> +++ b/drivers/usb/chipidea/core.c
> @@ -954,6 +954,15 @@ static int ci_hdrc_probe(struct platform_device *pdev)
> dev_err(dev, "unable to init phy: %d\n", ret);
> return ret;
> }
> +   /*
> +* At device tree, we have no device node for chipidea core,
> +* the glue layer's node is the parent node for host and udc
> +* device. But in related driver, the parent device is chipidea
> +* core. So, in order to let the common driver get parent's node,
> +* we let the core's device node equals glue layer's node.
> +*/
> +   if (dev->parent && dev->parent->of_node)
> +   dev->of_node = dev->parent->of_node;

Can this be done earlier? Perhaps after hw_device_init() in this probe
routine? That would allow me to remove the awkward parent searching in
my ULPI DT awareness patch.

Re: [Resend][PATCH] x86/power/64: Always create temporary identity mapping correctly

2016-08-09 Thread Rafael J. Wysocki

On Tuesday, August 09, 2016 11:23:31 PM Rafael J. Wysocki wrote:
> On Tue, Aug 9, 2016 at 10:02 PM, Jiri Kosina  wrote:
> > On Tue, 9 Aug 2016, Rafael J. Wysocki wrote:
> >
> >> I have a murky suspicion, but it is really weird.  Namely, what if
> >> restore_jump_address in set_up_temporary_text_mapping() happens to be
> >> covered by the restore kernel's identity mapping?  Then, the image
> >> kernel's entry point may get overwritten by something else in
> >> core_restore_code().
> >
> > So this made me to actually test a scenario where I'd suspend a kernel
> > that's known-broken (i.e. contains 021182e52fe), and then have it resumed
> > by a kernel that has 021182e52fe reverted. It resumed successfully.
> >
> > Just a datapoint.
> 
> That indicates the problem is somewhere in the restore kernel and no
> surprises there.
> 
> I am able to reproduce the original problem (a triple fault on resume
> with CONFIG_RANDOMIZE_MEMORY set) without the $subject patch, but the
> patch fixes it for me.
> 
> Question is why it is not sufficient for you and Boris and the above
> theory is about the only one I can come up with ATM.
> 
> I'm going to compare the configs etc, but I guess I just end up
> writing a patch to test that theory unless someone has any other idea
> in the meantime.

For the lack of better ideas, below is a patch to try.

It avoids the possible issue with the restore kernel's identity mapping overlap
with restore_jump_address by creating special super-simple page tables just
for the final jump to the image kernel.

It is on top of the $subject patch.  My test box still works with this applied,
but then it worked without it as well.

If it doesn't help, the identity mapping created by set_up_temporary_mappings()
is still not adequate for some reason most likely and we'll need to find out
why.

Thanks,
Rafael


---
 arch/x86/power/hibernate_64.c |   40 +++---
 arch/x86/power/hibernate_asm_64.S |   10 +
 2 files changed, 43 insertions(+), 7 deletions(-)

Index: linux-pm/arch/x86/power/hibernate_64.c
===
--- linux-pm.orig/arch/x86/power/hibernate_64.c
+++ linux-pm/arch/x86/power/hibernate_64.c
@@ -38,14 +38,20 @@ unsigned long jump_address_phys;
 unsigned long restore_cr3 __visible;
 
 unsigned long temp_level4_pgt __visible;
+unsigned long jump_level4_pgt __visible;
 
 unsigned long relocated_restore_code __visible;
 
-static int set_up_temporary_text_mapping(pgd_t *pgd)
+static int set_up_temporary_text_mapping(void)
 {
+   pgd_t *pgd;
pmd_t *pmd;
pud_t *pud;
 
+   pgd = (pgd_t *)get_safe_page(GFP_ATOMIC);
+   if (!pgd)
+   return -ENOMEM;
+
/*
 * The new mapping only has to cover the page containing the image
 * kernel's entry point (jump_address_phys), because the switch over to
@@ -74,6 +80,23 @@ static int set_up_temporary_text_mapping
set_pgd(pgd + pgd_index(restore_jump_address),
__pgd(__pa(pud) | _KERNPG_TABLE));
 
+   pud = (pud_t *)get_safe_page(GFP_ATOMIC);
+   if (!pud)
+   return -ENOMEM;
+
+   pmd = (pmd_t *)get_safe_page(GFP_ATOMIC);
+   if (!pmd)
+   return -ENOMEM;
+
+   set_pmd(pmd + pmd_index(relocated_restore_code),
+   __pmd((__pa(relocated_restore_code) & PMD_MASK) | 
__PAGE_KERNEL_LARGE_EXEC));
+   set_pud(pud + pud_index(relocated_restore_code),
+   __pud(__pa(pmd) | _KERNPG_TABLE));
+   set_pgd(pgd + pgd_index(relocated_restore_code),
+   __pgd(__pa(pud) | _KERNPG_TABLE));
+
+   jump_level4_pgt = __pa(pgd);
+
return 0;
 }
 
@@ -98,11 +121,6 @@ static int set_up_temporary_mappings(voi
if (!pgd)
return -ENOMEM;
 
-   /* Prepare a temporary mapping for the kernel text */
-   result = set_up_temporary_text_mapping(pgd);
-   if (result)
-   return result;
-
/* Set up the direct mapping from scratch */
for (i = 0; i < nr_pfn_mapped; i++) {
mstart = pfn_mapped[i].start << PAGE_SHIFT;
@@ -122,7 +140,10 @@ static int relocate_restore_code(void)
pgd_t *pgd;
pud_t *pud;
 
-   relocated_restore_code = get_safe_page(GFP_ATOMIC);
+   do
+   relocated_restore_code = get_safe_page(GFP_ATOMIC);
+   while ((relocated_restore_code & PMD_MASK) == (restore_jump_address & 
PMD_MASK));
+
if (!relocated_restore_code)
return -ENOMEM;
 
@@ -162,6 +183,11 @@ int swsusp_arch_resume(void)
if (error)
return error;
 
+   /* Prepare a temporary mapping for the jump to the image kernel */
+   error = set_up_temporary_text_mapping();
+   if (error)
+   return error;
+
restore_image();
return 0;
 }
Index: linux-pm/arch/x86/power/hibernate_asm_64.S
===
---

Re: [PATCH] mm: fix the incorrect hugepages count

2016-08-09 Thread Naoya Horiguchi

On Tue, Aug 09, 2016 at 06:32:39PM +0800, zhong jiang wrote:
> On 2016/8/9 1:14, Mike Kravetz wrote:
> > On 08/07/2016 07:49 PM, zhongjiang wrote:
> >> From: zhong jiang 
> >>
> >> when memory hotplug enable, free hugepages will be freed if movable node 
> >> offline.
> >> therefore, /proc/sys/vm/nr_hugepages will be incorrect.

This sounds a bit odd to me because /proc/sys/vm/nr_hugepages returns
h->nr_huge_pages or h->nr_huge_pages_node[nid], which is already
considered in dissolve_free_huge_page (via update_and_free_page).

I think that h->max_huge_pages effectively means the pool size, and
h->nr_huge_pages means total hugepage number (which can be greater than
the pool size when there's overcommiting/surplus.)

dissolve_free_huge_page intends to break a hugepage into buddy, and
the destination hugepage is supposed to be allocated from the pool of
the destination node, so the system-wide pool size is reduced.
So adding h->max_huge_pages-- makes sense to me.

Acked-by: Naoya Horiguchi 

> >>
> >> The patch fix it by reduce the max_huge_pages when the node offline.
> >>
> >> Signed-off-by: zhong jiang 
> >> ---
> >>  mm/hugetlb.c | 1 +
> >>  1 file changed, 1 insertion(+)
> >>
> >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> >> index f904246..3356e3a 100644
> >> --- a/mm/hugetlb.c
> >> +++ b/mm/hugetlb.c
> >> @@ -1448,6 +1448,7 @@ static void dissolve_free_huge_page(struct page 
> >> *page)
> >>list_del(&page->lru);
> >>h->free_huge_pages--;
> >>h->free_huge_pages_node[nid]--;
> >> +  h->max_huge_pages--;
> >>update_and_free_page(h, page);
> >>}
> >>spin_unlock(&hugetlb_lock);
> >>
> > Adding Naoya as he was the original author of this code.
> >
> > >From quick look it appears that the huge page will be migrated (allocated
> > on another node).  If my understanding is correct, then max_huge_pages
> > should not be adjusted here.
> >
>   we need to take free hugetlb pages into account.  of course, the allocated 
> huge pages is no
>   need to reduce.  The patch just reduce the free hugetlb pages count.

I

Re: [RFCv2][PATCH 2/5] arm: Implement ARCH_HAS_FORCE_CACHE

2016-08-09 Thread Laura Abbott


On 08/09/2016 02:56 PM, Florian Fainelli wrote:

On 08/08/2016 10:49 AM, Laura Abbott wrote:

arm may need the kernel_force_cache APIs to guarantee data consistency.
Implement versions of these APIs based on the DMA APIs.

Signed-off-by: Laura Abbott 
---
 arch/arm/include/asm/cacheflush.h |   4 ++
 arch/arm/mm/dma-mapping.c | 119 --
 arch/arm/mm/flush.c   | 115 


Why is the code moved between dma-mapping.c and flush.c? It was not
obvious while looking at these patches why this is needed.



I wanted to use the cache flushing routines from dma-mapping.c and
it seemed better to pull them out vs. trying to put more generic
cache flushing code in dma-mapping.c. flush.c seemed like an
appropriate place although I forgot about the dependency on CONFIG_MMU.
It can certainly remain in dma-mapping.c if deemed appropriate.

Thanks,
Laura

Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access

2016-08-09 Thread Benjamin Herrenschmidt

On Tue, 2016-08-09 at 11:12 -0700, Alexander Duyck wrote:
> 
> The PCI spec is what essentially assumes that there is only one block.
> If I am not mistaken in the case of this device the second block here
> actually contains device configuration data, not actual VPD data.  The
> issue here is that the second block is being accessed as VPD when it
> isn't.

Devices do funny things with config space, film at 11. VFIO trying to
be the middle man and intercept/interpret things is broken, cannot work,
will never work, will just results in lots and lots of useless code, but
I've been singing that song for too long and nobody seems to care...

> > > # Large item 42 bytes; name 0x2 Identifier String
> > #002d Large item 74 bytes; name 0x10
> > #007a Small item 1 bytes; name 0xf End Tag
> > ---
> > #0c00 Large item 16 bytes; name 0x2 Identifier String
> > #0c13 Large item 234 bytes; name 0x10
> > #0d00 Large item 252 bytes; name 0x11
> > #0dff Small item 0 bytes; name 0xf End Tag
> 
> The second block here is driver proprietary setup bits.

Right. They happen to be in VPD on this device. They an be elsewhere on
other devices. In between capabilities on some, in vendor caps on others...

> > > The cxgb3 driver is reading the second bit starting from 0xc00 but since
> > the size is wrongly detected as 0x7c, VFIO blocks access beyond it and the
> > guest driver fails to probe.
> >
> > I also cannot find a clause in the PCI 3.0 spec saying that there must be
> > just a single block, is it there?
> 
> > The problem is we need to be able to parse it.

We can parse the standard part for generic stuff like inventory tools
or lsvpd, but we shouldn't get in the way of the driver poking at its
device.

>   The spec defines a
> series of tags that can be used starting at offset 0.  That is how we
> are supposed to get around through the VPD data.  The problem is we
> can't have more than one end tag and what appears to be happening here
> is that we are defining a second block of data which uses the same
> formatting as VPD but is not VPD.
> 
> > What would the correct fix be? Scanning all 32k of VPD is not an option I
> > suppose as this is what this patch is trying to avoid. Thanks.
> 
> I adding the current cxgb3 maintainer and netdev list to the Cc.  This
> is something that can probably be addressed via a PCI quirk as what
> needs to happen is that we need to extend the VPD in the case of this
> part in order to include this second block.  As long as we can read
> the VPD data all the way out to 0xdff odds are we could probably just
> have the size arbitrarily increased to 0xe00 via the quirk and then
> you would be able to access all of the VPD for the device.  We already
> have code making other modifications to drivers/pci/quirks.c for
> several Broadcom devices and probably just need something similar to
> allow extended access in the case of these devices.


> > >
> >
> >
> > This is the device:
> >
> > > [aik@p81-p9 ~]$ sudo lspci -vvnns 0001:03:00.0
> > 0001:03:00.0 Ethernet controller [0200]: Chelsio Communications Inc T310
> > 10GbE Single Port Adapter [1425:0030]
> > Subsystem: IBM Device [1014:038c]
> > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> > Stepping- SERR- FastB2B- DisINTx+
> > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
> > > SERR-  > Latency: 0
> > Interrupt: pin A routed to IRQ 494
> > Region 0: Memory at 3fe08088 (64-bit, non-prefetchable) 
> >[size=4K]
> > Region 2: Memory at 3fe08000 (64-bit, non-prefetchable) 
> >[size=8M]
> > Region 4: Memory at 3fe080881000 (64-bit, non-prefetchable) 
> >[size=4K]
> > [virtual] Expansion ROM at 3fe08080 [disabled] [size=512K]
> > Capabilities: [40] Power Management version 3
> > Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
> >PME(D0+,D1-,D2-,D3hot+,D3cold-)
> > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> > Capabilities: [48] MSI: Enable- Count=1/32 Maskable- 64bit+
> > Address:   Data: 
> > Capabilities: [58] Express (v2) Endpoint, MSI 00
> > DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s 
> ><64ns, L1 <1us
> > ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
> > DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ 
> >Unsupported+
> > RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
> > MaxPayload 256 bytes, MaxReadReq 512 bytes
> > DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- 
> >TransPend-
> > LnkCap: Port #0, Speed 2.5GT/s, Width x8, ASPM L0s L1, Exit 
> >Latency L0s
> > unlimited, L1 unlimited
> > ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
> > LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
> > E

Re: spin_lock implicit/explicit memory barrier

2016-08-09 Thread Benjamin Herrenschmidt

On Tue, 2016-08-09 at 20:52 +0200, Manfred Spraul wrote:
> Hi Benjamin, Hi Michael,
> 
> regarding commit 51d7d5205d33 ("powerpc: Add smp_mb() to 
> arch_spin_is_locked()"):
> 
> For the ipc/sem code, I would like to replace the spin_is_locked() with 
> a smp_load_acquire(), see:
> 
> http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/tree/ipc/sem.c#n367
> 
> http://www.ozlabs.org/~akpm/mmots/broken-out/ipc-semc-fix-complex_count-vs-simple-op-race.patch
> 
> To my understanding, I must now add a smp_mb(), otherwise it would be 
> broken on PowerPC:
> 
> The approach that the memory barrier is added into spin_is_locked() 
> doesn't work because the code doesn't use spin_is_locked().
> 
> Correct?

Right, otherwise you aren't properly ordered. The current powerpc locks provide
good protection between what's inside vs. what's outside the lock but not vs.
the lock *value* itself, so if, like you do in the sem code, use the lock
value as something that is relevant in term of ordering, you probably need
an explicit full barrier.

Adding Paul McKenney.

Cheers,
Ben.

Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access

2016-08-09 Thread Benjamin Herrenschmidt

> On Tue, 2016-08-09 at 22:54 +1000, Alexey Kardashevskiy wrote:
> The cxgb3 driver is reading the second bit starting from 0xc00 but since
> the size is wrongly detected as 0x7c, VFIO blocks access beyond it and the
> guest driver fails to probe.
> 
> I also cannot find a clause in the PCI 3.0 spec saying that there must be
> just a single block, is it there?
> 
> What would the correct fix be? Scanning all 32k of VPD is not an option I
> suppose as this is what this patch is trying to avoid. Thanks.

Additionally, Hannes, Alex, I argue that for platforms with proper HW isolation
(such as ppc with EEH), we shouldn't have VFIO try to virtualize that stuff.

It's the same problem with the bloody MSIs. Just let the guest config space
accesses go straight through. Its drivers knows better what the HW needs and
if it crashes the card, too bad for that guest.

That being said, we don't have fine grained per-device PERST control on
all systems so there may not be recovery from that but on the other hand,
our constant attempts at "filtering" what the guest does to the HW is
imho, doomed.

Cheers,
Ben.

Re: [PATCH v2 1/1] Add nvd9128 as a simple panel

2016-08-09 Thread Rob Herring

On Fri, Aug 05, 2016 at 11:47:54AM +0200, Fabien Lahoudere wrote:
> Add New Vision Display 7.0" 800 RGB x 480 TFT LCD panel
> 
> Signed-off-by: Fabien Lahoudere 
> ---
>  .../devicetree/bindings/display/panel/nvd,9128.txt |  7 ++
>  .../devicetree/bindings/vendor-prefixes.txt|  1 +
>  drivers/gpu/drm/panel/panel-simple.c   | 26 
> ++
>  3 files changed, 34 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/panel/nvd,9128.txt
 
Please add acks from previous version when posting new versions.

Rob

Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs

2016-08-09 Thread Bart Van Assche

On 08/08/2016 09:20 AM, Oleg Nesterov wrote:
> So far _I think_ that the bug is somewhere else... Say, someone clears
> PG_locked without wake_up(). Then SIGKILL sent to the task sleeping in
> sys_read() "adds" the necessary wakeup...

Hello Oleg,

Something that puzzles me is that removing the "else" keyword from 
abort_exclusive_wait() is sufficient to avoid the hang. If there would 
be code that clears PG_locked without calling wake_up() this hang 
probably would also be triggered by workloads that do not wake up 
lock_page_killable() with a signal. BTW, the 
WARN_ONCE(!list_empty(&wait->task_list) && waitqueue_active(q), "mode = 
%#x\n", mode) statement that I added in abort_exclusive_wait() just 
produced the following call stack:

Aug  9 16:16:38 ion-dev-ib-ini kernel: WARNING: CPU: 0 PID: 14767 at 
kernel/sched/wait.c:284 abort_exclusive_wait+0xe3/0xf0
Aug  9 16:16:38 ion-dev-ib-ini kernel: mode = 0x82
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [last unloaded: brd]
Aug  9 16:16:38 ion-dev-ib-ini kernel: CPU: 0 PID: 14767 Comm: kpartx Tainted: 
GW   4.7.0-dbg+ #3
Aug  9 16:16:38 ion-dev-ib-ini kernel: Call Trace:
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [] 
dump_stack+0x68/0xa1
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [] __warn+0xc6/0xe0
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [] 
warn_slowpath_fmt+0x4a/0x50
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [] 
abort_exclusive_wait+0xe3/0xf0
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [] 
__wait_on_bit_lock+0x61/0xa0
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [] 
__lock_page_killable+0xb9/0xc0
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [] 
generic_file_read_iter+0x1ea/0x770
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [] 
blkdev_read_iter+0x30/0x40
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [] 
__vfs_read+0xbb/0x130
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [] vfs_read+0x91/0x130
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [] SyS_read+0x44/0xa0
Aug  9 16:16:38 ion-dev-ib-ini kernel:  [] 
entry_SYSCALL_64_fastpath+0x18/0xa8

(gdb) list *(generic_file_read_iter+0x1ea)
0x8115305a is in do_generic_file_read (mm/filemap.c:1730).
1725continue;
1726
1727page_not_up_to_date:
1728/* Get exclusive access to the page ... */
1729error = lock_page_killable(page);
1730if (unlikely(error))
1731goto readpage_error;
1732
1733page_not_up_to_date_locked:
1734/* Did it get truncated before we got the lock? */

Apparently the task that hangs is the same task as the one that
received the signal (PID 14767; state "D" = TASK_UNINTERRUPTIBLE):

[ 3718.134118] sysrq: SysRq : Show Blocked State
[ 3718.136234] kpartx  D 8803c7767838 0 14767  1 0x0006
[ 3718.136928] Call Trace:
[ 3718.137089]  [] schedule+0x37/0x90
[ 3718.137142]  [] schedule_timeout+0x27f/0x470
[ 3718.137603]  [] io_schedule_timeout+0x9f/0x110
[ 3718.137662]  [] bit_wait_io+0x16/0x60
[ 3718.137714]  [] __wait_on_bit_lock+0x49/0xa0
[ 3718.137764]  [] __lock_page+0xb9/0xc0
[ 3718.137865]  [] truncate_inode_pages_range+0x3e0/0x760
[ 3718.138175]  [] truncate_inode_pages+0x10/0x20
[ 3718.138477]  [] kill_bdev+0x30/0x40
[ 3718.138529]  [] __blkdev_put+0x71/0x360
[ 3718.138631]  [] blkdev_put+0x49/0x170
[ 3718.138681]  [] blkdev_close+0x20/0x30
[ 3718.138732]  [] __fput+0xe8/0x1f0
[ 3718.138782]  [] fput+0x9/0x10
[ 3718.138834]  [] task_work_run+0x83/0xb0
[ 3718.138886]  [] do_exit+0x3ee/0xc40
[ 3718.138987]  [] do_group_exit+0x4b/0xc0
[ 3718.139038]  [] get_signal+0x2ca/0x940
[ 3718.139142]  [] do_signal+0x23/0x660
[ 3718.139247]  [] exit_to_usermode_loop+0x73/0xb0
[ 3718.139297]  [] syscall_return_slowpath+0xb0/0xc0
[ 3718.139349]  [] entry_SYSCALL_64_fastpath+0xa6/0xa8

I'll try to see whether this behavior is reproducible.

Bart.

[PATCH] proc: Fix timerslack_ns CAP_SYS_NICE check when adjusting self

2016-08-09 Thread John Stultz

In changing from checking ptrace_may_access(p, PTRACE_MODE_ATTACH_FSCREDS)
to capable(CAP_SYS_NICE), I missed that ptrace_my_access succeeds
when p == current, but the CAP_SYS_NICE doesn't.

Thus while the previous commit was intended to loosen the needed
privledges to modify a processes timerslack, it needlessly restricted
a task modifying its own timerslack via the proc//timerslack_ns
(which is permitted also via the PR_SET_TIMERSLACK method).

This patch corrects this by checking if p == current before checking
the CAP_SYS_NICE value.

This patch applies on top of my two previous patches currently in -mm

Cc: Kees Cook 
Cc: "Serge E. Hallyn" 
Cc: Andrew Morton 
Cc: Thomas Gleixner 
CC: Arjan van de Ven 
Cc: Oren Laadan 
Cc: Ruchi Kandoi 
Cc: Rom Lemarchand 
Cc: Todd Kjos 
Cc: Colin Cross 
Cc: Nick Kralevich 
Cc: Dmitry Shmidt 
Cc: Elliott Hughes 
Cc: Android Kernel Team 
Signed-off-by: John Stultz 
---
 fs/proc/base.c | 34 +++---
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 02f8389..01c3c2d 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2281,15 +2281,17 @@ static ssize_t timerslack_ns_write(struct file *file, 
const char __user *buf,
if (!p)
return -ESRCH;
 
-   if (!capable(CAP_SYS_NICE)) {
-   count = -EPERM;
-   goto out;
-   }
+   if (p != current) {
+   if (!capable(CAP_SYS_NICE)) {
+   count = -EPERM;
+   goto out;
+   }
 
-   err = security_task_setscheduler(p);
-   if (err) {
-   count = err;
-   goto out;
+   err = security_task_setscheduler(p);
+   if (err) {
+   count = err;
+   goto out;
+   }
}
 
task_lock(p);
@@ -2315,14 +2317,16 @@ static int timerslack_ns_show(struct seq_file *m, void 
*v)
if (!p)
return -ESRCH;
 
-   if (!capable(CAP_SYS_NICE)) {
-   err = -EPERM;
-   goto out;
-   }
+   if (p != current) {
 
-   err = security_task_getscheduler(p);
-   if (err)
-   goto out;
+   if (!capable(CAP_SYS_NICE)) {
+   err = -EPERM;
+   goto out;
+   }
+   err = security_task_getscheduler(p);
+   if (err)
+   goto out;
+   }
 
task_lock(p);
seq_printf(m, "%llu\n", p->timer_slack_ns);
-- 
1.9.1

Re: [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-09 Thread John Stultz

On Tue, Aug 9, 2016 at 2:51 AM, Peter Zijlstra  wrote:
>
> Currently the percpu-rwsem switches to (global) atomic ops while a
> writer is waiting; which could be quite a while and slows down
> releasing the readers.
>
> This patch cures this problem by ordering the reader-state vs
> reader-count (see the comments in __percpu_down_read() and
> percpu_down_write()). This changes a global atomic op into a full
> memory barrier, which doesn't have the global cacheline contention.
>
> This also enables using the percpu-rwsem with rcu_sync disabled in order
> to bias the implementation differently, reducing the writer latency by
> adding some cost to readers.

So this by itself doesn't help us much, but including the following
from Oleg does help quite a bit:

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index db27804..9e9200b 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -5394,6 +5394,8 @@ int __init cgroup_init(void)
BUG_ON(cgroup_init_cftypes(NULL, cgroup_dfl_base_files));
BUG_ON(cgroup_init_cftypes(NULL, cgroup_legacy_base_files));

+   rcu_sync_enter(&cgroup_threadgroup_rwsem.rss);
+
mutex_lock(&cgroup_mutex);

/* Add init_css_set to the hash table */


thanks
-john

Re: [PATCH 1/5] sched,time: Count actually elapsed irq & softirq time

2016-08-09 Thread Wanpeng Li

2016-08-10 7:25 GMT+08:00 Wanpeng Li :
> 2016-08-09 22:06 GMT+08:00 Rik van Riel :
>> On Tue, 2016-08-09 at 11:59 +0800, Wanpeng Li wrote:
>>> Hi Rik,
>>> 2016-07-13 22:50 GMT+08:00 Frederic Weisbecker :
>>> > From: Rik van Riel 
>>> >
>>> > Currently, if there was any irq or softirq time during 'ticks'
>>> > jiffies, the entire period will be accounted as irq or softirq
>>> > time.
>>> >
>>> > This is inaccurate if only a subset of the time was actually spent
>>> > handling irqs, and could conceivably mis-count all of the ticks
>>> > during
>>> > a period as irq time, when there was some irq and some softirq
>>> > time.
>>> >
>>> > This can actually happen when irqtime_account_process_tick is
>>> > called
>>> > from account_idle_ticks, which can pass a larger number of ticks
>>> > down
>>> > all at once.
>>> >
>>> > Fix this by changing irqtime_account_hi_update,
>>> > irqtime_account_si_update,
>>> > and steal_account_process_ticks to work with cputime_t time units,
>>> > and
>>> > return the amount of time spent in each mode.
>>>
>>> Do we need to minus st cputime from idle cputime in
>>> account_idle_ticks() when noirqtime is true? I try to add this logic
>>> w/ noirqtime and idle=poll boot parameter for a full dynticks guest,
>>> however, there is no difference, where I miss?
>>
>> Yes, you are right. The code in account_idle_ticks()
>> could use the same treatment.
>>
>> I am not sure why it would not work, though...
>
> Actually I observed a regression caused by this patch. I use a i5

The regression is caused by your commit "sched,time: Count actually
elapsed irq & softirq time".

> laptop, 4 pCPUs, 4vCPUs for one full dynticks guest, there are four
> cpu hog processes(for loop) running in the guest, I hot-unplug the
> pCPUs on host one by one until there is only one left, then observe
> the top in guest, there are 100% st for cpu0(housekeeping), and 75% st
> for other cpus(nohz full). However, w/o this patch, 75% for all the
> four cpus.
>
> I try to figure out this recently, any tip is a great appreciated. :)
>
> Regards,
> Wapeng Li

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 900 matches

Mail list logo