Re: [PATCH v11 0/4] set VSESR_EL2 by user space and support NOTIFY_SEI notification

2018-04-11 Thread gengdongjiu
Hi James,
   thanks for this mail.

On 2018/4/10 22:15, James Morse wrote:
> Hi Dongjiu Geng,
> 
> On 09/04/18 22:36, Dongjiu Geng wrote:
>> 1. Detect whether KVM can set set guest SError syndrome
>> 2. Support to Set VSESR_EL2 and inject SError by user space.
>> 3. Support live migration to keep SError pending state and VSESR_EL2 value.
>> 4. ACPI 6.1 adds support for NOTIFY_SEI as a GHES notification mechanism, so 
>> support this
>>notification in software, KVM or kernel ARCH code call handle_guest_sei() 
>> to let ACP driver
>>to handle this notification.
> 
> Please don't post code during the merge-window, will this apply to v4.17-rc1? 
> We
> can't know until its tagged.
I do not know when it is merge-window. About the apply version, it does not 
have limited.

> 
> 
> This series is doing two separate things, please split it into two series.
OK, thanks!

> 
> But on the ACPI front: I don't see how any OS can support your NOTIFY_SEI when
> firmware is ignoring the normal world's PSTATE.A.
> 
> The latest lobe of that discussion was on the list here:
> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1611496.html
I have replied the mail.
I still have some questions that need to clarify with you.
After clarification, we will follow that.
The question is in the reply of this mail 
"https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1611496.html";

> 
> 
> As it is, we would need to spot SError being delivered while SError is masked,
> spray nasty messages about firmware being horrifically buggy, then panic(). 
> For
> a corrected error, this looks bad, but its preferable to letting firmware
> silently overwrite the exception registers, causing linux to spin through the
> vectors 'eret' with all exceptions masked.
> I still think its best to wait for firmware that does the right thing.
Let us  discuss that in another mail.
In a summary, I think firmware follow below rule can be OK, right?
1. The exception came from the EL that SError should be routed to(according to 
hcr_EL2.{AMO, TGE}),but PSTATE.A was set, EL3 firmware can't deliver SError;
2. The exception came from the EL that SError should not be routed to(according 
to hcr_EL2.{AMO, TGE}),even though the PSTATE.A was set,EL3 firmware still 
deliver SError

> 
> 
> Thanks,
> 
> James
> 
> .
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

2018-04-11 Thread gengdongjiu
Dear James,
Thanks for this mail and sorry for my late response.


2018-02-16 1:55 GMT+08:00 James Morse :
> Hi gengdongjiu, liu jun
>
> On 05/02/18 11:24, gengdongjiu wrote:
[]
>>
>>> Is the emulated SError routed following the routing rules for HCR_EL2.{AMO,
>>> TGE}?
>>
>> Yes, it is.
>
> ... and yet ...
>
>
>>> What does your firmware do when it wants to emulate SError but its masked?
>>> (e.g.1: The physical-SError interrupted EL2 and the SPSR shows EL2 had
>>> PSTATE.A  set.
>>>  e.g.2: The physical-SError interrupted EL2 but HCR_EL2 indicates the
>>> emulated  SError should go to EL1. This effectively masks SError.)
>>
>> Currently we does not consider much about the mask status(SPSR).
>
> .. this is a problem.
>
> If you ignore SPSR_EL3 you may deliver an SError to EL1 when the exception
> interrupted EL2. Even if you setup the EL1 register correctly, EL1 can't eret 
> to
> EL2. This should never happen, SError is effectively masked if you are running
> at an EL higher than the one its routed to.
>
> More obviously: if the exception came from the EL that SError should be routed
> to, but PSTATE.A was set, you can't deliver SError. Masking SError is the only

James, I  summarized the masking and routing rules for SError to
confirm with you for the firmware first solution,

1. If the HCR_EL2.{AMO,TGE} is set, which means the SError should route to EL2,
When system happens SError and trap to EL3,   If EL3 find
HCR_EL2.{AMO,TGE} and SPSR_EL3.A are both set,
and find this SError come from EL2, it will not deliver an SError:
store the RAS error in the BERT and 'reboot'; but if
it find that this SError come from EL1 or EL0, it also need to deliver
an SError, right?


2. If the HCR_EL2.{AMO,TGE} is not set, which means the SError should
route to EL1,
When system happens SError and trap to EL3, If EL3 find
HCR_EL2.{AMO,TGE} and SPSR_EL3.A are both not set,
and find this SError come from EL1, it will not deliver an SError:
store the RAS error in the BERT and 'reboot'; but if
it find that this SError come from EL0, it also need to deliver an
SError, right?


> way the OS has to indicate it can't take an exception right now. VBAR_EL1 may 
> be
> 'wrong' if we're doing some power-management, the FAR/ELR/ESR/SPSR registers 
> may
> contain live values that the OS would lose if you deliver another exception 
> over
> the top.
>
> If you deliver an emulated-SError as the OS eret's, your new ELR will point at
> the eret instruction and the CPU will spin on this instruction forever.
>
> You have to honour the masking and routing rules for SError, otherwise no OS 
> can
> run safely with this firmware.
>
>
>> I remember that you ever suggested firmware should reboot if the mask status
>> is set(SPSR), right?
>
> Yes, this is my suggestion of what to do if you can't deliver an SError: store
> the RAS error in the BERT and 'reboot'.
>
>
>> I CC "liu jun"  who is our UEFI firmware Architect,
>> if you have firmware requirements, you can raise again.
>
> (UEFI? I didn't think there was any of that at EL3, but I'm not familiar with
> all the 'PI' bits).
>
> The requirement is your emulated-SError from EL3 looks exactly like a
> physical-SError as if EL3 wasn't implemented.
> Your CPU has to handle cases where it can't deliver an SError, your emulation
> has to do the same.
>
> This is not something any OS can work around.
>
>
>>> Answers to these let us determine whether a bug is in the firmware or the
>>> kernel. If firmware is expecting the OS to do something special, I'd like 
>>> to know
>>> about it from the beginning!
>>
>> I know your meaning, thanks for raising it again.
>
>
> Happy new year,
>
> James
> ___
> kvmarm mailing list
> kvm...@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers

2018-04-11 Thread Guenter Roeck

On 04/11/2018 07:51 PM, Jae Hyun Yoo wrote:

On 4/11/2018 5:34 PM, Guenter Roeck wrote:

On 04/11/2018 02:59 PM, Jae Hyun Yoo wrote:

Hi Guenter,

Thanks a lot for sharing your time. Please see my inline answers.

On 4/10/2018 3:28 PM, Guenter Roeck wrote:

On Tue, Apr 10, 2018 at 11:32:11AM -0700, Jae Hyun Yoo wrote:

This commit adds PECI cputemp and dimmtemp hwmon drivers.

Signed-off-by: Jae Hyun Yoo 
Reviewed-by: Haiyue Wang 
Reviewed-by: James Feist 
Reviewed-by: Vernon Mauery 
Cc: Alan Cox 
Cc: Andrew Jeffery 
Cc: Andrew Lunn 
Cc: Andy Shevchenko 
Cc: Arnd Bergmann 
Cc: Benjamin Herrenschmidt 
Cc: Fengguang Wu 
Cc: Greg KH 
Cc: Guenter Roeck 
Cc: Jason M Biils 
Cc: Jean Delvare 
Cc: Joel Stanley 
Cc: Julia Cartwright 
Cc: Miguel Ojeda 
Cc: Milton Miller II 
Cc: Pavel Machek 
Cc: Randy Dunlap 
Cc: Stef van Os 
Cc: Sumeet R Pawnikar 
---
  drivers/hwmon/Kconfig |  28 ++
  drivers/hwmon/Makefile    |   2 +
  drivers/hwmon/peci-cputemp.c  | 783 ++
  drivers/hwmon/peci-dimmtemp.c | 432 +++
  4 files changed, 1245 insertions(+)
  create mode 100644 drivers/hwmon/peci-cputemp.c
  create mode 100644 drivers/hwmon/peci-dimmtemp.c

diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index f249a4428458..c52f610f81d0 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -1259,6 +1259,34 @@ config SENSORS_NCT7904
    This driver can also be built as a module.  If so, the module
    will be called nct7904.
+config SENSORS_PECI_CPUTEMP
+    tristate "PECI CPU temperature monitoring support"
+    depends on OF
+    depends on PECI
+    help
+  If you say yes here you get support for the generic Intel PECI
+  cputemp driver which provides Digital Thermal Sensor (DTS) thermal
+  readings of the CPU package and CPU cores that are accessible using
+  the PECI Client Command Suite via the processor PECI client.
+  Check Documentation/hwmon/peci-cputemp for details.
+
+  This driver can also be built as a module.  If so, the module
+  will be called peci-cputemp.
+
+config SENSORS_PECI_DIMMTEMP
+    tristate "PECI DIMM temperature monitoring support"
+    depends on OF
+    depends on PECI
+    help
+  If you say yes here you get support for the generic Intel PECI hwmon
+  driver which provides Digital Thermal Sensor (DTS) thermal readings of
+  DIMM components that are accessible using the PECI Client Command
+  Suite via the processor PECI client.
+  Check Documentation/hwmon/peci-dimmtemp for details.
+
+  This driver can also be built as a module.  If so, the module
+  will be called peci-dimmtemp.
+
  config SENSORS_NSA320
  tristate "ZyXEL NSA320 and compatible fan speed and temperature sensors"
  depends on GPIOLIB && OF
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index e7d52a36e6c4..48d9598fcd3a 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -136,6 +136,8 @@ obj-$(CONFIG_SENSORS_NCT7802)    += nct7802.o
  obj-$(CONFIG_SENSORS_NCT7904)    += nct7904.o
  obj-$(CONFIG_SENSORS_NSA320)    += nsa320-hwmon.o
  obj-$(CONFIG_SENSORS_NTC_THERMISTOR)    += ntc_thermistor.o
+obj-$(CONFIG_SENSORS_PECI_CPUTEMP)    += peci-cputemp.o
+obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)    += peci-dimmtemp.o
  obj-$(CONFIG_SENSORS_PC87360)    += pc87360.o
  obj-$(CONFIG_SENSORS_PC87427)    += pc87427.o
  obj-$(CONFIG_SENSORS_PCF8591)    += pcf8591.o
diff --git a/drivers/hwmon/peci-cputemp.c b/drivers/hwmon/peci-cputemp.c
new file mode 100644
index ..f0bc92687512
--- /dev/null
+++ b/drivers/hwmon/peci-cputemp.c
@@ -0,0 +1,783 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2018 Intel Corporation
+
+#include 
+#include 
+#include 


Is this include needed ?



No it isn't. Will drop the line.


+#include 
+#include 
+#include 
+#include 
+
+#define TEMP_TYPE_PECI    6  /* Sensor type 6: Intel PECI */
+
+#define CORE_MAX_ON_HSX   18 /* Max number of cores on Haswell */
+#define CORE_MAX_ON_BDX   24 /* Max number of cores on Broadwell */
+#define CORE_MAX_ON_SKX   28 /* Max number of cores on Skylake */
+
+#define DEFAULT_CHANNEL_NUMS  5
+#define CORETEMP_CHANNEL_NUMS CORE_MAX_ON_SKX
+#define CPUTEMP_CHANNEL_NUMS  (DEFAULT_CHANNEL_NUMS + CORETEMP_CHANNEL_NUMS)
+
+#define CLIENT_CPU_ID_MASK    0xf0ff0  /* Mask for Family / Model info */
+
+#define UPDATE_INTERVAL_MIN   HZ
+
+enum cpu_gens {
+    CPU_GEN_HSX, /* Haswell Xeon */
+    CPU_GEN_BRX, /* Broadwell Xeon */
+    CPU_GEN_SKX, /* Skylake Xeon */
+    CPU_GEN_MAX
+};
+
+struct cpu_gen_info {
+    u32 type;
+    u32 cpu_id;
+    u32 core_max;
+};
+
+struct temp_data {
+    bool valid;
+    s32  value;
+    unsigned long last_updated;
+};
+
+struct temp_group {
+    struct temp_data die;
+    struct temp_data dts_margin;
+    struct temp_data tcontrol;
+    struct temp_data tthrottle;
+    struct temp_data tjmax;
+    struct temp_data core[CORETEMP_CHANNEL_NUMS]

Re: [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers

2018-04-11 Thread Jae Hyun Yoo

On 4/11/2018 5:34 PM, Guenter Roeck wrote:

On 04/11/2018 02:59 PM, Jae Hyun Yoo wrote:

Hi Guenter,

Thanks a lot for sharing your time. Please see my inline answers.

On 4/10/2018 3:28 PM, Guenter Roeck wrote:

On Tue, Apr 10, 2018 at 11:32:11AM -0700, Jae Hyun Yoo wrote:

This commit adds PECI cputemp and dimmtemp hwmon drivers.

Signed-off-by: Jae Hyun Yoo 
Reviewed-by: Haiyue Wang 
Reviewed-by: James Feist 
Reviewed-by: Vernon Mauery 
Cc: Alan Cox 
Cc: Andrew Jeffery 
Cc: Andrew Lunn 
Cc: Andy Shevchenko 
Cc: Arnd Bergmann 
Cc: Benjamin Herrenschmidt 
Cc: Fengguang Wu 
Cc: Greg KH 
Cc: Guenter Roeck 
Cc: Jason M Biils 
Cc: Jean Delvare 
Cc: Joel Stanley 
Cc: Julia Cartwright 
Cc: Miguel Ojeda 
Cc: Milton Miller II 
Cc: Pavel Machek 
Cc: Randy Dunlap 
Cc: Stef van Os 
Cc: Sumeet R Pawnikar 
---
  drivers/hwmon/Kconfig |  28 ++
  drivers/hwmon/Makefile    |   2 +
  drivers/hwmon/peci-cputemp.c  | 783 
++

  drivers/hwmon/peci-dimmtemp.c | 432 +++
  4 files changed, 1245 insertions(+)
  create mode 100644 drivers/hwmon/peci-cputemp.c
  create mode 100644 drivers/hwmon/peci-dimmtemp.c

diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index f249a4428458..c52f610f81d0 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -1259,6 +1259,34 @@ config SENSORS_NCT7904
    This driver can also be built as a module.  If so, the module
    will be called nct7904.
+config SENSORS_PECI_CPUTEMP
+    tristate "PECI CPU temperature monitoring support"
+    depends on OF
+    depends on PECI
+    help
+  If you say yes here you get support for the generic Intel PECI
+  cputemp driver which provides Digital Thermal Sensor (DTS) 
thermal
+  readings of the CPU package and CPU cores that are accessible 
using

+  the PECI Client Command Suite via the processor PECI client.
+  Check Documentation/hwmon/peci-cputemp for details.
+
+  This driver can also be built as a module.  If so, the module
+  will be called peci-cputemp.
+
+config SENSORS_PECI_DIMMTEMP
+    tristate "PECI DIMM temperature monitoring support"
+    depends on OF
+    depends on PECI
+    help
+  If you say yes here you get support for the generic Intel 
PECI hwmon
+  driver which provides Digital Thermal Sensor (DTS) thermal 
readings of
+  DIMM components that are accessible using the PECI Client 
Command

+  Suite via the processor PECI client.
+  Check Documentation/hwmon/peci-dimmtemp for details.
+
+  This driver can also be built as a module.  If so, the module
+  will be called peci-dimmtemp.
+
  config SENSORS_NSA320
  tristate "ZyXEL NSA320 and compatible fan speed and 
temperature sensors"

  depends on GPIOLIB && OF
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index e7d52a36e6c4..48d9598fcd3a 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -136,6 +136,8 @@ obj-$(CONFIG_SENSORS_NCT7802)    += nct7802.o
  obj-$(CONFIG_SENSORS_NCT7904)    += nct7904.o
  obj-$(CONFIG_SENSORS_NSA320)    += nsa320-hwmon.o
  obj-$(CONFIG_SENSORS_NTC_THERMISTOR)    += ntc_thermistor.o
+obj-$(CONFIG_SENSORS_PECI_CPUTEMP)    += peci-cputemp.o
+obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)    += peci-dimmtemp.o
  obj-$(CONFIG_SENSORS_PC87360)    += pc87360.o
  obj-$(CONFIG_SENSORS_PC87427)    += pc87427.o
  obj-$(CONFIG_SENSORS_PCF8591)    += pcf8591.o
diff --git a/drivers/hwmon/peci-cputemp.c 
b/drivers/hwmon/peci-cputemp.c

new file mode 100644
index ..f0bc92687512
--- /dev/null
+++ b/drivers/hwmon/peci-cputemp.c
@@ -0,0 +1,783 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2018 Intel Corporation
+
+#include 
+#include 
+#include 


Is this include needed ?



No it isn't. Will drop the line.


+#include 
+#include 
+#include 
+#include 
+
+#define TEMP_TYPE_PECI    6  /* Sensor type 6: Intel PECI */
+
+#define CORE_MAX_ON_HSX   18 /* Max number of cores on Haswell */
+#define CORE_MAX_ON_BDX   24 /* Max number of cores on 
Broadwell */

+#define CORE_MAX_ON_SKX   28 /* Max number of cores on Skylake */
+
+#define DEFAULT_CHANNEL_NUMS  5
+#define CORETEMP_CHANNEL_NUMS CORE_MAX_ON_SKX
+#define CPUTEMP_CHANNEL_NUMS  (DEFAULT_CHANNEL_NUMS + 
CORETEMP_CHANNEL_NUMS)

+
+#define CLIENT_CPU_ID_MASK    0xf0ff0  /* Mask for Family / Model 
info */

+
+#define UPDATE_INTERVAL_MIN   HZ
+
+enum cpu_gens {
+    CPU_GEN_HSX, /* Haswell Xeon */
+    CPU_GEN_BRX, /* Broadwell Xeon */
+    CPU_GEN_SKX, /* Skylake Xeon */
+    CPU_GEN_MAX
+};
+
+struct cpu_gen_info {
+    u32 type;
+    u32 cpu_id;
+    u32 core_max;
+};
+
+struct temp_data {
+    bool valid;
+    s32  value;
+    unsigned long last_updated;
+};
+
+struct temp_group {
+    struct temp_data die;
+    struct temp_data dts_margin;
+    struct temp_data tcontrol;
+    struct temp_data tthrottle;
+    struct temp_data tjmax;
+    struct temp_data core[CORETEMP_CHANNEL_NUMS];
+};
+
+struct peci_cpute

Re: [PATCH v3 05/10] ARM: dts: aspeed: peci: Add PECI node

2018-04-11 Thread Jae Hyun Yoo

On 4/11/2018 4:52 AM, Joel Stanley wrote:

On 11 April 2018 at 04:02, Jae Hyun Yoo  wrote:

This commit adds PECI bus/adapter node of AST24xx/AST25xx into
aspeed-g4 and aspeed-g5.



The patches to the device trees get merged by the ASPEED maintainer
(me). Once you have the bindings reviewed you can send the patches to
me and the linux-aspeed list (I've got a pending patch to maintainers
that will ensure get_maintainers.pl does the right thing as far as
email addresses go).

I'd suggest dropping it from your series and re-sending once the
bindings and driver are reviewed.

Cheers,

Joel



Do you mean that bindings and driver of ASPEED peci adapter driver 
including documents?


Thanks,
-Jae
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs

2018-04-11 Thread Jae Hyun Yoo

Hi Joel,

On 4/11/2018 4:52 AM, Joel Stanley wrote:

On 11 April 2018 at 04:02, Jae Hyun Yoo  wrote:

This commit adds a dt-bindings document of PECI adapter driver for Aspeed


We try to capitalise ASPEED.



Got it. Will capitalize all Aspeed words.


AST24xx/25xx SoCs.
---
  .../devicetree/bindings/peci/peci-aspeed.txt   | 60 ++
  1 file changed, 60 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.txt

diff --git a/Documentation/devicetree/bindings/peci/peci-aspeed.txt 
b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
new file mode 100644
index ..4598bb8c20fa
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
@@ -0,0 +1,60 @@
+Device tree configuration for PECI buses on the AST24XX and AST25XX SoCs.
+
+Required properties:
+- compatible: Should be "aspeed,ast2400-peci" or "aspeed,ast2500-peci"
+ - aspeed,ast2400-peci: Aspeed AST2400 family PECI
+controller
+ - aspeed,ast2500-peci: Aspeed AST2500 family PECI
+controller
+- reg   : Should contain PECI controller registers location and
+ length.
+- #address-cells: Should be <1>.
+- #size-cells   : Should be <0>.
+- interrupts: Should contain PECI controller interrupt.
+- clocks: Should contain clock source for PECI controller.
+ Should reference clkin.


Are you sure that this is driven by clkin? Most peripherals on the
Aspeed are attached to the apb, so should reference that clock.



According to the datasheet, PECI controller module is attached to apb 
but its clock source is the 24MHz external clock.



+- clock_frequency   : Should contain the operation frequency of PECI controller
+ in units of Hz.
+ 187500 ~ 2400


Can you explain why you need both the parent clock and this frequency
to be specified?



Based on this setting, driver code makes clock divisor value to set 
operation clock of PECI controller which is adjustable.



+
+Optional properties:
+- msg-timing-nego   : Message timing negotiation period. This value will


Perhaps msg-timing-period? Or just msg-timing?



Will use msg-timing instead.


+ determine the period of message timing negotiation to be
+ issued by PECI controller. The unit of the programmed
+ value is four times of PECI clock period.
+ 0 ~ 255 (default: 1)
+- addr-timing-nego  : Address timing negotiation period. This value will
+ determine the period of address timing negotiation to be
+ issued by PECI controller. The unit of the programmed
+ value is four times of PECI clock period.
+ 0 ~ 255 (default: 1)
+- rd-sampling-point : Read sampling point selection. The whole period of a bit
+ time will be divided into 16 time frames. This value will
+ determine the time frame in which the controller will
+ sample PECI signal for data read back. Usually in the
+ middle of a bit time is the best.
+ 0 ~ 15 (default: 8)
+- cmd_timeout_ms: Command timeout in units of ms.
+ 1 ~ 6 (default: 1000)
+
+Example:
+   peci: peci@1e78b000 {
+   compatible = "simple-bus";
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges = <0x0 0x1e78b000 0x60>;
+
+   peci0: peci-bus@0 {
+   compatible = "aspeed,ast2500-peci";
+   reg = <0x0 0x60>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   interrupts = <15>;
+   clocks = <&clk_clkin>;
+   clock-frequency = <2400>;
+   msg-timing-nego = <1>;
+   addr-timing-nego = <1>;
+   rd-sampling-point = <8>;
+   cmd-timeout-ms = <1000>;
+   };
+   };
--
2.16.2


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 01/10] Documentations: dt-bindings: Add documents of generic PECI bus, adapter and client drivers

2018-04-11 Thread Jae Hyun Yoo

Hi Joel,

On 4/11/2018 4:52 AM, Joel Stanley wrote:

Hi Jae,

On 11 April 2018 at 04:02, Jae Hyun Yoo  wrote:

This commit adds documents of generic PECI bus, adapter and client drivers.

Signed-off-by: Jae Hyun Yoo 
Reviewed-by: Haiyue Wang 
Reviewed-by: James Feist 
Reviewed-by: Vernon Mauery 
Cc: Alan Cox 
Cc: Andrew Jeffery 
Cc: Andrew Lunn 
Cc: Andy Shevchenko 
Cc: Arnd Bergmann 
Cc: Benjamin Herrenschmidt 
Cc: Fengguang Wu 
Cc: Greg KH 
Cc: Guenter Roeck 
Cc: Jason M Biils 
Cc: Jean Delvare 
Cc: Joel Stanley 
Cc: Julia Cartwright 
Cc: Miguel Ojeda 
Cc: Milton Miller II 
Cc: Pavel Machek 
Cc: Randy Dunlap 
Cc: Stef van Os 
Cc: Sumeet R Pawnikar 


That's a hefty cc list. I can't see Rob Herring though, and he's
usually the person who you need to convince to get your bindings
accepted.

I recommend using ./scripts/get_maintainers.pl to build your CC list,
and then add others you think are relevant.

I'm not sure what the guidelines are for generic bindings, so I'll
defer to Rob for this patch.

Cheers,

Joel



Thanks a lot for letting me know that. I'll do as you suggested.

-Jae


---
  .../devicetree/bindings/peci/peci-adapter.txt  | 23 
  .../devicetree/bindings/peci/peci-bus.txt  | 15 +
  .../devicetree/bindings/peci/peci-client.txt   | 25 ++
  3 files changed, 63 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/peci/peci-adapter.txt
  create mode 100644 Documentation/devicetree/bindings/peci/peci-bus.txt
  create mode 100644 Documentation/devicetree/bindings/peci/peci-client.txt

diff --git a/Documentation/devicetree/bindings/peci/peci-adapter.txt 
b/Documentation/devicetree/bindings/peci/peci-adapter.txt
new file mode 100644
index ..9221374f6b11
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-adapter.txt
@@ -0,0 +1,23 @@
+Generic device tree configuration for PECI adapters.
+
+Required properties:
+- compatible : Should contain hardware specific definition strings that can
+  match an adapter driver implementation.
+- reg: Should contain PECI controller registers location and 
length.
+- #address-cells : Should be <1>.
+- #size-cells: Should be <0>.
+
+Example:
+   peci: peci@1000 {
+   compatible = "simple-bus";
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges = <0x0 0x1000 0x1000>;
+
+   peci0: peci-bus@0 {
+   compatible = "soc,soc-peci";
+   reg = <0x0 0x1000>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   };
+   };
diff --git a/Documentation/devicetree/bindings/peci/peci-bus.txt 
b/Documentation/devicetree/bindings/peci/peci-bus.txt
new file mode 100644
index ..90bcc791ccb0
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-bus.txt
@@ -0,0 +1,15 @@
+Generic device tree configuration for PECI buses.
+
+Required properties:
+- compatible : Should be "simple-bus".
+- #address-cells : Should be <1>.
+- #size-cells: Should be <1>.
+- ranges : Should contain PECI controller registers ranges.
+
+Example:
+   peci: peci@1000 {
+   compatible = "simple-bus";
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges = <0x0 0x1000 0x1000>;
+   };
diff --git a/Documentation/devicetree/bindings/peci/peci-client.txt 
b/Documentation/devicetree/bindings/peci/peci-client.txt
new file mode 100644
index ..8e2bfd8532f6
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-client.txt
@@ -0,0 +1,25 @@
+Generic device tree configuration for PECI clients.
+
+Required properties:
+- compatible : Should contain target device specific definition strings that 
can
+  match a client driver implementation.
+- reg: Should contain address of a client CPU. Address range of CPU
+  clients is starting from 0x30 based on PECI specification.
+  <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)
+
+Example:
+   peci-bus@0 {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   < more properties >
+
+   function@cpu0 {
+   compatible = "device,function";
+   reg = <0x30>;
+   };
+
+   function@cpu1 {
+   compatible = "device,function";
+   reg = <0x31>;
+   };
+   };
--
2.16.2


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 06/10] drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx

2018-04-11 Thread Jae Hyun Yoo

Hello Joel,

Thanks for sharing your time. Please see my answers inline.

On 4/11/2018 4:51 AM, Joel Stanley wrote:

Hello Jae,

On 11 April 2018 at 04:02, Jae Hyun Yoo  wrote:

This commit adds PECI adapter driver implementation for Aspeed
AST24xx/AST25xx.


The driver is looking good!

It looks like you've done some kind of review that we weren't allowed
to see, which is a double edged sword - I might be asking about things
that you've already spoken about with someone else.

I'm only just learning about PECI, but I do have some general comments below.



Yes, it took a hidden review process between v2 and v3. I know it's an 
unusual process but it was requested. Hopefully, change logs in cover 
letter could roughly provide the details. Thanks for your comments.



---
  drivers/peci/Kconfig   |  28 +++
  drivers/peci/Makefile  |   3 +
  drivers/peci/peci-aspeed.c | 504 +
  3 files changed, 535 insertions(+)
  create mode 100644 drivers/peci/peci-aspeed.c

diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
index 1fbc13f9e6c2..0e33420365de 100644
--- a/drivers/peci/Kconfig
+++ b/drivers/peci/Kconfig
@@ -14,4 +14,32 @@ config PECI
   processors and chipset components to external monitoring or control
   devices.

+ If you want PECI support, you should say Y here and also to the
+ specific driver for your bus adapter(s) below.
+
+if PECI
+
+#
+# PECI hardware bus configuration
+#
+
+menu "PECI Hardware Bus support"
+
+config PECI_ASPEED
+   tristate "Aspeed AST24xx/AST25xx PECI support"


I think just saying ASPEED PECI support is enough. That way if the
next ASPEED SoC happens to have PECI we don't need to update all of
the help text :)



Agreed. I'll change the description.


+   select REGMAP_MMIO
+   depends on OF
+   depends on ARCH_ASPEED || COMPILE_TEST
+   help
+ Say Y here if you want support for the Platform Environment Control
+ Interface (PECI) bus adapter driver on the Aspeed AST24XX and AST25XX
+ SoCs.
+
+ This support is also available as a module.  If so, the module
+ will be called peci-aspeed.
+
+endmenu
+
+endif # PECI
+
  endmenu
diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
index 9e8615e0d3ff..886285e69765 100644
--- a/drivers/peci/Makefile
+++ b/drivers/peci/Makefile
@@ -4,3 +4,6 @@

  # Core functionality
  obj-$(CONFIG_PECI) += peci-core.o
+
+# Hardware specific bus drivers
+obj-$(CONFIG_PECI_ASPEED)  += peci-aspeed.o
diff --git a/drivers/peci/peci-aspeed.c b/drivers/peci/peci-aspeed.c
new file mode 100644
index ..be2a1f327eb1
--- /dev/null
+++ b/drivers/peci/peci-aspeed.c
@@ -0,0 +1,504 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2012-2017 ASPEED Technology Inc.
+// Copyright (c) 2018 Intel Corporation
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DUMP_DEBUG 0
+
+/* Aspeed PECI Registers */
+#define AST_PECI_CTRL 0x00


Nit: we use ASPEED instead of AST in the upstream kernel to distingush
from the aspeed sdk drivers. If you feel strongly about this then I
won't insist you change.



Okay then, better change it now than later. Will change all defines.


+#define AST_PECI_TIMING   0x04
+#define AST_PECI_CMD  0x08
+#define AST_PECI_CMD_CTRL 0x0c
+#define AST_PECI_EXP_FCS  0x10
+#define AST_PECI_CAP_FCS  0x14
+#define AST_PECI_INT_CTRL 0x18
+#define AST_PECI_INT_STS  0x1c
+#define AST_PECI_W_DATA0  0x20
+#define AST_PECI_W_DATA1  0x24
+#define AST_PECI_W_DATA2  0x28
+#define AST_PECI_W_DATA3  0x2c
+#define AST_PECI_R_DATA0  0x30
+#define AST_PECI_R_DATA1  0x34
+#define AST_PECI_R_DATA2  0x38
+#define AST_PECI_R_DATA3  0x3c
+#define AST_PECI_W_DATA4  0x40
+#define AST_PECI_W_DATA5  0x44
+#define AST_PECI_W_DATA6  0x48
+#define AST_PECI_W_DATA7  0x4c
+#define AST_PECI_R_DATA4  0x50
+#define AST_PECI_R_DATA5  0x54
+#define AST_PECI_R_DATA6  0x58
+#define AST_PECI_R_DATA7  0x5c
+
+/* AST_PECI_CTRL - 0x00 : Control Register */
+#define PECI_CTRL_SAMPLING_MASK GENMASK(19, 16)
+#define PECI_CTRL_SAMPLING(x)   (((x) << 16) & PECI_CTRL_SAMPLING_MASK)
+#define PECI_CTRL_SAMPLING_GET(x)   (((x) & PECI_CTRL_SAMPLING_MASK) >> 16)
+#define PECI_CTRL_READ_MODE_MASKGENMASK(13, 12)
+#define PECI_CTRL_READ_MODE(x)  (((x) << 12) & PECI_CTRL_READ_MODE_MASK)
+#define PECI_CTRL_READ_MODE_GET(x)  (((x) & PECI_CTRL_READ_MODE_MASK) >> 12)
+#define PECI_CTRL_READ_MODE_COUNT   BIT(12)
+#define PECI_CTRL_READ_MODE_DBG BIT(13)
+#define PECI_CTRL_CLK_SOURCE_MASK   BIT(11)
+#define PECI_CTRL_CLK_SOURCE(x) (((x) << 11) & PECI_CTRL_CLK_SOURCE_MASK)
+#define PECI_CTRL_CLK_SOURCE_GET(x) (((x) & PECI_CTRL_CLK_SOURCE_MASK) >> 11)
+#define PECI_CTRL_CLK_DIV_MASK  GENMASK(10, 8)
+#define PECI_CTRL_CLK_DIV(x)(((x) << 8) & PECI_CTRL_CLK_DIV_MASK)
+#define PECI_CTRL_CLK_DIV_GET(x)(((x) & PECI_CTRL_CLK_DIV_MASK) 

Re: [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers

2018-04-11 Thread Guenter Roeck

On 04/11/2018 02:59 PM, Jae Hyun Yoo wrote:

Hi Guenter,

Thanks a lot for sharing your time. Please see my inline answers.

On 4/10/2018 3:28 PM, Guenter Roeck wrote:

On Tue, Apr 10, 2018 at 11:32:11AM -0700, Jae Hyun Yoo wrote:

This commit adds PECI cputemp and dimmtemp hwmon drivers.

Signed-off-by: Jae Hyun Yoo 
Reviewed-by: Haiyue Wang 
Reviewed-by: James Feist 
Reviewed-by: Vernon Mauery 
Cc: Alan Cox 
Cc: Andrew Jeffery 
Cc: Andrew Lunn 
Cc: Andy Shevchenko 
Cc: Arnd Bergmann 
Cc: Benjamin Herrenschmidt 
Cc: Fengguang Wu 
Cc: Greg KH 
Cc: Guenter Roeck 
Cc: Jason M Biils 
Cc: Jean Delvare 
Cc: Joel Stanley 
Cc: Julia Cartwright 
Cc: Miguel Ojeda 
Cc: Milton Miller II 
Cc: Pavel Machek 
Cc: Randy Dunlap 
Cc: Stef van Os 
Cc: Sumeet R Pawnikar 
---
  drivers/hwmon/Kconfig |  28 ++
  drivers/hwmon/Makefile    |   2 +
  drivers/hwmon/peci-cputemp.c  | 783 ++
  drivers/hwmon/peci-dimmtemp.c | 432 +++
  4 files changed, 1245 insertions(+)
  create mode 100644 drivers/hwmon/peci-cputemp.c
  create mode 100644 drivers/hwmon/peci-dimmtemp.c

diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index f249a4428458..c52f610f81d0 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -1259,6 +1259,34 @@ config SENSORS_NCT7904
    This driver can also be built as a module.  If so, the module
    will be called nct7904.
+config SENSORS_PECI_CPUTEMP
+    tristate "PECI CPU temperature monitoring support"
+    depends on OF
+    depends on PECI
+    help
+  If you say yes here you get support for the generic Intel PECI
+  cputemp driver which provides Digital Thermal Sensor (DTS) thermal
+  readings of the CPU package and CPU cores that are accessible using
+  the PECI Client Command Suite via the processor PECI client.
+  Check Documentation/hwmon/peci-cputemp for details.
+
+  This driver can also be built as a module.  If so, the module
+  will be called peci-cputemp.
+
+config SENSORS_PECI_DIMMTEMP
+    tristate "PECI DIMM temperature monitoring support"
+    depends on OF
+    depends on PECI
+    help
+  If you say yes here you get support for the generic Intel PECI hwmon
+  driver which provides Digital Thermal Sensor (DTS) thermal readings of
+  DIMM components that are accessible using the PECI Client Command
+  Suite via the processor PECI client.
+  Check Documentation/hwmon/peci-dimmtemp for details.
+
+  This driver can also be built as a module.  If so, the module
+  will be called peci-dimmtemp.
+
  config SENSORS_NSA320
  tristate "ZyXEL NSA320 and compatible fan speed and temperature sensors"
  depends on GPIOLIB && OF
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index e7d52a36e6c4..48d9598fcd3a 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -136,6 +136,8 @@ obj-$(CONFIG_SENSORS_NCT7802)    += nct7802.o
  obj-$(CONFIG_SENSORS_NCT7904)    += nct7904.o
  obj-$(CONFIG_SENSORS_NSA320)    += nsa320-hwmon.o
  obj-$(CONFIG_SENSORS_NTC_THERMISTOR)    += ntc_thermistor.o
+obj-$(CONFIG_SENSORS_PECI_CPUTEMP)    += peci-cputemp.o
+obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)    += peci-dimmtemp.o
  obj-$(CONFIG_SENSORS_PC87360)    += pc87360.o
  obj-$(CONFIG_SENSORS_PC87427)    += pc87427.o
  obj-$(CONFIG_SENSORS_PCF8591)    += pcf8591.o
diff --git a/drivers/hwmon/peci-cputemp.c b/drivers/hwmon/peci-cputemp.c
new file mode 100644
index ..f0bc92687512
--- /dev/null
+++ b/drivers/hwmon/peci-cputemp.c
@@ -0,0 +1,783 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2018 Intel Corporation
+
+#include 
+#include 
+#include 


Is this include needed ?



No it isn't. Will drop the line.


+#include 
+#include 
+#include 
+#include 
+
+#define TEMP_TYPE_PECI    6  /* Sensor type 6: Intel PECI */
+
+#define CORE_MAX_ON_HSX   18 /* Max number of cores on Haswell */
+#define CORE_MAX_ON_BDX   24 /* Max number of cores on Broadwell */
+#define CORE_MAX_ON_SKX   28 /* Max number of cores on Skylake */
+
+#define DEFAULT_CHANNEL_NUMS  5
+#define CORETEMP_CHANNEL_NUMS CORE_MAX_ON_SKX
+#define CPUTEMP_CHANNEL_NUMS  (DEFAULT_CHANNEL_NUMS + CORETEMP_CHANNEL_NUMS)
+
+#define CLIENT_CPU_ID_MASK    0xf0ff0  /* Mask for Family / Model info */
+
+#define UPDATE_INTERVAL_MIN   HZ
+
+enum cpu_gens {
+    CPU_GEN_HSX, /* Haswell Xeon */
+    CPU_GEN_BRX, /* Broadwell Xeon */
+    CPU_GEN_SKX, /* Skylake Xeon */
+    CPU_GEN_MAX
+};
+
+struct cpu_gen_info {
+    u32 type;
+    u32 cpu_id;
+    u32 core_max;
+};
+
+struct temp_data {
+    bool valid;
+    s32  value;
+    unsigned long last_updated;
+};
+
+struct temp_group {
+    struct temp_data die;
+    struct temp_data dts_margin;
+    struct temp_data tcontrol;
+    struct temp_data tthrottle;
+    struct temp_data tjmax;
+    struct temp_data core[CORETEMP_CHANNEL_NUMS];
+};
+
+struct peci_cputemp {
+    struct peci_client *client;
+    struct device *dev;

Re: [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers

2018-04-11 Thread Jae Hyun Yoo

Hi Guenter,

Thanks a lot for sharing your time. Please see my inline answers.

On 4/10/2018 3:28 PM, Guenter Roeck wrote:

On Tue, Apr 10, 2018 at 11:32:11AM -0700, Jae Hyun Yoo wrote:

This commit adds PECI cputemp and dimmtemp hwmon drivers.

Signed-off-by: Jae Hyun Yoo 
Reviewed-by: Haiyue Wang 
Reviewed-by: James Feist 
Reviewed-by: Vernon Mauery 
Cc: Alan Cox 
Cc: Andrew Jeffery 
Cc: Andrew Lunn 
Cc: Andy Shevchenko 
Cc: Arnd Bergmann 
Cc: Benjamin Herrenschmidt 
Cc: Fengguang Wu 
Cc: Greg KH 
Cc: Guenter Roeck 
Cc: Jason M Biils 
Cc: Jean Delvare 
Cc: Joel Stanley 
Cc: Julia Cartwright 
Cc: Miguel Ojeda 
Cc: Milton Miller II 
Cc: Pavel Machek 
Cc: Randy Dunlap 
Cc: Stef van Os 
Cc: Sumeet R Pawnikar 
---
  drivers/hwmon/Kconfig |  28 ++
  drivers/hwmon/Makefile|   2 +
  drivers/hwmon/peci-cputemp.c  | 783 ++
  drivers/hwmon/peci-dimmtemp.c | 432 +++
  4 files changed, 1245 insertions(+)
  create mode 100644 drivers/hwmon/peci-cputemp.c
  create mode 100644 drivers/hwmon/peci-dimmtemp.c

diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index f249a4428458..c52f610f81d0 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -1259,6 +1259,34 @@ config SENSORS_NCT7904
  This driver can also be built as a module.  If so, the module
  will be called nct7904.
  
+config SENSORS_PECI_CPUTEMP

+   tristate "PECI CPU temperature monitoring support"
+   depends on OF
+   depends on PECI
+   help
+ If you say yes here you get support for the generic Intel PECI
+ cputemp driver which provides Digital Thermal Sensor (DTS) thermal
+ readings of the CPU package and CPU cores that are accessible using
+ the PECI Client Command Suite via the processor PECI client.
+ Check Documentation/hwmon/peci-cputemp for details.
+
+ This driver can also be built as a module.  If so, the module
+ will be called peci-cputemp.
+
+config SENSORS_PECI_DIMMTEMP
+   tristate "PECI DIMM temperature monitoring support"
+   depends on OF
+   depends on PECI
+   help
+ If you say yes here you get support for the generic Intel PECI hwmon
+ driver which provides Digital Thermal Sensor (DTS) thermal readings of
+ DIMM components that are accessible using the PECI Client Command
+ Suite via the processor PECI client.
+ Check Documentation/hwmon/peci-dimmtemp for details.
+
+ This driver can also be built as a module.  If so, the module
+ will be called peci-dimmtemp.
+
  config SENSORS_NSA320
tristate "ZyXEL NSA320 and compatible fan speed and temperature sensors"
depends on GPIOLIB && OF
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index e7d52a36e6c4..48d9598fcd3a 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -136,6 +136,8 @@ obj-$(CONFIG_SENSORS_NCT7802)   += nct7802.o
  obj-$(CONFIG_SENSORS_NCT7904) += nct7904.o
  obj-$(CONFIG_SENSORS_NSA320)  += nsa320-hwmon.o
  obj-$(CONFIG_SENSORS_NTC_THERMISTOR)  += ntc_thermistor.o
+obj-$(CONFIG_SENSORS_PECI_CPUTEMP) += peci-cputemp.o
+obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)+= peci-dimmtemp.o
  obj-$(CONFIG_SENSORS_PC87360) += pc87360.o
  obj-$(CONFIG_SENSORS_PC87427) += pc87427.o
  obj-$(CONFIG_SENSORS_PCF8591) += pcf8591.o
diff --git a/drivers/hwmon/peci-cputemp.c b/drivers/hwmon/peci-cputemp.c
new file mode 100644
index ..f0bc92687512
--- /dev/null
+++ b/drivers/hwmon/peci-cputemp.c
@@ -0,0 +1,783 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2018 Intel Corporation
+
+#include 
+#include 
+#include 


Is this include needed ?



No it isn't. Will drop the line.


+#include 
+#include 
+#include 
+#include 
+
+#define TEMP_TYPE_PECI6  /* Sensor type 6: Intel PECI */
+
+#define CORE_MAX_ON_HSX   18 /* Max number of cores on Haswell */
+#define CORE_MAX_ON_BDX   24 /* Max number of cores on Broadwell */
+#define CORE_MAX_ON_SKX   28 /* Max number of cores on Skylake */
+
+#define DEFAULT_CHANNEL_NUMS  5
+#define CORETEMP_CHANNEL_NUMS CORE_MAX_ON_SKX
+#define CPUTEMP_CHANNEL_NUMS  (DEFAULT_CHANNEL_NUMS + CORETEMP_CHANNEL_NUMS)
+
+#define CLIENT_CPU_ID_MASK0xf0ff0  /* Mask for Family / Model info */
+
+#define UPDATE_INTERVAL_MIN   HZ
+
+enum cpu_gens {
+   CPU_GEN_HSX, /* Haswell Xeon */
+   CPU_GEN_BRX, /* Broadwell Xeon */
+   CPU_GEN_SKX, /* Skylake Xeon */
+   CPU_GEN_MAX
+};
+
+struct cpu_gen_info {
+   u32 type;
+   u32 cpu_id;
+   u32 core_max;
+};
+
+struct temp_data {
+   bool valid;
+   s32  value;
+   unsigned long last_updated;
+};
+
+struct temp_group {
+   struct temp_data die;
+   struct temp_data dts_margin;
+   struct temp_data tcontrol;
+   struct temp_data tthrottle;
+   struct temp_data tjmax;
+   struct temp_data core[CORETEMP_CHANNEL_NUMS];
+};
+
+struct peci

Re: [RFC bpf-next v2 7/8] bpf: add documentation for eBPF helpers (51-57)

2018-04-11 Thread Quentin Monnet
2018-04-11 16:44 UTC+0100 ~ Quentin Monnet 
> 2018-04-10 09:58 UTC-0700 ~ Yonghong Song 
>> On 4/10/18 7:41 AM, Quentin Monnet wrote:
>>> Add documentation for eBPF helper functions to bpf.h user header file.
>>> This documentation can be parsed with the Python script provided in
>>> another commit of the patch series, in order to provide a RST document
>>> that can later be converted into a man page.
>>>
>>> The objective is to make the documentation easily understandable and
>>> accessible to all eBPF developers, including beginners.
>>>
>>> This patch contains descriptions for the following helper functions:
>>>
>>> Helpers from Lawrence:
>>> - bpf_setsockopt()
>>> - bpf_getsockopt()
>>> - bpf_sock_ops_cb_flags_set()
>>>
>>> Helpers from Yonghong:
>>> - bpf_perf_event_read_value()
>>> - bpf_perf_prog_read_value()
>>>
>>> Helper from Josef:
>>> - bpf_override_return()
>>>
>>> Helper from Andrey:
>>> - bpf_bind()
>>>
>>> Cc: Lawrence Brakmo 
>>> Cc: Yonghong Song 
>>> Cc: Josef Bacik 
>>> Cc: Andrey Ignatov 
>>> Signed-off-by: Quentin Monnet 
>>> ---

> [...]
> 
> Thanks Yonghong for the review!
> 
> I have a favor to ask of you. I got a bounce for Kaixu Xia's email
> address, and I don't know what alternative email address I could use. I
> CC-ed to have a review for helper bpf_perf_event_read() (in patch 6 of
> this series), which is rather close to bpf_perf_event_read_value().
> Would you mind having a look at that one too, please? The description is
> not long.

Well I read again the description I wrote, and actually the one for
bpf_perf_evnet_read() is nearly a subset of the one for
perf_event_read_value(). So the same comments that you raised earlier
apply, there's probably nothing more to review. But if you notice that
some important info is missing for bpf_perf_event_read(), I'm interested
too!

Quentin
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC bpf-next v2 8/8] bpf: add documentation for eBPF helpers (58-64)

2018-04-11 Thread Quentin Monnet
2018-04-11 12:17 UTC+0200 ~ Jesper Dangaard Brouer 
> On Tue, 10 Apr 2018 15:41:57 +0100
> Quentin Monnet  wrote:
> 
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 7343af4196c8..db090ad03626 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -1250,6 +1250,51 @@ union bpf_attr {
>>   *  Return
>>   *  0 on success, or a negative error in case of failure.
>>   *
>> + * int bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags)
>> + *  Description
>> + *  Redirect the packet to the endpoint referenced by *map* at
>> + *  index *key*. Depending on its type, his *map* can contain
>> + *  references to net devices (for forwarding packets through other
>> + *  ports), or to CPUs (for redirecting XDP frames to another CPU;
>> + *  but this is not fully implemented as of this writing).
> 
> Stating that CPUMAP redirect "is not fully implemented" is confusing.
> The issue is that CPUMAP only works for "native" XDP.
> 
> What about saying:
> 
> "[...] or to CPUs (for redirecting XDP frames to another CPU;
>  but this is only implemented for native XDP as of this writing)"
> 

Fine by me, I will change it. Thank you Jesper for the review!

Quentin


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC bpf-next v2 7/8] bpf: add documentation for eBPF helpers (51-57)

2018-04-11 Thread Quentin Monnet
2018-04-10 10:50 UTC-0700 ~ Andrey Ignatov 
> Quentin Monnet  [Tue, 2018-04-10 07:43 -0700]:
>> + * int bpf_bind(struct bpf_sock_addr_kern *ctx, struct sockaddr *addr, int 
>> addr_len)
>> + *  Description
>> + *  Bind the socket associated to *ctx* to the address pointed by
>> + *  *addr*, of length *addr_len*. This allows for making outgoing
>> + *  connection from the desired IP address, which can be useful for
>> + *  example when all processes inside a cgroup should use one
>> + *  single IP address on a host that has multiple IP configured.
>> + *
>> + *  This helper works for IPv4 and IPv6, TCP and UDP sockets. The
>> + *  domain (*addr*\ **->sa_family**) must be **AF_INET** (or
>> + *  **AF_INET6**). Looking for a free port to bind to can be
>> + *  expensive, therefore binding to port is not permitted by the
>> + *  helper: *addr*\ **->sin_port** (or **sin6_port**, respectively)
>> + *  must be set to zero.
>> + *
>> + *  As for the remote end, both parts of it can be overridden,
>> + *  remote IP and remote port. This can be useful if an application
>> + *  inside a cgroup wants to connect to another application inside
>> + *  the same cgroup or to itself, but knows nothing about the IP
>> + *  address assigned to the cgroup.
> 
> The last paragraph ("As for the remote end ...") is not relevant to
> bpf_bind() and should be removed. It's about sys_connect hook itself
> that can call to bpf_bind() but also has other functionality (and that
> other functionality is described by this paragraph).

Thanks Andrey, I will remove this paragraph.

Quentin



--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC bpf-next v2 7/8] bpf: add documentation for eBPF helpers (51-57)

2018-04-11 Thread Quentin Monnet
2018-04-10 09:58 UTC-0700 ~ Yonghong Song 
> On 4/10/18 7:41 AM, Quentin Monnet wrote:
>> Add documentation for eBPF helper functions to bpf.h user header file.
>> This documentation can be parsed with the Python script provided in
>> another commit of the patch series, in order to provide a RST document
>> that can later be converted into a man page.
>>
>> The objective is to make the documentation easily understandable and
>> accessible to all eBPF developers, including beginners.
>>
>> This patch contains descriptions for the following helper functions:
>>
>> Helpers from Lawrence:
>> - bpf_setsockopt()
>> - bpf_getsockopt()
>> - bpf_sock_ops_cb_flags_set()
>>
>> Helpers from Yonghong:
>> - bpf_perf_event_read_value()
>> - bpf_perf_prog_read_value()
>>
>> Helper from Josef:
>> - bpf_override_return()
>>
>> Helper from Andrey:
>> - bpf_bind()
>>
>> Cc: Lawrence Brakmo 
>> Cc: Yonghong Song 
>> Cc: Josef Bacik 
>> Cc: Andrey Ignatov 
>> Signed-off-by: Quentin Monnet 
>> ---
>>   include/uapi/linux/bpf.h | 184
>> +++
>>   1 file changed, 184 insertions(+)
>>
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 15d9ccafebbe..7343af4196c8 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h

[...]

>> @@ -1255,6 +1277,168 @@ union bpf_attr {
>>    * performed again.
>>    * Return
>>    * 0 on success, or a negative error in case of failure.
>> + *
>> + * int bpf_perf_event_read_value(struct bpf_map *map, u64 flags,
>> struct bpf_perf_event_value *buf, u32 buf_size)
>> + * Description
>> + * Read the value of a perf event counter, and store it into
>> *buf*
>> + * of size *buf_size*. This helper relies on a *map* of type
>> + * **BPF_MAP_TYPE_PERF_EVENT_ARRAY**. The nature of the perf
>> + * event counter is selected at the creation of the *map*. The
> 
> The nature of the perf event counter is selected when *map* is updated
> with perf_event fd's.
> 

Thanks, I will fix it.

>> + * *map* is an array whose size is the number of available CPU
>> + * cores, and each cell contains a value relative to one
>> core. The
> 
> It is confusing to mix core/cpu here. Maybe just use perf_event
> convention, always using cpu?
> 

Right, I'll remove occurrences of "core".

>> + * value to retrieve is indicated by *flags*, that contains the
>> + * index of the core to look up, masked with
>> + * **BPF_F_INDEX_MASK**. Alternatively, *flags* can be set to
>> + * **BPF_F_CURRENT_CPU** to indicate that the value for the
>> + * current CPU core should be retrieved.
>> + *
>> + * This helper behaves in a way close to
>> + * **bpf_perf_event_read**\ () helper, save that instead of
>> + * just returning the value observed, it fills the *buf*
>> + * structure. This allows for additional data to be
>> retrieved: in
>> + * particular, the enabled and running times (in *buf*\
>> + * **->enabled** and *buf*\ **->running**, respectively) are
>> + * copied.
>> + *
>> + * These values are interesting, because hardware PMU
>> (Performance
>> + * Monitoring Unit) counters are limited resources. When
>> there are
>> + * more PMU based perf events opened than available counters,
>> + * kernel will multiplex these events so each event gets certain
>> + * percentage (but not all) of the PMU time. In case that
>> + * multiplexing happens, the number of samples or counter value
>> + * will not reflect the case compared to when no multiplexing
>> + * occurs. This makes comparison between different runs
>> difficult.
>> + * Typically, the counter value should be normalized before
>> + * comparing to other experiments. The usual normalization is
>> done
>> + * as follows.
>> + *
>> + * ::
>> + *
>> + * normalized_counter = counter * t_enabled / t_running
>> + *
>> + * Where t_enabled is the time enabled for event and
>> t_running is
>> + * the time running for event since last normalization. The
>> + * enabled and running times are accumulated since the perf
>> event
>> + * open. To achieve scaling factor between two invocations of an
>> + * eBPF program, users can can use CPU id as the key (which is
>> + * typical for perf array usage model) to remember the previous
>> + * value and do the calculation inside the eBPF program.
>> + * Return
>> + * 0 on success, or a negative error in case of failure.
>> + *

[...]

Thanks Yonghong for the review!

I have a favor to ask of you. I got a bounce for Kaixu Xia's email
address, and I don't know what alternative email address I could use. I
CC-ed to have a review for helper bpf_perf_event_read() (in patch 6 of
this series), which is rather close to bpf_perf_event_read_value().
Wou

[RFC bpf-next v2 3/8] bpf: add documentation for eBPF helpers (12-22)

2018-04-11 Thread Quentin Monnet
2018-04-10 15:43 UTC-0700 ~ Alexei Starovoitov

> On Tue, Apr 10, 2018 at 03:41:52PM +0100, Quentin Monnet wrote:
>> Add documentation for eBPF helper functions to bpf.h user header file.
>> This documentation can be parsed with the Python script provided in
>> another commit of the patch series, in order to provide a RST document
>> that can later be converted into a man page.
>>
>> The objective is to make the documentation easily understandable and
>> accessible to all eBPF developers, including beginners.
>>
>> This patch contains descriptions for the following helper functions, all
>> writter by Alexei:
>>
>> - bpf_get_current_pid_tgid()
>> - bpf_get_current_uid_gid()
>> - bpf_get_current_comm()
>> - bpf_skb_vlan_push()
>> - bpf_skb_vlan_pop()
>> - bpf_skb_get_tunnel_key()
>> - bpf_skb_set_tunnel_key()
>> - bpf_redirect()
>> - bpf_perf_event_output()
>> - bpf_get_stackid()
>> - bpf_get_current_task()
>>
>> Cc: Alexei Starovoitov 
>> Signed-off-by: Quentin Monnet 
>> ---
>>  include/uapi/linux/bpf.h | 237 
>> +++
>>  1 file changed, 237 insertions(+)
>>
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 2bc653a3a20f..f3ea8824efbc 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -580,6 +580,243 @@ union bpf_attr {
>>   *  performed again.
>>   *  Return
>>   *  0 on success, or a negative error in case of failure.
>> + *
>> + * u64 bpf_get_current_pid_tgid(void)
>> + *  Return
>> + *  A 64-bit integer containing the current tgid and pid, and
>> + *  created as such:
>> + *  *current_task*\ **->tgid << 32 \|**
>> + *  *current_task*\ **->pid**.
>> + *
>> + * u64 bpf_get_current_uid_gid(void)
>> + *  Return
>> + *  A 64-bit integer containing the current GID and UID, and
>> + *  created as such: *current_gid* **<< 32 \|** *current_uid*.
>> + *
>> + * int bpf_get_current_comm(char *buf, u32 size_of_buf)
>> + *  Description
>> + *  Copy the **comm** attribute of the current task into *buf* of
>> + *  *size_of_buf*. The **comm** attribute contains the name of
>> + *  the executable (excluding the path) for the current task. The
>> + *  *size_of_buf* must be strictly positive. On success, the
> 
> that reminds me that we probably should relax it to ARG_CONST_SIZE_OR_ZERO.
> The programs won't be passing an actual zero into it, but it helps
> a lot to tell verifier that zero is also valid, since programs
> become much simpler.
> 

Ok. No change to helper description for now, we will update here when
your patch lands.

>> + *  helper makes sure that the *buf* is NUL-terminated. On failure,
>> + *  it is filled with zeroes.
>> + *  Return
>> + *  0 on success, or a negative error in case of failure.
>> + *
>> + * int bpf_skb_vlan_push(struct sk_buff *skb, __be16 vlan_proto, u16 
>> vlan_tci)
>> + *  Description
>> + *  Push a *vlan_tci* (VLAN tag control information) of protocol
>> + *  *vlan_proto* to the packet associated to *skb*, then update
>> + *  the checksum. Note that if *vlan_proto* is different from
>> + *  **ETH_P_8021Q** and **ETH_P_8021AD**, it is considered to
>> + *  be **ETH_P_8021Q**.
>> + *
>> + *  A call to this helper is susceptible to change data from the
>> + *  packet. Therefore, at load time, all checks on pointers
>> + *  previously done by the verifier are invalidated and must be
>> + *  performed again.
>> + *  Return
>> + *  0 on success, or a negative error in case of failure.
>> + *
>> + * int bpf_skb_vlan_pop(struct sk_buff *skb)
>> + *  Description
>> + *  Pop a VLAN header from the packet associated to *skb*.
>> + *
>> + *  A call to this helper is susceptible to change data from the
>> + *  packet. Therefore, at load time, all checks on pointers
>> + *  previously done by the verifier are invalidated and must be
>> + *  performed again.
>> + *  Return
>> + *  0 on success, or a negative error in case of failure.
>> + *
>> + * int bpf_skb_get_tunnel_key(struct sk_buff *skb, struct bpf_tunnel_key 
>> *key, u32 size, u64 flags)
>> + *  Description
>> + *  Get tunnel metadata. This helper takes a pointer *key* to an
>> + *  empty **struct bpf_tunnel_key** of **size**, that will be
>> + *  filled with tunnel metadata for the packet associated to *skb*.
>> + *  The *flags* can be set to **BPF_F_TUNINFO_IPV6**, which
>> + *  indicates that the tunnel is based on IPv6 protocol instead of
>> + *  IPv4.
>> + *
>> + *  This is typically used on the receive path to perform a lookup
>> + *  or a packet redirection based on the value of *key*:
> 
> above is correct, but feels a bit cryptic.
> May be give more concrete example for particular tunneling protocol like gre
> and say that tu

[RFC bpf-next v2 2/8] bpf: add documentation for eBPF helpers (01-11)

2018-04-11 Thread Quentin Monnet
2018-04-10 10:56 UTC-0700 ~ Alexei Starovoitov

> On Tue, Apr 10, 2018 at 03:41:51PM +0100, Quentin Monnet wrote:
>> Add documentation for eBPF helper functions to bpf.h user header file.
>> This documentation can be parsed with the Python script provided in
>> another commit of the patch series, in order to provide a RST document
>> that can later be converted into a man page.
>>
>> The objective is to make the documentation easily understandable and
>> accessible to all eBPF developers, including beginners.
>>
>> This patch contains descriptions for the following helper functions, all
>> written by Alexei:
>>
>> - bpf_map_lookup_elem()
>> - bpf_map_update_elem()
>> - bpf_map_delete_elem()
>> - bpf_probe_read()
>> - bpf_ktime_get_ns()
>> - bpf_trace_printk()
>> - bpf_skb_store_bytes()
>> - bpf_l3_csum_replace()
>> - bpf_l4_csum_replace()
>> - bpf_tail_call()
>> - bpf_clone_redirect()
>>
>> Cc: Alexei Starovoitov 
>> Signed-off-by: Quentin Monnet 
>> ---
>>  include/uapi/linux/bpf.h | 199 
>> +++
>>  1 file changed, 199 insertions(+)
>>
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 45f77f01e672..2bc653a3a20f 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -381,6 +381,205 @@ union bpf_attr {
>>   * intentional, removing them would break paragraphs for rst2man.
>>   *
>>   * Start of BPF helper function descriptions:
>> + *
>> + * void *bpf_map_lookup_elem(struct bpf_map *map, void *key)
>> + *  Description
>> + *  Perform a lookup in *map* for an entry associated to *key*.
>> + *  Return
>> + *  Map value associated to *key*, or **NULL** if no entry was
>> + *  found.
>> + *
>> + * int bpf_map_update_elem(struct bpf_map *map, void *key, void *value, u64 
>> flags)
>> + *  Description
>> + *  Add or update the value of the entry associated to *key* in
>> + *  *map* with *value*. *flags* is one of:
>> + *
>> + *  **BPF_NOEXIST**
>> + *  The entry for *key* must not exist in the map.
>> + *  **BPF_EXIST**
>> + *  The entry for *key* must already exist in the map.
>> + *  **BPF_ANY**
>> + *  No condition on the existence of the entry for *key*.
>> + *
>> + *  These flags are only useful for maps of type
>> + *  **BPF_MAP_TYPE_HASH**. For all other map types, **BPF_ANY**
>> + *  should be used.
> 
> I think that's not entirely accurate.
> The flags work as expected for all other map types as well
> and for lru map, sockmap, map in map the flags have practical use cases.
> 

Ok, I missed that. I have to go back and check how the flags are used
for those maps. I will cook up something cleaner for the next version of
the set.

>> + *  Return
>> + *  0 on success, or a negative error in case of failure.
>> + *

[...]

>> + *
>> + * int bpf_trace_printk(const char *fmt, u32 fmt_size, ...)
>> + *  Description
>> + *  This helper is a "printk()-like" facility for debugging. It
>> + *  prints a message defined by format *fmt* (of size *fmt_size*)
>> + *  to file *\/sys/kernel/debug/tracing/trace* from DebugFS, if
>> + *  available. It can take up to three additional **u64**
>> + *  arguments (as an eBPF helpers, the total number of arguments is
>> + *  limited to five). Each time the helper is called, it appends a
>> + *  line that looks like the following:
>> + *
>> + *  ::
>> + *
>> + *  telnet-470   [001] .N.. 419421.045894: 0x0001: BPF 
>> command: 2
>> + *
>> + *  In the above:
>> + *
>> + *  * ``telnet`` is the name of the current task.
>> + *  * ``470`` is the PID of the current task.
>> + *  * ``001`` is the CPU number on which the task is
>> + *running.
>> + *  * In ``.N..``, each character refers to a set of
>> + *options (whether irqs are enabled, scheduling
>> + *options, whether hard/softirqs are running, level of
>> + *preempt_disabled respectively). **N** means that
>> + ***TIF_NEED_RESCHED** and **PREEMPT_NEED_RESCHED**
>> + *are set.
>> + *  * ``419421.045894`` is a timestamp.
>> + *  * ``0x0001`` is a fake value used by BPF for the
>> + *instruction pointer register.
>> + *  * ``BPF command: 2`` is the message formatted with
>> + **fmt*.
> 
> the above depends on how trace_pipe was configured. It's a default
> configuration for many, but would be good to explain this a bit better.
> 

I did not know about that. Would you have a pointer about how to
configure trace_pipe, please?

>> + *
>> + *  The conversion specifiers supported by *fmt* are similar, but
>> + *  more limited tha

[RFC bpf-next v2 1/8] bpf: add script and prepare bpf.h for new helpers documentation

2018-04-11 Thread Quentin Monnet
2018-04-10 11:16 UTC-0700 ~ Alexei Starovoitov

> On Tue, Apr 10, 2018 at 03:41:50PM +0100, Quentin Monnet wrote:
>> Remove previous "overview" of eBPF helpers from user bpf.h header.
>> Replace it by a comment explaining how to process the new documentation
>> (to come in following patches) with a Python script to produce RST, then
>> man page documentation.
>>
>> Also add the aforementioned Python script under scripts/. It is used to
>> process include/uapi/linux/bpf.h and to extract helper descriptions, to
>> turn it into a RST document that can further be processed with rst2man
>> to produce a man page. The script takes one "--filename "
>> option. If the script is launched from scripts/ in the kernel root
>> directory, it should be able to find the location of the header to
>> parse, and "--filename " is then optional. If it cannot
>> find the file, then the option becomes mandatory. RST-formatted
>> documentation is printed to standard output.
>>
>> Typical workflow for producing the final man page would be:
>>
>> $ ./scripts/bpf_helpers_doc.py \
>> --filename include/uapi/linux/bpf.h > /tmp/bpf-helpers.rst
>> $ rst2man /tmp/bpf-helpers.rst > /tmp/bpf-helpers.7
>> $ man /tmp/bpf-helpers.7
>>
>> Note that the tool kernel-doc cannot be used to document eBPF helpers,
>> whose signatures are not available directly in the header files
>> (pre-processor directives are used to produce them at the beginning of
>> the compilation process).
>>
>> Signed-off-by: Quentin Monnet 
>> ---
>>  include/uapi/linux/bpf.h   | 406 
>> ++--
>>  scripts/bpf_helpers_doc.py | 414 
>> +
>>  2 files changed, 430 insertions(+), 390 deletions(-)
>>  create mode 100755 scripts/bpf_helpers_doc.py
>>

[...]

>> diff --git a/scripts/bpf_helpers_doc.py b/scripts/bpf_helpers_doc.py
>> new file mode 100755
>> index ..3a15ba3f0a83
>> --- /dev/null
>> +++ b/scripts/bpf_helpers_doc.py
>> @@ -0,0 +1,414 @@
>> +#!/usr/bin/python3
>> +#
>> +# Copyright (C) 2018 Netronome Systems, Inc.
>> +#
>> +# This software is licensed under the GNU General License Version 2,
>> +# June 1991 as shown in the file COPYING in the top-level directory of this
>> +# source tree.
> 
> please use SPDX instead.
> 

Same as for bpftool, our layers remain a bit cautious about it. I'd be
happy to change it for SPDX as a follow-up when I get the green light.

>> +#
>> +# THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
>> +# WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
>> +# BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
>> +# FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND 
>> PERFORMANCE
>> +# OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME
>> +# THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
>> +

[...]

>> +class PrinterRST(Printer):
>> +"""
>> +A printer for dumping collected information about helpers as a 
>> ReStructured
>> +Text page compatible with the rst2man program, which can be used to
>> +generate a manual page for the helpers.
>> +@helpers: array of Helper objects to print to standard output
>> +"""
>> +def print_header(self):
>> +header = '''\
>> +.. Copyright (C) 2018 Netronome Systems, Inc.
> 
> I think would be good to capture copyrights of all authors that added
> the helpers being documented. Since a lot of text was copied from commit
> logs it's only fair to preserve the copyrights.
> Such man page file is automatically generated by the python script
> and script itself is copyrighted by Netronome. That's fine, but the text
> of man page is not netronome only.
> I'm not sure what would be the solution. May be something like:
> "
> Copyright (C) All BPF authors and contributors from 2011 to present
> See git log include/uapi/linux/bpf.h for details
> "
> ?

Seems fair indeed. I do not have a better suggestion myself, so I will
stick to yours.

Out of curiosity, why 2011 for the year? I thought you introduced eBPF
in the kernel in 2014 (bd4cf0ed331a), and I do not believe helpers have
any link with cBPF?

>> +.. 
>> +.. %%%LICENSE_START(VERBATIM)
>> +.. Permission is granted to make and distribute verbatim copies of this
>> +.. manual provided the copyright notice and this permission notice are
>> +.. preserved on all copies.
>> +.. 
>> +.. Permission is granted to copy and distribute modified versions of this
>> +.. manual under the conditions for verbatim copying, provided that the
>> +.. entire resulting derived work is distributed under the terms of a
>> +.. permission notice identical to this one.
>> +.. 
>> +.. Since the Linux kernel and libraries are constantly changing, this
>> +.. manual page may be incorrect or out-of-date.  The author(s) assume no
>> +.. responsibility for errors or omissions, or for damages resulting from
>> +.. the use of the information c

[RFC tip/locking/lockdep v6 01/20] lockdep/Documention: Recursive read lock detection reasoning

2018-04-11 Thread Boqun Feng
This patch add the documentation piece for the reasoning of deadlock
detection related to recursive read lock. The following sections are
added:

*   Explain what is a recursive read lock, and what deadlock cases
they could introduce.

*   Introduce the notations for different types of dependencies, and
the definition of strong paths.

*   Proof for a closed strong path is both sufficient and necessary
for deadlock detections with recursive read locks involved. The
proof could also explain why we call the path "strong"

Signed-off-by: Boqun Feng 
---
 Documentation/locking/lockdep-design.txt | 178 +++
 1 file changed, 178 insertions(+)

diff --git a/Documentation/locking/lockdep-design.txt 
b/Documentation/locking/lockdep-design.txt
index 9de1c158d44c..6bb9e90e2c4f 100644
--- a/Documentation/locking/lockdep-design.txt
+++ b/Documentation/locking/lockdep-design.txt
@@ -284,3 +284,181 @@ Run the command and save the output, then compare against 
the output from
 a later run of this command to identify the leakers.  This same output
 can also help you find situations where runtime lock initialization has
 been omitted.
+
+Recursive read locks:
+-
+
+Lockdep now is equipped with deadlock detection for recursive read locks.
+
+Recursive read locks, as their name indicates, are the locks able to be
+acquired recursively. Unlike non-recursive read locks, recursive read locks
+only get blocked by current write lock *holders* other than write lock
+*waiters*, for example:
+
+   TASK A: TASK B:
+
+   read_lock(X);
+
+   write_lock(X);
+
+   read_lock(X);
+
+is not a deadlock for recursive read locks, as while the task B is waiting for
+the lock X, the second read_lock() doesn't need to wait because it's a 
recursive
+read lock. However if the read_lock() is non-recursive read lock, then the 
above
+case is a deadlock, because even if the write_lock() in TASK B can not get the
+lock, but it can block the second read_lock() in TASK A.
+
+Note that a lock can be a write lock (exclusive lock), a non-recursive read
+lock (non-recursive shared lock) or a recursive read lock (recursive shared
+lock), depending on the lock operations used to acquire it (more specifically,
+the value of the 'read' parameter for lock_acquire()). In other words, a single
+lock instance has three types of acquisition depending on the acquisition
+functions: exclusive, non-recursive read, and recursive read.
+
+To be concise, we call that write locks and non-recursive read locks as
+"non-recursive" locks and recursive read locks as "recursive" locks.
+
+Recursive locks don't block each other, while non-recursive locks do (this is
+even true for two non-recursive read locks). A non-recursive lock can block the
+corresponding recursive lock, and vice versa.
+
+A deadlock case with recursive locks involved is as follow:
+
+   TASK A: TASK B:
+
+   read_lock(X);
+   read_lock(Y);
+   write_lock(Y);
+   write_lock(X);
+
+Task A is waiting for task B to read_unlock() Y and task B is waiting for task
+A to read_unlock() X.
+
+Dependency types and strong dependency paths:
+-
+In order to detect deadlocks as above, lockdep needs to track different 
dependencies.
+There are 4 categories for dependency edges in the lockdep graph:
+
+1) -(NN)->: non-recursive to non-recursive dependency. "X -(NN)-> Y" means
+X -> Y and both X and Y are non-recursive locks.
+
+2) -(RN)->: recursive to non-recursive dependency. "X -(RN)-> Y" means
+X -> Y and X is recursive read lock and Y is non-recursive lock.
+
+3) -(NR)->: non-recursive to recursive dependency, "X -(NR)-> Y" means
+X -> Y and X is non-recursive lock and Y is recursive lock.
+
+4) -(RR)->: recursive to recursive dependency, "X -(RR)-> Y" means
+X -> Y and both X and Y are recursive locks.
+
+Note that given two locks, they may have multiple dependencies between them, 
for example:
+
+   TASK A:
+
+   read_lock(X);
+   write_lock(Y);
+   ...
+
+   TASK B:
+
+   write_lock(X);
+   write_lock(Y);
+
+, we have both X -(RN)-> Y and X -(NN)-> Y in the dependency graph.
+
+We use -(*N)-> for edges that is either -(RN)-> or -(NN)->, the similar for 
-(N*)->,
+-(*R)-> and -(R*)->
+
+A "path" is a series of conjunct dependency edges in the graph. And we define a
+"strong" path, which indicates the strong dependency throughout each dependency
+in the path, as the path that doesn't have two conjunct edges (dependencies) as
+-(*R)-> and -(R*)->. In other words, a "strong" path is a path from a lock
+walking to another through the lock dependencies, and if X -> Y -> Z in the
+path (where X, Y, Z are locks), if the walk from X to Y is through a -(NR)-> or
+-(RR)-> dependency,

[RFC tip/locking/lockdep v6 04/20] lockdep: Redefine LOCK_*_STATE* bits

2018-04-11 Thread Boqun Feng
There are three types of lock acquisitions: write, non-recursive read
and recursive read, among which write locks and non-recursive read locks
have no difference from a viewpoint for deadlock detections, because a
write acquisition of the corresponding lock on an independent CPU or
task makes a non-recursive read lock act as a write lock in the sense of
deadlock. So we could treat them as the same type (named as
"non-recursive lock") in lockdep.

As in the irq lock inversion detection (safe->unsafe deadlock
detection), we used to differ write lock with read lock (non-recursive
and recursive ones), such a classification could be improved as
non-recursive read lock behaves the same as write lock, so this patch
redefines the meanings of LOCK_{USED_IN, ENABLED}_STATE*.

old:
LOCK_* : stands for write lock
LOCK_*_READ: stands for read lock(non-recursive and recursive)
new:
LOCK_* : stands for non-recursive(write lock and non-recursive
read lock)
LOCK_*_RR: stands for recursive read lock

Such a change is needed for a future improvement on recursive read
related irq inversion deadlock detection.

Signed-off-by: Boqun Feng 
---
 Documentation/locking/lockdep-design.txt |  6 +++---
 kernel/locking/lockdep.c | 28 ++--
 kernel/locking/lockdep_internals.h   | 16 
 kernel/locking/lockdep_proc.c| 12 ++--
 4 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/Documentation/locking/lockdep-design.txt 
b/Documentation/locking/lockdep-design.txt
index 6bb9e90e2c4f..53ede30ce16d 100644
--- a/Documentation/locking/lockdep-design.txt
+++ b/Documentation/locking/lockdep-design.txt
@@ -30,9 +30,9 @@ State
 The validator tracks lock-class usage history into 4n + 1 separate state bits:
 
 - 'ever held in STATE context'
-- 'ever held as readlock in STATE context'
+- 'ever held as recursive readlock in STATE context'
 - 'ever held with STATE enabled'
-- 'ever held as readlock with STATE enabled'
+- 'ever held as recurisve readlock with STATE enabled'
 
 Where STATE can be either one of (kernel/locking/lockdep_states.h)
  - hardirq
@@ -51,7 +51,7 @@ locking error messages, inside curlies. A contrived example:
 (&sio_locks[i].lock){-.-...}, at: [] mutex_lock+0x21/0x24
 
 
-The bit position indicates STATE, STATE-read, for each of the states listed
+The bit position indicates STATE, STATE-RR, for each of the states listed
 above, and the character displayed in each indicates:
 
'.'  acquired while irqs disabled and not in irq context
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index f39a071ef0a8..14af2327b52a 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -448,10 +448,10 @@ DEFINE_PER_CPU(struct lockdep_stats, lockdep_stats);
  */
 
 #define __USAGE(__STATE)   \
-   [LOCK_USED_IN_##__STATE] = "IN-"__stringify(__STATE)"-W",   \
-   [LOCK_ENABLED_##__STATE] = __stringify(__STATE)"-ON-W", \
-   [LOCK_USED_IN_##__STATE##_READ] = "IN-"__stringify(__STATE)"-R",\
-   [LOCK_ENABLED_##__STATE##_READ] = __stringify(__STATE)"-ON-R",
+   [LOCK_USED_IN_##__STATE] = "IN-"__stringify(__STATE),   \
+   [LOCK_ENABLED_##__STATE] = __stringify(__STATE)"-ON",   \
+   [LOCK_USED_IN_##__STATE##_RR] = "IN-"__stringify(__STATE)"-RR", \
+   [LOCK_ENABLED_##__STATE##_RR] = __stringify(__STATE)"-ON-RR",
 
 static const char *usage_str[] =
 {
@@ -492,7 +492,7 @@ void get_usage_chars(struct lock_class *class, char 
usage[LOCK_USAGE_CHARS])
 
 #define LOCKDEP_STATE(__STATE) 
\
usage[i++] = get_usage_char(class, LOCK_USED_IN_##__STATE); \
-   usage[i++] = get_usage_char(class, LOCK_USED_IN_##__STATE##_READ);
+   usage[i++] = get_usage_char(class, LOCK_USED_IN_##__STATE##_RR);
 #include "lockdep_states.h"
 #undef LOCKDEP_STATE
 
@@ -1645,7 +1645,7 @@ static const char *state_names[] = {
 
 static const char *state_rnames[] = {
 #define LOCKDEP_STATE(__STATE) \
-   __stringify(__STATE)"-READ",
+   __stringify(__STATE)"-RR",
 #include "lockdep_states.h"
 #undef LOCKDEP_STATE
 };
@@ -3039,14 +3039,14 @@ static int mark_irqflags(struct task_struct *curr, 
struct held_lock *hlock)
 * mark the lock as used in these contexts:
 */
if (!hlock->trylock) {
-   if (hlock->read) {
+   if (hlock->read == 2) {
if (curr->hardirq_context)
if (!mark_lock(curr, hlock,
-   LOCK_USED_IN_HARDIRQ_READ))
+   LOCK_USED_IN_HARDIRQ_RR))
return 0;
if (curr->softirq_context)
if (!mark_lock(curr, hlock,
-   LOCK

Re: [PATCH v3 06/10] drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx

2018-04-11 Thread Joel Stanley
Hello Jae,

On 11 April 2018 at 04:02, Jae Hyun Yoo  wrote:
> This commit adds PECI adapter driver implementation for Aspeed
> AST24xx/AST25xx.

The driver is looking good!

It looks like you've done some kind of review that we weren't allowed
to see, which is a double edged sword - I might be asking about things
that you've already spoken about with someone else.

I'm only just learning about PECI, but I do have some general comments below.

> ---
>  drivers/peci/Kconfig   |  28 +++
>  drivers/peci/Makefile  |   3 +
>  drivers/peci/peci-aspeed.c | 504 
> +
>  3 files changed, 535 insertions(+)
>  create mode 100644 drivers/peci/peci-aspeed.c
>
> diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
> index 1fbc13f9e6c2..0e33420365de 100644
> --- a/drivers/peci/Kconfig
> +++ b/drivers/peci/Kconfig
> @@ -14,4 +14,32 @@ config PECI
>   processors and chipset components to external monitoring or control
>   devices.
>
> + If you want PECI support, you should say Y here and also to the
> + specific driver for your bus adapter(s) below.
> +
> +if PECI
> +
> +#
> +# PECI hardware bus configuration
> +#
> +
> +menu "PECI Hardware Bus support"
> +
> +config PECI_ASPEED
> +   tristate "Aspeed AST24xx/AST25xx PECI support"

I think just saying ASPEED PECI support is enough. That way if the
next ASPEED SoC happens to have PECI we don't need to update all of
the help text :)

> +   select REGMAP_MMIO
> +   depends on OF
> +   depends on ARCH_ASPEED || COMPILE_TEST
> +   help
> + Say Y here if you want support for the Platform Environment Control
> + Interface (PECI) bus adapter driver on the Aspeed AST24XX and 
> AST25XX
> + SoCs.
> +
> + This support is also available as a module.  If so, the module
> + will be called peci-aspeed.
> +
> +endmenu
> +
> +endif # PECI
> +
>  endmenu
> diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
> index 9e8615e0d3ff..886285e69765 100644
> --- a/drivers/peci/Makefile
> +++ b/drivers/peci/Makefile
> @@ -4,3 +4,6 @@
>
>  # Core functionality
>  obj-$(CONFIG_PECI) += peci-core.o
> +
> +# Hardware specific bus drivers
> +obj-$(CONFIG_PECI_ASPEED)  += peci-aspeed.o
> diff --git a/drivers/peci/peci-aspeed.c b/drivers/peci/peci-aspeed.c
> new file mode 100644
> index ..be2a1f327eb1
> --- /dev/null
> +++ b/drivers/peci/peci-aspeed.c
> @@ -0,0 +1,504 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (C) 2012-2017 ASPEED Technology Inc.
> +// Copyright (c) 2018 Intel Corporation
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define DUMP_DEBUG 0
> +
> +/* Aspeed PECI Registers */
> +#define AST_PECI_CTRL 0x00

Nit: we use ASPEED instead of AST in the upstream kernel to distingush
from the aspeed sdk drivers. If you feel strongly about this then I
won't insist you change.

> +#define AST_PECI_TIMING   0x04
> +#define AST_PECI_CMD  0x08
> +#define AST_PECI_CMD_CTRL 0x0c
> +#define AST_PECI_EXP_FCS  0x10
> +#define AST_PECI_CAP_FCS  0x14
> +#define AST_PECI_INT_CTRL 0x18
> +#define AST_PECI_INT_STS  0x1c
> +#define AST_PECI_W_DATA0  0x20
> +#define AST_PECI_W_DATA1  0x24
> +#define AST_PECI_W_DATA2  0x28
> +#define AST_PECI_W_DATA3  0x2c
> +#define AST_PECI_R_DATA0  0x30
> +#define AST_PECI_R_DATA1  0x34
> +#define AST_PECI_R_DATA2  0x38
> +#define AST_PECI_R_DATA3  0x3c
> +#define AST_PECI_W_DATA4  0x40
> +#define AST_PECI_W_DATA5  0x44
> +#define AST_PECI_W_DATA6  0x48
> +#define AST_PECI_W_DATA7  0x4c
> +#define AST_PECI_R_DATA4  0x50
> +#define AST_PECI_R_DATA5  0x54
> +#define AST_PECI_R_DATA6  0x58
> +#define AST_PECI_R_DATA7  0x5c
> +
> +/* AST_PECI_CTRL - 0x00 : Control Register */
> +#define PECI_CTRL_SAMPLING_MASK GENMASK(19, 16)
> +#define PECI_CTRL_SAMPLING(x)   (((x) << 16) & PECI_CTRL_SAMPLING_MASK)
> +#define PECI_CTRL_SAMPLING_GET(x)   (((x) & PECI_CTRL_SAMPLING_MASK) >> 16)
> +#define PECI_CTRL_READ_MODE_MASKGENMASK(13, 12)
> +#define PECI_CTRL_READ_MODE(x)  (((x) << 12) & PECI_CTRL_READ_MODE_MASK)
> +#define PECI_CTRL_READ_MODE_GET(x)  (((x) & PECI_CTRL_READ_MODE_MASK) >> 12)
> +#define PECI_CTRL_READ_MODE_COUNT   BIT(12)
> +#define PECI_CTRL_READ_MODE_DBG BIT(13)
> +#define PECI_CTRL_CLK_SOURCE_MASK   BIT(11)
> +#define PECI_CTRL_CLK_SOURCE(x) (((x) << 11) & PECI_CTRL_CLK_SOURCE_MASK)
> +#define PECI_CTRL_CLK_SOURCE_GET(x) (((x) & PECI_CTRL_CLK_SOURCE_MASK) >> 11)
> +#define PECI_CTRL_CLK_DIV_MASK  GENMASK(10, 8)
> +#define PECI_CTRL_CLK_DIV(x)(((x) << 8) & PECI_CTRL_CLK_DIV_MASK)
> +#define PECI_CTRL_CLK_DIV_GET(x)(((x) & PECI_CTRL_CLK_DIV_MASK) >> 8)
> +#define PECI_CTRL_INVERT_OUTBIT(7)
> +#define PECI_CTRL_INVERT_IN BIT(6)
> +#define PECI_CTRL_BUS_CONTENT_ENBIT(5)
> +#define PECI_CTRL_PECI_EN   BIT(4)
> +#define PECI_CTRL_PECI_CLK

Re: [PATCH v3 01/10] Documentations: dt-bindings: Add documents of generic PECI bus, adapter and client drivers

2018-04-11 Thread Joel Stanley
Hi Jae,

On 11 April 2018 at 04:02, Jae Hyun Yoo  wrote:
> This commit adds documents of generic PECI bus, adapter and client drivers.
>
> Signed-off-by: Jae Hyun Yoo 
> Reviewed-by: Haiyue Wang 
> Reviewed-by: James Feist 
> Reviewed-by: Vernon Mauery 
> Cc: Alan Cox 
> Cc: Andrew Jeffery 
> Cc: Andrew Lunn 
> Cc: Andy Shevchenko 
> Cc: Arnd Bergmann 
> Cc: Benjamin Herrenschmidt 
> Cc: Fengguang Wu 
> Cc: Greg KH 
> Cc: Guenter Roeck 
> Cc: Jason M Biils 
> Cc: Jean Delvare 
> Cc: Joel Stanley 
> Cc: Julia Cartwright 
> Cc: Miguel Ojeda 
> Cc: Milton Miller II 
> Cc: Pavel Machek 
> Cc: Randy Dunlap 
> Cc: Stef van Os 
> Cc: Sumeet R Pawnikar 

That's a hefty cc list. I can't see Rob Herring though, and he's
usually the person who you need to convince to get your bindings
accepted.

I recommend using ./scripts/get_maintainers.pl to build your CC list,
and then add others you think are relevant.

I'm not sure what the guidelines are for generic bindings, so I'll
defer to Rob for this patch.

Cheers,

Joel

> ---
>  .../devicetree/bindings/peci/peci-adapter.txt  | 23 
>  .../devicetree/bindings/peci/peci-bus.txt  | 15 +
>  .../devicetree/bindings/peci/peci-client.txt   | 25 
> ++
>  3 files changed, 63 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/peci/peci-adapter.txt
>  create mode 100644 Documentation/devicetree/bindings/peci/peci-bus.txt
>  create mode 100644 Documentation/devicetree/bindings/peci/peci-client.txt
>
> diff --git a/Documentation/devicetree/bindings/peci/peci-adapter.txt 
> b/Documentation/devicetree/bindings/peci/peci-adapter.txt
> new file mode 100644
> index ..9221374f6b11
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/peci/peci-adapter.txt
> @@ -0,0 +1,23 @@
> +Generic device tree configuration for PECI adapters.
> +
> +Required properties:
> +- compatible : Should contain hardware specific definition strings that 
> can
> +  match an adapter driver implementation.
> +- reg: Should contain PECI controller registers location and 
> length.
> +- #address-cells : Should be <1>.
> +- #size-cells: Should be <0>.
> +
> +Example:
> +   peci: peci@1000 {
> +   compatible = "simple-bus";
> +   #address-cells = <1>;
> +   #size-cells = <1>;
> +   ranges = <0x0 0x1000 0x1000>;
> +
> +   peci0: peci-bus@0 {
> +   compatible = "soc,soc-peci";
> +   reg = <0x0 0x1000>;
> +   #address-cells = <1>;
> +   #size-cells = <0>;
> +   };
> +   };
> diff --git a/Documentation/devicetree/bindings/peci/peci-bus.txt 
> b/Documentation/devicetree/bindings/peci/peci-bus.txt
> new file mode 100644
> index ..90bcc791ccb0
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/peci/peci-bus.txt
> @@ -0,0 +1,15 @@
> +Generic device tree configuration for PECI buses.
> +
> +Required properties:
> +- compatible : Should be "simple-bus".
> +- #address-cells : Should be <1>.
> +- #size-cells: Should be <1>.
> +- ranges : Should contain PECI controller registers ranges.
> +
> +Example:
> +   peci: peci@1000 {
> +   compatible = "simple-bus";
> +   #address-cells = <1>;
> +   #size-cells = <1>;
> +   ranges = <0x0 0x1000 0x1000>;
> +   };
> diff --git a/Documentation/devicetree/bindings/peci/peci-client.txt 
> b/Documentation/devicetree/bindings/peci/peci-client.txt
> new file mode 100644
> index ..8e2bfd8532f6
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/peci/peci-client.txt
> @@ -0,0 +1,25 @@
> +Generic device tree configuration for PECI clients.
> +
> +Required properties:
> +- compatible : Should contain target device specific definition strings that 
> can
> +  match a client driver implementation.
> +- reg: Should contain address of a client CPU. Address range of CPU
> +  clients is starting from 0x30 based on PECI specification.
> +  <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)
> +
> +Example:
> +   peci-bus@0 {
> +   #address-cells = <1>;
> +   #size-cells = <0>;
> +   < more properties >
> +
> +   function@cpu0 {
> +   compatible = "device,function";
> +   reg = <0x30>;
> +   };
> +
> +   function@cpu1 {
> +   compatible = "device,function";
> +   reg = <0x31>;
> +   };
> +   };
> --
> 2.16.2
>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs

2018-04-11 Thread Joel Stanley
On 11 April 2018 at 04:02, Jae Hyun Yoo  wrote:
> This commit adds a dt-bindings document of PECI adapter driver for Aspeed

We try to capitalise ASPEED.

> AST24xx/25xx SoCs.
> ---
>  .../devicetree/bindings/peci/peci-aspeed.txt   | 60 
> ++
>  1 file changed, 60 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.txt
>
> diff --git a/Documentation/devicetree/bindings/peci/peci-aspeed.txt 
> b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
> new file mode 100644
> index ..4598bb8c20fa
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
> @@ -0,0 +1,60 @@
> +Device tree configuration for PECI buses on the AST24XX and AST25XX SoCs.
> +
> +Required properties:
> +- compatible: Should be "aspeed,ast2400-peci" or 
> "aspeed,ast2500-peci"
> + - aspeed,ast2400-peci: Aspeed AST2400 family PECI
> +controller
> + - aspeed,ast2500-peci: Aspeed AST2500 family PECI
> +controller
> +- reg   : Should contain PECI controller registers location and
> + length.
> +- #address-cells: Should be <1>.
> +- #size-cells   : Should be <0>.
> +- interrupts: Should contain PECI controller interrupt.
> +- clocks: Should contain clock source for PECI controller.
> + Should reference clkin.

Are you sure that this is driven by clkin? Most peripherals on the
Aspeed are attached to the apb, so should reference that clock.

> +- clock_frequency   : Should contain the operation frequency of PECI 
> controller
> + in units of Hz.
> + 187500 ~ 2400

Can you explain why you need both the parent clock and this frequency
to be specified?

> +
> +Optional properties:
> +- msg-timing-nego   : Message timing negotiation period. This value will

Perhaps msg-timing-period? Or just msg-timing?

> + determine the period of message timing negotiation to be
> + issued by PECI controller. The unit of the programmed
> + value is four times of PECI clock period.
> + 0 ~ 255 (default: 1)
> +- addr-timing-nego  : Address timing negotiation period. This value will
> + determine the period of address timing negotiation to be
> + issued by PECI controller. The unit of the programmed
> + value is four times of PECI clock period.
> + 0 ~ 255 (default: 1)
> +- rd-sampling-point : Read sampling point selection. The whole period of a 
> bit
> + time will be divided into 16 time frames. This value 
> will
> + determine the time frame in which the controller will
> + sample PECI signal for data read back. Usually in the
> + middle of a bit time is the best.
> + 0 ~ 15 (default: 8)
> +- cmd_timeout_ms: Command timeout in units of ms.
> + 1 ~ 6 (default: 1000)
> +
> +Example:
> +   peci: peci@1e78b000 {
> +   compatible = "simple-bus";
> +   #address-cells = <1>;
> +   #size-cells = <1>;
> +   ranges = <0x0 0x1e78b000 0x60>;
> +
> +   peci0: peci-bus@0 {
> +   compatible = "aspeed,ast2500-peci";
> +   reg = <0x0 0x60>;
> +   #address-cells = <1>;
> +   #size-cells = <0>;
> +   interrupts = <15>;
> +   clocks = <&clk_clkin>;
> +   clock-frequency = <2400>;
> +   msg-timing-nego = <1>;
> +   addr-timing-nego = <1>;
> +   rd-sampling-point = <8>;
> +   cmd-timeout-ms = <1000>;
> +   };
> +   };
> --
> 2.16.2
>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 05/10] ARM: dts: aspeed: peci: Add PECI node

2018-04-11 Thread Joel Stanley
On 11 April 2018 at 04:02, Jae Hyun Yoo  wrote:
> This commit adds PECI bus/adapter node of AST24xx/AST25xx into
> aspeed-g4 and aspeed-g5.
>

The patches to the device trees get merged by the ASPEED maintainer
(me). Once you have the bindings reviewed you can send the patches to
me and the linux-aspeed list (I've got a pending patch to maintainers
that will ensure get_maintainers.pl does the right thing as far as
email addresses go).

I'd suggest dropping it from your series and re-sending once the
bindings and driver are reviewed.

Cheers,

Joel
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/2] mm: remove odd HAVE_PTE_SPECIAL

2018-04-11 Thread Michal Hocko
On Wed 11-04-18 12:32:07, Laurent Dufour wrote:
[...]
> Andrew, should I send a v4 or could you wipe the 2 __maybe_unsued when 
> applying
> the patch ?

A follow $patch-fix should be better rather than post this again and
spam people with more emails.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/2] mm: remove odd HAVE_PTE_SPECIAL

2018-04-11 Thread Laurent Dufour
On 11/04/2018 11:09, Christophe LEROY wrote:
> 
> 
> Le 11/04/2018 à 11:03, Laurent Dufour a écrit :
>>
>>
>> On 11/04/2018 10:58, Christophe LEROY wrote:
>>>
>>>
>>> Le 11/04/2018 à 10:03, Laurent Dufour a écrit :
 Remove the additional define HAVE_PTE_SPECIAL and rely directly on
 CONFIG_ARCH_HAS_PTE_SPECIAL.

 There is no functional change introduced by this patch

 Signed-off-by: Laurent Dufour 
 ---
    mm/memory.c | 19 ---
    1 file changed, 8 insertions(+), 11 deletions(-)

 diff --git a/mm/memory.c b/mm/memory.c
 index 96910c625daa..7f7dc7b2a341 100644
 --- a/mm/memory.c
 +++ b/mm/memory.c
 @@ -817,17 +817,12 @@ static void print_bad_pte(struct vm_area_struct *vma,
 unsigned long addr,
     * PFNMAP mappings in order to support COWable mappings.
     *
     */
 -#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL
 -# define HAVE_PTE_SPECIAL 1
 -#else
 -# define HAVE_PTE_SPECIAL 0
 -#endif
    struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long
 addr,
     pte_t pte, bool with_public_device)
    {
    unsigned long pfn = pte_pfn(pte);
    -    if (HAVE_PTE_SPECIAL) {
 +    if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL)) {
    if (likely(!pte_special(pte)))
    goto check_pfn;
    if (vma->vm_ops && vma->vm_ops->find_special_page)
 @@ -862,7 +857,7 @@ struct page *_vm_normal_page(struct vm_area_struct 
 *vma,
 unsigned long addr,
    return NULL;
    }
    -    /* !HAVE_PTE_SPECIAL case follows: */
 +    /* !CONFIG_ARCH_HAS_PTE_SPECIAL case follows: */
      if (unlikely(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
    if (vma->vm_flags & VM_MIXEDMAP) {
 @@ -881,7 +876,8 @@ struct page *_vm_normal_page(struct vm_area_struct 
 *vma,
 unsigned long addr,
      if (is_zero_pfn(pfn))
    return NULL;
 -check_pfn:
 +
 +check_pfn: __maybe_unused
>>>
>>> See below
>>>
    if (unlikely(pfn > highest_memmap_pfn)) {
    print_bad_pte(vma, addr, pte, NULL);
    return NULL;
 @@ -891,7 +887,7 @@ struct page *_vm_normal_page(struct vm_area_struct 
 *vma,
 unsigned long addr,
     * NOTE! We still have PageReserved() pages in the page tables.
     * eg. VDSO mappings can cause them to exist.
     */
 -out:
 +out: __maybe_unused
>>>
>>> Why do you need that change ?
>>>
>>> There is no reason for the compiler to complain. It would complain if the 
>>> goto
>>> was within a #ifdef, but all the purpose of using IS_ENABLED() is to allow 
>>> the
>>> compiler to properly handle all possible cases. That's all the force of
>>> IS_ENABLED() compared to ifdefs, and that the reason why they are 
>>> plebicited,
>>> ref Linux Codying style for a detailed explanation.
>>
>> Fair enough.
>>
>> Should I submit a v4 just to remove these so ugly __maybe_unused ?
>>
> 
> Most likely, unless the mm maintainer agrees to remove them by himself when
> applying your patch ?

That was my point.

Andrew, should I send a v4 or could you wipe the 2 __maybe_unsued when applying
the patch ?

Thanks,
Laurent.

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC bpf-next v2 8/8] bpf: add documentation for eBPF helpers (58-64)

2018-04-11 Thread Jesper Dangaard Brouer
On Tue, 10 Apr 2018 15:41:57 +0100
Quentin Monnet  wrote:

> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 7343af4196c8..db090ad03626 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1250,6 +1250,51 @@ union bpf_attr {
>   *   Return
>   *   0 on success, or a negative error in case of failure.
>   *
> + * int bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags)
> + *   Description
> + *   Redirect the packet to the endpoint referenced by *map* at
> + *   index *key*. Depending on its type, his *map* can contain
> + *   references to net devices (for forwarding packets through other
> + *   ports), or to CPUs (for redirecting XDP frames to another CPU;
> + *   but this is not fully implemented as of this writing).

Stating that CPUMAP redirect "is not fully implemented" is confusing.
The issue is that CPUMAP only works for "native" XDP.

What about saying:

"[...] or to CPUs (for redirecting XDP frames to another CPU;
 but this is only implemented for native XDP as of this writing)"

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/2] mm: remove odd HAVE_PTE_SPECIAL

2018-04-11 Thread Christophe LEROY



Le 11/04/2018 à 11:03, Laurent Dufour a écrit :



On 11/04/2018 10:58, Christophe LEROY wrote:



Le 11/04/2018 à 10:03, Laurent Dufour a écrit :

Remove the additional define HAVE_PTE_SPECIAL and rely directly on
CONFIG_ARCH_HAS_PTE_SPECIAL.

There is no functional change introduced by this patch

Signed-off-by: Laurent Dufour 
---
   mm/memory.c | 19 ---
   1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 96910c625daa..7f7dc7b2a341 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -817,17 +817,12 @@ static void print_bad_pte(struct vm_area_struct *vma,
unsigned long addr,
    * PFNMAP mappings in order to support COWable mappings.
    *
    */
-#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL
-# define HAVE_PTE_SPECIAL 1
-#else
-# define HAVE_PTE_SPECIAL 0
-#endif
   struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
    pte_t pte, bool with_public_device)
   {
   unsigned long pfn = pte_pfn(pte);
   -    if (HAVE_PTE_SPECIAL) {
+    if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL)) {
   if (likely(!pte_special(pte)))
   goto check_pfn;
   if (vma->vm_ops && vma->vm_ops->find_special_page)
@@ -862,7 +857,7 @@ struct page *_vm_normal_page(struct vm_area_struct *vma,
unsigned long addr,
   return NULL;
   }
   -    /* !HAVE_PTE_SPECIAL case follows: */
+    /* !CONFIG_ARCH_HAS_PTE_SPECIAL case follows: */
     if (unlikely(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
   if (vma->vm_flags & VM_MIXEDMAP) {
@@ -881,7 +876,8 @@ struct page *_vm_normal_page(struct vm_area_struct *vma,
unsigned long addr,
     if (is_zero_pfn(pfn))
   return NULL;
-check_pfn:
+
+check_pfn: __maybe_unused


See below


   if (unlikely(pfn > highest_memmap_pfn)) {
   print_bad_pte(vma, addr, pte, NULL);
   return NULL;
@@ -891,7 +887,7 @@ struct page *_vm_normal_page(struct vm_area_struct *vma,
unsigned long addr,
    * NOTE! We still have PageReserved() pages in the page tables.
    * eg. VDSO mappings can cause them to exist.
    */
-out:
+out: __maybe_unused


Why do you need that change ?

There is no reason for the compiler to complain. It would complain if the goto
was within a #ifdef, but all the purpose of using IS_ENABLED() is to allow the
compiler to properly handle all possible cases. That's all the force of
IS_ENABLED() compared to ifdefs, and that the reason why they are plebicited,
ref Linux Codying style for a detailed explanation.


Fair enough.

Should I submit a v4 just to remove these so ugly __maybe_unused ?



Most likely, unless the mm maintainer agrees to remove them by himself 
when applying your patch ?


Christophe
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/2] mm: remove odd HAVE_PTE_SPECIAL

2018-04-11 Thread Laurent Dufour


On 11/04/2018 10:58, Christophe LEROY wrote:
> 
> 
> Le 11/04/2018 à 10:03, Laurent Dufour a écrit :
>> Remove the additional define HAVE_PTE_SPECIAL and rely directly on
>> CONFIG_ARCH_HAS_PTE_SPECIAL.
>>
>> There is no functional change introduced by this patch
>>
>> Signed-off-by: Laurent Dufour 
>> ---
>>   mm/memory.c | 19 ---
>>   1 file changed, 8 insertions(+), 11 deletions(-)
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 96910c625daa..7f7dc7b2a341 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -817,17 +817,12 @@ static void print_bad_pte(struct vm_area_struct *vma,
>> unsigned long addr,
>>    * PFNMAP mappings in order to support COWable mappings.
>>    *
>>    */
>> -#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL
>> -# define HAVE_PTE_SPECIAL 1
>> -#else
>> -# define HAVE_PTE_SPECIAL 0
>> -#endif
>>   struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long 
>> addr,
>>    pte_t pte, bool with_public_device)
>>   {
>>   unsigned long pfn = pte_pfn(pte);
>>   -    if (HAVE_PTE_SPECIAL) {
>> +    if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL)) {
>>   if (likely(!pte_special(pte)))
>>   goto check_pfn;
>>   if (vma->vm_ops && vma->vm_ops->find_special_page)
>> @@ -862,7 +857,7 @@ struct page *_vm_normal_page(struct vm_area_struct *vma,
>> unsigned long addr,
>>   return NULL;
>>   }
>>   -    /* !HAVE_PTE_SPECIAL case follows: */
>> +    /* !CONFIG_ARCH_HAS_PTE_SPECIAL case follows: */
>>     if (unlikely(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
>>   if (vma->vm_flags & VM_MIXEDMAP) {
>> @@ -881,7 +876,8 @@ struct page *_vm_normal_page(struct vm_area_struct *vma,
>> unsigned long addr,
>>     if (is_zero_pfn(pfn))
>>   return NULL;
>> -check_pfn:
>> +
>> +check_pfn: __maybe_unused
> 
> See below
> 
>>   if (unlikely(pfn > highest_memmap_pfn)) {
>>   print_bad_pte(vma, addr, pte, NULL);
>>   return NULL;
>> @@ -891,7 +887,7 @@ struct page *_vm_normal_page(struct vm_area_struct *vma,
>> unsigned long addr,
>>    * NOTE! We still have PageReserved() pages in the page tables.
>>    * eg. VDSO mappings can cause them to exist.
>>    */
>> -out:
>> +out: __maybe_unused
> 
> Why do you need that change ?
> 
> There is no reason for the compiler to complain. It would complain if the goto
> was within a #ifdef, but all the purpose of using IS_ENABLED() is to allow the
> compiler to properly handle all possible cases. That's all the force of
> IS_ENABLED() compared to ifdefs, and that the reason why they are plebicited,
> ref Linux Codying style for a detailed explanation.

Fair enough.

Should I submit a v4 just to remove these so ugly __maybe_unused ?

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/2] mm: remove odd HAVE_PTE_SPECIAL

2018-04-11 Thread Christophe LEROY



Le 11/04/2018 à 10:41, Laurent Dufour a écrit :

On 11/04/2018 10:33, Michal Hocko wrote:

On Wed 11-04-18 10:03:36, Laurent Dufour wrote:

@@ -881,7 +876,8 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, 
unsigned long addr,
  
  	if (is_zero_pfn(pfn))

return NULL;
-check_pfn:
+
+check_pfn: __maybe_unused
if (unlikely(pfn > highest_memmap_pfn)) {
print_bad_pte(vma, addr, pte, NULL);
return NULL;
@@ -891,7 +887,7 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, 
unsigned long addr,
 * NOTE! We still have PageReserved() pages in the page tables.
 * eg. VDSO mappings can cause them to exist.
 */
-out:
+out: __maybe_unused
return pfn_to_page(pfn);


Why do we need this ugliness all of the sudden?

Indeed the compiler doesn't complaint but in theory it should since these
labels are not used depending on CONFIG_ARCH_HAS_PTE_SPECIAL.


Why should it complain ?

Regards
Christophe




--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/2] mm: remove odd HAVE_PTE_SPECIAL

2018-04-11 Thread Christophe LEROY



Le 11/04/2018 à 10:03, Laurent Dufour a écrit :

Remove the additional define HAVE_PTE_SPECIAL and rely directly on
CONFIG_ARCH_HAS_PTE_SPECIAL.

There is no functional change introduced by this patch

Signed-off-by: Laurent Dufour 
---
  mm/memory.c | 19 ---
  1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 96910c625daa..7f7dc7b2a341 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -817,17 +817,12 @@ static void print_bad_pte(struct vm_area_struct *vma, 
unsigned long addr,
   * PFNMAP mappings in order to support COWable mappings.
   *
   */
-#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL
-# define HAVE_PTE_SPECIAL 1
-#else
-# define HAVE_PTE_SPECIAL 0
-#endif
  struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
 pte_t pte, bool with_public_device)
  {
unsigned long pfn = pte_pfn(pte);
  
-	if (HAVE_PTE_SPECIAL) {

+   if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL)) {
if (likely(!pte_special(pte)))
goto check_pfn;
if (vma->vm_ops && vma->vm_ops->find_special_page)
@@ -862,7 +857,7 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, 
unsigned long addr,
return NULL;
}
  
-	/* !HAVE_PTE_SPECIAL case follows: */

+   /* !CONFIG_ARCH_HAS_PTE_SPECIAL case follows: */
  
  	if (unlikely(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))) {

if (vma->vm_flags & VM_MIXEDMAP) {
@@ -881,7 +876,8 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, 
unsigned long addr,
  
  	if (is_zero_pfn(pfn))

return NULL;
-check_pfn:
+
+check_pfn: __maybe_unused


See below


if (unlikely(pfn > highest_memmap_pfn)) {
print_bad_pte(vma, addr, pte, NULL);
return NULL;
@@ -891,7 +887,7 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, 
unsigned long addr,
 * NOTE! We still have PageReserved() pages in the page tables.
 * eg. VDSO mappings can cause them to exist.
 */
-out:
+out: __maybe_unused


Why do you need that change ?

There is no reason for the compiler to complain. It would complain if 
the goto was within a #ifdef, but all the purpose of using IS_ENABLED() 
is to allow the compiler to properly handle all possible cases. That's 
all the force of IS_ENABLED() compared to ifdefs, and that the reason 
why they are plebicited, ref Linux Codying style for a detailed explanation.


Christophe



return pfn_to_page(pfn);
  }
  
@@ -904,7 +900,7 @@ struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr,

/*
 * There is no pmd_special() but there may be special pmds, e.g.
 * in a direct-access (dax) mapping, so let's just replicate the
-* !HAVE_PTE_SPECIAL case from vm_normal_page() here.
+* !CONFIG_ARCH_HAS_PTE_SPECIAL case from vm_normal_page() here.
 */
if (unlikely(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
if (vma->vm_flags & VM_MIXEDMAP) {
@@ -1933,7 +1929,8 @@ static int __vm_insert_mixed(struct vm_area_struct *vma, 
unsigned long addr,
 * than insert_pfn).  If a zero_pfn were inserted into a VM_MIXEDMAP
 * without pte special, it would there be refcounted as a normal page.
 */
-   if (!HAVE_PTE_SPECIAL && !pfn_t_devmap(pfn) && pfn_t_valid(pfn)) {
+   if (!IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL) &&
+   !pfn_t_devmap(pfn) && pfn_t_valid(pfn)) {
struct page *page;
  
  		/*



--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/2] mm: remove odd HAVE_PTE_SPECIAL

2018-04-11 Thread Michal Hocko
On Wed 11-04-18 10:41:23, Laurent Dufour wrote:
> On 11/04/2018 10:33, Michal Hocko wrote:
> > On Wed 11-04-18 10:03:36, Laurent Dufour wrote:
> >> @@ -881,7 +876,8 @@ struct page *_vm_normal_page(struct vm_area_struct 
> >> *vma, unsigned long addr,
> >>  
> >>if (is_zero_pfn(pfn))
> >>return NULL;
> >> -check_pfn:
> >> +
> >> +check_pfn: __maybe_unused
> >>if (unlikely(pfn > highest_memmap_pfn)) {
> >>print_bad_pte(vma, addr, pte, NULL);
> >>return NULL;
> >> @@ -891,7 +887,7 @@ struct page *_vm_normal_page(struct vm_area_struct 
> >> *vma, unsigned long addr,
> >> * NOTE! We still have PageReserved() pages in the page tables.
> >> * eg. VDSO mappings can cause them to exist.
> >> */
> >> -out:
> >> +out: __maybe_unused
> >>return pfn_to_page(pfn);
> > 
> > Why do we need this ugliness all of the sudden?
> Indeed the compiler doesn't complaint but in theory it should since these
> labels are not used depending on CONFIG_ARCH_HAS_PTE_SPECIAL.

Well, such a warning would be quite pointless so I would rather not make
the code ugly. The value of unused label is quite questionable to start
with...

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/2] mm: remove odd HAVE_PTE_SPECIAL

2018-04-11 Thread Laurent Dufour
On 11/04/2018 10:33, Michal Hocko wrote:
> On Wed 11-04-18 10:03:36, Laurent Dufour wrote:
>> @@ -881,7 +876,8 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, 
>> unsigned long addr,
>>  
>>  if (is_zero_pfn(pfn))
>>  return NULL;
>> -check_pfn:
>> +
>> +check_pfn: __maybe_unused
>>  if (unlikely(pfn > highest_memmap_pfn)) {
>>  print_bad_pte(vma, addr, pte, NULL);
>>  return NULL;
>> @@ -891,7 +887,7 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, 
>> unsigned long addr,
>>   * NOTE! We still have PageReserved() pages in the page tables.
>>   * eg. VDSO mappings can cause them to exist.
>>   */
>> -out:
>> +out: __maybe_unused
>>  return pfn_to_page(pfn);
> 
> Why do we need this ugliness all of the sudden?
Indeed the compiler doesn't complaint but in theory it should since these
labels are not used depending on CONFIG_ARCH_HAS_PTE_SPECIAL.

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/2] mm: introduce ARCH_HAS_PTE_SPECIAL

2018-04-11 Thread Michal Hocko
On Wed 11-04-18 10:03:35, Laurent Dufour wrote:
> Currently the PTE special supports is turned on in per architecture header
> files. Most of the time, it is defined in arch/*/include/asm/pgtable.h
> depending or not on some other per architecture static definition.
> 
> This patch introduce a new configuration variable to manage this directly
> in the Kconfig files. It would later replace __HAVE_ARCH_PTE_SPECIAL.
> 
> Here notes for some architecture where the definition of
> __HAVE_ARCH_PTE_SPECIAL is not obvious:
> 
> arm
>  __HAVE_ARCH_PTE_SPECIAL which is currently defined in
> arch/arm/include/asm/pgtable-3level.h which is included by
> arch/arm/include/asm/pgtable.h when CONFIG_ARM_LPAE is set.
> So select ARCH_HAS_PTE_SPECIAL if ARM_LPAE.
> 
> powerpc
> __HAVE_ARCH_PTE_SPECIAL is defined in 2 files:
>  - arch/powerpc/include/asm/book3s/64/pgtable.h
>  - arch/powerpc/include/asm/pte-common.h
> The first one is included if (PPC_BOOK3S & PPC64) while the second is
> included in all the other cases.
> So select ARCH_HAS_PTE_SPECIAL all the time.
> 
> sparc:
> __HAVE_ARCH_PTE_SPECIAL is defined if defined(__sparc__) &&
> defined(__arch64__) which are defined through the compiler in
> sparc/Makefile if !SPARC32 which I assume to be if SPARC64.
> So select ARCH_HAS_PTE_SPECIAL if SPARC64
> 
> There is no functional change introduced by this patch.
> 
> Suggested-by: Jerome Glisse 
> Reviewed-by: Jerome Glisse 
> Acked-by: David Rientjes 
> Signed-off-by: Laurent Dufour 

Looks good to me. I have checked x86 and the generic code and it looks
good to me. Anyway arch maintainers really have to double check this.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/2] mm: remove odd HAVE_PTE_SPECIAL

2018-04-11 Thread Michal Hocko
On Wed 11-04-18 10:03:36, Laurent Dufour wrote:
> @@ -881,7 +876,8 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, 
> unsigned long addr,
>  
>   if (is_zero_pfn(pfn))
>   return NULL;
> -check_pfn:
> +
> +check_pfn: __maybe_unused
>   if (unlikely(pfn > highest_memmap_pfn)) {
>   print_bad_pte(vma, addr, pte, NULL);
>   return NULL;
> @@ -891,7 +887,7 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, 
> unsigned long addr,
>* NOTE! We still have PageReserved() pages in the page tables.
>* eg. VDSO mappings can cause them to exist.
>*/
> -out:
> +out: __maybe_unused
>   return pfn_to_page(pfn);

Why do we need this ugliness all of the sudden?

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 0/2] move __HAVE_ARCH_PTE_SPECIAL in Kconfig

2018-04-11 Thread Laurent Dufour
The per architecture __HAVE_ARCH_PTE_SPECIAL is defined statically in the
per architecture header files. This doesn't allow to make other
configuration dependent on it.

The first patch of this series is replacing __HAVE_ARCH_PTE_SPECIAL by
CONFIG_ARCH_HAS_PTE_SPECIAL defined into the Kconfig files,
setting it automatically when architectures was already setting it in
header file.

The second patch is removing the odd define HAVE_PTE_SPECIAL which is a
duplicate of CONFIG_ARCH_HAS_PTE_SPECIAL.

There is no functional change introduced by this series.

--
Changes since v2:
 * remove __HAVE_ARCH_PTE_SPECIAL in arch/riscv/include/asm/pgtable-bits.h
 * use IS_ENABLED() instead of #ifdef blocks in patch 2

Laurent Dufour (2):
  mm: introduce ARCH_HAS_PTE_SPECIAL
  mm: remove odd HAVE_PTE_SPECIAL

 .../features/vm/pte_special/arch-support.txt  |  2 +-
 arch/arc/Kconfig  |  1 +
 arch/arc/include/asm/pgtable.h|  2 --
 arch/arm/Kconfig  |  1 +
 arch/arm/include/asm/pgtable-3level.h |  1 -
 arch/arm64/Kconfig|  1 +
 arch/arm64/include/asm/pgtable.h  |  2 --
 arch/powerpc/Kconfig  |  1 +
 arch/powerpc/include/asm/book3s/64/pgtable.h  |  3 ---
 arch/powerpc/include/asm/pte-common.h |  3 ---
 arch/riscv/Kconfig|  1 +
 arch/riscv/include/asm/pgtable-bits.h |  3 ---
 arch/s390/Kconfig |  1 +
 arch/s390/include/asm/pgtable.h   |  1 -
 arch/sh/Kconfig   |  1 +
 arch/sh/include/asm/pgtable.h |  2 --
 arch/sparc/Kconfig|  1 +
 arch/sparc/include/asm/pgtable_64.h   |  3 ---
 arch/x86/Kconfig  |  1 +
 arch/x86/include/asm/pgtable_types.h  |  1 -
 include/linux/pfn_t.h |  4 ++--
 mm/Kconfig|  3 +++
 mm/gup.c  |  4 ++--
 mm/memory.c   | 19 ---
 24 files changed, 25 insertions(+), 37 deletions(-)

-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/2] mm: introduce ARCH_HAS_PTE_SPECIAL

2018-04-11 Thread Laurent Dufour
Currently the PTE special supports is turned on in per architecture header
files. Most of the time, it is defined in arch/*/include/asm/pgtable.h
depending or not on some other per architecture static definition.

This patch introduce a new configuration variable to manage this directly
in the Kconfig files. It would later replace __HAVE_ARCH_PTE_SPECIAL.

Here notes for some architecture where the definition of
__HAVE_ARCH_PTE_SPECIAL is not obvious:

arm
 __HAVE_ARCH_PTE_SPECIAL which is currently defined in
arch/arm/include/asm/pgtable-3level.h which is included by
arch/arm/include/asm/pgtable.h when CONFIG_ARM_LPAE is set.
So select ARCH_HAS_PTE_SPECIAL if ARM_LPAE.

powerpc
__HAVE_ARCH_PTE_SPECIAL is defined in 2 files:
 - arch/powerpc/include/asm/book3s/64/pgtable.h
 - arch/powerpc/include/asm/pte-common.h
The first one is included if (PPC_BOOK3S & PPC64) while the second is
included in all the other cases.
So select ARCH_HAS_PTE_SPECIAL all the time.

sparc:
__HAVE_ARCH_PTE_SPECIAL is defined if defined(__sparc__) &&
defined(__arch64__) which are defined through the compiler in
sparc/Makefile if !SPARC32 which I assume to be if SPARC64.
So select ARCH_HAS_PTE_SPECIAL if SPARC64

There is no functional change introduced by this patch.

Suggested-by: Jerome Glisse 
Reviewed-by: Jerome Glisse 
Acked-by: David Rientjes 
Signed-off-by: Laurent Dufour 
---
 Documentation/features/vm/pte_special/arch-support.txt | 2 +-
 arch/arc/Kconfig   | 1 +
 arch/arc/include/asm/pgtable.h | 2 --
 arch/arm/Kconfig   | 1 +
 arch/arm/include/asm/pgtable-3level.h  | 1 -
 arch/arm64/Kconfig | 1 +
 arch/arm64/include/asm/pgtable.h   | 2 --
 arch/powerpc/Kconfig   | 1 +
 arch/powerpc/include/asm/book3s/64/pgtable.h   | 3 ---
 arch/powerpc/include/asm/pte-common.h  | 3 ---
 arch/riscv/Kconfig | 1 +
 arch/riscv/include/asm/pgtable-bits.h  | 3 ---
 arch/s390/Kconfig  | 1 +
 arch/s390/include/asm/pgtable.h| 1 -
 arch/sh/Kconfig| 1 +
 arch/sh/include/asm/pgtable.h  | 2 --
 arch/sparc/Kconfig | 1 +
 arch/sparc/include/asm/pgtable_64.h| 3 ---
 arch/x86/Kconfig   | 1 +
 arch/x86/include/asm/pgtable_types.h   | 1 -
 include/linux/pfn_t.h  | 4 ++--
 mm/Kconfig | 3 +++
 mm/gup.c   | 4 ++--
 mm/memory.c| 2 +-
 24 files changed, 18 insertions(+), 27 deletions(-)

diff --git a/Documentation/features/vm/pte_special/arch-support.txt 
b/Documentation/features/vm/pte_special/arch-support.txt
index 055004f467d2..cd05924ea875 100644
--- a/Documentation/features/vm/pte_special/arch-support.txt
+++ b/Documentation/features/vm/pte_special/arch-support.txt
@@ -1,6 +1,6 @@
 #
 # Feature name:  pte_special
-# Kconfig:   __HAVE_ARCH_PTE_SPECIAL
+# Kconfig:   ARCH_HAS_PTE_SPECIAL
 # description:   arch supports the pte_special()/pte_mkspecial() VM 
APIs
 #
 ---
diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index d76bf4a83740..8516e2b0239a 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -44,6 +44,7 @@ config ARC
select HAVE_GENERIC_DMA_COHERENT
select HAVE_KERNEL_GZIP
select HAVE_KERNEL_LZMA
+   select ARCH_HAS_PTE_SPECIAL
 
 config MIGHT_HAVE_PCI
bool
diff --git a/arch/arc/include/asm/pgtable.h b/arch/arc/include/asm/pgtable.h
index 08fe33830d4b..8ec5599a0957 100644
--- a/arch/arc/include/asm/pgtable.h
+++ b/arch/arc/include/asm/pgtable.h
@@ -320,8 +320,6 @@ PTE_BIT_FUNC(mkexec,|= (_PAGE_EXECUTE));
 PTE_BIT_FUNC(mkspecial,|= (_PAGE_SPECIAL));
 PTE_BIT_FUNC(mkhuge,   |= (_PAGE_HW_SZ));
 
-#define __HAVE_ARCH_PTE_SPECIAL
-
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
return __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot));
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index a7f8e7f4b88f..c088c851b235 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -8,6 +8,7 @@ config ARM
select ARCH_HAS_DEVMEM_IS_ALLOWED
select ARCH_HAS_ELF_RANDOMIZE
select ARCH_HAS_FORTIFY_SOURCE
+   select ARCH_HAS_PTE_SPECIAL if ARM_LPAE
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_PHYS_TO_DMA
select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
diff --git a/arch/arm/include/asm/pgtable-3level.h 
b/arch/arm/include/asm/pgtable-3level.h
index 2a4836087358..6d50a11d7793 100644
--- a/arch/arm/include/asm

[PATCH v3 2/2] mm: remove odd HAVE_PTE_SPECIAL

2018-04-11 Thread Laurent Dufour
Remove the additional define HAVE_PTE_SPECIAL and rely directly on
CONFIG_ARCH_HAS_PTE_SPECIAL.

There is no functional change introduced by this patch

Signed-off-by: Laurent Dufour 
---
 mm/memory.c | 19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 96910c625daa..7f7dc7b2a341 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -817,17 +817,12 @@ static void print_bad_pte(struct vm_area_struct *vma, 
unsigned long addr,
  * PFNMAP mappings in order to support COWable mappings.
  *
  */
-#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL
-# define HAVE_PTE_SPECIAL 1
-#else
-# define HAVE_PTE_SPECIAL 0
-#endif
 struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
 pte_t pte, bool with_public_device)
 {
unsigned long pfn = pte_pfn(pte);
 
-   if (HAVE_PTE_SPECIAL) {
+   if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL)) {
if (likely(!pte_special(pte)))
goto check_pfn;
if (vma->vm_ops && vma->vm_ops->find_special_page)
@@ -862,7 +857,7 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, 
unsigned long addr,
return NULL;
}
 
-   /* !HAVE_PTE_SPECIAL case follows: */
+   /* !CONFIG_ARCH_HAS_PTE_SPECIAL case follows: */
 
if (unlikely(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
if (vma->vm_flags & VM_MIXEDMAP) {
@@ -881,7 +876,8 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, 
unsigned long addr,
 
if (is_zero_pfn(pfn))
return NULL;
-check_pfn:
+
+check_pfn: __maybe_unused
if (unlikely(pfn > highest_memmap_pfn)) {
print_bad_pte(vma, addr, pte, NULL);
return NULL;
@@ -891,7 +887,7 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, 
unsigned long addr,
 * NOTE! We still have PageReserved() pages in the page tables.
 * eg. VDSO mappings can cause them to exist.
 */
-out:
+out: __maybe_unused
return pfn_to_page(pfn);
 }
 
@@ -904,7 +900,7 @@ struct page *vm_normal_page_pmd(struct vm_area_struct *vma, 
unsigned long addr,
/*
 * There is no pmd_special() but there may be special pmds, e.g.
 * in a direct-access (dax) mapping, so let's just replicate the
-* !HAVE_PTE_SPECIAL case from vm_normal_page() here.
+* !CONFIG_ARCH_HAS_PTE_SPECIAL case from vm_normal_page() here.
 */
if (unlikely(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
if (vma->vm_flags & VM_MIXEDMAP) {
@@ -1933,7 +1929,8 @@ static int __vm_insert_mixed(struct vm_area_struct *vma, 
unsigned long addr,
 * than insert_pfn).  If a zero_pfn were inserted into a VM_MIXEDMAP
 * without pte special, it would there be refcounted as a normal page.
 */
-   if (!HAVE_PTE_SPECIAL && !pfn_t_devmap(pfn) && pfn_t_valid(pfn)) {
+   if (!IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL) &&
+   !pfn_t_devmap(pfn) && pfn_t_valid(pfn)) {
struct page *page;
 
/*
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html