Re: [PATCH RFC] hugetlbfs 'noautofill' mount option

2017-05-07 Thread Prakash Sangappa



On 5/3/17 12:02 PM, Prakash Sangappa wrote:

On 5/2/17 4:43 PM, Dave Hansen wrote:


Ideally, it would be something that is *not* specifically for hugetlbfs.
  MADV_NOAUTOFILL, for instance, could be defined to SIGSEGV whenever
memory is touched that was not populated with MADV_WILLNEED, mlock(), 
etc...


If this is a generic advice type, necessary support will have to be 
implemented

in various filesystems which can support this.

The proposed behavior for 'noautofill' was to not fill holes in 
files(like sparse files).

In the page fault path, mm would not know if the mmapped address on which
the fault occurred, is over a hole in the file or just that the page 
is not available
in the page cache. The underlying filesystem would be called and it 
determines
if it is a hole and that is where it would fail and not fill the hole, 
if this support is added.
Normally, filesystem which support sparse files(holes in file) 
automatically fill the hole
when accessed. Then there is the issue of file system block size and 
page size. If the
block sizes are smaller then page size, it could mean the noautofill 
would only work

if the hole size is equal to  or a multiple of, page size?

In case of hugetlbfs it is much straight forward. Since this 
filesystem is not like a normal
filesystems and and the file sizes are multiple of huge pages. The 
hole will be a multiple
of the huge page size. For this reason then should the advise be 
specific to hugetlbfs?






Any further comments? I think introducing a general madvise option or a 
mmap flag applicable to all filesystems, may not be required. The 
'noautofill' behavior would be specifically useful in hugetlbfs filesystem.


So, if it is specific to hugetlbfs, will the mount option be ok? 
Otherwise adding a madvise / mmap option specific to hugetlbfs, be 
preferred?




Re: [PATCH RFC] hugetlbfs 'noautofill' mount option

2017-05-07 Thread Prakash Sangappa



On 5/3/17 12:02 PM, Prakash Sangappa wrote:

On 5/2/17 4:43 PM, Dave Hansen wrote:


Ideally, it would be something that is *not* specifically for hugetlbfs.
  MADV_NOAUTOFILL, for instance, could be defined to SIGSEGV whenever
memory is touched that was not populated with MADV_WILLNEED, mlock(), 
etc...


If this is a generic advice type, necessary support will have to be 
implemented

in various filesystems which can support this.

The proposed behavior for 'noautofill' was to not fill holes in 
files(like sparse files).

In the page fault path, mm would not know if the mmapped address on which
the fault occurred, is over a hole in the file or just that the page 
is not available
in the page cache. The underlying filesystem would be called and it 
determines
if it is a hole and that is where it would fail and not fill the hole, 
if this support is added.
Normally, filesystem which support sparse files(holes in file) 
automatically fill the hole
when accessed. Then there is the issue of file system block size and 
page size. If the
block sizes are smaller then page size, it could mean the noautofill 
would only work

if the hole size is equal to  or a multiple of, page size?

In case of hugetlbfs it is much straight forward. Since this 
filesystem is not like a normal
filesystems and and the file sizes are multiple of huge pages. The 
hole will be a multiple
of the huge page size. For this reason then should the advise be 
specific to hugetlbfs?






Any further comments? I think introducing a general madvise option or a 
mmap flag applicable to all filesystems, may not be required. The 
'noautofill' behavior would be specifically useful in hugetlbfs filesystem.


So, if it is specific to hugetlbfs, will the mount option be ok? 
Otherwise adding a madvise / mmap option specific to hugetlbfs, be 
preferred?




Re: [PATCH v4 3/5] soc: qcom: Introduce APCS IPC driver

2017-05-07 Thread Bjorn Andersson
On Fri 05 May 21:48 PDT 2017, Jassi Brar wrote:

> On Sat, May 6, 2017 at 6:49 AM, Bjorn Andersson
>  wrote:
> > On Fri 05 May 13:22 PDT 2017, Jassi Brar wrote:
> 
> >> How is it supposed to work if a client queues more than one request?
> >
> > One such example is found in patch 5 in this series. There are two FIFOs
> > in shared memory, one in each direction. Each fifo has a index-pair
> > associated; a write-index is used by the writer to inform the reader
> > where the valid data in the ring buffer ends and a read-index indicates
> > to the writer how far behind the read is.
> >
> > The writer will just push data into the FIFO, each time firing off an
> > APCS IPC interrupt and when the remote interrupt handler runs it will
> > consume all the messages from the read-index to the write-index. All
> > without the need for the reader to signal the writer that it has
> > received the interrupts.
> >
> > In the event that the write-index catches up with the read-index a
> > dedicated flag is set which will cause the reader to signal that the
> > read-index is updated - allowing the writer to sleep waiting for room in
> > the FIFO.
> >
> Interesting.Just for my enlightenment...
> 
>   Where does the writer sleep in the driver? I see it simply sets the
> bit and leave.
> Such a flag (or head and tail pointers matching) should be checked in
> last_tx_done()
> 

In the case of the FIFO based communication the flow control is
implemented one layer up. You can see this in glink_rpm_tx() (in patch
5/5), where we check to see if there's enough room between the writer
and the reader to store the new message. 

The remote side will process messages and increment the read index,
which will break us out of the loop.


Note that its possible to write any number of messages before invoking
the APCS IPC and there's no harm (beyond wasting a little bit of power)
in invoking it when there's no data written.

> If you think RPM will _always_ be ready to accept new messages (though
> we have seen that doesn't hold in some situations), then you don't
> need last_tx_done.

Whether the remote processor is ready to accept a new message or not is
irrelevant in the sense of APCS IPC. When a client sends the signal over
the APCS IPC it has already determined that there was space, filled in
the shared memory buffers and is now simply invoking the remote handler.


The APCS IPC register serves the basis for all inter-processor
communication in a Qualcomm platform, so it's not only the RPM driver
discussed earlier that uses this. It's also used for other non-FIFO
based communication channels, where the signalled information either
isn't acked at all or acked on a system-level.

But regardless of the protocol implemented ontop, the purpose of the
APCS IPC bit is _only_ to invoke some remote handler to consume some
data, somewhere - the event in itself does not carry any information.

> The client should call mbox_client_txdone() after
> mbox_send_message().

So every time we call mbox_send_message() from any of the client drivers
we also needs to call mbox_client_txdone()?

This seems like an awkward side effect of using the mailbox framework -
which has to be spread out in at least 6 different client drivers :(

Regards,
Bjorn


Re: [PATCH v4 3/5] soc: qcom: Introduce APCS IPC driver

2017-05-07 Thread Bjorn Andersson
On Fri 05 May 21:48 PDT 2017, Jassi Brar wrote:

> On Sat, May 6, 2017 at 6:49 AM, Bjorn Andersson
>  wrote:
> > On Fri 05 May 13:22 PDT 2017, Jassi Brar wrote:
> 
> >> How is it supposed to work if a client queues more than one request?
> >
> > One such example is found in patch 5 in this series. There are two FIFOs
> > in shared memory, one in each direction. Each fifo has a index-pair
> > associated; a write-index is used by the writer to inform the reader
> > where the valid data in the ring buffer ends and a read-index indicates
> > to the writer how far behind the read is.
> >
> > The writer will just push data into the FIFO, each time firing off an
> > APCS IPC interrupt and when the remote interrupt handler runs it will
> > consume all the messages from the read-index to the write-index. All
> > without the need for the reader to signal the writer that it has
> > received the interrupts.
> >
> > In the event that the write-index catches up with the read-index a
> > dedicated flag is set which will cause the reader to signal that the
> > read-index is updated - allowing the writer to sleep waiting for room in
> > the FIFO.
> >
> Interesting.Just for my enlightenment...
> 
>   Where does the writer sleep in the driver? I see it simply sets the
> bit and leave.
> Such a flag (or head and tail pointers matching) should be checked in
> last_tx_done()
> 

In the case of the FIFO based communication the flow control is
implemented one layer up. You can see this in glink_rpm_tx() (in patch
5/5), where we check to see if there's enough room between the writer
and the reader to store the new message. 

The remote side will process messages and increment the read index,
which will break us out of the loop.


Note that its possible to write any number of messages before invoking
the APCS IPC and there's no harm (beyond wasting a little bit of power)
in invoking it when there's no data written.

> If you think RPM will _always_ be ready to accept new messages (though
> we have seen that doesn't hold in some situations), then you don't
> need last_tx_done.

Whether the remote processor is ready to accept a new message or not is
irrelevant in the sense of APCS IPC. When a client sends the signal over
the APCS IPC it has already determined that there was space, filled in
the shared memory buffers and is now simply invoking the remote handler.


The APCS IPC register serves the basis for all inter-processor
communication in a Qualcomm platform, so it's not only the RPM driver
discussed earlier that uses this. It's also used for other non-FIFO
based communication channels, where the signalled information either
isn't acked at all or acked on a system-level.

But regardless of the protocol implemented ontop, the purpose of the
APCS IPC bit is _only_ to invoke some remote handler to consume some
data, somewhere - the event in itself does not carry any information.

> The client should call mbox_client_txdone() after
> mbox_send_message().

So every time we call mbox_send_message() from any of the client drivers
we also needs to call mbox_client_txdone()?

This seems like an awkward side effect of using the mailbox framework -
which has to be spread out in at least 6 different client drivers :(

Regards,
Bjorn


[V4] staging : rtl8188eu : remove void function return

2017-05-07 Thread Surender Polsani
kernel coding style doesn't allow the return statement
in void function.

Signed-off-by: Surender Polsani 
---
Changes for v2:
corrected subject line as suggested
Changes for v3:
modified from line as suggested by Greg KH
placed a semicolon in label for fixing build error
Changes for v4:
changes made as suggested by Dan Carpenter
---
 drivers/staging/rtl8188eu/hal/rtl8188e_dm.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/staging/rtl8188eu/hal/rtl8188e_dm.c 
b/drivers/staging/rtl8188eu/hal/rtl8188e_dm.c
index d04b7fb..bf9f10f 100644
--- a/drivers/staging/rtl8188eu/hal/rtl8188e_dm.c
+++ b/drivers/staging/rtl8188eu/hal/rtl8188e_dm.c
@@ -146,7 +146,6 @@ void rtw_hal_dm_watchdog(struct adapter *Adapter)
 
if (!hw_init_completed)
goto skip_dm;
-
/* ODM */
pmlmepriv = >mlmepriv;
 
@@ -165,7 +164,7 @@ void rtw_hal_dm_watchdog(struct adapter *Adapter)
 skip_dm:
/*  Check GPIO to determine current RF on/off and Pbc status. */
/*  Check Hardware Radio ON/OFF or not */
-   return;
+   ;
 }
 
 void rtw_hal_dm_init(struct adapter *Adapter)
-- 
1.9.1



[V4] staging : rtl8188eu : remove void function return

2017-05-07 Thread Surender Polsani
kernel coding style doesn't allow the return statement
in void function.

Signed-off-by: Surender Polsani 
---
Changes for v2:
corrected subject line as suggested
Changes for v3:
modified from line as suggested by Greg KH
placed a semicolon in label for fixing build error
Changes for v4:
changes made as suggested by Dan Carpenter
---
 drivers/staging/rtl8188eu/hal/rtl8188e_dm.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/staging/rtl8188eu/hal/rtl8188e_dm.c 
b/drivers/staging/rtl8188eu/hal/rtl8188e_dm.c
index d04b7fb..bf9f10f 100644
--- a/drivers/staging/rtl8188eu/hal/rtl8188e_dm.c
+++ b/drivers/staging/rtl8188eu/hal/rtl8188e_dm.c
@@ -146,7 +146,6 @@ void rtw_hal_dm_watchdog(struct adapter *Adapter)
 
if (!hw_init_completed)
goto skip_dm;
-
/* ODM */
pmlmepriv = >mlmepriv;
 
@@ -165,7 +164,7 @@ void rtw_hal_dm_watchdog(struct adapter *Adapter)
 skip_dm:
/*  Check GPIO to determine current RF on/off and Pbc status. */
/*  Check Hardware Radio ON/OFF or not */
-   return;
+   ;
 }
 
 void rtw_hal_dm_init(struct adapter *Adapter)
-- 
1.9.1



Re: [PATCH v2 1/5] pinctrl: qcom: Add ipq8074 pinctrl driver

2017-05-07 Thread Varadarajan Narayanan

+ Bjorn Andersson

On 5/4/2017 5:23 PM, Varadarajan Narayanan wrote:

Add initial pinctrl driver to support pin configuration with
pinctrl framework for ipq8074.

Signed-off-by: Manoharan Vijaya Raghavan 
Signed-off-by: Varadarajan Narayanan 
---
  .../bindings/pinctrl/qcom,ipq8074-pinctrl.txt  |  187 +++
  drivers/pinctrl/qcom/Kconfig   |   10 +
  drivers/pinctrl/qcom/Makefile  |1 +
  drivers/pinctrl/qcom/pinctrl-ipq8074.c | 1217 
  4 files changed, 1415 insertions(+)
  create mode 100644 
Documentation/devicetree/bindings/pinctrl/qcom,ipq8074-pinctrl.txt
  create mode 100644 drivers/pinctrl/qcom/pinctrl-ipq8074.c

diff --git a/Documentation/devicetree/bindings/pinctrl/qcom,ipq8074-pinctrl.txt 
b/Documentation/devicetree/bindings/pinctrl/qcom,ipq8074-pinctrl.txt
new file mode 100644
index 000..7765bca
--- /dev/null
+++ b/Documentation/devicetree/bindings/pinctrl/qcom,ipq8074-pinctrl.txt
@@ -0,0 +1,187 @@
+Qualcomm Technologies, Inc. IPQ8074 TLMM block
+
+This binding describes the Top Level Mode Multiplexer block found in the
+IPQ8074 platform.
+
+- compatible:
+   Usage: required
+   Value type: 
+   Definition: must be "qcom,ipq8074-pinctrl"
+
+- reg:
+   Usage: required
+   Value type: 
+   Definition: the base address and size of the TLMM register space.
+
+- interrupts:
+   Usage: required
+   Value type: 
+   Definition: should specify the TLMM summary IRQ.
+
+- interrupt-controller:
+   Usage: required
+   Value type: 
+   Definition: identifies this node as an interrupt controller
+
+- #interrupt-cells:
+   Usage: required
+   Value type: 
+   Definition: must be 2. Specifying the pin number and flags, as defined
+   in 
+
+- gpio-controller:
+   Usage: required
+   Value type: 
+   Definition: identifies this node as a gpio controller
+
+- #gpio-cells:
+   Usage: required
+   Value type: 
+   Definition: must be 2. Specifying the pin number and flags, as defined
+   in 
+
+Please refer to ../gpio/gpio.txt and ../interrupt-controller/interrupts.txt for
+a general description of GPIO and interrupt bindings.
+
+Please refer to pinctrl-bindings.txt in this directory for details of the
+common pinctrl bindings used by client devices, including the meaning of the
+phrase "pin configuration node".
+
+The pin configuration nodes act as a container for an arbitrary number of
+subnodes. Each of these subnodes represents some desired configuration for a
+pin, a group, or a list of pins or groups. This configuration can include the
+mux function to select on those pin(s)/group(s), and various pin configuration
+parameters, such as pull-up, drive strength, etc.
+
+
+PIN CONFIGURATION NODES:
+
+The name of each subnode is not important; all subnodes should be enumerated
+and processed purely based on their content.
+
+Each subnode only affects those parameters that are explicitly listed. In
+other words, a subnode that lists a mux function but no pin configuration
+parameters implies no information about any pin configuration parameters.
+Similarly, a pin subnode that describes a pullup parameter implies no
+information about e.g. the mux function.
+
+
+The following generic properties as defined in pinctrl-bindings.txt are valid
+to specify in a pin configuration subnode:
+
+- pins:
+   Usage: required
+   Value type: 
+   Definition: List of gpio pins affected by the properties specified in
+   this subnode.  Valid pins are:
+   gpio0-gpio121,
+   sdc1_clk,
+   sdc1_cmd,
+   sdc1_data
+   sdc2_clk,
+   sdc2_cmd,
+   sdc2_data,
+   qdsd_cmd,
+   qdsd_data0,
+   qdsd_data1,
+   qdsd_data2,
+   qdsd_data3
+
+- function:
+   Usage: required
+   Value type: 
+   Definition: Specify the alternative function to be configured for the
+   specified pins. Functions are only valid for gpio pins.
+   Valid values are:
+   gpio, qpic_pad, blsp5_i2c, blsp5_spi, wci20, blsp3_spi3,
+   burn0, pcm_zsi0, blsp5_uart, mac12, blsp3_spi0, burn1, mac01,
+   qdss_cti_trig_out_b0, qdss_cti_trig_in_b0, qpic_pad4,
+   blsp4_uart0, blsp4_i2c0, blsp4_spi0, mac21,
+   qdss_cti_trig_out_b1, qpic_pad5, qdss_cti_trig_in_b1,
+   qpic_pad6, qpic_pad7, cxc0, mac13, qdss_cti_trig_in_a1,
+   qdss_cti_trig_out_a1, wci22, qdss_cti_trig_in_a0, qpic_pad1,
+   qdss_cti_trig_out_a0, qpic_pad2, qpic_pad3, qdss_traceclk_b,
+   qpic_pad0, qdss_tracectl_b, qpic_pad8, pcm_zsi1,
+   qdss_tracedata_b, led0, pwm04, led1, pwm14, led2, pwm24,
+   pwm00, blsp4_uart1, blsp4_i2c1, blsp4_spi1, wci23, mac11,
+   

Re: [PATCH v2 1/5] pinctrl: qcom: Add ipq8074 pinctrl driver

2017-05-07 Thread Varadarajan Narayanan

+ Bjorn Andersson

On 5/4/2017 5:23 PM, Varadarajan Narayanan wrote:

Add initial pinctrl driver to support pin configuration with
pinctrl framework for ipq8074.

Signed-off-by: Manoharan Vijaya Raghavan 
Signed-off-by: Varadarajan Narayanan 
---
  .../bindings/pinctrl/qcom,ipq8074-pinctrl.txt  |  187 +++
  drivers/pinctrl/qcom/Kconfig   |   10 +
  drivers/pinctrl/qcom/Makefile  |1 +
  drivers/pinctrl/qcom/pinctrl-ipq8074.c | 1217 
  4 files changed, 1415 insertions(+)
  create mode 100644 
Documentation/devicetree/bindings/pinctrl/qcom,ipq8074-pinctrl.txt
  create mode 100644 drivers/pinctrl/qcom/pinctrl-ipq8074.c

diff --git a/Documentation/devicetree/bindings/pinctrl/qcom,ipq8074-pinctrl.txt 
b/Documentation/devicetree/bindings/pinctrl/qcom,ipq8074-pinctrl.txt
new file mode 100644
index 000..7765bca
--- /dev/null
+++ b/Documentation/devicetree/bindings/pinctrl/qcom,ipq8074-pinctrl.txt
@@ -0,0 +1,187 @@
+Qualcomm Technologies, Inc. IPQ8074 TLMM block
+
+This binding describes the Top Level Mode Multiplexer block found in the
+IPQ8074 platform.
+
+- compatible:
+   Usage: required
+   Value type: 
+   Definition: must be "qcom,ipq8074-pinctrl"
+
+- reg:
+   Usage: required
+   Value type: 
+   Definition: the base address and size of the TLMM register space.
+
+- interrupts:
+   Usage: required
+   Value type: 
+   Definition: should specify the TLMM summary IRQ.
+
+- interrupt-controller:
+   Usage: required
+   Value type: 
+   Definition: identifies this node as an interrupt controller
+
+- #interrupt-cells:
+   Usage: required
+   Value type: 
+   Definition: must be 2. Specifying the pin number and flags, as defined
+   in 
+
+- gpio-controller:
+   Usage: required
+   Value type: 
+   Definition: identifies this node as a gpio controller
+
+- #gpio-cells:
+   Usage: required
+   Value type: 
+   Definition: must be 2. Specifying the pin number and flags, as defined
+   in 
+
+Please refer to ../gpio/gpio.txt and ../interrupt-controller/interrupts.txt for
+a general description of GPIO and interrupt bindings.
+
+Please refer to pinctrl-bindings.txt in this directory for details of the
+common pinctrl bindings used by client devices, including the meaning of the
+phrase "pin configuration node".
+
+The pin configuration nodes act as a container for an arbitrary number of
+subnodes. Each of these subnodes represents some desired configuration for a
+pin, a group, or a list of pins or groups. This configuration can include the
+mux function to select on those pin(s)/group(s), and various pin configuration
+parameters, such as pull-up, drive strength, etc.
+
+
+PIN CONFIGURATION NODES:
+
+The name of each subnode is not important; all subnodes should be enumerated
+and processed purely based on their content.
+
+Each subnode only affects those parameters that are explicitly listed. In
+other words, a subnode that lists a mux function but no pin configuration
+parameters implies no information about any pin configuration parameters.
+Similarly, a pin subnode that describes a pullup parameter implies no
+information about e.g. the mux function.
+
+
+The following generic properties as defined in pinctrl-bindings.txt are valid
+to specify in a pin configuration subnode:
+
+- pins:
+   Usage: required
+   Value type: 
+   Definition: List of gpio pins affected by the properties specified in
+   this subnode.  Valid pins are:
+   gpio0-gpio121,
+   sdc1_clk,
+   sdc1_cmd,
+   sdc1_data
+   sdc2_clk,
+   sdc2_cmd,
+   sdc2_data,
+   qdsd_cmd,
+   qdsd_data0,
+   qdsd_data1,
+   qdsd_data2,
+   qdsd_data3
+
+- function:
+   Usage: required
+   Value type: 
+   Definition: Specify the alternative function to be configured for the
+   specified pins. Functions are only valid for gpio pins.
+   Valid values are:
+   gpio, qpic_pad, blsp5_i2c, blsp5_spi, wci20, blsp3_spi3,
+   burn0, pcm_zsi0, blsp5_uart, mac12, blsp3_spi0, burn1, mac01,
+   qdss_cti_trig_out_b0, qdss_cti_trig_in_b0, qpic_pad4,
+   blsp4_uart0, blsp4_i2c0, blsp4_spi0, mac21,
+   qdss_cti_trig_out_b1, qpic_pad5, qdss_cti_trig_in_b1,
+   qpic_pad6, qpic_pad7, cxc0, mac13, qdss_cti_trig_in_a1,
+   qdss_cti_trig_out_a1, wci22, qdss_cti_trig_in_a0, qpic_pad1,
+   qdss_cti_trig_out_a0, qpic_pad2, qpic_pad3, qdss_traceclk_b,
+   qpic_pad0, qdss_tracectl_b, qpic_pad8, pcm_zsi1,
+   qdss_tracedata_b, led0, pwm04, led1, pwm14, led2, pwm24,
+   pwm00, blsp4_uart1, blsp4_i2c1, blsp4_spi1, wci23, mac11,
+   blsp3_spi2, pwm10, pwm20, pwm30, audio_txmclk, 

[PATCH] regulator: max8997/8966: fix charger cv voltage set bug

2017-05-07 Thread MyungJoo Ham

When min charger-CV is <= 4.0V and max charger-CV is >= 4.0V,
we can use 4.00V as CV (register value = 0x1).`

The original code had a typo that wrote ">=" (max_uV >= 400),
which should've been "<", which is not necessary anyway
as mentioned by Dan Carpenter.

Reported-By: Dan Carpenter 
Signed-off-by: MyungJoo Ham 
---
 drivers/regulator/max8997-regulator.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/regulator/max8997-regulator.c 
b/drivers/regulator/max8997-regulator.c
index efabc0e..559b9ac 100644
--- a/drivers/regulator/max8997-regulator.c
+++ b/drivers/regulator/max8997-regulator.c
@@ -428,12 +428,9 @@ static int max8997_set_voltage_charger_cv(struct 
regulator_dev *rdev,
if (max_uV < 400 || min_uV > 435)
return -EINVAL;
 
-   if (min_uV <= 400) {
-   if (max_uV >= 400)
-   return -EINVAL;
-   else
-   val = 0x1;
-   } else if (min_uV <= 420 && max_uV >= 420)
+   if (min_uV <= 400)
+   val = 0x1;
+   else if (min_uV <= 420 && max_uV >= 420)
val = 0x0;
else {
lb = (min_uV - 401) / 2 + 2;
-- 
1.9.1



[PATCH] regulator: max8997/8966: fix charger cv voltage set bug

2017-05-07 Thread MyungJoo Ham

When min charger-CV is <= 4.0V and max charger-CV is >= 4.0V,
we can use 4.00V as CV (register value = 0x1).`

The original code had a typo that wrote ">=" (max_uV >= 400),
which should've been "<", which is not necessary anyway
as mentioned by Dan Carpenter.

Reported-By: Dan Carpenter 
Signed-off-by: MyungJoo Ham 
---
 drivers/regulator/max8997-regulator.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/regulator/max8997-regulator.c 
b/drivers/regulator/max8997-regulator.c
index efabc0e..559b9ac 100644
--- a/drivers/regulator/max8997-regulator.c
+++ b/drivers/regulator/max8997-regulator.c
@@ -428,12 +428,9 @@ static int max8997_set_voltage_charger_cv(struct 
regulator_dev *rdev,
if (max_uV < 400 || min_uV > 435)
return -EINVAL;
 
-   if (min_uV <= 400) {
-   if (max_uV >= 400)
-   return -EINVAL;
-   else
-   val = 0x1;
-   } else if (min_uV <= 420 && max_uV >= 420)
+   if (min_uV <= 400)
+   val = 0x1;
+   else if (min_uV <= 420 && max_uV >= 420)
val = 0x0;
else {
lb = (min_uV - 401) / 2 + 2;
-- 
1.9.1



Re: [PATCH 4.10 0/5] 4.10.15-stable review

2017-05-07 Thread Greg Kroah-Hartman
On Sun, May 07, 2017 at 12:38:53PM -0700, Guenter Roeck wrote:
> On 05/06/2017 01:39 PM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.10.15 release.
> > There are 5 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Mon May  8 20:38:36 UTC 2017.
> > Anything received after that time might be too late.
> > 
> 
> 
> Build results:
>   total: 145 pass: 145 fail: 0
> Qemu test results:
>   total: 122 pass: 122 fail: 0
> 
> Details are available at http://kerneltests.org/builders.

Thanks for testing all of these and letting me know.

greg k-h


Re: [PATCH 4.10 0/5] 4.10.15-stable review

2017-05-07 Thread Greg Kroah-Hartman
On Sun, May 07, 2017 at 12:38:53PM -0700, Guenter Roeck wrote:
> On 05/06/2017 01:39 PM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.10.15 release.
> > There are 5 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Mon May  8 20:38:36 UTC 2017.
> > Anything received after that time might be too late.
> > 
> 
> 
> Build results:
>   total: 145 pass: 145 fail: 0
> Qemu test results:
>   total: 122 pass: 122 fail: 0
> 
> Details are available at http://kerneltests.org/builders.

Thanks for testing all of these and letting me know.

greg k-h


Re: [PATCH] staging: iio: meter: This patch fixes warnings found in ade7753.c

2017-05-07 Thread Greg KH
On Mon, May 08, 2017 at 01:38:47AM -0400, harinath Nampally wrote:
> Hi Greg,
> 
> Thank you so much for your time to review. 
> Sure I will break up into multiple patches and create patch series.
> But I wonder 'one thing' per patch means one kind of warnings per patch?

Yes it does.

> For example if I have multiple fixes for similar warnings like 'Symbolic
> permissions'. Can they all go in a single patch.

Yes.

>  So patch 0/10 for all 'Symbolic permissions' and path 1/10 for all 'alignment
> warnings' and so on and so forth.

Yes, that is correct.

And note, the mailing lists reject html email, so you should fix up your
email client to not send that :)

thanks,

greg k-h


Re: [PATCH] staging: iio: meter: This patch fixes warnings found in ade7753.c

2017-05-07 Thread Greg KH
On Mon, May 08, 2017 at 01:38:47AM -0400, harinath Nampally wrote:
> Hi Greg,
> 
> Thank you so much for your time to review. 
> Sure I will break up into multiple patches and create patch series.
> But I wonder 'one thing' per patch means one kind of warnings per patch?

Yes it does.

> For example if I have multiple fixes for similar warnings like 'Symbolic
> permissions'. Can they all go in a single patch.

Yes.

>  So patch 0/10 for all 'Symbolic permissions' and path 1/10 for all 'alignment
> warnings' and so on and so forth.

Yes, that is correct.

And note, the mailing lists reject html email, so you should fix up your
email client to not send that :)

thanks,

greg k-h


Re: [PATCH V6 1/9] PM / OPP: Introduce "power-domain-opp" property

2017-05-07 Thread Rajendra Nayak


On 05/08/2017 09:45 AM, Viresh Kumar wrote:
> On 06-05-17, 11:58, Kevin Hilman wrote:
>> Rob Herring  writes:
>>
>>> On Wed, Apr 26, 2017 at 04:27:05PM +0530, Viresh Kumar wrote:
 Power-domains need to express their active states in DT and the devices
 within the power-domain need to express their dependency on those active
 states. The power-domains can use the OPP tables without any
 modifications to the bindings.

 Add a new property "power-domain-opp", which will contain phandle to the
 OPP node of the parent power domain. This is required for devices which
 have dependency on the configured active state of the power domain for
 their working.

 For some platforms the actual frequency and voltages of the power
 domains are managed by the firmware and are so hidden from the high
 level operating system. The "opp-hz" property is relaxed a bit to
 contain indexes instead of actual frequency values to support such
 platforms.

 Signed-off-by: Viresh Kumar 
 ---
  Documentation/devicetree/bindings/opp/opp.txt | 74 
 ++-
  1 file changed, 73 insertions(+), 1 deletion(-)

 diff --git a/Documentation/devicetree/bindings/opp/opp.txt 
 b/Documentation/devicetree/bindings/opp/opp.txt
 index 63725498bd20..6e30cae2a936 100644
 --- a/Documentation/devicetree/bindings/opp/opp.txt
 +++ b/Documentation/devicetree/bindings/opp/opp.txt
 @@ -77,7 +77,10 @@ This defines voltage-current-frequency combinations 
 along with other related
  properties.
  
  Required properties:
 -- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer.
 +- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. In 
 some
 +  cases the exact frequency in Hz may be hidden from the OS by the 
 firmware and
 +  this field may contain values that represent the frequency in a firmware
 +  dependent way, for example an index of an array in the firmware.
>>>
>>> Not really sure OPP binding makes sense here.
>>
>> I think OPP makes perfect sense here, because microcontroller firmware
>> is managaging OPPs in hardware.  We just may not know the exact voltage
>> and/or frequency (and the firmware/hardware may even be doing AVS for
>> micro-adjustments.)
> 
> Yes, AVS is being done for the Qcom SoC as well.
> 
>>> What about all the other properties. We expose voltage, but not freq?
>>
>> I had the same question.  Seems the same comment about an abstract
>> "index" is needed for voltage also.
> 
> Why should we do that? Here are the cases that I had in mind while writing 
> this:
> 
> - DT only contains the performance-index and nothing else (i.e. voltages 
> aren't
>   exposed).
> 
>   We wouldn't be required to fill the microvolt property as it is optional.

So the performance-index is specified in opp-hz property?
What if the microcontroller firmware maps the performance-index to voltage but
expects linux to scale the frequency? There is no way to specify a 
performance-index
*and* a frequency for a OPP now I guess?

> 
> - DT contains both performance-index and voltages.
> 
>   The microvolts property will contain the actual voltages and opp-hz will
>   contain the index.

So this is for cases where the performance-index maps to a freq managed by the
microcontroller and voltages managed by linux? I have a case of exact opposite
and I don't see now how to handle it now with these bindings.

> 
> I don't see why would we like to put some index value in the microvolts
> property. We are setting the index value in the opp-hz property to avoid 
> adding
> extra fields and making sure opp-hz is still the unique property for the 
> nodes.

Maybe to handle the case like what I described above?

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


Re: [PATCH V6 1/9] PM / OPP: Introduce "power-domain-opp" property

2017-05-07 Thread Rajendra Nayak


On 05/08/2017 09:45 AM, Viresh Kumar wrote:
> On 06-05-17, 11:58, Kevin Hilman wrote:
>> Rob Herring  writes:
>>
>>> On Wed, Apr 26, 2017 at 04:27:05PM +0530, Viresh Kumar wrote:
 Power-domains need to express their active states in DT and the devices
 within the power-domain need to express their dependency on those active
 states. The power-domains can use the OPP tables without any
 modifications to the bindings.

 Add a new property "power-domain-opp", which will contain phandle to the
 OPP node of the parent power domain. This is required for devices which
 have dependency on the configured active state of the power domain for
 their working.

 For some platforms the actual frequency and voltages of the power
 domains are managed by the firmware and are so hidden from the high
 level operating system. The "opp-hz" property is relaxed a bit to
 contain indexes instead of actual frequency values to support such
 platforms.

 Signed-off-by: Viresh Kumar 
 ---
  Documentation/devicetree/bindings/opp/opp.txt | 74 
 ++-
  1 file changed, 73 insertions(+), 1 deletion(-)

 diff --git a/Documentation/devicetree/bindings/opp/opp.txt 
 b/Documentation/devicetree/bindings/opp/opp.txt
 index 63725498bd20..6e30cae2a936 100644
 --- a/Documentation/devicetree/bindings/opp/opp.txt
 +++ b/Documentation/devicetree/bindings/opp/opp.txt
 @@ -77,7 +77,10 @@ This defines voltage-current-frequency combinations 
 along with other related
  properties.
  
  Required properties:
 -- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer.
 +- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. In 
 some
 +  cases the exact frequency in Hz may be hidden from the OS by the 
 firmware and
 +  this field may contain values that represent the frequency in a firmware
 +  dependent way, for example an index of an array in the firmware.
>>>
>>> Not really sure OPP binding makes sense here.
>>
>> I think OPP makes perfect sense here, because microcontroller firmware
>> is managaging OPPs in hardware.  We just may not know the exact voltage
>> and/or frequency (and the firmware/hardware may even be doing AVS for
>> micro-adjustments.)
> 
> Yes, AVS is being done for the Qcom SoC as well.
> 
>>> What about all the other properties. We expose voltage, but not freq?
>>
>> I had the same question.  Seems the same comment about an abstract
>> "index" is needed for voltage also.
> 
> Why should we do that? Here are the cases that I had in mind while writing 
> this:
> 
> - DT only contains the performance-index and nothing else (i.e. voltages 
> aren't
>   exposed).
> 
>   We wouldn't be required to fill the microvolt property as it is optional.

So the performance-index is specified in opp-hz property?
What if the microcontroller firmware maps the performance-index to voltage but
expects linux to scale the frequency? There is no way to specify a 
performance-index
*and* a frequency for a OPP now I guess?

> 
> - DT contains both performance-index and voltages.
> 
>   The microvolts property will contain the actual voltages and opp-hz will
>   contain the index.

So this is for cases where the performance-index maps to a freq managed by the
microcontroller and voltages managed by linux? I have a case of exact opposite
and I don't see now how to handle it now with these bindings.

> 
> I don't see why would we like to put some index value in the microvolts
> property. We are setting the index value in the opp-hz property to avoid 
> adding
> extra fields and making sure opp-hz is still the unique property for the 
> nodes.

Maybe to handle the case like what I described above?

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


Re: [PATCH 7/7] DWARF: add the config option

2017-05-07 Thread Andy Lutomirski
On Sun, May 7, 2017 at 9:55 AM, Josh Poimboeuf  wrote:
> On Fri, May 05, 2017 at 12:57:11PM -0700, Linus Torvalds wrote:
>> On Fri, May 5, 2017 at 5:22 AM, Jiri Slaby  wrote:
>> > The DWARF unwinder is in place and ready. So introduce the config option
>> > to allow users to enable it. It is by default off due to missing
>> > assembly annotations.
>>
>> Who actually ends up using this?
>>
>> Because from the last time we had fancy unwindoers, and all the
>> problems it caused for oops handling with absolutely _zero_ upsides
>> ever, I do not ever again want to see fancy unwinders with complex
>> state machine handling used by the oopsing code.
>>
>> The fact that it gets disabled for KASAN also makes me suspicious. It
>> basically means that now all the accesses it does are not even
>> validated.
>>
>> The fact that the most of the code seems to be disabled for the first
>> six patches, and then just enabled in the last patch, also seems to
>> mean that the series also gets no bisection coverage or testing that
>> the individual patches make any sense. (ie there's a lot of code
>> inside "CONFIG_DWARF_UNWIND" in the early patches but that config
>> option cannot even be enabled until the last patch).
>>
>> We used to have nasty issues with not just missing dwarf info, but
>> also actively *wrong* dwarf info. Compiler versions that generated
>> subtly wrong info, because nobody actually really depended on it, and
>> the people who had tested it seldom did the kinds of things we do in
>> the kernel (eg inline asms etc).
>>
>> So I'm personally still very suspicious of these things.
>>
>> Last time I had huge issues with people also always blaming *anything*
>> else than that unwinder. It was always "oh, somebody wrote asm without
>> getting it right". Or "oh, the compiler generated bad tables, it's not
>> *my* fault that now the kernel oopsing code no longer works".
>>
>> When I asked for more stricter debug table validation to avoid issues,
>> it was always "oh, we fixed it, no worries", and then two months later
>> somebody hit another issue.
>>
>> Put another way; the last time we did crazy stuff like this, it got
>> reverted. For a damn good reason, despite some people being in denial
>> about those reasons.
>
> Here's another possible idea that's been rattling around in my head.
> It's purely theoretical at this point, so I don't know for sure that it
> would work.  But I haven't been able to find any major issues with it
> yet.
>
> DWARF is great for debuggers.  It helps you find all the registers on
> the stack, so you can see function arguments and local variables.  All
> expressed in a nice compact format.
>
> But that's overkill for unwinders.  We don't need all those registers,
> and the state machine is too complicated.  Unwinders basically only need
> to know one thing: given an instruction address and a stack pointer,
> where is the caller's stack frame?

I think that, if the code were sufficiently robust, it would be handy
if the unwinder displayed function arguments.  DWARF can do that to a
limited extent.

That being said, having a simpler table format would probably cover
most of the use cases.


Re: [PATCH 7/7] DWARF: add the config option

2017-05-07 Thread Andy Lutomirski
On Sun, May 7, 2017 at 9:55 AM, Josh Poimboeuf  wrote:
> On Fri, May 05, 2017 at 12:57:11PM -0700, Linus Torvalds wrote:
>> On Fri, May 5, 2017 at 5:22 AM, Jiri Slaby  wrote:
>> > The DWARF unwinder is in place and ready. So introduce the config option
>> > to allow users to enable it. It is by default off due to missing
>> > assembly annotations.
>>
>> Who actually ends up using this?
>>
>> Because from the last time we had fancy unwindoers, and all the
>> problems it caused for oops handling with absolutely _zero_ upsides
>> ever, I do not ever again want to see fancy unwinders with complex
>> state machine handling used by the oopsing code.
>>
>> The fact that it gets disabled for KASAN also makes me suspicious. It
>> basically means that now all the accesses it does are not even
>> validated.
>>
>> The fact that the most of the code seems to be disabled for the first
>> six patches, and then just enabled in the last patch, also seems to
>> mean that the series also gets no bisection coverage or testing that
>> the individual patches make any sense. (ie there's a lot of code
>> inside "CONFIG_DWARF_UNWIND" in the early patches but that config
>> option cannot even be enabled until the last patch).
>>
>> We used to have nasty issues with not just missing dwarf info, but
>> also actively *wrong* dwarf info. Compiler versions that generated
>> subtly wrong info, because nobody actually really depended on it, and
>> the people who had tested it seldom did the kinds of things we do in
>> the kernel (eg inline asms etc).
>>
>> So I'm personally still very suspicious of these things.
>>
>> Last time I had huge issues with people also always blaming *anything*
>> else than that unwinder. It was always "oh, somebody wrote asm without
>> getting it right". Or "oh, the compiler generated bad tables, it's not
>> *my* fault that now the kernel oopsing code no longer works".
>>
>> When I asked for more stricter debug table validation to avoid issues,
>> it was always "oh, we fixed it, no worries", and then two months later
>> somebody hit another issue.
>>
>> Put another way; the last time we did crazy stuff like this, it got
>> reverted. For a damn good reason, despite some people being in denial
>> about those reasons.
>
> Here's another possible idea that's been rattling around in my head.
> It's purely theoretical at this point, so I don't know for sure that it
> would work.  But I haven't been able to find any major issues with it
> yet.
>
> DWARF is great for debuggers.  It helps you find all the registers on
> the stack, so you can see function arguments and local variables.  All
> expressed in a nice compact format.
>
> But that's overkill for unwinders.  We don't need all those registers,
> and the state machine is too complicated.  Unwinders basically only need
> to know one thing: given an instruction address and a stack pointer,
> where is the caller's stack frame?

I think that, if the code were sufficiently robust, it would be handy
if the unwinder displayed function arguments.  DWARF can do that to a
limited extent.

That being said, having a simpler table format would probably cover
most of the use cases.


Re: [lkp-robot] [generic_file_read_iter()] 5ecda13711: BUG:KASAN:stack-out-of-bounds

2017-05-07 Thread Al Viro
On Mon, May 08, 2017 at 09:22:38AM +0800, kernel test robot wrote:
> 
> FYI, we noticed the following commit:
> 
> commit: 5ecda13711b3bd4a750b5740897bf13d1720de7c ("generic_file_read_iter(): 
> make use of iov_iter_revert()")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> in testcase: ocfs2test
> with following parameters:
> 
>   disk: 1HDD
>   test: test-backup_super
> 
> 
> 
> on test machine: qemu-system-x86_64 -enable-kvm -cpu host -smp 2 -m 4G
> 
> caused below changes (please refer to attached dmesg/kmsg for entire 
> log/backtrace):

Very interesting...  It looks like that has nothing to do with ocfs2 -
it seems to be O_DIRECT read on block device and I wonder how come that
nothing in LTP/xfstests has stepped into that...



Bloody hell...  OK, this is absolutely insane; there's an obvious braino
in that sucker - it should be
iov_iter_revert(iter, count - iov_iter_count(iter));
not
iov_iter_revert(iter, iov_iter_count(iter) - count);
We want "how much has ->direct_IO() overconsumed", i.e. "how much should've
been left judging by the retval - how much is actually left".  How the
hell did avoid being caught by the very first O_DIRECT read that had lead
to overconsumption?

I'm half-asleep right now; the first thing tomorrow morning will be to
sort the thing out and find how the hell has it avoided being caught.

Looking at other callers, this seems to be the only victim of such
idiocy.  Ugh...

Among other things, I'm going to add WARN_ON(unroll > MAX_RW_COUNT); in
iov_iter_revert() - should've done that from the very beginning.


Re: [lkp-robot] [generic_file_read_iter()] 5ecda13711: BUG:KASAN:stack-out-of-bounds

2017-05-07 Thread Al Viro
On Mon, May 08, 2017 at 09:22:38AM +0800, kernel test robot wrote:
> 
> FYI, we noticed the following commit:
> 
> commit: 5ecda13711b3bd4a750b5740897bf13d1720de7c ("generic_file_read_iter(): 
> make use of iov_iter_revert()")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> in testcase: ocfs2test
> with following parameters:
> 
>   disk: 1HDD
>   test: test-backup_super
> 
> 
> 
> on test machine: qemu-system-x86_64 -enable-kvm -cpu host -smp 2 -m 4G
> 
> caused below changes (please refer to attached dmesg/kmsg for entire 
> log/backtrace):

Very interesting...  It looks like that has nothing to do with ocfs2 -
it seems to be O_DIRECT read on block device and I wonder how come that
nothing in LTP/xfstests has stepped into that...



Bloody hell...  OK, this is absolutely insane; there's an obvious braino
in that sucker - it should be
iov_iter_revert(iter, count - iov_iter_count(iter));
not
iov_iter_revert(iter, iov_iter_count(iter) - count);
We want "how much has ->direct_IO() overconsumed", i.e. "how much should've
been left judging by the retval - how much is actually left".  How the
hell did avoid being caught by the very first O_DIRECT read that had lead
to overconsumption?

I'm half-asleep right now; the first thing tomorrow morning will be to
sort the thing out and find how the hell has it avoided being caught.

Looking at other callers, this seems to be the only victim of such
idiocy.  Ugh...

Among other things, I'm going to add WARN_ON(unroll > MAX_RW_COUNT); in
iov_iter_revert() - should've done that from the very beginning.


Issues with make-kpkg while compiling Linux Kernel From Source.

2017-05-07 Thread Anil Nair
Hi,

A file named "REPORTING-BUGS" is moved (and renamed and edited) to
"Documentation/admin-guide/reporting-bugs.rst" as on linux kernel version v4.10.
(As mentioned in conversation with Randy Dunlap in lkml)

The command make-kpkg fails to build, the command that i used was,

"fakeroot make-kpkg --initrd --revision=1.0.AN kernel_image kernel_headers -j 4"

The error i received was,

"install -p-o root -g root  -m  644 REPORTING-BUGS
 
/home/anilnair/linux-kernel/linux-stable/debian/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty/usr/share/doc/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty/
install: cannot stat 'REPORTING-BUGS': No such file or directory
debian/ruleset/targets/headers.mk:40: recipe for target
'debian/stamp/install/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty'
failed
make[1]: *** 
[debian/stamp/install/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty]
Error 1
make[1]: Leaving directory '/home/anilnair/linux-kernel/linux-stable'
debian/ruleset/local.mk:102: recipe for target 'kernel_headers' failed
make: *** [kernel_headers] Error 2"

I am using linux kernel version 4.4.0-72-generic, and make-kpkg version 13.018.

I suppose i am addressing this issue to proper maintainer.


-- 
--
Regards,
Anil Nair


Issues with make-kpkg while compiling Linux Kernel From Source.

2017-05-07 Thread Anil Nair
Hi,

A file named "REPORTING-BUGS" is moved (and renamed and edited) to
"Documentation/admin-guide/reporting-bugs.rst" as on linux kernel version v4.10.
(As mentioned in conversation with Randy Dunlap in lkml)

The command make-kpkg fails to build, the command that i used was,

"fakeroot make-kpkg --initrd --revision=1.0.AN kernel_image kernel_headers -j 4"

The error i received was,

"install -p-o root -g root  -m  644 REPORTING-BUGS
 
/home/anilnair/linux-kernel/linux-stable/debian/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty/usr/share/doc/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty/
install: cannot stat 'REPORTING-BUGS': No such file or directory
debian/ruleset/targets/headers.mk:40: recipe for target
'debian/stamp/install/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty'
failed
make[1]: *** 
[debian/stamp/install/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty]
Error 1
make[1]: Leaving directory '/home/anilnair/linux-kernel/linux-stable'
debian/ruleset/local.mk:102: recipe for target 'kernel_headers' failed
make: *** [kernel_headers] Error 2"

I am using linux kernel version 4.4.0-72-generic, and make-kpkg version 13.018.

I suppose i am addressing this issue to proper maintainer.


-- 
--
Regards,
Anil Nair


Re: [RFC][PATCH v3 2/2] cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely

2017-05-07 Thread Wanpeng Li
2017-05-08 12:01 GMT+08:00 Viresh Kumar :
> On 08-05-17, 11:49, Wanpeng Li wrote:
>> Hi Rafael,
>> 2017-03-22 7:08 GMT+08:00 Rafael J. Wysocki :
>> > From: Rafael J. Wysocki 
>> >
>> > The way the schedutil governor uses the PELT metric causes it to
>> > underestimate the CPU utilization in some cases.
>> >
>> > That can be easily demonstrated by running kernel compilation on
>> > a Sandy Bridge Intel processor, running turbostat in parallel with
>> > it and looking at the values written to the MSR_IA32_PERF_CTL
>> > register.  Namely, the expected result would be that when all CPUs
>> > were 100% busy, all of them would be requested to run in the maximum
>> > P-state, but observation shows that this clearly isn't the case.
>> > The CPUs run in the maximum P-state for a while and then are
>> > requested to run slower and go back to the maximum P-state after
>> > a while again.  That causes the actual frequency of the processor to
>> > visibly oscillate below the sustainable maximum in a jittery fashion
>> > which clearly is not desirable.
>> >
>> > That has been attributed to CPU utilization metric updates on task
>> > migration that cause the total utilization value for the CPU to be
>> > reduced by the utilization of the migrated task.  If that happens,
>> > the schedutil governor may see a CPU utilization reduction and will
>> > attempt to reduce the CPU frequency accordingly right away.  That
>> > may be premature, though, for example if the system is generally
>> > busy and there are other runnable tasks waiting to be run on that
>> > CPU already.
>> >
>> > This is unlikely to be an issue on systems where cpufreq policies are
>> > shared between multiple CPUs, because in those cases the policy
>> > utilization is computed as the maximum of the CPU utilization values
>>
>> Sorry for one question maybe not associated with this patch. If the
>> cpufreq policy is shared between multiple CPUs, the function
>> intel_cpufreq_target()  just updates IA32_PERF_CTL MSR of the cpu
>> which is managing this policy, I wonder whether other cpus which are
>> affected should also update their per-logical cpu's IA32_PERF_CTL MSR?
>
> The CPUs share the policy when they share their freq/voltage rails and so
> changing perf state of one CPU should result in that changing for all the CPUs
> in that policy. Otherwise, they can't be considered to be part of the same
> policy.
>
> That's why this code is changing it only for policy->cpu alone.

I see, thanks for the explanation.

Regards,
Wanpeng Li


Re: [RFC][PATCH v3 2/2] cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely

2017-05-07 Thread Wanpeng Li
2017-05-08 12:01 GMT+08:00 Viresh Kumar :
> On 08-05-17, 11:49, Wanpeng Li wrote:
>> Hi Rafael,
>> 2017-03-22 7:08 GMT+08:00 Rafael J. Wysocki :
>> > From: Rafael J. Wysocki 
>> >
>> > The way the schedutil governor uses the PELT metric causes it to
>> > underestimate the CPU utilization in some cases.
>> >
>> > That can be easily demonstrated by running kernel compilation on
>> > a Sandy Bridge Intel processor, running turbostat in parallel with
>> > it and looking at the values written to the MSR_IA32_PERF_CTL
>> > register.  Namely, the expected result would be that when all CPUs
>> > were 100% busy, all of them would be requested to run in the maximum
>> > P-state, but observation shows that this clearly isn't the case.
>> > The CPUs run in the maximum P-state for a while and then are
>> > requested to run slower and go back to the maximum P-state after
>> > a while again.  That causes the actual frequency of the processor to
>> > visibly oscillate below the sustainable maximum in a jittery fashion
>> > which clearly is not desirable.
>> >
>> > That has been attributed to CPU utilization metric updates on task
>> > migration that cause the total utilization value for the CPU to be
>> > reduced by the utilization of the migrated task.  If that happens,
>> > the schedutil governor may see a CPU utilization reduction and will
>> > attempt to reduce the CPU frequency accordingly right away.  That
>> > may be premature, though, for example if the system is generally
>> > busy and there are other runnable tasks waiting to be run on that
>> > CPU already.
>> >
>> > This is unlikely to be an issue on systems where cpufreq policies are
>> > shared between multiple CPUs, because in those cases the policy
>> > utilization is computed as the maximum of the CPU utilization values
>>
>> Sorry for one question maybe not associated with this patch. If the
>> cpufreq policy is shared between multiple CPUs, the function
>> intel_cpufreq_target()  just updates IA32_PERF_CTL MSR of the cpu
>> which is managing this policy, I wonder whether other cpus which are
>> affected should also update their per-logical cpu's IA32_PERF_CTL MSR?
>
> The CPUs share the policy when they share their freq/voltage rails and so
> changing perf state of one CPU should result in that changing for all the CPUs
> in that policy. Otherwise, they can't be considered to be part of the same
> policy.
>
> That's why this code is changing it only for policy->cpu alone.

I see, thanks for the explanation.

Regards,
Wanpeng Li


[stable-4.10: PATCH] xen: revert commits 72a9b186292 and da72ff5bfcb0

2017-05-07 Thread Juergen Gross
Revert commit 72a9b186292 ("xen: Remove event channel notification
through Xen PCI platform device") as the original analysis was wrong
that all the removed code isn't in use any more. As commit da72ff5bfcb0
("partially revert xen: Remove event channel notification through Xen
PCI platform device") reverted already some parts of it revert this
commit, too.

It is still necessary for old Xen versions (< 4.0) and for being able
to run the Linux kernel as dom0 in a nested Xen environment.

This is a backport of upstream commit 84d582d236dc1f9085e741affc72e9ba061a67c2

Cc: Boris Ostrovsky 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: Konrad Rzeszutek Wilk 
Cc: Bjorn Helgaas 
Cc: Stefano Stabellini 
Cc: Julien Grall 
Cc: Vitaly Kuznetsov 
Cc: Paul Gortmaker 
Cc: Ross Lagerwall 
Cc: xen-de...@lists.xenproject.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: Anthony Liguori 
Cc: KarimAllah Ahmed 
Signed-off-by: Juergen Gross 
---
Please apply to stable-4.10.y
---
 arch/x86/include/asm/xen/events.h | 11 +++
 arch/x86/pci/xen.c|  2 +-
 arch/x86/xen/enlighten.c  | 21 +++--
 arch/x86/xen/smp.c|  2 ++
 arch/x86/xen/time.c   |  5 +
 drivers/xen/events/events_base.c  | 26 +-
 drivers/xen/platform-pci.c| 13 +++--
 include/xen/xen.h |  3 ++-
 8 files changed, 56 insertions(+), 27 deletions(-)

diff --git a/arch/x86/include/asm/xen/events.h 
b/arch/x86/include/asm/xen/events.h
index 608a79d5a466..e6911caf5bbf 100644
--- a/arch/x86/include/asm/xen/events.h
+++ b/arch/x86/include/asm/xen/events.h
@@ -20,4 +20,15 @@ static inline int xen_irqs_disabled(struct pt_regs *regs)
 /* No need for a barrier -- XCHG is a barrier on x86. */
 #define xchg_xen_ulong(ptr, val) xchg((ptr), (val))
 
+extern int xen_have_vector_callback;
+
+/*
+ * Events delivered via platform PCI interrupts are always
+ * routed to vcpu 0 and hence cannot be rebound.
+ */
+static inline bool xen_support_evtchn_rebind(void)
+{
+   return (!xen_hvm_domain() || xen_have_vector_callback);
+}
+
 #endif /* _ASM_X86_XEN_EVENTS_H */
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 292ab0364a89..c4b3646bd04c 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -447,7 +447,7 @@ void __init xen_msi_init(void)
 
 int __init pci_xen_hvm_init(void)
 {
-   if (!xen_feature(XENFEAT_hvm_pirqs))
+   if (!xen_have_vector_callback || !xen_feature(XENFEAT_hvm_pirqs))
return 0;
 
 #ifdef CONFIG_ACPI
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 51ef95232725..6623867cc0d4 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -137,6 +137,8 @@ struct shared_info xen_dummy_shared_info;
 void *xen_initial_gdt;
 
 RESERVE_BRK(shared_info_page_brk, PAGE_SIZE);
+__read_mostly int xen_have_vector_callback;
+EXPORT_SYMBOL_GPL(xen_have_vector_callback);
 
 static int xen_cpu_up_prepare(unsigned int cpu);
 static int xen_cpu_up_online(unsigned int cpu);
@@ -1508,7 +1510,10 @@ static void __init xen_pvh_early_guest_init(void)
if (!xen_feature(XENFEAT_auto_translated_physmap))
return;
 
-   BUG_ON(!xen_feature(XENFEAT_hvm_callback_vector));
+   if (!xen_feature(XENFEAT_hvm_callback_vector))
+   return;
+
+   xen_have_vector_callback = 1;
 
xen_pvh_early_cpu_init(0, false);
xen_pvh_set_cr_flags(0);
@@ -1847,7 +1852,9 @@ static int xen_cpu_up_prepare(unsigned int cpu)
xen_vcpu_setup(cpu);
}
 
-   if (xen_pv_domain() || xen_feature(XENFEAT_hvm_safe_pvclock))
+   if (xen_pv_domain() ||
+   (xen_have_vector_callback &&
+xen_feature(XENFEAT_hvm_safe_pvclock)))
xen_setup_timer(cpu);
 
rc = xen_smp_intr_init(cpu);
@@ -1863,7 +1870,9 @@ static int xen_cpu_dead(unsigned int cpu)
 {
xen_smp_intr_free(cpu);
 
-   if (xen_pv_domain() || xen_feature(XENFEAT_hvm_safe_pvclock))
+   if (xen_pv_domain() ||
+   (xen_have_vector_callback &&
+xen_feature(XENFEAT_hvm_safe_pvclock)))
xen_teardown_timer(cpu);
 
return 0;
@@ -1902,8 +1911,8 @@ static void __init xen_hvm_guest_init(void)
 
xen_panic_handler_init();
 
-   BUG_ON(!xen_feature(XENFEAT_hvm_callback_vector));
-
+   if (xen_feature(XENFEAT_hvm_callback_vector))
+   xen_have_vector_callback = 1;
xen_hvm_smp_init();
WARN_ON(xen_cpuhp_setup());
xen_unplug_emulated_devices();
@@ -1941,7 +1950,7 @@ bool xen_hvm_need_lapic(void)
 

[stable-4.10: PATCH] xen: revert commits 72a9b186292 and da72ff5bfcb0

2017-05-07 Thread Juergen Gross
Revert commit 72a9b186292 ("xen: Remove event channel notification
through Xen PCI platform device") as the original analysis was wrong
that all the removed code isn't in use any more. As commit da72ff5bfcb0
("partially revert xen: Remove event channel notification through Xen
PCI platform device") reverted already some parts of it revert this
commit, too.

It is still necessary for old Xen versions (< 4.0) and for being able
to run the Linux kernel as dom0 in a nested Xen environment.

This is a backport of upstream commit 84d582d236dc1f9085e741affc72e9ba061a67c2

Cc: Boris Ostrovsky 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: Konrad Rzeszutek Wilk 
Cc: Bjorn Helgaas 
Cc: Stefano Stabellini 
Cc: Julien Grall 
Cc: Vitaly Kuznetsov 
Cc: Paul Gortmaker 
Cc: Ross Lagerwall 
Cc: xen-de...@lists.xenproject.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: Anthony Liguori 
Cc: KarimAllah Ahmed 
Signed-off-by: Juergen Gross 
---
Please apply to stable-4.10.y
---
 arch/x86/include/asm/xen/events.h | 11 +++
 arch/x86/pci/xen.c|  2 +-
 arch/x86/xen/enlighten.c  | 21 +++--
 arch/x86/xen/smp.c|  2 ++
 arch/x86/xen/time.c   |  5 +
 drivers/xen/events/events_base.c  | 26 +-
 drivers/xen/platform-pci.c| 13 +++--
 include/xen/xen.h |  3 ++-
 8 files changed, 56 insertions(+), 27 deletions(-)

diff --git a/arch/x86/include/asm/xen/events.h 
b/arch/x86/include/asm/xen/events.h
index 608a79d5a466..e6911caf5bbf 100644
--- a/arch/x86/include/asm/xen/events.h
+++ b/arch/x86/include/asm/xen/events.h
@@ -20,4 +20,15 @@ static inline int xen_irqs_disabled(struct pt_regs *regs)
 /* No need for a barrier -- XCHG is a barrier on x86. */
 #define xchg_xen_ulong(ptr, val) xchg((ptr), (val))
 
+extern int xen_have_vector_callback;
+
+/*
+ * Events delivered via platform PCI interrupts are always
+ * routed to vcpu 0 and hence cannot be rebound.
+ */
+static inline bool xen_support_evtchn_rebind(void)
+{
+   return (!xen_hvm_domain() || xen_have_vector_callback);
+}
+
 #endif /* _ASM_X86_XEN_EVENTS_H */
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 292ab0364a89..c4b3646bd04c 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -447,7 +447,7 @@ void __init xen_msi_init(void)
 
 int __init pci_xen_hvm_init(void)
 {
-   if (!xen_feature(XENFEAT_hvm_pirqs))
+   if (!xen_have_vector_callback || !xen_feature(XENFEAT_hvm_pirqs))
return 0;
 
 #ifdef CONFIG_ACPI
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 51ef95232725..6623867cc0d4 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -137,6 +137,8 @@ struct shared_info xen_dummy_shared_info;
 void *xen_initial_gdt;
 
 RESERVE_BRK(shared_info_page_brk, PAGE_SIZE);
+__read_mostly int xen_have_vector_callback;
+EXPORT_SYMBOL_GPL(xen_have_vector_callback);
 
 static int xen_cpu_up_prepare(unsigned int cpu);
 static int xen_cpu_up_online(unsigned int cpu);
@@ -1508,7 +1510,10 @@ static void __init xen_pvh_early_guest_init(void)
if (!xen_feature(XENFEAT_auto_translated_physmap))
return;
 
-   BUG_ON(!xen_feature(XENFEAT_hvm_callback_vector));
+   if (!xen_feature(XENFEAT_hvm_callback_vector))
+   return;
+
+   xen_have_vector_callback = 1;
 
xen_pvh_early_cpu_init(0, false);
xen_pvh_set_cr_flags(0);
@@ -1847,7 +1852,9 @@ static int xen_cpu_up_prepare(unsigned int cpu)
xen_vcpu_setup(cpu);
}
 
-   if (xen_pv_domain() || xen_feature(XENFEAT_hvm_safe_pvclock))
+   if (xen_pv_domain() ||
+   (xen_have_vector_callback &&
+xen_feature(XENFEAT_hvm_safe_pvclock)))
xen_setup_timer(cpu);
 
rc = xen_smp_intr_init(cpu);
@@ -1863,7 +1870,9 @@ static int xen_cpu_dead(unsigned int cpu)
 {
xen_smp_intr_free(cpu);
 
-   if (xen_pv_domain() || xen_feature(XENFEAT_hvm_safe_pvclock))
+   if (xen_pv_domain() ||
+   (xen_have_vector_callback &&
+xen_feature(XENFEAT_hvm_safe_pvclock)))
xen_teardown_timer(cpu);
 
return 0;
@@ -1902,8 +1911,8 @@ static void __init xen_hvm_guest_init(void)
 
xen_panic_handler_init();
 
-   BUG_ON(!xen_feature(XENFEAT_hvm_callback_vector));
-
+   if (xen_feature(XENFEAT_hvm_callback_vector))
+   xen_have_vector_callback = 1;
xen_hvm_smp_init();
WARN_ON(xen_cpuhp_setup());
xen_unplug_emulated_devices();
@@ -1941,7 +1950,7 @@ bool xen_hvm_need_lapic(void)
return false;
if (!xen_hvm_domain())
return false;
-   if (xen_feature(XENFEAT_hvm_pirqs))
+   if (xen_feature(XENFEAT_hvm_pirqs) && xen_have_vector_callback)
return false;
return true;
 }
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c

[stable-4.9: PATCH] xen: revert commit 72a9b186292

2017-05-07 Thread Juergen Gross
Revert commit 72a9b186292 ("xen: Remove event channel notification
through Xen PCI platform device") as the original analysis was wrong
that all the removed code isn't in use any more.

It is still necessary for old Xen versions (< 4.0) and for being able
to run the Linux kernel as dom0 in a nested Xen environment.

This is a backport of upstream commit 84d582d236dc1f9085e741affc72e9ba061a67c2

Cc: Boris Ostrovsky 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: Konrad Rzeszutek Wilk 
Cc: Bjorn Helgaas 
Cc: Stefano Stabellini 
Cc: Julien Grall 
Cc: Vitaly Kuznetsov 
Cc: Paul Gortmaker 
Cc: Ross Lagerwall 
Cc: xen-de...@lists.xenproject.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: Anthony Liguori 
Cc: KarimAllah Ahmed 
Signed-off-by: Juergen Gross 
---
Please apply to stable-4.9.y
---
 arch/x86/include/asm/xen/events.h | 11 +++
 arch/x86/pci/xen.c|  2 +-
 arch/x86/xen/enlighten.c  | 21 +
 arch/x86/xen/smp.c|  2 ++
 arch/x86/xen/time.c   |  5 +++
 drivers/xen/events/events_base.c  | 26 ++--
 drivers/xen/platform-pci.c| 64 +++
 include/xen/xen.h |  3 +-
 8 files changed, 117 insertions(+), 17 deletions(-)

diff --git a/arch/x86/include/asm/xen/events.h 
b/arch/x86/include/asm/xen/events.h
index 608a79d5a466..e6911caf5bbf 100644
--- a/arch/x86/include/asm/xen/events.h
+++ b/arch/x86/include/asm/xen/events.h
@@ -20,4 +20,15 @@ static inline int xen_irqs_disabled(struct pt_regs *regs)
 /* No need for a barrier -- XCHG is a barrier on x86. */
 #define xchg_xen_ulong(ptr, val) xchg((ptr), (val))
 
+extern int xen_have_vector_callback;
+
+/*
+ * Events delivered via platform PCI interrupts are always
+ * routed to vcpu 0 and hence cannot be rebound.
+ */
+static inline bool xen_support_evtchn_rebind(void)
+{
+   return (!xen_hvm_domain() || xen_have_vector_callback);
+}
+
 #endif /* _ASM_X86_XEN_EVENTS_H */
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index a00a6c07bb6f..4ea9f290c19f 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -447,7 +447,7 @@ void __init xen_msi_init(void)
 
 int __init pci_xen_hvm_init(void)
 {
-   if (!xen_feature(XENFEAT_hvm_pirqs))
+   if (!xen_have_vector_callback || !xen_feature(XENFEAT_hvm_pirqs))
return 0;
 
 #ifdef CONFIG_ACPI
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index bdd855685403..8f1f7efa848c 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -137,6 +137,8 @@ struct shared_info xen_dummy_shared_info;
 void *xen_initial_gdt;
 
 RESERVE_BRK(shared_info_page_brk, PAGE_SIZE);
+__read_mostly int xen_have_vector_callback;
+EXPORT_SYMBOL_GPL(xen_have_vector_callback);
 
 static int xen_cpu_up_prepare(unsigned int cpu);
 static int xen_cpu_up_online(unsigned int cpu);
@@ -1521,7 +1523,10 @@ static void __init xen_pvh_early_guest_init(void)
if (!xen_feature(XENFEAT_auto_translated_physmap))
return;
 
-   BUG_ON(!xen_feature(XENFEAT_hvm_callback_vector));
+   if (!xen_feature(XENFEAT_hvm_callback_vector))
+   return;
+
+   xen_have_vector_callback = 1;
 
xen_pvh_early_cpu_init(0, false);
xen_pvh_set_cr_flags(0);
@@ -1860,7 +1865,9 @@ static int xen_cpu_up_prepare(unsigned int cpu)
xen_vcpu_setup(cpu);
}
 
-   if (xen_pv_domain() || xen_feature(XENFEAT_hvm_safe_pvclock))
+   if (xen_pv_domain() ||
+   (xen_have_vector_callback &&
+xen_feature(XENFEAT_hvm_safe_pvclock)))
xen_setup_timer(cpu);
 
rc = xen_smp_intr_init(cpu);
@@ -1876,7 +1883,9 @@ static int xen_cpu_dead(unsigned int cpu)
 {
xen_smp_intr_free(cpu);
 
-   if (xen_pv_domain() || xen_feature(XENFEAT_hvm_safe_pvclock))
+   if (xen_pv_domain() ||
+   (xen_have_vector_callback &&
+xen_feature(XENFEAT_hvm_safe_pvclock)))
xen_teardown_timer(cpu);
 
return 0;
@@ -1915,8 +1924,8 @@ static void __init xen_hvm_guest_init(void)
 
xen_panic_handler_init();
 
-   BUG_ON(!xen_feature(XENFEAT_hvm_callback_vector));
-
+   if (xen_feature(XENFEAT_hvm_callback_vector))
+   xen_have_vector_callback = 1;
xen_hvm_smp_init();
WARN_ON(xen_cpuhp_setup());
xen_unplug_emulated_devices();
@@ -1954,7 +1963,7 @@ bool xen_hvm_need_lapic(void)
return false;
if (!xen_hvm_domain())
return false;
-   if (xen_feature(XENFEAT_hvm_pirqs))
+   if (xen_feature(XENFEAT_hvm_pirqs) 

[stable-4.9: PATCH] xen: revert commit 72a9b186292

2017-05-07 Thread Juergen Gross
Revert commit 72a9b186292 ("xen: Remove event channel notification
through Xen PCI platform device") as the original analysis was wrong
that all the removed code isn't in use any more.

It is still necessary for old Xen versions (< 4.0) and for being able
to run the Linux kernel as dom0 in a nested Xen environment.

This is a backport of upstream commit 84d582d236dc1f9085e741affc72e9ba061a67c2

Cc: Boris Ostrovsky 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: Konrad Rzeszutek Wilk 
Cc: Bjorn Helgaas 
Cc: Stefano Stabellini 
Cc: Julien Grall 
Cc: Vitaly Kuznetsov 
Cc: Paul Gortmaker 
Cc: Ross Lagerwall 
Cc: xen-de...@lists.xenproject.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: Anthony Liguori 
Cc: KarimAllah Ahmed 
Signed-off-by: Juergen Gross 
---
Please apply to stable-4.9.y
---
 arch/x86/include/asm/xen/events.h | 11 +++
 arch/x86/pci/xen.c|  2 +-
 arch/x86/xen/enlighten.c  | 21 +
 arch/x86/xen/smp.c|  2 ++
 arch/x86/xen/time.c   |  5 +++
 drivers/xen/events/events_base.c  | 26 ++--
 drivers/xen/platform-pci.c| 64 +++
 include/xen/xen.h |  3 +-
 8 files changed, 117 insertions(+), 17 deletions(-)

diff --git a/arch/x86/include/asm/xen/events.h 
b/arch/x86/include/asm/xen/events.h
index 608a79d5a466..e6911caf5bbf 100644
--- a/arch/x86/include/asm/xen/events.h
+++ b/arch/x86/include/asm/xen/events.h
@@ -20,4 +20,15 @@ static inline int xen_irqs_disabled(struct pt_regs *regs)
 /* No need for a barrier -- XCHG is a barrier on x86. */
 #define xchg_xen_ulong(ptr, val) xchg((ptr), (val))
 
+extern int xen_have_vector_callback;
+
+/*
+ * Events delivered via platform PCI interrupts are always
+ * routed to vcpu 0 and hence cannot be rebound.
+ */
+static inline bool xen_support_evtchn_rebind(void)
+{
+   return (!xen_hvm_domain() || xen_have_vector_callback);
+}
+
 #endif /* _ASM_X86_XEN_EVENTS_H */
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index a00a6c07bb6f..4ea9f290c19f 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -447,7 +447,7 @@ void __init xen_msi_init(void)
 
 int __init pci_xen_hvm_init(void)
 {
-   if (!xen_feature(XENFEAT_hvm_pirqs))
+   if (!xen_have_vector_callback || !xen_feature(XENFEAT_hvm_pirqs))
return 0;
 
 #ifdef CONFIG_ACPI
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index bdd855685403..8f1f7efa848c 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -137,6 +137,8 @@ struct shared_info xen_dummy_shared_info;
 void *xen_initial_gdt;
 
 RESERVE_BRK(shared_info_page_brk, PAGE_SIZE);
+__read_mostly int xen_have_vector_callback;
+EXPORT_SYMBOL_GPL(xen_have_vector_callback);
 
 static int xen_cpu_up_prepare(unsigned int cpu);
 static int xen_cpu_up_online(unsigned int cpu);
@@ -1521,7 +1523,10 @@ static void __init xen_pvh_early_guest_init(void)
if (!xen_feature(XENFEAT_auto_translated_physmap))
return;
 
-   BUG_ON(!xen_feature(XENFEAT_hvm_callback_vector));
+   if (!xen_feature(XENFEAT_hvm_callback_vector))
+   return;
+
+   xen_have_vector_callback = 1;
 
xen_pvh_early_cpu_init(0, false);
xen_pvh_set_cr_flags(0);
@@ -1860,7 +1865,9 @@ static int xen_cpu_up_prepare(unsigned int cpu)
xen_vcpu_setup(cpu);
}
 
-   if (xen_pv_domain() || xen_feature(XENFEAT_hvm_safe_pvclock))
+   if (xen_pv_domain() ||
+   (xen_have_vector_callback &&
+xen_feature(XENFEAT_hvm_safe_pvclock)))
xen_setup_timer(cpu);
 
rc = xen_smp_intr_init(cpu);
@@ -1876,7 +1883,9 @@ static int xen_cpu_dead(unsigned int cpu)
 {
xen_smp_intr_free(cpu);
 
-   if (xen_pv_domain() || xen_feature(XENFEAT_hvm_safe_pvclock))
+   if (xen_pv_domain() ||
+   (xen_have_vector_callback &&
+xen_feature(XENFEAT_hvm_safe_pvclock)))
xen_teardown_timer(cpu);
 
return 0;
@@ -1915,8 +1924,8 @@ static void __init xen_hvm_guest_init(void)
 
xen_panic_handler_init();
 
-   BUG_ON(!xen_feature(XENFEAT_hvm_callback_vector));
-
+   if (xen_feature(XENFEAT_hvm_callback_vector))
+   xen_have_vector_callback = 1;
xen_hvm_smp_init();
WARN_ON(xen_cpuhp_setup());
xen_unplug_emulated_devices();
@@ -1954,7 +1963,7 @@ bool xen_hvm_need_lapic(void)
return false;
if (!xen_hvm_domain())
return false;
-   if (xen_feature(XENFEAT_hvm_pirqs))
+   if (xen_feature(XENFEAT_hvm_pirqs) && xen_have_vector_callback)
return false;
return true;
 }
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 311acad7dad2..137afbbd0590 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -765,6 +765,8 @@ static void __init xen_hvm_smp_prepare_cpus(unsigned int 

Re: [RFC PATCH] x86/boot: Add the secdata section to the setup header

2017-05-07 Thread Gary Lin
On Sat, May 06, 2017 at 01:12:10AM -0700, h...@zytor.com wrote:
> On May 5, 2017 2:26:39 AM PDT, Gary Lin  wrote:
> >This is a different approach to replace my previous implementation of
> >Security Version(*). Instead of using the fields in the PE/COFF header,
> >this commit adds secdata_offset in the setup header for the file offset
> >of secdata. Currently, the secdata section contains the signer's name,
> >the distro version, and the security version as defined in the wiki
> >page.
> >Since we don't rely on the PE/COFF header anymore, the size of signer
> >is
> >increased to 8 bytes to store more characters.
> >
> >Since this is just a tentative patch, I haven't started the other parts
> >(shim and mokutil) yet, and it's flexible to change. Any comment and
> >suggestion are welcome.
> >
> >(*) https://github.com/lcp/shim/wiki/Security-Version
> >
> >Cc: Ard Biesheuvel 
> >Cc: "H. Peter Anvin" 
> >Cc: Thomas Gleixner 
> >Cc: Ingo Molnar 
> >Cc: Joey Lee 
> >Signed-off-by: Gary Lin 
> >---
> > arch/x86/Kconfig  | 14 ++
> > arch/x86/boot/header.S| 15 ++-
> > arch/x86/boot/setup.ld|  1 +
> > arch/x86/boot/tools/build.c   | 11 +++
> > arch/x86/include/uapi/asm/bootparam.h |  1 +
> > 5 files changed, 41 insertions(+), 1 deletion(-)
> >
> >diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> >index 5bbdef151805..09f99cd1e699 100644
> >--- a/arch/x86/Kconfig
> >+++ b/arch/x86/Kconfig
> >@@ -1817,6 +1817,20 @@ config EFI_MIXED
> > 
> >If unsure, say N.
> > 
> >+config SEC_SIGNER
> >+string "The signer name"
> >+default "none"
> >+
> >+config SEC_DISTRO
> >+int "The distro version"
> >+default 0
> >+range 0 65535
> >+
> >+config SEC_VERSION
> >+int "The security version"
> >+default 0
> >+range 0 65535
> >+
> > config SECCOMP
> > def_bool y
> > prompt "Enable seccomp to safely compute untrusted bytecode"
> >diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
> >index 3dd5be33aaa7..f751790f1f44 100644
> >--- a/arch/x86/boot/header.S
> >+++ b/arch/x86/boot/header.S
> >@@ -301,7 +301,7 @@ _start:
> > # Part 2 of the header, from the old setup.S
> > 
> > .ascii  "HdrS"  # header signature
> >-.word   0x020d  # header version number (>= 0x0105)
> >+.word   0x020e  # header version number (>= 0x0105)
> > # or else old loadlin-1.5 will fail)
> > .globl realmode_swtch
> > realmode_swtch: .word   0, 0# default_switch, SETUPSEG
> >@@ -552,6 +552,7 @@ pref_address:.quad LOAD_PHYSICAL_ADDR
> ># preferred
> >load addr
> > 
> > init_size:  .long INIT_SIZE # kernel initialization size
> > handover_offset:.long 0 # Filled in by build.c
> >+secdata_offset: .long secdata_start
> > 
> ># End of setup header
> >#
> > 
> >@@ -629,3 +630,15 @@ die:
> > setup_corrupt:
> > .byte   7
> > .string "No setup signature found...\n"
> >+
> >+.section ".secdata", "a"
> >+secdata_start:
> >+sec_length:
> >+.long   secdata_end - secdata_start
> >+sec_signer:
> >+.quad   0   # Filled by build.c
> >+sec_distro:
> >+.word   CONFIG_SEC_DISTRO
> >+sec_version:
> >+.word   CONFIG_SEC_VERSION
> >+secdata_end:
> >diff --git a/arch/x86/boot/setup.ld b/arch/x86/boot/setup.ld
> >index 96a6c7563538..43ddbaabaf7a 100644
> >--- a/arch/x86/boot/setup.ld
> >+++ b/arch/x86/boot/setup.ld
> >@@ -18,6 +18,7 @@ SECTIONS
> > .entrytext  : { *(.entrytext) }
> > .inittext   : { *(.inittext) }
> > .initdata   : { *(.initdata) }
> >+.secdata: { *(.secdata) }
> > __end_init = .;
> > 
> > .text   : { *(.text) }
> >diff --git a/arch/x86/boot/tools/build.c b/arch/x86/boot/tools/build.c
> >index 0702d2531bc7..ec4b311d7b2d 100644
> >--- a/arch/x86/boot/tools/build.c
> >+++ b/arch/x86/boot/tools/build.c
> >@@ -287,6 +287,15 @@ static inline int reserve_pecoff_reloc_section(int
> >c)
> > }
> > #endif /* CONFIG_EFI_STUB */
> > 
> >+static void security_fields_update(void)
> >+{
> >+unsigned int security_offset;
> >+char *dest;
> >+
> >+security_offset = get_unaligned_le32([0x268]);
> >+dest = (char *)[security_offset + 4];
> >+strncpy(dest, CONFIG_SEC_SIGNER, 8);
> >+}
> > 
> > /*
> >* Parse zoffset.h and find the entry points. We could just #include
> >zoffset.h
> >@@ -401,6 +410,8 @@ int main(int argc, char ** argv)
> > 
> > efi_stub_entry_update();
> > 
> >+security_fields_update();
> >+
> > crc = partial_crc32(buf, i, crc);
> > if (fwrite(buf, 1, i, dest) != i)
> > die("Writing setup failed");
> >diff --git 

Re: [RFC PATCH] x86/boot: Add the secdata section to the setup header

2017-05-07 Thread Gary Lin
On Sat, May 06, 2017 at 01:12:10AM -0700, h...@zytor.com wrote:
> On May 5, 2017 2:26:39 AM PDT, Gary Lin  wrote:
> >This is a different approach to replace my previous implementation of
> >Security Version(*). Instead of using the fields in the PE/COFF header,
> >this commit adds secdata_offset in the setup header for the file offset
> >of secdata. Currently, the secdata section contains the signer's name,
> >the distro version, and the security version as defined in the wiki
> >page.
> >Since we don't rely on the PE/COFF header anymore, the size of signer
> >is
> >increased to 8 bytes to store more characters.
> >
> >Since this is just a tentative patch, I haven't started the other parts
> >(shim and mokutil) yet, and it's flexible to change. Any comment and
> >suggestion are welcome.
> >
> >(*) https://github.com/lcp/shim/wiki/Security-Version
> >
> >Cc: Ard Biesheuvel 
> >Cc: "H. Peter Anvin" 
> >Cc: Thomas Gleixner 
> >Cc: Ingo Molnar 
> >Cc: Joey Lee 
> >Signed-off-by: Gary Lin 
> >---
> > arch/x86/Kconfig  | 14 ++
> > arch/x86/boot/header.S| 15 ++-
> > arch/x86/boot/setup.ld|  1 +
> > arch/x86/boot/tools/build.c   | 11 +++
> > arch/x86/include/uapi/asm/bootparam.h |  1 +
> > 5 files changed, 41 insertions(+), 1 deletion(-)
> >
> >diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> >index 5bbdef151805..09f99cd1e699 100644
> >--- a/arch/x86/Kconfig
> >+++ b/arch/x86/Kconfig
> >@@ -1817,6 +1817,20 @@ config EFI_MIXED
> > 
> >If unsure, say N.
> > 
> >+config SEC_SIGNER
> >+string "The signer name"
> >+default "none"
> >+
> >+config SEC_DISTRO
> >+int "The distro version"
> >+default 0
> >+range 0 65535
> >+
> >+config SEC_VERSION
> >+int "The security version"
> >+default 0
> >+range 0 65535
> >+
> > config SECCOMP
> > def_bool y
> > prompt "Enable seccomp to safely compute untrusted bytecode"
> >diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
> >index 3dd5be33aaa7..f751790f1f44 100644
> >--- a/arch/x86/boot/header.S
> >+++ b/arch/x86/boot/header.S
> >@@ -301,7 +301,7 @@ _start:
> > # Part 2 of the header, from the old setup.S
> > 
> > .ascii  "HdrS"  # header signature
> >-.word   0x020d  # header version number (>= 0x0105)
> >+.word   0x020e  # header version number (>= 0x0105)
> > # or else old loadlin-1.5 will fail)
> > .globl realmode_swtch
> > realmode_swtch: .word   0, 0# default_switch, SETUPSEG
> >@@ -552,6 +552,7 @@ pref_address:.quad LOAD_PHYSICAL_ADDR
> ># preferred
> >load addr
> > 
> > init_size:  .long INIT_SIZE # kernel initialization size
> > handover_offset:.long 0 # Filled in by build.c
> >+secdata_offset: .long secdata_start
> > 
> ># End of setup header
> >#
> > 
> >@@ -629,3 +630,15 @@ die:
> > setup_corrupt:
> > .byte   7
> > .string "No setup signature found...\n"
> >+
> >+.section ".secdata", "a"
> >+secdata_start:
> >+sec_length:
> >+.long   secdata_end - secdata_start
> >+sec_signer:
> >+.quad   0   # Filled by build.c
> >+sec_distro:
> >+.word   CONFIG_SEC_DISTRO
> >+sec_version:
> >+.word   CONFIG_SEC_VERSION
> >+secdata_end:
> >diff --git a/arch/x86/boot/setup.ld b/arch/x86/boot/setup.ld
> >index 96a6c7563538..43ddbaabaf7a 100644
> >--- a/arch/x86/boot/setup.ld
> >+++ b/arch/x86/boot/setup.ld
> >@@ -18,6 +18,7 @@ SECTIONS
> > .entrytext  : { *(.entrytext) }
> > .inittext   : { *(.inittext) }
> > .initdata   : { *(.initdata) }
> >+.secdata: { *(.secdata) }
> > __end_init = .;
> > 
> > .text   : { *(.text) }
> >diff --git a/arch/x86/boot/tools/build.c b/arch/x86/boot/tools/build.c
> >index 0702d2531bc7..ec4b311d7b2d 100644
> >--- a/arch/x86/boot/tools/build.c
> >+++ b/arch/x86/boot/tools/build.c
> >@@ -287,6 +287,15 @@ static inline int reserve_pecoff_reloc_section(int
> >c)
> > }
> > #endif /* CONFIG_EFI_STUB */
> > 
> >+static void security_fields_update(void)
> >+{
> >+unsigned int security_offset;
> >+char *dest;
> >+
> >+security_offset = get_unaligned_le32([0x268]);
> >+dest = (char *)[security_offset + 4];
> >+strncpy(dest, CONFIG_SEC_SIGNER, 8);
> >+}
> > 
> > /*
> >* Parse zoffset.h and find the entry points. We could just #include
> >zoffset.h
> >@@ -401,6 +410,8 @@ int main(int argc, char ** argv)
> > 
> > efi_stub_entry_update();
> > 
> >+security_fields_update();
> >+
> > crc = partial_crc32(buf, i, crc);
> > if (fwrite(buf, 1, i, dest) != i)
> > die("Writing setup failed");
> >diff --git a/arch/x86/include/uapi/asm/bootparam.h
> >b/arch/x86/include/uapi/asm/bootparam.h
> >index 07244ea16765..713ae5d933a4 100644
> 

linux-next: Tree for May 8

2017-05-07 Thread Stephen Rothwell
Hi all,

Please do not add any v4.13 destined material in your linux-next
included branches until after v4.12-rc1 has been released.

Changes since 20170505:

The akpm-current tree gained a conflict against Linus' tree.

Non-merge commits (relative to Linus' tree): 2723
 2661 files changed, 96043 insertions(+), 47627 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
and pseries_le_defconfig and i386, sparc and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 258 trees (counting Linus' and 37 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (13e098814037 docs: complete bumping minimal GNU Make 
version to 3.81)
Merging fixes/master (97da3854c526 Linux 4.11-rc3)
Merging kbuild-current/fixes (9be3213b14d4 gconfig: remove misleading 
parentheses around a condition)
Merging arc-current/for-curr (cf4100d1cddc Revert "ARCv2: Allow enabling PAE40 
w/o HIGHMEM")
Merging arm-current/fixes (6d8059493691 ARM: 8670/1: V7M: Do not corrupt vector 
table around v7m_invalidate_l1 call)
Merging m68k-current/for-linus (f6ab4d59a5fe nubus: Add MVC and VSC video card 
definitions)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (be5c5e843c4a powerpc/64: Fix HMI exception on LE 
with CONFIG_RELOCATABLE=y)
Merging sparc/master (8c64415cc1f5 sparc: Remove redundant tests in 
boot_flags_init().)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (2c041afc5af9 net: alx: handle pci_alloc_irq_vectors return 
correctly)
Merging ipsec/master (9b3eb54106cf xfrm: fix stack access out of bounds with 
CONFIG_XFRM_SUB_POLICY)
Merging netfilter/master (f411af682218 Merge branch 
'ibmvnic-Updated-reset-handler-andcode-fixes')
Merging ipvs/master (c8d6c6b496dc ipvs: SNAT packet replies only for NATed 
connections)
Merging wireless-drivers/master (d77facb88448 brcmfmac: use local iftype 
avoiding use-after-free of virtual interface)
Merging mac80211/master (842be75c77cb cfg80211: make RATE_INFO_BW_20 the 
default)
Merging sound-current/for-linus (a5c3b32a1146 Merge tag 'asoc-v4.12' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus)
Merging pci-current/for-linus (b9c1153f7a9c PCI: hisi: Fix DT binding 
(hisi-pcie-almost-ecam))
Merging driver-core.current/driver-core-linus (af82455f7dbd Merge tag 
'char-misc-4.12-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc)
Merging tty.current/tty-linus (4f7d029b9bf0 Linux 4.11-rc7)
Merging usb.current/usb-linus (a71c9a1c779f Linux 4.11-rc5)
Merging usb-gadget-fixes/fixes (25cd9721c2b1 usb: gadget: f_hid: fix: Don't 
access hidg->req without spinlock held)
Merging usb-serial-fixes/usb-linus (c02ed2e75ef4 Linux 4.11-rc4)
Merging usb-chipidea-fixes/ci-for-usb-stable (c7fbb09b2ea1 usb: chipidea: move 
the lock initialization to core file)
Merging phy/fixes (1a09b6a7c10e phy: qcom-usb-hs: Add depends on EXTCON)
Merging staging.current/staging-linus (13e098814037 docs: complete bumping 
minimal GNU Make version to 3.81)
Merging char-misc.current/char-misc-linus (af82455f7dbd Merge tag 
'char-misc-4.12-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc)
Merging input-current/for-linus (4706aa075662 Input: xpad - add USB IDs for Mad 
Catz Brawlstick and Razer Sabertooth)
Merging crypto-current/master (929562b14478 crypto: stm32 - Fix OF module alias 
information)
Merging ide/master (96297aee8bce ide: 

linux-next: Tree for May 8

2017-05-07 Thread Stephen Rothwell
Hi all,

Please do not add any v4.13 destined material in your linux-next
included branches until after v4.12-rc1 has been released.

Changes since 20170505:

The akpm-current tree gained a conflict against Linus' tree.

Non-merge commits (relative to Linus' tree): 2723
 2661 files changed, 96043 insertions(+), 47627 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
and pseries_le_defconfig and i386, sparc and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 258 trees (counting Linus' and 37 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (13e098814037 docs: complete bumping minimal GNU Make 
version to 3.81)
Merging fixes/master (97da3854c526 Linux 4.11-rc3)
Merging kbuild-current/fixes (9be3213b14d4 gconfig: remove misleading 
parentheses around a condition)
Merging arc-current/for-curr (cf4100d1cddc Revert "ARCv2: Allow enabling PAE40 
w/o HIGHMEM")
Merging arm-current/fixes (6d8059493691 ARM: 8670/1: V7M: Do not corrupt vector 
table around v7m_invalidate_l1 call)
Merging m68k-current/for-linus (f6ab4d59a5fe nubus: Add MVC and VSC video card 
definitions)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (be5c5e843c4a powerpc/64: Fix HMI exception on LE 
with CONFIG_RELOCATABLE=y)
Merging sparc/master (8c64415cc1f5 sparc: Remove redundant tests in 
boot_flags_init().)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (2c041afc5af9 net: alx: handle pci_alloc_irq_vectors return 
correctly)
Merging ipsec/master (9b3eb54106cf xfrm: fix stack access out of bounds with 
CONFIG_XFRM_SUB_POLICY)
Merging netfilter/master (f411af682218 Merge branch 
'ibmvnic-Updated-reset-handler-andcode-fixes')
Merging ipvs/master (c8d6c6b496dc ipvs: SNAT packet replies only for NATed 
connections)
Merging wireless-drivers/master (d77facb88448 brcmfmac: use local iftype 
avoiding use-after-free of virtual interface)
Merging mac80211/master (842be75c77cb cfg80211: make RATE_INFO_BW_20 the 
default)
Merging sound-current/for-linus (a5c3b32a1146 Merge tag 'asoc-v4.12' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus)
Merging pci-current/for-linus (b9c1153f7a9c PCI: hisi: Fix DT binding 
(hisi-pcie-almost-ecam))
Merging driver-core.current/driver-core-linus (af82455f7dbd Merge tag 
'char-misc-4.12-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc)
Merging tty.current/tty-linus (4f7d029b9bf0 Linux 4.11-rc7)
Merging usb.current/usb-linus (a71c9a1c779f Linux 4.11-rc5)
Merging usb-gadget-fixes/fixes (25cd9721c2b1 usb: gadget: f_hid: fix: Don't 
access hidg->req without spinlock held)
Merging usb-serial-fixes/usb-linus (c02ed2e75ef4 Linux 4.11-rc4)
Merging usb-chipidea-fixes/ci-for-usb-stable (c7fbb09b2ea1 usb: chipidea: move 
the lock initialization to core file)
Merging phy/fixes (1a09b6a7c10e phy: qcom-usb-hs: Add depends on EXTCON)
Merging staging.current/staging-linus (13e098814037 docs: complete bumping 
minimal GNU Make version to 3.81)
Merging char-misc.current/char-misc-linus (af82455f7dbd Merge tag 
'char-misc-4.12-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc)
Merging input-current/for-linus (4706aa075662 Input: xpad - add USB IDs for Mad 
Catz Brawlstick and Razer Sabertooth)
Merging crypto-current/master (929562b14478 crypto: stm32 - Fix OF module alias 
information)
Merging ide/master (96297aee8bce ide: 

Re: 64fa03de33: BUG:Dentry_still_in_use

2017-05-07 Thread Serge E. Hallyn
>From 6a3fb632f67f8425c6e76c65dad8115f1550d2a0 Mon Sep 17 00:00:00 2001
From: Serge Hallyn 
Date: Sun, 7 May 2017 23:40:42 -0500
Subject: [PATCH 1/1] cap_inode_getsecurity: don't pin dentry (fold up)

This should fix the "Dentry_still_in_use" reported by the kernel
test robot.

Signed-off-by: Serge Hallyn 
---
 security/commoncap.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/security/commoncap.c b/security/commoncap.c
index a1a2935..c970b71 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -406,21 +406,21 @@ int cap_inode_getsecurity(struct inode *inode, const char 
*name, void **buffer,
 , size, GFP_NOFS);
 
if (ret < 0)
-   return ret;
+   goto out;
 
fs_ns = inode->i_sb->s_user_ns;
cap = (struct vfs_cap_data *) tmpbuf;
if (is_v2header(ret, cap->magic_etc)) {
/* If this is sizeof(vfs_cap_data) then we're ok with the
 * on-disk value, so return that.  */
-   if (alloc)
+   if (alloc) {
*buffer = tmpbuf;
-   else
-   kfree(tmpbuf);
-   return ret;
+   tmpbuf = NULL;
+   }
+   goto out;
} else if (!is_v3header(ret, cap->magic_etc)) {
-   kfree(tmpbuf);
-   return -EINVAL;
+   ret = -EINVAL;
+   goto out;
}
 
nscap = (struct vfs_ns_cap_data *) tmpbuf;
@@ -434,14 +434,14 @@ int cap_inode_getsecurity(struct inode *inode, const char 
*name, void **buffer,
if (alloc) {
*buffer = tmpbuf;
nscap->rootid = cpu_to_le32(mappedroot);
-   } else
-   kfree(tmpbuf);
-   return size;
+   tmpbuf = NULL;
+   }
+   goto out;
}
 
if (!rootid_owns_currentns(kroot)) {
-   kfree(tmpbuf);
-   return -EOPNOTSUPP;
+   ret = -EOPNOTSUPP;
+   goto out;
}
 
/* This comes from a parent namespace.  Return as a v2 capability */
@@ -459,6 +459,9 @@ int cap_inode_getsecurity(struct inode *inode, const char 
*name, void **buffer,
cap->magic_etc = cpu_to_le32(magic);
}
}
+
+out:
+   dput(dentry);
kfree(tmpbuf);
return size;
 }
-- 
2.7.4



Re: 64fa03de33: BUG:Dentry_still_in_use

2017-05-07 Thread Serge E. Hallyn
>From 6a3fb632f67f8425c6e76c65dad8115f1550d2a0 Mon Sep 17 00:00:00 2001
From: Serge Hallyn 
Date: Sun, 7 May 2017 23:40:42 -0500
Subject: [PATCH 1/1] cap_inode_getsecurity: don't pin dentry (fold up)

This should fix the "Dentry_still_in_use" reported by the kernel
test robot.

Signed-off-by: Serge Hallyn 
---
 security/commoncap.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/security/commoncap.c b/security/commoncap.c
index a1a2935..c970b71 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -406,21 +406,21 @@ int cap_inode_getsecurity(struct inode *inode, const char 
*name, void **buffer,
 , size, GFP_NOFS);
 
if (ret < 0)
-   return ret;
+   goto out;
 
fs_ns = inode->i_sb->s_user_ns;
cap = (struct vfs_cap_data *) tmpbuf;
if (is_v2header(ret, cap->magic_etc)) {
/* If this is sizeof(vfs_cap_data) then we're ok with the
 * on-disk value, so return that.  */
-   if (alloc)
+   if (alloc) {
*buffer = tmpbuf;
-   else
-   kfree(tmpbuf);
-   return ret;
+   tmpbuf = NULL;
+   }
+   goto out;
} else if (!is_v3header(ret, cap->magic_etc)) {
-   kfree(tmpbuf);
-   return -EINVAL;
+   ret = -EINVAL;
+   goto out;
}
 
nscap = (struct vfs_ns_cap_data *) tmpbuf;
@@ -434,14 +434,14 @@ int cap_inode_getsecurity(struct inode *inode, const char 
*name, void **buffer,
if (alloc) {
*buffer = tmpbuf;
nscap->rootid = cpu_to_le32(mappedroot);
-   } else
-   kfree(tmpbuf);
-   return size;
+   tmpbuf = NULL;
+   }
+   goto out;
}
 
if (!rootid_owns_currentns(kroot)) {
-   kfree(tmpbuf);
-   return -EOPNOTSUPP;
+   ret = -EOPNOTSUPP;
+   goto out;
}
 
/* This comes from a parent namespace.  Return as a v2 capability */
@@ -459,6 +459,9 @@ int cap_inode_getsecurity(struct inode *inode, const char 
*name, void **buffer,
cap->magic_etc = cpu_to_le32(magic);
}
}
+
+out:
+   dput(dentry);
kfree(tmpbuf);
return size;
 }
-- 
2.7.4



Re: [PATCH] staging: iio: meter: This patch fixes warnings found in ade7753.c

2017-05-07 Thread Greg KH
On Mon, May 08, 2017 at 12:16:17AM -0400, Harinath Nampally wrote:
> All below warnings are found and fixed by checkpatch.pl:
> CHECK: struct mutex definition without comment
> CHECK: Alignment should match open parenthesis
> WARNING: Symbolic permissions 'S_IWUSR | S_IRUGO' are not preferred.

You should only don "one thing" per patch, and no, "fix coding style
issues" is not "one thing" :)

Please break this up into multiple patches and send it as a patch
series.

thanks,

greg k-h


Re: [PATCH] staging: iio: meter: This patch fixes warnings found in ade7753.c

2017-05-07 Thread Greg KH
On Mon, May 08, 2017 at 12:16:17AM -0400, Harinath Nampally wrote:
> All below warnings are found and fixed by checkpatch.pl:
> CHECK: struct mutex definition without comment
> CHECK: Alignment should match open parenthesis
> WARNING: Symbolic permissions 'S_IWUSR | S_IRUGO' are not preferred.

You should only don "one thing" per patch, and no, "fix coding style
issues" is not "one thing" :)

Please break this up into multiple patches and send it as a patch
series.

thanks,

greg k-h


[PATCH V5 resend] PM / OPP: Use - instead of @ for DT entries

2017-05-07 Thread Viresh Kumar
Compiling the DT file with W=1, DTC warns like follows:

Warning (unit_address_vs_reg): Node /opp_table0/opp@10 has a
unit name, but no reg property

Fix this by replacing '@' with '-' as the OPP nodes will never have a
"reg" property.

Reported-by: Krzysztof Kozlowski 
Reported-by: Masahiro Yamada 
Suggested-by: Mark Rutland 
Signed-off-by: Viresh Kumar 
Acked-by: Rob Herring 
---
@Rafael: This one needs to go via the PM tree and so sending it again.

 Documentation/devicetree/bindings/opp/opp.txt | 38 +--
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/Documentation/devicetree/bindings/opp/opp.txt 
b/Documentation/devicetree/bindings/opp/opp.txt
index 63725498bd20..e36d261b9ba6 100644
--- a/Documentation/devicetree/bindings/opp/opp.txt
+++ b/Documentation/devicetree/bindings/opp/opp.txt
@@ -186,20 +186,20 @@ Example 1: Single cluster Dual-core ARM cortex A9, switch 
DVFS states together.
compatible = "operating-points-v2";
opp-shared;
 
-   opp@10 {
+   opp-10 {
opp-hz = /bits/ 64 <10>;
opp-microvolt = <975000 97 985000>;
opp-microamp = <7>;
clock-latency-ns = <30>;
opp-suspend;
};
-   opp@11 {
+   opp-11 {
opp-hz = /bits/ 64 <11>;
opp-microvolt = <100 98 101>;
opp-microamp = <8>;
clock-latency-ns = <31>;
};
-   opp@12 {
+   opp-12 {
opp-hz = /bits/ 64 <12>;
opp-microvolt = <1025000>;
clock-latency-ns = <29>;
@@ -265,20 +265,20 @@ independently.
 * independently.
 */
 
-   opp@10 {
+   opp-10 {
opp-hz = /bits/ 64 <10>;
opp-microvolt = <975000 97 985000>;
opp-microamp = <7>;
clock-latency-ns = <30>;
opp-suspend;
};
-   opp@11 {
+   opp-11 {
opp-hz = /bits/ 64 <11>;
opp-microvolt = <100 98 101>;
opp-microamp = <8>;
clock-latency-ns = <31>;
};
-   opp@12 {
+   opp-12 {
opp-hz = /bits/ 64 <12>;
opp-microvolt = <1025000>;
opp-microamp = <9;
@@ -341,20 +341,20 @@ DVFS state together.
compatible = "operating-points-v2";
opp-shared;
 
-   opp@10 {
+   opp-10 {
opp-hz = /bits/ 64 <10>;
opp-microvolt = <975000 97 985000>;
opp-microamp = <7>;
clock-latency-ns = <30>;
opp-suspend;
};
-   opp@11 {
+   opp-11 {
opp-hz = /bits/ 64 <11>;
opp-microvolt = <100 98 101>;
opp-microamp = <8>;
clock-latency-ns = <31>;
};
-   opp@12 {
+   opp-12 {
opp-hz = /bits/ 64 <12>;
opp-microvolt = <1025000>;
opp-microamp = <9>;
@@ -367,20 +367,20 @@ DVFS state together.
compatible = "operating-points-v2";
opp-shared;
 
-   opp@13 {
+   opp-13 {
opp-hz = /bits/ 64 <13>;
opp-microvolt = <105 1045000 1055000>;
opp-microamp = <95000>;
clock-latency-ns = <40>;
opp-suspend;
};
-   opp@14 {
+   opp-14 {
opp-hz = /bits/ 64 <14>;
opp-microvolt = <1075000>;
opp-microamp = <10>;
clock-latency-ns = <40>;
};
-   opp@15 {
+   opp-15 {
opp-hz = /bits/ 64 <15>;
opp-microvolt = <110 101 111>;
opp-microamp = <95000>;

[PATCH V5 resend] PM / OPP: Use - instead of @ for DT entries

2017-05-07 Thread Viresh Kumar
Compiling the DT file with W=1, DTC warns like follows:

Warning (unit_address_vs_reg): Node /opp_table0/opp@10 has a
unit name, but no reg property

Fix this by replacing '@' with '-' as the OPP nodes will never have a
"reg" property.

Reported-by: Krzysztof Kozlowski 
Reported-by: Masahiro Yamada 
Suggested-by: Mark Rutland 
Signed-off-by: Viresh Kumar 
Acked-by: Rob Herring 
---
@Rafael: This one needs to go via the PM tree and so sending it again.

 Documentation/devicetree/bindings/opp/opp.txt | 38 +--
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/Documentation/devicetree/bindings/opp/opp.txt 
b/Documentation/devicetree/bindings/opp/opp.txt
index 63725498bd20..e36d261b9ba6 100644
--- a/Documentation/devicetree/bindings/opp/opp.txt
+++ b/Documentation/devicetree/bindings/opp/opp.txt
@@ -186,20 +186,20 @@ Example 1: Single cluster Dual-core ARM cortex A9, switch 
DVFS states together.
compatible = "operating-points-v2";
opp-shared;
 
-   opp@10 {
+   opp-10 {
opp-hz = /bits/ 64 <10>;
opp-microvolt = <975000 97 985000>;
opp-microamp = <7>;
clock-latency-ns = <30>;
opp-suspend;
};
-   opp@11 {
+   opp-11 {
opp-hz = /bits/ 64 <11>;
opp-microvolt = <100 98 101>;
opp-microamp = <8>;
clock-latency-ns = <31>;
};
-   opp@12 {
+   opp-12 {
opp-hz = /bits/ 64 <12>;
opp-microvolt = <1025000>;
clock-latency-ns = <29>;
@@ -265,20 +265,20 @@ independently.
 * independently.
 */
 
-   opp@10 {
+   opp-10 {
opp-hz = /bits/ 64 <10>;
opp-microvolt = <975000 97 985000>;
opp-microamp = <7>;
clock-latency-ns = <30>;
opp-suspend;
};
-   opp@11 {
+   opp-11 {
opp-hz = /bits/ 64 <11>;
opp-microvolt = <100 98 101>;
opp-microamp = <8>;
clock-latency-ns = <31>;
};
-   opp@12 {
+   opp-12 {
opp-hz = /bits/ 64 <12>;
opp-microvolt = <1025000>;
opp-microamp = <9;
@@ -341,20 +341,20 @@ DVFS state together.
compatible = "operating-points-v2";
opp-shared;
 
-   opp@10 {
+   opp-10 {
opp-hz = /bits/ 64 <10>;
opp-microvolt = <975000 97 985000>;
opp-microamp = <7>;
clock-latency-ns = <30>;
opp-suspend;
};
-   opp@11 {
+   opp-11 {
opp-hz = /bits/ 64 <11>;
opp-microvolt = <100 98 101>;
opp-microamp = <8>;
clock-latency-ns = <31>;
};
-   opp@12 {
+   opp-12 {
opp-hz = /bits/ 64 <12>;
opp-microvolt = <1025000>;
opp-microamp = <9>;
@@ -367,20 +367,20 @@ DVFS state together.
compatible = "operating-points-v2";
opp-shared;
 
-   opp@13 {
+   opp-13 {
opp-hz = /bits/ 64 <13>;
opp-microvolt = <105 1045000 1055000>;
opp-microamp = <95000>;
clock-latency-ns = <40>;
opp-suspend;
};
-   opp@14 {
+   opp-14 {
opp-hz = /bits/ 64 <14>;
opp-microvolt = <1075000>;
opp-microamp = <10>;
clock-latency-ns = <40>;
};
-   opp@15 {
+   opp-15 {
opp-hz = /bits/ 64 <15>;
opp-microvolt = <110 101 111>;
opp-microamp = <95000>;
@@ -409,7 +409,7 @@ Example 4: Handling multiple regulators
compatible = "operating-points-v2";
 

Re: [PATCH v3 1/2] thermal: core: Allow orderly_poweroff to be called only once

2017-05-07 Thread Keerthy


On Monday 08 May 2017 02:32 AM, Pavel Machek wrote:
>> thermal_zone_device_check --> thermal_zone_device_update -->
>> handle_thermal_trip --> handle_critical_trips --> orderly_poweroff
>>
>> The above sequence happens every 250/500 mS based on the configuration.
>> The orderly_poweroff function is getting called every 250/500 mS.
>> With a full fledged file system it takes at least 5-10 Seconds to
>> power off gracefully.
>>
>> In that period due to the thermal_zone_device_check triggering
>> periodically the thermal work queues bombard with
>> orderly_poweroff calls multiple times eventually leading to
>> failures in gracefully powering off the system.
>>
>> Make sure that orderly_poweroff is called only once.
> 
> Pretty please, can we do it in the core code, not in thermal/? There
> are other reasons kernel may want to shut the system down, like for
> example critical battery, and if both thermal _and_ bad battery
> happen, we want just one shutdown...

Pavel,

Thermal fix is still valid. As it is having multiple calls.
I can work on fixing the core code as well.

BTW the latest of this series is v6:

https://patchwork.kernel.org/patch/9684929/


- Keerthy

> 
>> Reported-by: Keerthy 
>> Signed-off-by: Keerthy 
> 
> And here probably we need your full name.
> 
> Thanks,
>   Pavel
> 


Re: [PATCH v3 1/2] thermal: core: Allow orderly_poweroff to be called only once

2017-05-07 Thread Keerthy


On Monday 08 May 2017 02:32 AM, Pavel Machek wrote:
>> thermal_zone_device_check --> thermal_zone_device_update -->
>> handle_thermal_trip --> handle_critical_trips --> orderly_poweroff
>>
>> The above sequence happens every 250/500 mS based on the configuration.
>> The orderly_poweroff function is getting called every 250/500 mS.
>> With a full fledged file system it takes at least 5-10 Seconds to
>> power off gracefully.
>>
>> In that period due to the thermal_zone_device_check triggering
>> periodically the thermal work queues bombard with
>> orderly_poweroff calls multiple times eventually leading to
>> failures in gracefully powering off the system.
>>
>> Make sure that orderly_poweroff is called only once.
> 
> Pretty please, can we do it in the core code, not in thermal/? There
> are other reasons kernel may want to shut the system down, like for
> example critical battery, and if both thermal _and_ bad battery
> happen, we want just one shutdown...

Pavel,

Thermal fix is still valid. As it is having multiple calls.
I can work on fixing the core code as well.

BTW the latest of this series is v6:

https://patchwork.kernel.org/patch/9684929/


- Keerthy

> 
>> Reported-by: Keerthy 
>> Signed-off-by: Keerthy 
> 
> And here probably we need your full name.
> 
> Thanks,
>   Pavel
> 


Re: [PATCH 6/6] dt-bindings: cpufreq: enhance Mediatek cpufreq dt-binding document

2017-05-07 Thread Viresh Kumar
On 05-05-17, 23:26, sean.w...@mediatek.com wrote:
> From: Sean Wang 
> 
> Update binding document to reflect the lastest driver logic and

The bindings don't follow the drivers but its the other way around. Don't
mention anything about the drivers in the bindings patches.

> add more examples guiding people how to use Mediatek cpufreq driver.
> 
> Signed-off-by: Sean Wang 
> ---
>  .../bindings/cpufreq/cpufreq-mediatek.txt  | 170 
> -
>  1 file changed, 167 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt 
> b/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt
> index 52b457c..0703927 100644
> --- a/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt
> +++ b/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt
> @@ -1,4 +1,5 @@
> -Device Tree Clock bindins for CPU DVFS of Mediatek MT8173 SoC
> +Binding for Mediatek's CPUFreq driver
> +=
>  
>  Required properties:
>  - clocks: A list of phandle + clock-specifier pairs for the clocks listed in 
> clock names.
> @@ -9,6 +10,8 @@ Required properties:
> transition and not stable yet.
>   Please refer to 
> Documentation/devicetree/bindings/clk/clock-bindings.txt for
>   generic clock consumer properties.
> +- operating-points-v2: Pleaser refer to 
> Documentation/devicetree/bindings/opp/opp.txt
> + for detail.
>  - proc-supply: Regulator for Vproc of CPU cluster.
>  
>  Optional properties:
> @@ -17,9 +20,166 @@ Optional properties:
>  Vsram to fit SoC specific needs. When absent, the voltage scaling
>  flow is handled by hardware, hence no software "voltage 
> tracking" is
>  needed.
> +- #cooling-cells:
> +- cooling-min-level:
> +- cooling-max-level:
> + Please refer to Documentation/devicetree/bindings/thermal/thermal.txt
> + for detail.
> +
> +Example 1 (MT7623 SoC):
> +
> + cpu_opp_table: opp_table {
> + compatible = "operating-points-v2";
> + opp-shared;
> +
> + opp@59800 {

s/opp@/opp-/

-- 
viresh


Re: [PATCH 6/6] dt-bindings: cpufreq: enhance Mediatek cpufreq dt-binding document

2017-05-07 Thread Viresh Kumar
On 05-05-17, 23:26, sean.w...@mediatek.com wrote:
> From: Sean Wang 
> 
> Update binding document to reflect the lastest driver logic and

The bindings don't follow the drivers but its the other way around. Don't
mention anything about the drivers in the bindings patches.

> add more examples guiding people how to use Mediatek cpufreq driver.
> 
> Signed-off-by: Sean Wang 
> ---
>  .../bindings/cpufreq/cpufreq-mediatek.txt  | 170 
> -
>  1 file changed, 167 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt 
> b/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt
> index 52b457c..0703927 100644
> --- a/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt
> +++ b/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt
> @@ -1,4 +1,5 @@
> -Device Tree Clock bindins for CPU DVFS of Mediatek MT8173 SoC
> +Binding for Mediatek's CPUFreq driver
> +=
>  
>  Required properties:
>  - clocks: A list of phandle + clock-specifier pairs for the clocks listed in 
> clock names.
> @@ -9,6 +10,8 @@ Required properties:
> transition and not stable yet.
>   Please refer to 
> Documentation/devicetree/bindings/clk/clock-bindings.txt for
>   generic clock consumer properties.
> +- operating-points-v2: Pleaser refer to 
> Documentation/devicetree/bindings/opp/opp.txt
> + for detail.
>  - proc-supply: Regulator for Vproc of CPU cluster.
>  
>  Optional properties:
> @@ -17,9 +20,166 @@ Optional properties:
>  Vsram to fit SoC specific needs. When absent, the voltage scaling
>  flow is handled by hardware, hence no software "voltage 
> tracking" is
>  needed.
> +- #cooling-cells:
> +- cooling-min-level:
> +- cooling-max-level:
> + Please refer to Documentation/devicetree/bindings/thermal/thermal.txt
> + for detail.
> +
> +Example 1 (MT7623 SoC):
> +
> + cpu_opp_table: opp_table {
> + compatible = "operating-points-v2";
> + opp-shared;
> +
> + opp@59800 {

s/opp@/opp-/

-- 
viresh


Re: [PATCH 5/6] dt-bindings: cpufreq: move Mediatek cpufreq dt-bindings document to proper place

2017-05-07 Thread Viresh Kumar
On 05-05-17, 23:26, sean.w...@mediatek.com wrote:
> From: Sean Wang 
> 
> The old place is Documentation/devicetree/bindings/clock/ that would
> let people hard to find how to use Mediatek cpufreq driver, so moving
> it to the new place as other cpufreq dirvers are done would be better.
> 
> Signed-off-by: Sean Wang 
> ---
>  .../devicetree/bindings/clock/mt8173-cpu-dvfs.txt  | 83 
> --
>  .../bindings/cpufreq/cpufreq-mediatek.txt  | 83 
> ++
>  2 files changed, 83 insertions(+), 83 deletions(-)
>  delete mode 100644 
> Documentation/devicetree/bindings/clock/mt8173-cpu-dvfs.txt
>  create mode 100644 
> Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt

Please generate patches with:

git format-patch -C -M --thread=shallow

You would be required to resend this patch at least to let us see what has
changed.

-- 
viresh


Re: [PATCH 5/6] dt-bindings: cpufreq: move Mediatek cpufreq dt-bindings document to proper place

2017-05-07 Thread Viresh Kumar
On 05-05-17, 23:26, sean.w...@mediatek.com wrote:
> From: Sean Wang 
> 
> The old place is Documentation/devicetree/bindings/clock/ that would
> let people hard to find how to use Mediatek cpufreq driver, so moving
> it to the new place as other cpufreq dirvers are done would be better.
> 
> Signed-off-by: Sean Wang 
> ---
>  .../devicetree/bindings/clock/mt8173-cpu-dvfs.txt  | 83 
> --
>  .../bindings/cpufreq/cpufreq-mediatek.txt  | 83 
> ++
>  2 files changed, 83 insertions(+), 83 deletions(-)
>  delete mode 100644 
> Documentation/devicetree/bindings/clock/mt8173-cpu-dvfs.txt
>  create mode 100644 
> Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt

Please generate patches with:

git format-patch -C -M --thread=shallow

You would be required to resend this patch at least to let us see what has
changed.

-- 
viresh


Re: [PATCH V6 1/9] PM / OPP: Introduce "power-domain-opp" property

2017-05-07 Thread Viresh Kumar
On 06-05-17, 11:58, Kevin Hilman wrote:
> Rob Herring  writes:
> 
> > On Wed, Apr 26, 2017 at 04:27:05PM +0530, Viresh Kumar wrote:
> >> Power-domains need to express their active states in DT and the devices
> >> within the power-domain need to express their dependency on those active
> >> states. The power-domains can use the OPP tables without any
> >> modifications to the bindings.
> >> 
> >> Add a new property "power-domain-opp", which will contain phandle to the
> >> OPP node of the parent power domain. This is required for devices which
> >> have dependency on the configured active state of the power domain for
> >> their working.
> >> 
> >> For some platforms the actual frequency and voltages of the power
> >> domains are managed by the firmware and are so hidden from the high
> >> level operating system. The "opp-hz" property is relaxed a bit to
> >> contain indexes instead of actual frequency values to support such
> >> platforms.
> >> 
> >> Signed-off-by: Viresh Kumar 
> >> ---
> >>  Documentation/devicetree/bindings/opp/opp.txt | 74 
> >> ++-
> >>  1 file changed, 73 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/Documentation/devicetree/bindings/opp/opp.txt 
> >> b/Documentation/devicetree/bindings/opp/opp.txt
> >> index 63725498bd20..6e30cae2a936 100644
> >> --- a/Documentation/devicetree/bindings/opp/opp.txt
> >> +++ b/Documentation/devicetree/bindings/opp/opp.txt
> >> @@ -77,7 +77,10 @@ This defines voltage-current-frequency combinations 
> >> along with other related
> >>  properties.
> >>  
> >>  Required properties:
> >> -- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer.
> >> +- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. In 
> >> some
> >> +  cases the exact frequency in Hz may be hidden from the OS by the 
> >> firmware and
> >> +  this field may contain values that represent the frequency in a firmware
> >> +  dependent way, for example an index of an array in the firmware.
> >
> > Not really sure OPP binding makes sense here.
> 
> I think OPP makes perfect sense here, because microcontroller firmware
> is managaging OPPs in hardware.  We just may not know the exact voltage
> and/or frequency (and the firmware/hardware may even be doing AVS for
> micro-adjustments.)

Yes, AVS is being done for the Qcom SoC as well.

> > What about all the other properties. We expose voltage, but not freq?
> 
> I had the same question.  Seems the same comment about an abstract
> "index" is needed for voltage also.

Why should we do that? Here are the cases that I had in mind while writing this:

- DT only contains the performance-index and nothing else (i.e. voltages aren't
  exposed).

  We wouldn't be required to fill the microvolt property as it is optional.

- DT contains both performance-index and voltages.

  The microvolts property will contain the actual voltages and opp-hz will
  contain the index.

I don't see why would we like to put some index value in the microvolts
property. We are setting the index value in the opp-hz property to avoid adding
extra fields and making sure opp-hz is still the unique property for the nodes.

> >>  
> >>  Optional properties:
> >>  - opp-microvolt: voltage in micro Volts.
> >> @@ -154,6 +157,13 @@ properties.
> >>  
> >>  - status: Marks the node enabled/disabled.
> >>  
> >> +- power-domain-opp: Phandle to the OPP node of the parent power-domain. 
> >> The
> >> +  parent power-domain should be configured to the OPP whose node is 
> >> pointed by
> >> +  the phandle, in order to configure the device for the OPP node that 
> >> contains
> >> +  this property. The order in which the device and power domain should be
> >> +  configured is implementation defined. The OPP table of a device can set 
> >> this
> >> +  property only if the device node contains "power-domains" property.
> >> +
> 
> I do understand the need to map a device OPP to a parent power-domain
> OPP, but I really don't like another phandle.
> 
> First, just because a device OPP changes does not mean that a
> power-domain OPP has to change.  What really needs to be specified is a
> minimum requirement, not an exact OPP.  IOW, if a device changes OPP,
> the power-domain OPP has to be *at least* an OPP that can guarantee that
> level of performance, but could also be a more performant OPP, right?

Right and that's how the code is interpreting it right now. Yes, the description
above should have been more clear on that though.

> Also, the parent power-domain driver will have a list of all its
> devices, and be able to get OPPs from those devices.
> 
> IMO, we should do the first (few) implementations of this feature from
> the power-domain driver itself, and not try to figure out how to define
> this for everyone in DT until we have a better handle on it (pun
> intended) ;)

Hmm, I am not sure how things are going to work in that case. The opp-hz value
read from the phandle is passed 

[PATCH] staging: iio: meter: This patch fixes warnings found in ade7753.c

2017-05-07 Thread Harinath Nampally
All below warnings are found and fixed by checkpatch.pl:
CHECK: struct mutex definition without comment
CHECK: Alignment should match open parenthesis
WARNING: Symbolic permissions 'S_IWUSR | S_IRUGO' are not preferred.

Below errors are false positives:
ade7753.c:382: ERROR: Use 4 digit octal (0777) not decimal permissions
ade7753.c:386: ERROR: Use 4 digit octal (0777) not decimal permissions

Signed-off-by: Harinath Nampally 
---
 drivers/staging/iio/meter/ade7753.c | 104 ++--
 1 file changed, 52 insertions(+), 52 deletions(-)

diff --git a/drivers/staging/iio/meter/ade7753.c 
b/drivers/staging/iio/meter/ade7753.c
index b71fbd3..2534bd0 100644
--- a/drivers/staging/iio/meter/ade7753.c
+++ b/drivers/staging/iio/meter/ade7753.c
@@ -78,12 +78,13 @@
 /**
  * struct ade7753_state - device instance specific data
  * @us: actual spi_device
+ * @buf_lock:   mutex to protect tx and rx
  * @tx: transmit buffer
  * @rx: receive buffer
- * @buf_lock:   mutex to protect tx and rx
  **/
 struct ade7753_state {
struct spi_device   *us;
+ /* mutex to protect tx and rx */
struct mutexbuf_lock;
u8  tx[ADE7753_MAX_TX] cacheline_aligned;
u8  rx[ADE7753_MAX_RX];
@@ -107,9 +108,8 @@ static int ade7753_spi_write_reg_8(struct device *dev,
return ret;
 }
 
-static int ade7753_spi_write_reg_16(struct device *dev,
-   u8 reg_address,
-   u16 value)
+static int ade7753_spi_write_reg_16(struct device *dev, u8 reg_address,
+   u16 value)
 {
int ret;
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
@@ -126,8 +126,8 @@ static int ade7753_spi_write_reg_16(struct device *dev,
 }
 
 static int ade7753_spi_read_reg_8(struct device *dev,
-   u8 reg_address,
-   u8 *val)
+ u8 reg_address,
+ u8 *val)
 {
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
struct ade7753_state *st = iio_priv(indio_dev);
@@ -136,7 +136,7 @@ static int ade7753_spi_read_reg_8(struct device *dev,
ret = spi_w8r8(st->us, ADE7753_READ_REG(reg_address));
if (ret < 0) {
dev_err(>us->dev, "problem when reading 8 bit register 
0x%02X",
-   reg_address);
+   reg_address);
return ret;
}
*val = ret;
@@ -145,8 +145,8 @@ static int ade7753_spi_read_reg_8(struct device *dev,
 }
 
 static int ade7753_spi_read_reg_16(struct device *dev,
-   u8 reg_address,
-   u16 *val)
+  u8 reg_address,
+  u16 *val)
 {
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
struct ade7753_state *st = iio_priv(indio_dev);
@@ -165,8 +165,8 @@ static int ade7753_spi_read_reg_16(struct device *dev,
 }
 
 static int ade7753_spi_read_reg_24(struct device *dev,
-   u8 reg_address,
-   u32 *val)
+  u8 reg_address,
+  u32 *val)
 {
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
struct ade7753_state *st = iio_priv(indio_dev);
@@ -189,7 +189,7 @@ static int ade7753_spi_read_reg_24(struct device *dev,
ret = spi_sync_transfer(st->us, xfers, ARRAY_SIZE(xfers));
if (ret) {
dev_err(>us->dev, "problem when reading 24 bit register 
0x%02X",
-   reg_address);
+   reg_address);
goto error_ret;
}
*val = (st->rx[0] << 16) | (st->rx[1] << 8) | st->rx[2];
@@ -200,8 +200,8 @@ static int ade7753_spi_read_reg_24(struct device *dev,
 }
 
 static ssize_t ade7753_read_8bit(struct device *dev,
-   struct device_attribute *attr,
-   char *buf)
+struct device_attribute *attr,
+char *buf)
 {
int ret;
u8 val;
@@ -215,8 +215,8 @@ static ssize_t ade7753_read_8bit(struct device *dev,
 }
 
 static ssize_t ade7753_read_16bit(struct device *dev,
-   struct device_attribute *attr,
-   char *buf)
+ struct device_attribute *attr,
+ char *buf)
 {
int ret;
u16 val;
@@ -230,8 +230,8 @@ static ssize_t ade7753_read_16bit(struct device *dev,
 }
 
 static ssize_t ade7753_read_24bit(struct device *dev,
-   struct device_attribute *attr,
-   char *buf)
+ struct device_attribute *attr,
+ char *buf)
 {
int ret;
u32 val;
@@ -245,9 +245,9 @@ static ssize_t ade7753_read_24bit(struct device *dev,
 }
 
 static ssize_t ade7753_write_8bit(struct device *dev,
-   struct device_attribute *attr,
-

Re: [PATCH V6 1/9] PM / OPP: Introduce "power-domain-opp" property

2017-05-07 Thread Viresh Kumar
On 06-05-17, 11:58, Kevin Hilman wrote:
> Rob Herring  writes:
> 
> > On Wed, Apr 26, 2017 at 04:27:05PM +0530, Viresh Kumar wrote:
> >> Power-domains need to express their active states in DT and the devices
> >> within the power-domain need to express their dependency on those active
> >> states. The power-domains can use the OPP tables without any
> >> modifications to the bindings.
> >> 
> >> Add a new property "power-domain-opp", which will contain phandle to the
> >> OPP node of the parent power domain. This is required for devices which
> >> have dependency on the configured active state of the power domain for
> >> their working.
> >> 
> >> For some platforms the actual frequency and voltages of the power
> >> domains are managed by the firmware and are so hidden from the high
> >> level operating system. The "opp-hz" property is relaxed a bit to
> >> contain indexes instead of actual frequency values to support such
> >> platforms.
> >> 
> >> Signed-off-by: Viresh Kumar 
> >> ---
> >>  Documentation/devicetree/bindings/opp/opp.txt | 74 
> >> ++-
> >>  1 file changed, 73 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/Documentation/devicetree/bindings/opp/opp.txt 
> >> b/Documentation/devicetree/bindings/opp/opp.txt
> >> index 63725498bd20..6e30cae2a936 100644
> >> --- a/Documentation/devicetree/bindings/opp/opp.txt
> >> +++ b/Documentation/devicetree/bindings/opp/opp.txt
> >> @@ -77,7 +77,10 @@ This defines voltage-current-frequency combinations 
> >> along with other related
> >>  properties.
> >>  
> >>  Required properties:
> >> -- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer.
> >> +- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. In 
> >> some
> >> +  cases the exact frequency in Hz may be hidden from the OS by the 
> >> firmware and
> >> +  this field may contain values that represent the frequency in a firmware
> >> +  dependent way, for example an index of an array in the firmware.
> >
> > Not really sure OPP binding makes sense here.
> 
> I think OPP makes perfect sense here, because microcontroller firmware
> is managaging OPPs in hardware.  We just may not know the exact voltage
> and/or frequency (and the firmware/hardware may even be doing AVS for
> micro-adjustments.)

Yes, AVS is being done for the Qcom SoC as well.

> > What about all the other properties. We expose voltage, but not freq?
> 
> I had the same question.  Seems the same comment about an abstract
> "index" is needed for voltage also.

Why should we do that? Here are the cases that I had in mind while writing this:

- DT only contains the performance-index and nothing else (i.e. voltages aren't
  exposed).

  We wouldn't be required to fill the microvolt property as it is optional.

- DT contains both performance-index and voltages.

  The microvolts property will contain the actual voltages and opp-hz will
  contain the index.

I don't see why would we like to put some index value in the microvolts
property. We are setting the index value in the opp-hz property to avoid adding
extra fields and making sure opp-hz is still the unique property for the nodes.

> >>  
> >>  Optional properties:
> >>  - opp-microvolt: voltage in micro Volts.
> >> @@ -154,6 +157,13 @@ properties.
> >>  
> >>  - status: Marks the node enabled/disabled.
> >>  
> >> +- power-domain-opp: Phandle to the OPP node of the parent power-domain. 
> >> The
> >> +  parent power-domain should be configured to the OPP whose node is 
> >> pointed by
> >> +  the phandle, in order to configure the device for the OPP node that 
> >> contains
> >> +  this property. The order in which the device and power domain should be
> >> +  configured is implementation defined. The OPP table of a device can set 
> >> this
> >> +  property only if the device node contains "power-domains" property.
> >> +
> 
> I do understand the need to map a device OPP to a parent power-domain
> OPP, but I really don't like another phandle.
> 
> First, just because a device OPP changes does not mean that a
> power-domain OPP has to change.  What really needs to be specified is a
> minimum requirement, not an exact OPP.  IOW, if a device changes OPP,
> the power-domain OPP has to be *at least* an OPP that can guarantee that
> level of performance, but could also be a more performant OPP, right?

Right and that's how the code is interpreting it right now. Yes, the description
above should have been more clear on that though.

> Also, the parent power-domain driver will have a list of all its
> devices, and be able to get OPPs from those devices.
> 
> IMO, we should do the first (few) implementations of this feature from
> the power-domain driver itself, and not try to figure out how to define
> this for everyone in DT until we have a better handle on it (pun
> intended) ;)

Hmm, I am not sure how things are going to work in that case. The opp-hz value
read from the phandle is passed to the QoS framework in this series, 

[PATCH] staging: iio: meter: This patch fixes warnings found in ade7753.c

2017-05-07 Thread Harinath Nampally
All below warnings are found and fixed by checkpatch.pl:
CHECK: struct mutex definition without comment
CHECK: Alignment should match open parenthesis
WARNING: Symbolic permissions 'S_IWUSR | S_IRUGO' are not preferred.

Below errors are false positives:
ade7753.c:382: ERROR: Use 4 digit octal (0777) not decimal permissions
ade7753.c:386: ERROR: Use 4 digit octal (0777) not decimal permissions

Signed-off-by: Harinath Nampally 
---
 drivers/staging/iio/meter/ade7753.c | 104 ++--
 1 file changed, 52 insertions(+), 52 deletions(-)

diff --git a/drivers/staging/iio/meter/ade7753.c 
b/drivers/staging/iio/meter/ade7753.c
index b71fbd3..2534bd0 100644
--- a/drivers/staging/iio/meter/ade7753.c
+++ b/drivers/staging/iio/meter/ade7753.c
@@ -78,12 +78,13 @@
 /**
  * struct ade7753_state - device instance specific data
  * @us: actual spi_device
+ * @buf_lock:   mutex to protect tx and rx
  * @tx: transmit buffer
  * @rx: receive buffer
- * @buf_lock:   mutex to protect tx and rx
  **/
 struct ade7753_state {
struct spi_device   *us;
+ /* mutex to protect tx and rx */
struct mutexbuf_lock;
u8  tx[ADE7753_MAX_TX] cacheline_aligned;
u8  rx[ADE7753_MAX_RX];
@@ -107,9 +108,8 @@ static int ade7753_spi_write_reg_8(struct device *dev,
return ret;
 }
 
-static int ade7753_spi_write_reg_16(struct device *dev,
-   u8 reg_address,
-   u16 value)
+static int ade7753_spi_write_reg_16(struct device *dev, u8 reg_address,
+   u16 value)
 {
int ret;
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
@@ -126,8 +126,8 @@ static int ade7753_spi_write_reg_16(struct device *dev,
 }
 
 static int ade7753_spi_read_reg_8(struct device *dev,
-   u8 reg_address,
-   u8 *val)
+ u8 reg_address,
+ u8 *val)
 {
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
struct ade7753_state *st = iio_priv(indio_dev);
@@ -136,7 +136,7 @@ static int ade7753_spi_read_reg_8(struct device *dev,
ret = spi_w8r8(st->us, ADE7753_READ_REG(reg_address));
if (ret < 0) {
dev_err(>us->dev, "problem when reading 8 bit register 
0x%02X",
-   reg_address);
+   reg_address);
return ret;
}
*val = ret;
@@ -145,8 +145,8 @@ static int ade7753_spi_read_reg_8(struct device *dev,
 }
 
 static int ade7753_spi_read_reg_16(struct device *dev,
-   u8 reg_address,
-   u16 *val)
+  u8 reg_address,
+  u16 *val)
 {
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
struct ade7753_state *st = iio_priv(indio_dev);
@@ -165,8 +165,8 @@ static int ade7753_spi_read_reg_16(struct device *dev,
 }
 
 static int ade7753_spi_read_reg_24(struct device *dev,
-   u8 reg_address,
-   u32 *val)
+  u8 reg_address,
+  u32 *val)
 {
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
struct ade7753_state *st = iio_priv(indio_dev);
@@ -189,7 +189,7 @@ static int ade7753_spi_read_reg_24(struct device *dev,
ret = spi_sync_transfer(st->us, xfers, ARRAY_SIZE(xfers));
if (ret) {
dev_err(>us->dev, "problem when reading 24 bit register 
0x%02X",
-   reg_address);
+   reg_address);
goto error_ret;
}
*val = (st->rx[0] << 16) | (st->rx[1] << 8) | st->rx[2];
@@ -200,8 +200,8 @@ static int ade7753_spi_read_reg_24(struct device *dev,
 }
 
 static ssize_t ade7753_read_8bit(struct device *dev,
-   struct device_attribute *attr,
-   char *buf)
+struct device_attribute *attr,
+char *buf)
 {
int ret;
u8 val;
@@ -215,8 +215,8 @@ static ssize_t ade7753_read_8bit(struct device *dev,
 }
 
 static ssize_t ade7753_read_16bit(struct device *dev,
-   struct device_attribute *attr,
-   char *buf)
+ struct device_attribute *attr,
+ char *buf)
 {
int ret;
u16 val;
@@ -230,8 +230,8 @@ static ssize_t ade7753_read_16bit(struct device *dev,
 }
 
 static ssize_t ade7753_read_24bit(struct device *dev,
-   struct device_attribute *attr,
-   char *buf)
+ struct device_attribute *attr,
+ char *buf)
 {
int ret;
u32 val;
@@ -245,9 +245,9 @@ static ssize_t ade7753_read_24bit(struct device *dev,
 }
 
 static ssize_t ade7753_write_8bit(struct device *dev,
-   struct device_attribute *attr,
-   const char 

Re: [RFC][PATCH v3 2/2] cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely

2017-05-07 Thread Viresh Kumar
On 08-05-17, 11:49, Wanpeng Li wrote:
> Hi Rafael,
> 2017-03-22 7:08 GMT+08:00 Rafael J. Wysocki :
> > From: Rafael J. Wysocki 
> >
> > The way the schedutil governor uses the PELT metric causes it to
> > underestimate the CPU utilization in some cases.
> >
> > That can be easily demonstrated by running kernel compilation on
> > a Sandy Bridge Intel processor, running turbostat in parallel with
> > it and looking at the values written to the MSR_IA32_PERF_CTL
> > register.  Namely, the expected result would be that when all CPUs
> > were 100% busy, all of them would be requested to run in the maximum
> > P-state, but observation shows that this clearly isn't the case.
> > The CPUs run in the maximum P-state for a while and then are
> > requested to run slower and go back to the maximum P-state after
> > a while again.  That causes the actual frequency of the processor to
> > visibly oscillate below the sustainable maximum in a jittery fashion
> > which clearly is not desirable.
> >
> > That has been attributed to CPU utilization metric updates on task
> > migration that cause the total utilization value for the CPU to be
> > reduced by the utilization of the migrated task.  If that happens,
> > the schedutil governor may see a CPU utilization reduction and will
> > attempt to reduce the CPU frequency accordingly right away.  That
> > may be premature, though, for example if the system is generally
> > busy and there are other runnable tasks waiting to be run on that
> > CPU already.
> >
> > This is unlikely to be an issue on systems where cpufreq policies are
> > shared between multiple CPUs, because in those cases the policy
> > utilization is computed as the maximum of the CPU utilization values
> 
> Sorry for one question maybe not associated with this patch. If the
> cpufreq policy is shared between multiple CPUs, the function
> intel_cpufreq_target()  just updates IA32_PERF_CTL MSR of the cpu
> which is managing this policy, I wonder whether other cpus which are
> affected should also update their per-logical cpu's IA32_PERF_CTL MSR?

The CPUs share the policy when they share their freq/voltage rails and so
changing perf state of one CPU should result in that changing for all the CPUs
in that policy. Otherwise, they can't be considered to be part of the same
policy.

That's why this code is changing it only for policy->cpu alone.

-- 
viresh


Re: [RFC][PATCH v3 2/2] cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely

2017-05-07 Thread Viresh Kumar
On 08-05-17, 11:49, Wanpeng Li wrote:
> Hi Rafael,
> 2017-03-22 7:08 GMT+08:00 Rafael J. Wysocki :
> > From: Rafael J. Wysocki 
> >
> > The way the schedutil governor uses the PELT metric causes it to
> > underestimate the CPU utilization in some cases.
> >
> > That can be easily demonstrated by running kernel compilation on
> > a Sandy Bridge Intel processor, running turbostat in parallel with
> > it and looking at the values written to the MSR_IA32_PERF_CTL
> > register.  Namely, the expected result would be that when all CPUs
> > were 100% busy, all of them would be requested to run in the maximum
> > P-state, but observation shows that this clearly isn't the case.
> > The CPUs run in the maximum P-state for a while and then are
> > requested to run slower and go back to the maximum P-state after
> > a while again.  That causes the actual frequency of the processor to
> > visibly oscillate below the sustainable maximum in a jittery fashion
> > which clearly is not desirable.
> >
> > That has been attributed to CPU utilization metric updates on task
> > migration that cause the total utilization value for the CPU to be
> > reduced by the utilization of the migrated task.  If that happens,
> > the schedutil governor may see a CPU utilization reduction and will
> > attempt to reduce the CPU frequency accordingly right away.  That
> > may be premature, though, for example if the system is generally
> > busy and there are other runnable tasks waiting to be run on that
> > CPU already.
> >
> > This is unlikely to be an issue on systems where cpufreq policies are
> > shared between multiple CPUs, because in those cases the policy
> > utilization is computed as the maximum of the CPU utilization values
> 
> Sorry for one question maybe not associated with this patch. If the
> cpufreq policy is shared between multiple CPUs, the function
> intel_cpufreq_target()  just updates IA32_PERF_CTL MSR of the cpu
> which is managing this policy, I wonder whether other cpus which are
> affected should also update their per-logical cpu's IA32_PERF_CTL MSR?

The CPUs share the policy when they share their freq/voltage rails and so
changing perf state of one CPU should result in that changing for all the CPUs
in that policy. Otherwise, they can't be considered to be part of the same
policy.

That's why this code is changing it only for policy->cpu alone.

-- 
viresh


Re: [RFC][PATCH v3 2/2] cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely

2017-05-07 Thread Wanpeng Li
Hi Rafael,
2017-03-22 7:08 GMT+08:00 Rafael J. Wysocki :
> From: Rafael J. Wysocki 
>
> The way the schedutil governor uses the PELT metric causes it to
> underestimate the CPU utilization in some cases.
>
> That can be easily demonstrated by running kernel compilation on
> a Sandy Bridge Intel processor, running turbostat in parallel with
> it and looking at the values written to the MSR_IA32_PERF_CTL
> register.  Namely, the expected result would be that when all CPUs
> were 100% busy, all of them would be requested to run in the maximum
> P-state, but observation shows that this clearly isn't the case.
> The CPUs run in the maximum P-state for a while and then are
> requested to run slower and go back to the maximum P-state after
> a while again.  That causes the actual frequency of the processor to
> visibly oscillate below the sustainable maximum in a jittery fashion
> which clearly is not desirable.
>
> That has been attributed to CPU utilization metric updates on task
> migration that cause the total utilization value for the CPU to be
> reduced by the utilization of the migrated task.  If that happens,
> the schedutil governor may see a CPU utilization reduction and will
> attempt to reduce the CPU frequency accordingly right away.  That
> may be premature, though, for example if the system is generally
> busy and there are other runnable tasks waiting to be run on that
> CPU already.
>
> This is unlikely to be an issue on systems where cpufreq policies are
> shared between multiple CPUs, because in those cases the policy
> utilization is computed as the maximum of the CPU utilization values

Sorry for one question maybe not associated with this patch. If the
cpufreq policy is shared between multiple CPUs, the function
intel_cpufreq_target()  just updates IA32_PERF_CTL MSR of the cpu
which is managing this policy, I wonder whether other cpus which are
affected should also update their per-logical cpu's IA32_PERF_CTL MSR?

Regards,
Wanpeng Li

> over the whole policy and if that turns out to be low, reducing the
> frequency for the policy most likely is a good idea anyway.  On
> systems with one CPU per policy, however, it may affect performance
> adversely and even lead to increased energy consumption in some cases.
>
> On those systems it may be addressed by taking another utilization
> metric into consideration, like whether or not the CPU whose
> frequency is about to be reduced has been idle recently, because if
> that's not the case, the CPU is likely to be busy in the near future
> and its frequency should not be reduced.
>
> To that end, use the counter of idle calls in the timekeeping code.
> Namely, make the schedutil governor look at that counter for the
> current CPU every time before its frequency is about to be reduced.
> If the counter has not changed since the previous iteration of the
> governor computations for that CPU, the CPU has been busy for all
> that time and its frequency should not be decreased, so if the new
> frequency would be lower than the one set previously, the governor
> will skip the frequency update.
>
> Signed-off-by: Rafael J. Wysocki 
> ---
>  include/linux/tick.h |1 +
>  kernel/sched/cpufreq_schedutil.c |   27 +++
>  kernel/time/tick-sched.c |   12 
>  3 files changed, 40 insertions(+)
>
> Index: linux-pm/kernel/sched/cpufreq_schedutil.c
> ===
> --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
> +++ linux-pm/kernel/sched/cpufreq_schedutil.c
> @@ -61,6 +61,11 @@ struct sugov_cpu {
> unsigned long util;
> unsigned long max;
> unsigned int flags;
> +
> +   /* The field below is for single-CPU policies only. */
> +#ifdef CONFIG_NO_HZ_COMMON
> +   unsigned long saved_idle_calls;
> +#endif
>  };
>
>  static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu);
> @@ -192,6 +197,19 @@ static void sugov_iowait_boost(struct su
> sg_cpu->iowait_boost >>= 1;
>  }
>
> +#ifdef CONFIG_NO_HZ_COMMON
> +static bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu)
> +{
> +   unsigned long idle_calls = tick_nohz_get_idle_calls();
> +   bool ret = idle_calls == sg_cpu->saved_idle_calls;
> +
> +   sg_cpu->saved_idle_calls = idle_calls;
> +   return ret;
> +}
> +#else
> +static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return 
> false; }
> +#endif /* CONFIG_NO_HZ_COMMON */
> +
>  static void sugov_update_single(struct update_util_data *hook, u64 time,
> unsigned int flags)
>  {
> @@ -200,6 +218,7 @@ static void sugov_update_single(struct u
> struct cpufreq_policy *policy = sg_policy->policy;
> unsigned long util, max;
> unsigned int next_f;
> +   bool busy;
>
> sugov_set_iowait_boost(sg_cpu, time, flags);
> sg_cpu->last_update = time;
> @@ -207,12 

Re: [RFC][PATCH v3 2/2] cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely

2017-05-07 Thread Wanpeng Li
Hi Rafael,
2017-03-22 7:08 GMT+08:00 Rafael J. Wysocki :
> From: Rafael J. Wysocki 
>
> The way the schedutil governor uses the PELT metric causes it to
> underestimate the CPU utilization in some cases.
>
> That can be easily demonstrated by running kernel compilation on
> a Sandy Bridge Intel processor, running turbostat in parallel with
> it and looking at the values written to the MSR_IA32_PERF_CTL
> register.  Namely, the expected result would be that when all CPUs
> were 100% busy, all of them would be requested to run in the maximum
> P-state, but observation shows that this clearly isn't the case.
> The CPUs run in the maximum P-state for a while and then are
> requested to run slower and go back to the maximum P-state after
> a while again.  That causes the actual frequency of the processor to
> visibly oscillate below the sustainable maximum in a jittery fashion
> which clearly is not desirable.
>
> That has been attributed to CPU utilization metric updates on task
> migration that cause the total utilization value for the CPU to be
> reduced by the utilization of the migrated task.  If that happens,
> the schedutil governor may see a CPU utilization reduction and will
> attempt to reduce the CPU frequency accordingly right away.  That
> may be premature, though, for example if the system is generally
> busy and there are other runnable tasks waiting to be run on that
> CPU already.
>
> This is unlikely to be an issue on systems where cpufreq policies are
> shared between multiple CPUs, because in those cases the policy
> utilization is computed as the maximum of the CPU utilization values

Sorry for one question maybe not associated with this patch. If the
cpufreq policy is shared between multiple CPUs, the function
intel_cpufreq_target()  just updates IA32_PERF_CTL MSR of the cpu
which is managing this policy, I wonder whether other cpus which are
affected should also update their per-logical cpu's IA32_PERF_CTL MSR?

Regards,
Wanpeng Li

> over the whole policy and if that turns out to be low, reducing the
> frequency for the policy most likely is a good idea anyway.  On
> systems with one CPU per policy, however, it may affect performance
> adversely and even lead to increased energy consumption in some cases.
>
> On those systems it may be addressed by taking another utilization
> metric into consideration, like whether or not the CPU whose
> frequency is about to be reduced has been idle recently, because if
> that's not the case, the CPU is likely to be busy in the near future
> and its frequency should not be reduced.
>
> To that end, use the counter of idle calls in the timekeeping code.
> Namely, make the schedutil governor look at that counter for the
> current CPU every time before its frequency is about to be reduced.
> If the counter has not changed since the previous iteration of the
> governor computations for that CPU, the CPU has been busy for all
> that time and its frequency should not be decreased, so if the new
> frequency would be lower than the one set previously, the governor
> will skip the frequency update.
>
> Signed-off-by: Rafael J. Wysocki 
> ---
>  include/linux/tick.h |1 +
>  kernel/sched/cpufreq_schedutil.c |   27 +++
>  kernel/time/tick-sched.c |   12 
>  3 files changed, 40 insertions(+)
>
> Index: linux-pm/kernel/sched/cpufreq_schedutil.c
> ===
> --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
> +++ linux-pm/kernel/sched/cpufreq_schedutil.c
> @@ -61,6 +61,11 @@ struct sugov_cpu {
> unsigned long util;
> unsigned long max;
> unsigned int flags;
> +
> +   /* The field below is for single-CPU policies only. */
> +#ifdef CONFIG_NO_HZ_COMMON
> +   unsigned long saved_idle_calls;
> +#endif
>  };
>
>  static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu);
> @@ -192,6 +197,19 @@ static void sugov_iowait_boost(struct su
> sg_cpu->iowait_boost >>= 1;
>  }
>
> +#ifdef CONFIG_NO_HZ_COMMON
> +static bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu)
> +{
> +   unsigned long idle_calls = tick_nohz_get_idle_calls();
> +   bool ret = idle_calls == sg_cpu->saved_idle_calls;
> +
> +   sg_cpu->saved_idle_calls = idle_calls;
> +   return ret;
> +}
> +#else
> +static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return 
> false; }
> +#endif /* CONFIG_NO_HZ_COMMON */
> +
>  static void sugov_update_single(struct update_util_data *hook, u64 time,
> unsigned int flags)
>  {
> @@ -200,6 +218,7 @@ static void sugov_update_single(struct u
> struct cpufreq_policy *policy = sg_policy->policy;
> unsigned long util, max;
> unsigned int next_f;
> +   bool busy;
>
> sugov_set_iowait_boost(sg_cpu, time, flags);
> sg_cpu->last_update = time;
> @@ -207,12 +226,20 @@ static void sugov_update_single(struct u
> if 

linux-next: manual merge of the akpm-current tree with Linus' tree

2017-05-07 Thread Stephen Rothwell
Hi Andrew,

Today's linux-next merge of the akpm-current tree got a conflict in:

  init/initramfs.c

between commit:

  17a9be317475 ("initramfs: Always do fput() and load modules after rootfs 
populate")

from Linus' tree and commit:

  c25cfb52513b ("initramfs: provide a way to ignore image provided by 
bootloader")

from the akpm-current tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc init/initramfs.c
index 8daf7ac6c7e2,a5b686696535..
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@@ -612,8 -611,7 +612,8 @@@ static int __init populate_rootfs(void
char *err = unpack_to_rootfs(__initramfs_start, __initramfs_size);
if (err)
panic("%s", err); /* Failed to decompress INTERNAL initramfs */
 +  /* If available load the bootloader supplied initrd */
-   if (initrd_start) {
+   if (initrd_start && !IS_ENABLED(CONFIG_INITRAMFS_FORCE)) {
  #ifdef CONFIG_BLK_DEV_RAM
int fd;
printk(KERN_INFO "Trying to unpack rootfs image as 
initramfs...\n");


linux-next: manual merge of the akpm-current tree with Linus' tree

2017-05-07 Thread Stephen Rothwell
Hi Andrew,

Today's linux-next merge of the akpm-current tree got a conflict in:

  init/initramfs.c

between commit:

  17a9be317475 ("initramfs: Always do fput() and load modules after rootfs 
populate")

from Linus' tree and commit:

  c25cfb52513b ("initramfs: provide a way to ignore image provided by 
bootloader")

from the akpm-current tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc init/initramfs.c
index 8daf7ac6c7e2,a5b686696535..
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@@ -612,8 -611,7 +612,8 @@@ static int __init populate_rootfs(void
char *err = unpack_to_rootfs(__initramfs_start, __initramfs_size);
if (err)
panic("%s", err); /* Failed to decompress INTERNAL initramfs */
 +  /* If available load the bootloader supplied initrd */
-   if (initrd_start) {
+   if (initrd_start && !IS_ENABLED(CONFIG_INITRAMFS_FORCE)) {
  #ifdef CONFIG_BLK_DEV_RAM
int fd;
printk(KERN_INFO "Trying to unpack rootfs image as 
initramfs...\n");


Re: [PATCH 1/2] mtd: nand: add generic helpers to check, match, maximize ECC settings

2017-05-07 Thread Masahiro Yamada
Hi Boris,


2017-04-29 1:32 GMT+09:00 Boris Brezillon :

>> + for (setting = caps->ecc_settings; setting->step; setting++) {
>> + /* If chip->ecc.size is already set, respect it. */
>> + if (chip->ecc.size && setting->step != chip->ecc.size)
>> + continue;
>> +
>> + /* If chip->ecc.strength is already set, respect it. */
>> + if (chip->ecc.strength &&
>> + setting->strength != chip->ecc.strength)
>> + continue;
>
> Hm, I don't get it. If chip->ecc.strength and chip->ecc.size are
> explicitly set, you should just call nand_check_ecc_caps() and skip
> nand_try_to_match_ecc_req(). Why would you call
> nand_try_to_match_ecc_req() in this case?


I want to call this function if
ecc.size is specified but ecc.strength is not
(or vice versa).


If both ecc.size and ecc.strength are already specified,
you are right, no need to call this function.
This function can check the sanity of the specified
combination of (step, strength), but this is the same
as what nand_check_ecc_caps() does.




>> +
>> + /*
>> +  * If the controller's step size is smaller than the chip's
>> +  * requirement, comparison of the strength is not simple.
>> +  */
>
> There's one thing we can easily do in this case: try to apply the
> same strength but on the smaller step size. If it fits the OOB area, we
> have a valid match, if it doesn't, then we can fallback to ECC
> maximization in case no valid settings were found after iterating over
> ECC settings.
>
> How about:
>
> if (setting->step < req_step &&
> setting->strength < req_strength)
> continue;

Make sense.  I will do this.



>> + if (setting->step < req_step)
>> + continue;
>
> You should probably check that setting->step < mtd->writesize.
>
>> +
>> + steps = mtd->writesize / setting->step;
>
> Not sure it will ever happen to have a step which is not a multiple of
> ->writesize, but it's probably safer to do DIV_ROUND_UP(), or simply
> skip the ECC setting entry if mtd->writesize % setting->step != 0.

OK.

mtd->writesize % setting->step != 0
will guarantee
setting->step <= mtd->writesize





>> +
>> + ecc_bytes = caps->calc_ecc_bytes(setting);
>> + if (WARN_ON_ONCE(ecc_bytes < 0))
>> + continue;
>> + ecc_bytes_total = ecc_bytes * steps;
>> +
>> + if (ecc_bytes_total > caps->avail_oobsize ||
>> + setting->strength * steps < req_corr)
>> + continue;
>> +
>> + /*
>> +  * We assume the best is to meet the chip's requrement
>> +  * with the least number of ECC bytes.
>> +  */
>
> If ecc_settings entries were in ascending order (lowest step-size and
> strength first), you could bail out as soon as you find a suitable
> config, because following settings would necessarily take more bits.

If we want to achieve this,
step-size must be descending order,
while strength must be ascending order.

This is a bit confusing (and I have no idea
how to statically check if every driver follows this order).



>> + /*
>> +  * If the number of correctable bits is the same,
>> +  * bigger ecc_step has more reliability.
>> +  */
>> + if (corr > best_corr ||
>> + (corr == best_corr && setting->step > best_setting->step)) 
>> {
>> + best_corr = corr;
>> + best_setting = setting;
>> + best_ecc_bytes = ecc_bytes;
>> + }
>
> Same comment as earlier: you could probably skip a few entries by
> enforcing ordering in the ecc_settings array.


Assuming that step-size is descending order and strength is ascending order,
it may be possible by iterating backward.

But, I think the array should be terminated by zero or NULL.
How to find the last entry we want to start iterating from?

Assuming the array size is small, is this worthwhile?



>> + }
>> +
>> + if (!best_setting)
>> + return -ENOTSUPP;
>> +
>> + chip->ecc.size = best_setting->step;
>> + chip->ecc.strength = best_setting->strength;
>> + chip->ecc.bytes = best_ecc_bytes;
>> +
>> + return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(nand_try_to_maximize_ecc);
>> +
>>  /*
>>   * Check if the chip configuration meet the datasheet requirements.
>>
>> diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
>> index 2ae781e..394128f 100644
>> --- a/include/linux/mtd/nand.h
>> +++ b/include/linux/mtd/nand.h
>> @@ -486,6 +486,28 @@ static inline void nand_hw_control_init(struct 
>> nand_hw_control *nfc)
>>  }
>>
>>  /**
>> + * struct nand_ecc_setting - information of ECC step supported by ECC engine
>> + * @step: data bytes per ECC step
>> + * 

Re: [PATCH 1/2] mtd: nand: add generic helpers to check, match, maximize ECC settings

2017-05-07 Thread Masahiro Yamada
Hi Boris,


2017-04-29 1:32 GMT+09:00 Boris Brezillon :

>> + for (setting = caps->ecc_settings; setting->step; setting++) {
>> + /* If chip->ecc.size is already set, respect it. */
>> + if (chip->ecc.size && setting->step != chip->ecc.size)
>> + continue;
>> +
>> + /* If chip->ecc.strength is already set, respect it. */
>> + if (chip->ecc.strength &&
>> + setting->strength != chip->ecc.strength)
>> + continue;
>
> Hm, I don't get it. If chip->ecc.strength and chip->ecc.size are
> explicitly set, you should just call nand_check_ecc_caps() and skip
> nand_try_to_match_ecc_req(). Why would you call
> nand_try_to_match_ecc_req() in this case?


I want to call this function if
ecc.size is specified but ecc.strength is not
(or vice versa).


If both ecc.size and ecc.strength are already specified,
you are right, no need to call this function.
This function can check the sanity of the specified
combination of (step, strength), but this is the same
as what nand_check_ecc_caps() does.




>> +
>> + /*
>> +  * If the controller's step size is smaller than the chip's
>> +  * requirement, comparison of the strength is not simple.
>> +  */
>
> There's one thing we can easily do in this case: try to apply the
> same strength but on the smaller step size. If it fits the OOB area, we
> have a valid match, if it doesn't, then we can fallback to ECC
> maximization in case no valid settings were found after iterating over
> ECC settings.
>
> How about:
>
> if (setting->step < req_step &&
> setting->strength < req_strength)
> continue;

Make sense.  I will do this.



>> + if (setting->step < req_step)
>> + continue;
>
> You should probably check that setting->step < mtd->writesize.
>
>> +
>> + steps = mtd->writesize / setting->step;
>
> Not sure it will ever happen to have a step which is not a multiple of
> ->writesize, but it's probably safer to do DIV_ROUND_UP(), or simply
> skip the ECC setting entry if mtd->writesize % setting->step != 0.

OK.

mtd->writesize % setting->step != 0
will guarantee
setting->step <= mtd->writesize





>> +
>> + ecc_bytes = caps->calc_ecc_bytes(setting);
>> + if (WARN_ON_ONCE(ecc_bytes < 0))
>> + continue;
>> + ecc_bytes_total = ecc_bytes * steps;
>> +
>> + if (ecc_bytes_total > caps->avail_oobsize ||
>> + setting->strength * steps < req_corr)
>> + continue;
>> +
>> + /*
>> +  * We assume the best is to meet the chip's requrement
>> +  * with the least number of ECC bytes.
>> +  */
>
> If ecc_settings entries were in ascending order (lowest step-size and
> strength first), you could bail out as soon as you find a suitable
> config, because following settings would necessarily take more bits.

If we want to achieve this,
step-size must be descending order,
while strength must be ascending order.

This is a bit confusing (and I have no idea
how to statically check if every driver follows this order).



>> + /*
>> +  * If the number of correctable bits is the same,
>> +  * bigger ecc_step has more reliability.
>> +  */
>> + if (corr > best_corr ||
>> + (corr == best_corr && setting->step > best_setting->step)) 
>> {
>> + best_corr = corr;
>> + best_setting = setting;
>> + best_ecc_bytes = ecc_bytes;
>> + }
>
> Same comment as earlier: you could probably skip a few entries by
> enforcing ordering in the ecc_settings array.


Assuming that step-size is descending order and strength is ascending order,
it may be possible by iterating backward.

But, I think the array should be terminated by zero or NULL.
How to find the last entry we want to start iterating from?

Assuming the array size is small, is this worthwhile?



>> + }
>> +
>> + if (!best_setting)
>> + return -ENOTSUPP;
>> +
>> + chip->ecc.size = best_setting->step;
>> + chip->ecc.strength = best_setting->strength;
>> + chip->ecc.bytes = best_ecc_bytes;
>> +
>> + return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(nand_try_to_maximize_ecc);
>> +
>>  /*
>>   * Check if the chip configuration meet the datasheet requirements.
>>
>> diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
>> index 2ae781e..394128f 100644
>> --- a/include/linux/mtd/nand.h
>> +++ b/include/linux/mtd/nand.h
>> @@ -486,6 +486,28 @@ static inline void nand_hw_control_init(struct 
>> nand_hw_control *nfc)
>>  }
>>
>>  /**
>> + * struct nand_ecc_setting - information of ECC step supported by ECC engine
>> + * @step: data bytes per ECC step
>> + * @bytes: ECC bytes per step
>> + */
>> 

Re: [PATCH v2] x86/efi: Correct ident mapping of efi old_map when kalsr enabled

2017-05-07 Thread Russ Anderson
On Sat, May 06, 2017 at 01:36:20AM +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 09:42:14PM +0100, Matt Fleming wrote:
> > (Including the folks from SGI since this was hit on a UV system)
> 
> Wasn't there a BIOS fix supplied at some point which obviated the need
> to boot with efi=old_map on SGI boxes?

Yes, and other fixes to get new and old mapping working (except
for UV1 hardware).  The kaslr patchset broke booting with old
mapping.  That is the issue Baoquan, Bhupesh, and legacy SGI
engineers are trying to fix.


For those that want a more detailed summary:

In early 2014 upstream EFI changed the mapping, which lead to setting 
EFI_OLD_MEMMAP on all UV systems.

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a5d90c923bcfb9632d998ed06e9569216ad695f3

Later upstream fixes, plus a bios fix, got new mapping working.
Here are a couple of the fixes.

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=08914f436bdd2ed60923f49cbc402307aba20fe4
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/x86/platform/uv/bios_uv.c?id=f72075c9eda8a43aeea2f9dbb8d187afd4a76f0b

This patch enabled new EFI mapping on UV2+ platforms (all but UV1).
Note this is not bios version checking, it is hardware platform checking.

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/x86/platform/efi/quirks.c?id=d394f2d9d8e1e7b4959819344baf67b5995da9b0

One of the fixes to get new map to work broke old map.  This patch fixed it.

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/x86/platform/uv/bios_uv.c?id=caef78b6cdeddf4ad364f95910bba6b43b8eb9bf

So upstream with recent bios works on UV2, UV3, and UV4 hardware platforms,
both old and new mapping, with new mapping being the default.

Thanks.
-- 
Russ Anderson,  Hawks 2 Linux Kernel Group Manager
HPE - Hewlett Packard Enterprise (formerly SGI)  r...@hpe.com  (r...@sgi.com)


Re: [Patch v3] x86/build: don't add -maccumulate-outgoing-args w/o compiler support

2017-05-07 Thread Josh Poimboeuf
On Fri, May 05, 2017 at 08:05:09PM -0700, Nick Desaulniers wrote:
> Clang does not support this machine dependent option.
> 
> Older versions of GCC (pre 3.0) may not support this option, added in
> 2000, but it's unlikely they can still compile the kernel.
> 
> Signed-off-by: Nick Desaulniers 
> ---
>  arch/x86/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
> index 4430dd489620..5a0ac8411792 100644
> --- a/arch/x86/Makefile
> +++ b/arch/x86/Makefile
> @@ -179,7 +179,7 @@ ifdef CONFIG_JUMP_LABEL
>  endif
>  
>  ifeq ($(ACCUMULATE_OUTGOING_ARGS), 1)
> - KBUILD_CFLAGS += -maccumulate-outgoing-args
> + KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args,)
>  endif
>  
>  # Stackpointer is addressed different for 32 bit and 64 bit x86

It would be useful to add a comment in the Makefile above the
KBUILD_CFLAGS line stating that clang doesn't support this flag.

-- 
Josh


Re: [PATCH v2] x86/efi: Correct ident mapping of efi old_map when kalsr enabled

2017-05-07 Thread Russ Anderson
On Sat, May 06, 2017 at 01:36:20AM +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 09:42:14PM +0100, Matt Fleming wrote:
> > (Including the folks from SGI since this was hit on a UV system)
> 
> Wasn't there a BIOS fix supplied at some point which obviated the need
> to boot with efi=old_map on SGI boxes?

Yes, and other fixes to get new and old mapping working (except
for UV1 hardware).  The kaslr patchset broke booting with old
mapping.  That is the issue Baoquan, Bhupesh, and legacy SGI
engineers are trying to fix.


For those that want a more detailed summary:

In early 2014 upstream EFI changed the mapping, which lead to setting 
EFI_OLD_MEMMAP on all UV systems.

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a5d90c923bcfb9632d998ed06e9569216ad695f3

Later upstream fixes, plus a bios fix, got new mapping working.
Here are a couple of the fixes.

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=08914f436bdd2ed60923f49cbc402307aba20fe4
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/x86/platform/uv/bios_uv.c?id=f72075c9eda8a43aeea2f9dbb8d187afd4a76f0b

This patch enabled new EFI mapping on UV2+ platforms (all but UV1).
Note this is not bios version checking, it is hardware platform checking.

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/x86/platform/efi/quirks.c?id=d394f2d9d8e1e7b4959819344baf67b5995da9b0

One of the fixes to get new map to work broke old map.  This patch fixed it.

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/x86/platform/uv/bios_uv.c?id=caef78b6cdeddf4ad364f95910bba6b43b8eb9bf

So upstream with recent bios works on UV2, UV3, and UV4 hardware platforms,
both old and new mapping, with new mapping being the default.

Thanks.
-- 
Russ Anderson,  Hawks 2 Linux Kernel Group Manager
HPE - Hewlett Packard Enterprise (formerly SGI)  r...@hpe.com  (r...@sgi.com)


Re: [Patch v3] x86/build: don't add -maccumulate-outgoing-args w/o compiler support

2017-05-07 Thread Josh Poimboeuf
On Fri, May 05, 2017 at 08:05:09PM -0700, Nick Desaulniers wrote:
> Clang does not support this machine dependent option.
> 
> Older versions of GCC (pre 3.0) may not support this option, added in
> 2000, but it's unlikely they can still compile the kernel.
> 
> Signed-off-by: Nick Desaulniers 
> ---
>  arch/x86/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
> index 4430dd489620..5a0ac8411792 100644
> --- a/arch/x86/Makefile
> +++ b/arch/x86/Makefile
> @@ -179,7 +179,7 @@ ifdef CONFIG_JUMP_LABEL
>  endif
>  
>  ifeq ($(ACCUMULATE_OUTGOING_ARGS), 1)
> - KBUILD_CFLAGS += -maccumulate-outgoing-args
> + KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args,)
>  endif
>  
>  # Stackpointer is addressed different for 32 bit and 64 bit x86

It would be useful to add a comment in the Makefile above the
KBUILD_CFLAGS line stating that clang doesn't support this flag.

-- 
Josh


RE: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-07 Thread Kani, Toshimitsu
> * Dan Williams  wrote:
> 
> > On Sat, May 6, 2017 at 2:46 AM, Ingo Molnar  wrote:
> > >
> > > * Dan Williams  wrote:
> > >
> > >> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu 
> wrote:
> > >> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote:
> > >> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu
> 
> > >> >> wrote:
> > >> >  :
> > >> >> > > ---
> > >> >> > > Changes since the initial RFC:
> > >> >> > > * s/writethru/wt/ since we already have ioremap_wt(),
> > >> >> > > set_memory_wt(), etc. (Ingo)
> > >> >> >
> > >> >> > Sorry I should have said earlier, but I think the term "wt" is
> > >> >> > misleading.  Non-temporal stores used in memcpy_wt() provide WC
> > >> >> > semantics, not WT semantics.
> > >> >>
> > >> >> The non-temporal stores do, but memcpy_wt() is using a combination
> of
> > >> >> non-temporal stores and explicit cache flushing.
> > >> >>
> > >> >> > How about using "nocache" as it's been
> > >> >> > used in __copy_user_nocache()?
> > >> >>
> > >> >> The difference in my mind is that the "_nocache" suffix indicates
> > >> >> opportunistic / optional cache pollution avoidance whereas "_wt"
> > >> >> strictly arranges for caches not to contain dirty data upon
> > >> >> completion of the routine. For example, non-temporal stores on older
> > >> >> x86 cpus could potentially leave dirty data in the cache, so
> > >> >> memcpy_wt on those cpus would need to use explicit cache flushing.
> > >> >
> > >> > I see.  I agree that its behavior is different from the existing one
> > >> > with "_nocache".   That said, I think "wt" or "write-through" generally
> > >> > means that writes allocate cachelines and keep them clean by writing
> to
> > >> > memory.  So, subsequent reads to the destination will hit the
> > >> > cachelines.  This is not the case with this interface.
> > >>
> > >> True... maybe _nocache_strict()? Or, leave it _wt() until someone
> > >> comes along and is surprised that the cache is not warm for reads
> > >> after memcpy_wt(), at which point we can ask "why not just use plain
> > >> memcpy then?", or set the page-attributes to WT.
> > >
> > > Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal
> and that
> > > no cache line is left around afterwards (dirty or clean)?
> >
> > Yes, I think "flush" belongs in the name, and to make it easily
> > grep-able separate from _nocache we can call it _flushcache? An
> > efficient implementation will use _nocache / non-temporal stores
> > internally, but external consumers just care about the state of the
> > cache after the call.
> 
> _flushcache() works for me too.
> 

Works for me too.
Thanks,
-Toshi



RE: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-07 Thread Kani, Toshimitsu
> * Dan Williams  wrote:
> 
> > On Sat, May 6, 2017 at 2:46 AM, Ingo Molnar  wrote:
> > >
> > > * Dan Williams  wrote:
> > >
> > >> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu 
> wrote:
> > >> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote:
> > >> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu
> 
> > >> >> wrote:
> > >> >  :
> > >> >> > > ---
> > >> >> > > Changes since the initial RFC:
> > >> >> > > * s/writethru/wt/ since we already have ioremap_wt(),
> > >> >> > > set_memory_wt(), etc. (Ingo)
> > >> >> >
> > >> >> > Sorry I should have said earlier, but I think the term "wt" is
> > >> >> > misleading.  Non-temporal stores used in memcpy_wt() provide WC
> > >> >> > semantics, not WT semantics.
> > >> >>
> > >> >> The non-temporal stores do, but memcpy_wt() is using a combination
> of
> > >> >> non-temporal stores and explicit cache flushing.
> > >> >>
> > >> >> > How about using "nocache" as it's been
> > >> >> > used in __copy_user_nocache()?
> > >> >>
> > >> >> The difference in my mind is that the "_nocache" suffix indicates
> > >> >> opportunistic / optional cache pollution avoidance whereas "_wt"
> > >> >> strictly arranges for caches not to contain dirty data upon
> > >> >> completion of the routine. For example, non-temporal stores on older
> > >> >> x86 cpus could potentially leave dirty data in the cache, so
> > >> >> memcpy_wt on those cpus would need to use explicit cache flushing.
> > >> >
> > >> > I see.  I agree that its behavior is different from the existing one
> > >> > with "_nocache".   That said, I think "wt" or "write-through" generally
> > >> > means that writes allocate cachelines and keep them clean by writing
> to
> > >> > memory.  So, subsequent reads to the destination will hit the
> > >> > cachelines.  This is not the case with this interface.
> > >>
> > >> True... maybe _nocache_strict()? Or, leave it _wt() until someone
> > >> comes along and is surprised that the cache is not warm for reads
> > >> after memcpy_wt(), at which point we can ask "why not just use plain
> > >> memcpy then?", or set the page-attributes to WT.
> > >
> > > Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal
> and that
> > > no cache line is left around afterwards (dirty or clean)?
> >
> > Yes, I think "flush" belongs in the name, and to make it easily
> > grep-able separate from _nocache we can call it _flushcache? An
> > efficient implementation will use _nocache / non-temporal stores
> > internally, but external consumers just care about the state of the
> > cache after the call.
> 
> _flushcache() works for me too.
> 

Works for me too.
Thanks,
-Toshi



Re: [PATCH v2 1/2] mm: Uncharge poisoned pages

2017-05-07 Thread Naoya Horiguchi
On Tue, May 02, 2017 at 08:55:07PM +0200, Michal Hocko wrote:
> On Tue 02-05-17 16:59:30, Laurent Dufour wrote:
> > On 28/04/2017 15:48, Michal Hocko wrote:
> [...]
> > > This is getting quite hairy. What is the expected page count of the
> > > hwpoison page?
> 
> OK, so from the quick check of the hwpoison code it seems that the ref
> count will be > 1 (from get_hwpoison_page).
> 
> > > I guess we would need to update the VM_BUG_ON in the
> > > memcg uncharge code to ignore the page count of hwpoison pages if it can
> > > be arbitrary.
> > 
> > Based on the experiment I did, page count == 2 when isolate_lru_page()
> > succeeds, even in the case of a poisoned page.
> 
> that would make some sense to me. The page should have been already
> unmapped therefore but memory_failure increases the ref count and 1 is
> for isolate_lru_page().

# sorry for late reply, I was on holidays last week...

Right, and the refcount taken for memory_failure is not freed after
memory_failure() returns. unpoison_memory() does free the refcount.

> 
> > In my case I think this
> > is because the page is still used by the process which is calling madvise().
> > 
> > I'm wondering if I'm looking at the right place. May be the poisoned
> > page should remain attach to the memory_cgroup until no one is using it.
> > In that case this means that something should be done when the page is
> > off-lined... I've to dig further here.
> 
> No, AFAIU the page will not drop the reference count down to 0 in most
> cases. Maybe there are some scenarios where this can happen but I would
> expect that the poisoned page will be mapped and in use most of the time
> and won't drop down 0. And then we should really uncharge it because it
> will pin the memcg and make it unfreeable which doesn't seem to be what
> we want.  So does the following work reasonable? Andi, Johannes, what do
> you think? I cannot say I would be really comfortable touching hwpoison
> code as I really do not understand the workflow. Maybe we want to move
> this uncharge down to memory_failure() right before we report success?

memory_failure() can be called for any types of page (including slab or
any kernel/driver pages), and the reported problem seems happen only on
in-use user pages, so uncharging in delete_from_lru_cache() as done below
looks better to me.

> ---
> From 8bf0791bcf35996a859b6d33fb5494e5b53de49d Mon Sep 17 00:00:00 2001
> From: Michal Hocko 
> Date: Tue, 2 May 2017 20:32:24 +0200
> Subject: [PATCH] hwpoison, memcg: forcibly uncharge LRU pages
> 
> Laurent Dufour has noticed that hwpoinsoned pages are kept charged. In
> his particular case he has hit a bad_page("page still charged to cgroup")
> when onlining a hwpoison page.

> While this looks like something that shouldn't
> happen in the first place because onlining hwpages and returning them to
> the page allocator makes only little sense it shows a real problem.
> 
> hwpoison pages do not get freed usually so we do not uncharge them (at
> least not since 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API")).
> Each charge pins memcg (since e8ea14cc6ead ("mm: memcontrol: take a css
> reference for each charged page")) as well and so the mem_cgroup and the
> associated state will never go away. Fix this leak by forcibly
> uncharging a LRU hwpoisoned page in delete_from_lru_cache(). We also
> have to tweak uncharge_list because it cannot rely on zero ref count
> for these pages.
> 
> Fixes: 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API")
> Reported-by: Laurent Dufour 
> Signed-off-by: Michal Hocko 

Reviewed-by: Naoya Horiguchi 

> ---
>  mm/memcontrol.c | 2 +-
>  mm/memory-failure.c | 7 +++
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 16c556ac103d..4cf26059adb1 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5527,7 +5527,7 @@ static void uncharge_list(struct list_head *page_list)
>   next = page->lru.next;
>  
>   VM_BUG_ON_PAGE(PageLRU(page), page);
> - VM_BUG_ON_PAGE(page_count(page), page);
> + VM_BUG_ON_PAGE(!PageHWPoison(page) && page_count(page), page);
>  
>   if (!page->mem_cgroup)
>   continue;
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 8a6bd3a9eb1e..4497d9619bb4 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -541,6 +541,13 @@ static int delete_from_lru_cache(struct page *p)
>*/
>   ClearPageActive(p);
>   ClearPageUnevictable(p);
> +
> + /*
> +  * Poisoned page might never drop its ref count to 0 so we have 
> to
> +  * uncharge it manually from its memcg.
> +  */
> + mem_cgroup_uncharge(p);
> +
>   /*
>* drop the page count elevated by isolate_lru_page()
>  

Re: [PATCH v2 1/2] mm: Uncharge poisoned pages

2017-05-07 Thread Naoya Horiguchi
On Tue, May 02, 2017 at 08:55:07PM +0200, Michal Hocko wrote:
> On Tue 02-05-17 16:59:30, Laurent Dufour wrote:
> > On 28/04/2017 15:48, Michal Hocko wrote:
> [...]
> > > This is getting quite hairy. What is the expected page count of the
> > > hwpoison page?
> 
> OK, so from the quick check of the hwpoison code it seems that the ref
> count will be > 1 (from get_hwpoison_page).
> 
> > > I guess we would need to update the VM_BUG_ON in the
> > > memcg uncharge code to ignore the page count of hwpoison pages if it can
> > > be arbitrary.
> > 
> > Based on the experiment I did, page count == 2 when isolate_lru_page()
> > succeeds, even in the case of a poisoned page.
> 
> that would make some sense to me. The page should have been already
> unmapped therefore but memory_failure increases the ref count and 1 is
> for isolate_lru_page().

# sorry for late reply, I was on holidays last week...

Right, and the refcount taken for memory_failure is not freed after
memory_failure() returns. unpoison_memory() does free the refcount.

> 
> > In my case I think this
> > is because the page is still used by the process which is calling madvise().
> > 
> > I'm wondering if I'm looking at the right place. May be the poisoned
> > page should remain attach to the memory_cgroup until no one is using it.
> > In that case this means that something should be done when the page is
> > off-lined... I've to dig further here.
> 
> No, AFAIU the page will not drop the reference count down to 0 in most
> cases. Maybe there are some scenarios where this can happen but I would
> expect that the poisoned page will be mapped and in use most of the time
> and won't drop down 0. And then we should really uncharge it because it
> will pin the memcg and make it unfreeable which doesn't seem to be what
> we want.  So does the following work reasonable? Andi, Johannes, what do
> you think? I cannot say I would be really comfortable touching hwpoison
> code as I really do not understand the workflow. Maybe we want to move
> this uncharge down to memory_failure() right before we report success?

memory_failure() can be called for any types of page (including slab or
any kernel/driver pages), and the reported problem seems happen only on
in-use user pages, so uncharging in delete_from_lru_cache() as done below
looks better to me.

> ---
> From 8bf0791bcf35996a859b6d33fb5494e5b53de49d Mon Sep 17 00:00:00 2001
> From: Michal Hocko 
> Date: Tue, 2 May 2017 20:32:24 +0200
> Subject: [PATCH] hwpoison, memcg: forcibly uncharge LRU pages
> 
> Laurent Dufour has noticed that hwpoinsoned pages are kept charged. In
> his particular case he has hit a bad_page("page still charged to cgroup")
> when onlining a hwpoison page.

> While this looks like something that shouldn't
> happen in the first place because onlining hwpages and returning them to
> the page allocator makes only little sense it shows a real problem.
> 
> hwpoison pages do not get freed usually so we do not uncharge them (at
> least not since 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API")).
> Each charge pins memcg (since e8ea14cc6ead ("mm: memcontrol: take a css
> reference for each charged page")) as well and so the mem_cgroup and the
> associated state will never go away. Fix this leak by forcibly
> uncharging a LRU hwpoisoned page in delete_from_lru_cache(). We also
> have to tweak uncharge_list because it cannot rely on zero ref count
> for these pages.
> 
> Fixes: 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API")
> Reported-by: Laurent Dufour 
> Signed-off-by: Michal Hocko 

Reviewed-by: Naoya Horiguchi 

> ---
>  mm/memcontrol.c | 2 +-
>  mm/memory-failure.c | 7 +++
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 16c556ac103d..4cf26059adb1 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5527,7 +5527,7 @@ static void uncharge_list(struct list_head *page_list)
>   next = page->lru.next;
>  
>   VM_BUG_ON_PAGE(PageLRU(page), page);
> - VM_BUG_ON_PAGE(page_count(page), page);
> + VM_BUG_ON_PAGE(!PageHWPoison(page) && page_count(page), page);
>  
>   if (!page->mem_cgroup)
>   continue;
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 8a6bd3a9eb1e..4497d9619bb4 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -541,6 +541,13 @@ static int delete_from_lru_cache(struct page *p)
>*/
>   ClearPageActive(p);
>   ClearPageUnevictable(p);
> +
> + /*
> +  * Poisoned page might never drop its ref count to 0 so we have 
> to
> +  * uncharge it manually from its memcg.
> +  */
> + mem_cgroup_uncharge(p);
> +
>   /*
>* drop the page count elevated by isolate_lru_page()
>*/
> -- 
> 2.11.0
> 
> -- 
> Michal Hocko
> SUSE Labs
> 


Re: Missing File REPORTING-BUGS In Linux Kernel.

2017-05-07 Thread Randy Dunlap
On 05/07/17 19:17, Anil Nair wrote:
> Hi,
> 
> Could not find the proper person hence putting it in general mailing
> list, A file named "REPORTING-BUGS" is missing in the current release
> of the linux kernel i.e. v4.11.
> 
> I found this is when using debian tools to compile a latest release
> using command,
> 
> "fakeroot make-kpkg --initrd --revision=1.0.AN kernel_image kernel_headers -j 
> 4"
> 
> The error i received was,
> 
> "install -p-o root -g root  -m  644 REPORTING-BUGS
>  
> /home/anilnair/linux-kernel/linux-stable/debian/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty/usr/share/doc/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty/
> install: cannot stat 'REPORTING-BUGS': No such file or directory
> debian/ruleset/targets/headers.mk:40: recipe for target
> 'debian/stamp/install/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty'
> failed
> make[1]: *** 
> [debian/stamp/install/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty]
> Error 1
> make[1]: Leaving directory '/home/anilnair/linux-kernel/linux-stable'
> debian/ruleset/local.mk:102: recipe for target 'kernel_headers' failed
> make: *** [kernel_headers] Error 2"
> 
> Though I could find the same file in linux kernel v4.8.6.
> 

Hi,

In v4.10, the file was moved (and renamed and edited) to
"Documentation/admin-guide/reporting-bugs.rst".

I suppose that some Debian install script needs to be updated.

thanks.
-- 
~Randy


Re: Missing File REPORTING-BUGS In Linux Kernel.

2017-05-07 Thread Randy Dunlap
On 05/07/17 19:17, Anil Nair wrote:
> Hi,
> 
> Could not find the proper person hence putting it in general mailing
> list, A file named "REPORTING-BUGS" is missing in the current release
> of the linux kernel i.e. v4.11.
> 
> I found this is when using debian tools to compile a latest release
> using command,
> 
> "fakeroot make-kpkg --initrd --revision=1.0.AN kernel_image kernel_headers -j 
> 4"
> 
> The error i received was,
> 
> "install -p-o root -g root  -m  644 REPORTING-BUGS
>  
> /home/anilnair/linux-kernel/linux-stable/debian/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty/usr/share/doc/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty/
> install: cannot stat 'REPORTING-BUGS': No such file or directory
> debian/ruleset/targets/headers.mk:40: recipe for target
> 'debian/stamp/install/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty'
> failed
> make[1]: *** 
> [debian/stamp/install/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty]
> Error 1
> make[1]: Leaving directory '/home/anilnair/linux-kernel/linux-stable'
> debian/ruleset/local.mk:102: recipe for target 'kernel_headers' failed
> make: *** [kernel_headers] Error 2"
> 
> Though I could find the same file in linux kernel v4.8.6.
> 

Hi,

In v4.10, the file was moved (and renamed and edited) to
"Documentation/admin-guide/reporting-bugs.rst".

I suppose that some Debian install script needs to be updated.

thanks.
-- 
~Randy


[PATCH v2] perf report: Make --branch-history work without callgraphs(-g) option in perf record

2017-05-07 Thread Jin Yao
perf record -b -g 
perf report --branch-history

This merges the LBRs with the callgraphs.

However it would be nice if it also works without callgraphs (-g)
set in perf record, so that only the LBRs are displayed.
But currently perf report errors in this case. For example,

perf record -b 
perf report --branch-history

Error:
Selected -g or --branch-history but no callchain data. Did
you call 'perf record' without -g?

This patch displays the LBRs only even if callgraphs(-g) is not
enabled in perf record.

Change log:
--
v2: According to Milian Wolff's comment, change the obsolete error
message. Now the error message is:

 ┌─Error:─┐
 │Selected -g or --branch-history.│
 │But no callchain or branch data.│
 │Did you call 'perf record' without -g or -b?│
 ││
 ││
 │Press any key...│
 └┘

When passing the last parameter to hists__fprintf,
changes "|" to "||".

hsts__fprintf(hists, !quiet, 0, 0, rep->min_percent, stdout,
  symbol_conf.use_callchain ||
  symbol_conf.show_branchflag_count);

Signed-off-by: Jin Yao 
---
 tools/perf/builtin-report.c | 12 +++-
 tools/perf/util/callchain.c |  7 ---
 tools/perf/util/hist.c  |  2 ++
 tools/perf/util/machine.c   | 13 -
 4 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 22478ff..4ceb2d2 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -259,10 +259,11 @@ static int report__setup_sample_type(struct report *rep)
"'perf record' without -g?\n");
return -EINVAL;
}
-   if (symbol_conf.use_callchain) {
-   ui__error("Selected -g or --branch-history but no "
- "callchain data. Did\n"
- "you call 'perf record' without -g?\n");
+   if (symbol_conf.use_callchain &&
+   !symbol_conf.show_branchflag_count) {
+   ui__error("Selected -g or --branch-history.\n"
+ "But no callchain or branch data.\n"
+ "Did you call 'perf record' without -g or 
-b?\n");
return -1;
}
} else if (!callchain_param.enabled &&
@@ -397,7 +398,8 @@ static int perf_evlist__tty_browse_hists(struct perf_evlist 
*evlist,
 
hists__fprintf_nr_sample_events(hists, rep, evname, stdout);
hists__fprintf(hists, !quiet, 0, 0, rep->min_percent, stdout,
-  symbol_conf.use_callchain);
+  symbol_conf.use_callchain ||
+  symbol_conf.show_branchflag_count);
fprintf(stdout, "\n\n");
}
 
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 81fc29a..08d3abf 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -993,11 +993,11 @@ int sample__resolve_callchain(struct perf_sample *sample,
  struct perf_evsel *evsel, struct addr_location 
*al,
  int max_stack)
 {
-   if (sample->callchain == NULL)
+   if (sample->callchain == NULL && !symbol_conf.show_branchflag_count)
return 0;
 
if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain ||
-   perf_hpp_list.parent) {
+   perf_hpp_list.parent || symbol_conf.show_branchflag_count) {
return thread__resolve_callchain(al->thread, cursor, evsel, 
sample,
 parent, al, max_stack);
}
@@ -1006,7 +1006,8 @@ int sample__resolve_callchain(struct perf_sample *sample,
 
 int hist_entry__append_callchain(struct hist_entry *he, struct perf_sample 
*sample)
 {
-   if (!symbol_conf.use_callchain || sample->callchain == NULL)
+   if ((!symbol_conf.use_callchain || sample->callchain == NULL) &&
+   !symbol_conf.show_branchflag_count)
return 0;
return callchain_append(he->callchain, _cursor, 
sample->period);
 }
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index cf0186a..8b045a5 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1762,6 +1762,8 @@ void perf_evsel__output_resort(struct perf_evsel *evsel, 
struct ui_progress *pro
else
use_callchain = symbol_conf.use_callchain;
 
+   use_callchain |= symbol_conf.show_branchflag_count;
+
output_resort(evsel__hists(evsel), prog, 

[PATCH v2] perf report: Make --branch-history work without callgraphs(-g) option in perf record

2017-05-07 Thread Jin Yao
perf record -b -g 
perf report --branch-history

This merges the LBRs with the callgraphs.

However it would be nice if it also works without callgraphs (-g)
set in perf record, so that only the LBRs are displayed.
But currently perf report errors in this case. For example,

perf record -b 
perf report --branch-history

Error:
Selected -g or --branch-history but no callchain data. Did
you call 'perf record' without -g?

This patch displays the LBRs only even if callgraphs(-g) is not
enabled in perf record.

Change log:
--
v2: According to Milian Wolff's comment, change the obsolete error
message. Now the error message is:

 ┌─Error:─┐
 │Selected -g or --branch-history.│
 │But no callchain or branch data.│
 │Did you call 'perf record' without -g or -b?│
 ││
 ││
 │Press any key...│
 └┘

When passing the last parameter to hists__fprintf,
changes "|" to "||".

hsts__fprintf(hists, !quiet, 0, 0, rep->min_percent, stdout,
  symbol_conf.use_callchain ||
  symbol_conf.show_branchflag_count);

Signed-off-by: Jin Yao 
---
 tools/perf/builtin-report.c | 12 +++-
 tools/perf/util/callchain.c |  7 ---
 tools/perf/util/hist.c  |  2 ++
 tools/perf/util/machine.c   | 13 -
 4 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 22478ff..4ceb2d2 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -259,10 +259,11 @@ static int report__setup_sample_type(struct report *rep)
"'perf record' without -g?\n");
return -EINVAL;
}
-   if (symbol_conf.use_callchain) {
-   ui__error("Selected -g or --branch-history but no "
- "callchain data. Did\n"
- "you call 'perf record' without -g?\n");
+   if (symbol_conf.use_callchain &&
+   !symbol_conf.show_branchflag_count) {
+   ui__error("Selected -g or --branch-history.\n"
+ "But no callchain or branch data.\n"
+ "Did you call 'perf record' without -g or 
-b?\n");
return -1;
}
} else if (!callchain_param.enabled &&
@@ -397,7 +398,8 @@ static int perf_evlist__tty_browse_hists(struct perf_evlist 
*evlist,
 
hists__fprintf_nr_sample_events(hists, rep, evname, stdout);
hists__fprintf(hists, !quiet, 0, 0, rep->min_percent, stdout,
-  symbol_conf.use_callchain);
+  symbol_conf.use_callchain ||
+  symbol_conf.show_branchflag_count);
fprintf(stdout, "\n\n");
}
 
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 81fc29a..08d3abf 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -993,11 +993,11 @@ int sample__resolve_callchain(struct perf_sample *sample,
  struct perf_evsel *evsel, struct addr_location 
*al,
  int max_stack)
 {
-   if (sample->callchain == NULL)
+   if (sample->callchain == NULL && !symbol_conf.show_branchflag_count)
return 0;
 
if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain ||
-   perf_hpp_list.parent) {
+   perf_hpp_list.parent || symbol_conf.show_branchflag_count) {
return thread__resolve_callchain(al->thread, cursor, evsel, 
sample,
 parent, al, max_stack);
}
@@ -1006,7 +1006,8 @@ int sample__resolve_callchain(struct perf_sample *sample,
 
 int hist_entry__append_callchain(struct hist_entry *he, struct perf_sample 
*sample)
 {
-   if (!symbol_conf.use_callchain || sample->callchain == NULL)
+   if ((!symbol_conf.use_callchain || sample->callchain == NULL) &&
+   !symbol_conf.show_branchflag_count)
return 0;
return callchain_append(he->callchain, _cursor, 
sample->period);
 }
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index cf0186a..8b045a5 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1762,6 +1762,8 @@ void perf_evsel__output_resort(struct perf_evsel *evsel, 
struct ui_progress *pro
else
use_callchain = symbol_conf.use_callchain;
 
+   use_callchain |= symbol_conf.show_branchflag_count;
+
output_resort(evsel__hists(evsel), prog, use_callchain, NULL);
 }
 

Re: [PATCH v3] x86/mm: Fix incorrect for loop count calculation in sync_global_pgds

2017-05-07 Thread Baoquan He
On 05/04/17 at 09:25am, Thomas Garnier wrote:

> > I think this needs a "Fixes:" tag and Cc: .

Sorry for late response, should I resend with them?

> 
> Agreed.
> 
> >
> > Other than that:
> >
> > Reviewed-by: Dan Williams 
> 
> Thanks again!
> 
> Reviewed-by: Thomas Garnier 
> -- 
> Thomas


Re: [PATCH v3] x86/mm: Fix incorrect for loop count calculation in sync_global_pgds

2017-05-07 Thread Baoquan He
On 05/04/17 at 09:25am, Thomas Garnier wrote:

> > I think this needs a "Fixes:" tag and Cc: .

Sorry for late response, should I resend with them?

> 
> Agreed.
> 
> >
> > Other than that:
> >
> > Reviewed-by: Dan Williams 
> 
> Thanks again!
> 
> Reviewed-by: Thomas Garnier 
> -- 
> Thomas


Missing File REPORTING-BUGS In Linux Kernel.

2017-05-07 Thread Anil Nair
Hi,

Could not find the proper person hence putting it in general mailing
list, A file named "REPORTING-BUGS" is missing in the current release
of the linux kernel i.e. v4.11.

I found this is when using debian tools to compile a latest release
using command,

"fakeroot make-kpkg --initrd --revision=1.0.AN kernel_image kernel_headers -j 4"

The error i received was,

"install -p-o root -g root  -m  644 REPORTING-BUGS
 
/home/anilnair/linux-kernel/linux-stable/debian/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty/usr/share/doc/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty/
install: cannot stat 'REPORTING-BUGS': No such file or directory
debian/ruleset/targets/headers.mk:40: recipe for target
'debian/stamp/install/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty'
failed
make[1]: *** 
[debian/stamp/install/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty]
Error 1
make[1]: Leaving directory '/home/anilnair/linux-kernel/linux-stable'
debian/ruleset/local.mk:102: recipe for target 'kernel_headers' failed
make: *** [kernel_headers] Error 2"

Though I could find the same file in linux kernel v4.8.6.

-- 
--
Regards,
Anil Nair


Missing File REPORTING-BUGS In Linux Kernel.

2017-05-07 Thread Anil Nair
Hi,

Could not find the proper person hence putting it in general mailing
list, A file named "REPORTING-BUGS" is missing in the current release
of the linux kernel i.e. v4.11.

I found this is when using debian tools to compile a latest release
using command,

"fakeroot make-kpkg --initrd --revision=1.0.AN kernel_image kernel_headers -j 4"

The error i received was,

"install -p-o root -g root  -m  644 REPORTING-BUGS
 
/home/anilnair/linux-kernel/linux-stable/debian/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty/usr/share/doc/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty/
install: cannot stat 'REPORTING-BUGS': No such file or directory
debian/ruleset/targets/headers.mk:40: recipe for target
'debian/stamp/install/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty'
failed
make[1]: *** 
[debian/stamp/install/linux-headers-4.11.0anilnair-07650-ga1be8ed-dirty]
Error 1
make[1]: Leaving directory '/home/anilnair/linux-kernel/linux-stable'
debian/ruleset/local.mk:102: recipe for target 'kernel_headers' failed
make: *** [kernel_headers] Error 2"

Though I could find the same file in linux kernel v4.8.6.

-- 
--
Regards,
Anil Nair


Re: [PATCH] ceph: Check that the new inode size is within limits in ceph_fallocate()

2017-05-07 Thread Yan, Zheng
On Sat, May 6, 2017 at 1:28 AM, Luis Henriques  wrote:
> Currently the ceph client doesn't respect the rlimit in fallocate.  This
> means that a user can allocate a file with size > RLIMIT_FSIZE.  This
> patch adds the call to inode_newsize_ok() to verify filesystem limits and
> ulimits.  This should make ceph successfully run xfstest generic/228.
>
> Signed-off-by: Luis Henriques 
> ---
>  fs/ceph/file.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
> index 26cc95421cca..bc5809d4d2d4 100644
> --- a/fs/ceph/file.c
> +++ b/fs/ceph/file.c
> @@ -1636,8 +1636,12 @@ static long ceph_fallocate(struct file *file, int mode,
> }
>
> size = i_size_read(inode);
> -   if (!(mode & FALLOC_FL_KEEP_SIZE))
> +   if (!(mode & FALLOC_FL_KEEP_SIZE)) {
> endoff = offset + length;
> +   ret = inode_newsize_ok(inode, endoff);
> +   if (ret)
> +   goto unlock;
> +   }
>
> if (fi->fmode & CEPH_FILE_MODE_LAZY)
> want = CEPH_CAP_FILE_BUFFER | CEPH_CAP_FILE_LAZYIO;
> --

Applied, thanks

Yan, Zheng

> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ceph: Check that the new inode size is within limits in ceph_fallocate()

2017-05-07 Thread Yan, Zheng
On Sat, May 6, 2017 at 1:28 AM, Luis Henriques  wrote:
> Currently the ceph client doesn't respect the rlimit in fallocate.  This
> means that a user can allocate a file with size > RLIMIT_FSIZE.  This
> patch adds the call to inode_newsize_ok() to verify filesystem limits and
> ulimits.  This should make ceph successfully run xfstest generic/228.
>
> Signed-off-by: Luis Henriques 
> ---
>  fs/ceph/file.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
> index 26cc95421cca..bc5809d4d2d4 100644
> --- a/fs/ceph/file.c
> +++ b/fs/ceph/file.c
> @@ -1636,8 +1636,12 @@ static long ceph_fallocate(struct file *file, int mode,
> }
>
> size = i_size_read(inode);
> -   if (!(mode & FALLOC_FL_KEEP_SIZE))
> +   if (!(mode & FALLOC_FL_KEEP_SIZE)) {
> endoff = offset + length;
> +   ret = inode_newsize_ok(inode, endoff);
> +   if (ret)
> +   goto unlock;
> +   }
>
> if (fi->fmode & CEPH_FILE_MODE_LAZY)
> want = CEPH_CAP_FILE_BUFFER | CEPH_CAP_FILE_LAZYIO;
> --

Applied, thanks

Yan, Zheng

> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] lightnvm: remove unused rq parameter of nvme_nvm_rqtocmd() to kill warning

2017-05-07 Thread Jens Axboe
On 05/07/2017 08:14 AM, Geert Uytterhoeven wrote:
> With gcc 4.1.2:

Wow, that's ancient.

> drivers/nvme/host/lightnvm.c: In function ‘nvme_nvm_submit_io’:
> drivers/nvme/host/lightnvm.c:498: warning: ‘rq’ is used uninitialized in 
> this function
> 
> Indeed, since commit 2e13f33a2464fc3a ("lightnvm: create cmd before
> allocating request"), the request is passed to nvme_nvm_rqtocmd() before
> it is allocated.
> 
> Fortunately, as of commit 91276162de9476b8 ("lightnvm: refactor end_io
> functions for sync"), nvme_nvm_rqtocmd () no longer uses the passed
> request, so this warning is a false positive.
> 
> Drop the unused parameter to clean up the code and kill the warning.

Thanks, applied.

-- 
Jens Axboe



Re: [PATCH] lightnvm: remove unused rq parameter of nvme_nvm_rqtocmd() to kill warning

2017-05-07 Thread Jens Axboe
On 05/07/2017 08:14 AM, Geert Uytterhoeven wrote:
> With gcc 4.1.2:

Wow, that's ancient.

> drivers/nvme/host/lightnvm.c: In function ‘nvme_nvm_submit_io’:
> drivers/nvme/host/lightnvm.c:498: warning: ‘rq’ is used uninitialized in 
> this function
> 
> Indeed, since commit 2e13f33a2464fc3a ("lightnvm: create cmd before
> allocating request"), the request is passed to nvme_nvm_rqtocmd() before
> it is allocated.
> 
> Fortunately, as of commit 91276162de9476b8 ("lightnvm: refactor end_io
> functions for sync"), nvme_nvm_rqtocmd () no longer uses the passed
> request, so this warning is a false positive.
> 
> Drop the unused parameter to clean up the code and kill the warning.

Thanks, applied.

-- 
Jens Axboe



Re: [PATCH] drm/i915: Make vblank evade warnings optional

2017-05-07 Thread Jens Axboe
On 05/07/2017 11:56 AM, Daniel Vetter wrote:
> On Sun, May 7, 2017 at 7:46 PM, Jens Axboe  wrote:
>> On 05/07/2017 11:12 AM, ville.syrj...@linux.intel.com wrote:
>>> From: Ville Syrjälä 
>>>
>>> Add a new Kconfig option to enable/disable the extra warnings
>>> from the vblank evade code. For now we'll keep the warning
>>> about an actually missed vblank always enabled as that can have
>>> an actual user visible impact. But if we miss the deadline
>>> othrwise there's no real need to bother the user with that.
>>> We'll want these warnings enabled during development however
>>> so that we can catch regressions.
>>>
>>> Based on the reports it looks like this is still very easy
>>> to hit on SKL, so we have more work ahead of us to optimize
>>> the crtiical section further.
>>
>> Shouldn't it just be a debug printk or something instead, so that normal
>> people don't see it, but the folks that turn on debugging can get the
>> info they need? Seems silly to add a kconfig option for this.
> 
> I guess we could keep it as debug for users, but we want to make this
> a hard failure on our CI machines. Kconfig knob is the easiest to roll
> out to all machines.

Wouldn't a module parameter be more useful then, as an opt-in
to catch these violations.

Nobody is going to know wtf to set this kconfig option to.

-- 
Jens Axboe



Re: [PATCH] drm/i915: Make vblank evade warnings optional

2017-05-07 Thread Jens Axboe
On 05/07/2017 11:56 AM, Daniel Vetter wrote:
> On Sun, May 7, 2017 at 7:46 PM, Jens Axboe  wrote:
>> On 05/07/2017 11:12 AM, ville.syrj...@linux.intel.com wrote:
>>> From: Ville Syrjälä 
>>>
>>> Add a new Kconfig option to enable/disable the extra warnings
>>> from the vblank evade code. For now we'll keep the warning
>>> about an actually missed vblank always enabled as that can have
>>> an actual user visible impact. But if we miss the deadline
>>> othrwise there's no real need to bother the user with that.
>>> We'll want these warnings enabled during development however
>>> so that we can catch regressions.
>>>
>>> Based on the reports it looks like this is still very easy
>>> to hit on SKL, so we have more work ahead of us to optimize
>>> the crtiical section further.
>>
>> Shouldn't it just be a debug printk or something instead, so that normal
>> people don't see it, but the folks that turn on debugging can get the
>> info they need? Seems silly to add a kconfig option for this.
> 
> I guess we could keep it as debug for users, but we want to make this
> a hard failure on our CI machines. Kconfig knob is the easiest to roll
> out to all machines.

Wouldn't a module parameter be more useful then, as an opt-in
to catch these violations.

Nobody is going to know wtf to set this kconfig option to.

-- 
Jens Axboe



Re: [PATCH v2] block/mq: fix potential deadlock during cpu hotplug

2017-05-07 Thread Jens Axboe
On 05/07/2017 01:14 AM, Wanpeng Li wrote:
> From: Wanpeng Li 
> 
> This can be triggered by hot-unplug one cpu.
> 
> ==
>  [ INFO: possible circular locking dependency detected ]
>  4.11.0+ #17 Not tainted
>  ---
>  step_after_susp/2640 is trying to acquire lock:
>   (all_q_mutex){+.+...}, at: [] 
> blk_mq_queue_reinit_work+0x18/0x110
> 
>  but task is already holding lock:
>   (cpu_hotplug.lock){+.+.+.}, at: [] 
> cpu_hotplug_begin+0x7f/0xe0
> 
>  which lock already depends on the new lock.
> 
> 
>  the existing dependency chain (in reverse order) is:
> 
>  -> #1 (cpu_hotplug.lock){+.+.+.}:
> lock_acquire+0x11c/0x230
> __mutex_lock+0x92/0x990
> mutex_lock_nested+0x1b/0x20
> get_online_cpus+0x64/0x80
> blk_mq_init_allocated_queue+0x3a0/0x4e0
> blk_mq_init_queue+0x3a/0x60
> loop_add+0xe5/0x280
> loop_init+0x124/0x177
> do_one_initcall+0x53/0x1c0
> kernel_init_freeable+0x1e3/0x27f
> kernel_init+0xe/0x100
> ret_from_fork+0x31/0x40
> 
>  -> #0 (all_q_mutex){+.+...}:
> __lock_acquire+0x189a/0x18a0
> lock_acquire+0x11c/0x230
> __mutex_lock+0x92/0x990
> mutex_lock_nested+0x1b/0x20
> blk_mq_queue_reinit_work+0x18/0x110
> blk_mq_queue_reinit_dead+0x1c/0x20
> cpuhp_invoke_callback+0x1f2/0x810
> cpuhp_down_callbacks+0x42/0x80
> _cpu_down+0xb2/0xe0
> freeze_secondary_cpus+0xb6/0x390
> suspend_devices_and_enter+0x3b3/0xa40
> pm_suspend+0x129/0x490
> state_store+0x82/0xf0
> kobj_attr_store+0xf/0x20
> sysfs_kf_write+0x45/0x60
> kernfs_fop_write+0x135/0x1c0
> __vfs_write+0x37/0x160
> vfs_write+0xcd/0x1d0
> SyS_write+0x58/0xc0
> do_syscall_64+0x8f/0x710
> return_from_SYSCALL_64+0x0/0x7a
> 
>  other info that might help us debug this:
> 
>   Possible unsafe locking scenario:
> 
> CPU0CPU1
> 
>lock(cpu_hotplug.lock);
> lock(all_q_mutex);
> lock(cpu_hotplug.lock);
>lock(all_q_mutex);
> 
>   *** DEADLOCK ***
> 
>  8 locks held by step_after_susp/2640:
>   #0:  (sb_writers#6){.+.+.+}, at: [] vfs_write+0x1ad/0x1d0
>   #1:  (>mutex){+.+.+.}, at: [] 
> kernfs_fop_write+0x101/0x1c0
>   #2:  (s_active#166){.+.+.+}, at: [] 
> kernfs_fop_write+0x109/0x1c0
>   #3:  (pm_mutex){+.+...}, at: [] pm_suspend+0x21d/0x490
>   #4:  (acpi_scan_lock){+.+.+.}, at: [] 
> acpi_scan_lock_acquire+0x17/0x20
>   #5:  (cpu_add_remove_lock){+.+.+.}, at: [] 
> freeze_secondary_cpus+0x27/0x390
>   #6:  (cpu_hotplug.dep_map){++}, at: [] 
> cpu_hotplug_begin+0x5/0xe0
>   #7:  (cpu_hotplug.lock){+.+.+.}, at: [] 
> cpu_hotplug_begin+0x7f/0xe0
> 
>  stack backtrace:
>  CPU: 3 PID: 2640 Comm: step_after_susp Not tainted 4.11.0+ #17
>  Hardware name: Dell Inc. OptiPlex 7040/0JCTF8, BIOS 1.4.9 09/12/2016
>  Call Trace:
>   dump_stack+0x99/0xce
>   print_circular_bug+0x1fa/0x270
>   __lock_acquire+0x189a/0x18a0
>   lock_acquire+0x11c/0x230
>   ? lock_acquire+0x11c/0x230
>   ? blk_mq_queue_reinit_work+0x18/0x110
>   ? blk_mq_queue_reinit_work+0x18/0x110
>   __mutex_lock+0x92/0x990
>   ? blk_mq_queue_reinit_work+0x18/0x110
>   ? kmem_cache_free+0x2cb/0x330
>   ? anon_transport_class_unregister+0x20/0x20
>   ? blk_mq_queue_reinit_work+0x110/0x110
>   mutex_lock_nested+0x1b/0x20
>   ? mutex_lock_nested+0x1b/0x20
>   blk_mq_queue_reinit_work+0x18/0x110
>   blk_mq_queue_reinit_dead+0x1c/0x20
>   cpuhp_invoke_callback+0x1f2/0x810
>   ? __flow_cache_shrink+0x160/0x160
>   cpuhp_down_callbacks+0x42/0x80
>   _cpu_down+0xb2/0xe0
>   freeze_secondary_cpus+0xb6/0x390
>   suspend_devices_and_enter+0x3b3/0xa40
>   ? rcu_read_lock_sched_held+0x79/0x80
>   pm_suspend+0x129/0x490
>   state_store+0x82/0xf0
>   kobj_attr_store+0xf/0x20
>   sysfs_kf_write+0x45/0x60
>   kernfs_fop_write+0x135/0x1c0
>   __vfs_write+0x37/0x160
>   ? rcu_read_lock_sched_held+0x79/0x80
>   ? rcu_sync_lockdep_assert+0x2f/0x60
>   ? __sb_start_write+0xd9/0x1c0
>   ? vfs_write+0x1ad/0x1d0
>   vfs_write+0xcd/0x1d0
>   SyS_write+0x58/0xc0
>   ? rcu_read_lock_sched_held+0x79/0x80
>   do_syscall_64+0x8f/0x710
>   ? trace_hardirqs_on_thunk+0x1a/0x1c
>   entry_SYSCALL64_slow_path+0x25/0x25
> 
> The cpu hotplug path will hold cpu_hotplug.lock and then reinit all exiting 
> queues for blk mq w/ all_q_mutex, however, blk_mq_init_allocated_queue() will 
> contend these two locks in the inversion order. This is due to commit 
> eabe06595d62
> (blk/mq: Cure cpu hotplug lock inversion), it fixes a cpu hotplug lock 
> inversion 
> issue because of hotplug rework, however the hotplug rework is still 
> work-in-progress 
> and lives in a -tip branch and mainline cannot yet trigger that splat. The 
> 

Re: [PATCH v2] block/mq: fix potential deadlock during cpu hotplug

2017-05-07 Thread Jens Axboe
On 05/07/2017 01:14 AM, Wanpeng Li wrote:
> From: Wanpeng Li 
> 
> This can be triggered by hot-unplug one cpu.
> 
> ==
>  [ INFO: possible circular locking dependency detected ]
>  4.11.0+ #17 Not tainted
>  ---
>  step_after_susp/2640 is trying to acquire lock:
>   (all_q_mutex){+.+...}, at: [] 
> blk_mq_queue_reinit_work+0x18/0x110
> 
>  but task is already holding lock:
>   (cpu_hotplug.lock){+.+.+.}, at: [] 
> cpu_hotplug_begin+0x7f/0xe0
> 
>  which lock already depends on the new lock.
> 
> 
>  the existing dependency chain (in reverse order) is:
> 
>  -> #1 (cpu_hotplug.lock){+.+.+.}:
> lock_acquire+0x11c/0x230
> __mutex_lock+0x92/0x990
> mutex_lock_nested+0x1b/0x20
> get_online_cpus+0x64/0x80
> blk_mq_init_allocated_queue+0x3a0/0x4e0
> blk_mq_init_queue+0x3a/0x60
> loop_add+0xe5/0x280
> loop_init+0x124/0x177
> do_one_initcall+0x53/0x1c0
> kernel_init_freeable+0x1e3/0x27f
> kernel_init+0xe/0x100
> ret_from_fork+0x31/0x40
> 
>  -> #0 (all_q_mutex){+.+...}:
> __lock_acquire+0x189a/0x18a0
> lock_acquire+0x11c/0x230
> __mutex_lock+0x92/0x990
> mutex_lock_nested+0x1b/0x20
> blk_mq_queue_reinit_work+0x18/0x110
> blk_mq_queue_reinit_dead+0x1c/0x20
> cpuhp_invoke_callback+0x1f2/0x810
> cpuhp_down_callbacks+0x42/0x80
> _cpu_down+0xb2/0xe0
> freeze_secondary_cpus+0xb6/0x390
> suspend_devices_and_enter+0x3b3/0xa40
> pm_suspend+0x129/0x490
> state_store+0x82/0xf0
> kobj_attr_store+0xf/0x20
> sysfs_kf_write+0x45/0x60
> kernfs_fop_write+0x135/0x1c0
> __vfs_write+0x37/0x160
> vfs_write+0xcd/0x1d0
> SyS_write+0x58/0xc0
> do_syscall_64+0x8f/0x710
> return_from_SYSCALL_64+0x0/0x7a
> 
>  other info that might help us debug this:
> 
>   Possible unsafe locking scenario:
> 
> CPU0CPU1
> 
>lock(cpu_hotplug.lock);
> lock(all_q_mutex);
> lock(cpu_hotplug.lock);
>lock(all_q_mutex);
> 
>   *** DEADLOCK ***
> 
>  8 locks held by step_after_susp/2640:
>   #0:  (sb_writers#6){.+.+.+}, at: [] vfs_write+0x1ad/0x1d0
>   #1:  (>mutex){+.+.+.}, at: [] 
> kernfs_fop_write+0x101/0x1c0
>   #2:  (s_active#166){.+.+.+}, at: [] 
> kernfs_fop_write+0x109/0x1c0
>   #3:  (pm_mutex){+.+...}, at: [] pm_suspend+0x21d/0x490
>   #4:  (acpi_scan_lock){+.+.+.}, at: [] 
> acpi_scan_lock_acquire+0x17/0x20
>   #5:  (cpu_add_remove_lock){+.+.+.}, at: [] 
> freeze_secondary_cpus+0x27/0x390
>   #6:  (cpu_hotplug.dep_map){++}, at: [] 
> cpu_hotplug_begin+0x5/0xe0
>   #7:  (cpu_hotplug.lock){+.+.+.}, at: [] 
> cpu_hotplug_begin+0x7f/0xe0
> 
>  stack backtrace:
>  CPU: 3 PID: 2640 Comm: step_after_susp Not tainted 4.11.0+ #17
>  Hardware name: Dell Inc. OptiPlex 7040/0JCTF8, BIOS 1.4.9 09/12/2016
>  Call Trace:
>   dump_stack+0x99/0xce
>   print_circular_bug+0x1fa/0x270
>   __lock_acquire+0x189a/0x18a0
>   lock_acquire+0x11c/0x230
>   ? lock_acquire+0x11c/0x230
>   ? blk_mq_queue_reinit_work+0x18/0x110
>   ? blk_mq_queue_reinit_work+0x18/0x110
>   __mutex_lock+0x92/0x990
>   ? blk_mq_queue_reinit_work+0x18/0x110
>   ? kmem_cache_free+0x2cb/0x330
>   ? anon_transport_class_unregister+0x20/0x20
>   ? blk_mq_queue_reinit_work+0x110/0x110
>   mutex_lock_nested+0x1b/0x20
>   ? mutex_lock_nested+0x1b/0x20
>   blk_mq_queue_reinit_work+0x18/0x110
>   blk_mq_queue_reinit_dead+0x1c/0x20
>   cpuhp_invoke_callback+0x1f2/0x810
>   ? __flow_cache_shrink+0x160/0x160
>   cpuhp_down_callbacks+0x42/0x80
>   _cpu_down+0xb2/0xe0
>   freeze_secondary_cpus+0xb6/0x390
>   suspend_devices_and_enter+0x3b3/0xa40
>   ? rcu_read_lock_sched_held+0x79/0x80
>   pm_suspend+0x129/0x490
>   state_store+0x82/0xf0
>   kobj_attr_store+0xf/0x20
>   sysfs_kf_write+0x45/0x60
>   kernfs_fop_write+0x135/0x1c0
>   __vfs_write+0x37/0x160
>   ? rcu_read_lock_sched_held+0x79/0x80
>   ? rcu_sync_lockdep_assert+0x2f/0x60
>   ? __sb_start_write+0xd9/0x1c0
>   ? vfs_write+0x1ad/0x1d0
>   vfs_write+0xcd/0x1d0
>   SyS_write+0x58/0xc0
>   ? rcu_read_lock_sched_held+0x79/0x80
>   do_syscall_64+0x8f/0x710
>   ? trace_hardirqs_on_thunk+0x1a/0x1c
>   entry_SYSCALL64_slow_path+0x25/0x25
> 
> The cpu hotplug path will hold cpu_hotplug.lock and then reinit all exiting 
> queues for blk mq w/ all_q_mutex, however, blk_mq_init_allocated_queue() will 
> contend these two locks in the inversion order. This is due to commit 
> eabe06595d62
> (blk/mq: Cure cpu hotplug lock inversion), it fixes a cpu hotplug lock 
> inversion 
> issue because of hotplug rework, however the hotplug rework is still 
> work-in-progress 
> and lives in a -tip branch and mainline cannot yet trigger that splat. The 
> commit 
> breaks the linus's 

Re: [PATCH v2] x86/efi: Correct ident mapping of efi old_map when kalsr enabled

2017-05-07 Thread Baoquan He
Thanks for explaining, Bhupesh. 

BIOS issue of SGI uv1 is still not fixed. There's a quirk for uv1 to
use efi old map:

void __init efi_apply_memmap_quirks(void)
{
...
...
/* UV2+ BIOS has a fix for this issue.  UV1 still needs the quirk. */
if (dmi_check_system(sgi_uv1_dmi))
set_bit(EFI_OLD_MEMMAP, );
}

And because of some reasons, redhat also need efi old_map now.

Hi Matt,

This v2 patch works on my kvm guest, however there's still problem on
SGI system. I will post v3 later after it's handled. So nack this v2
patch.

Thanks
Baoquan

On 05/08/17 at 12:07am, Bhupesh Sharma wrote:
> On Sat, May 6, 2017 at 5:06 AM, Borislav Petkov  wrote:
> > On Fri, May 05, 2017 at 09:42:14PM +0100, Matt Fleming wrote:
> >> (Including the folks from SGI since this was hit on a UV system)
> >
> > Wasn't there a BIOS fix supplied at some point which obviated the need
> > to boot with efi=old_map on SGI boxes?
> >
> 
> AFAICR, the bios fixes were provided only for SGI boxes with BIOS
> version greater than or equal to UV2 (so upstream with recent bios
> works on UV2, UV3, and UV4 hardware platforms, both with old and new
> mapping, with new mapping being the default), however the UV1
> platforms still use efi=old_map
> 
> Also as mentioned above since commit
> caef78b6cdeddf4ad364f95910bba6b43b8eb9bf fixed the efi=old_map support
> on UV systems even with new bios, they should ideally all boot up
> properly in upstream both with 'nokaslr' and without 'nokaslr' in the
> bootargs when efi=old_map is used.
> 
> Regards,
> Bhupesh


Re: [PATCH v2] x86/efi: Correct ident mapping of efi old_map when kalsr enabled

2017-05-07 Thread Baoquan He
Thanks for explaining, Bhupesh. 

BIOS issue of SGI uv1 is still not fixed. There's a quirk for uv1 to
use efi old map:

void __init efi_apply_memmap_quirks(void)
{
...
...
/* UV2+ BIOS has a fix for this issue.  UV1 still needs the quirk. */
if (dmi_check_system(sgi_uv1_dmi))
set_bit(EFI_OLD_MEMMAP, );
}

And because of some reasons, redhat also need efi old_map now.

Hi Matt,

This v2 patch works on my kvm guest, however there's still problem on
SGI system. I will post v3 later after it's handled. So nack this v2
patch.

Thanks
Baoquan

On 05/08/17 at 12:07am, Bhupesh Sharma wrote:
> On Sat, May 6, 2017 at 5:06 AM, Borislav Petkov  wrote:
> > On Fri, May 05, 2017 at 09:42:14PM +0100, Matt Fleming wrote:
> >> (Including the folks from SGI since this was hit on a UV system)
> >
> > Wasn't there a BIOS fix supplied at some point which obviated the need
> > to boot with efi=old_map on SGI boxes?
> >
> 
> AFAICR, the bios fixes were provided only for SGI boxes with BIOS
> version greater than or equal to UV2 (so upstream with recent bios
> works on UV2, UV3, and UV4 hardware platforms, both with old and new
> mapping, with new mapping being the default), however the UV1
> platforms still use efi=old_map
> 
> Also as mentioned above since commit
> caef78b6cdeddf4ad364f95910bba6b43b8eb9bf fixed the efi=old_map support
> on UV systems even with new bios, they should ideally all boot up
> properly in upstream both with 'nokaslr' and without 'nokaslr' in the
> bootargs when efi=old_map is used.
> 
> Regards,
> Bhupesh


[PATCH 2/2] apparmorfs: Use seq_putc() in two functions

2017-05-07 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sun, 7 May 2017 13:50:28 +0200

Two single characters (line breaks) should be put into a sequence.
Thus use the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 security/apparmor/apparmorfs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c
index b4d83e0bc651..41e427a4f051 100644
--- a/security/apparmor/apparmorfs.c
+++ b/security/apparmor/apparmorfs.c
@@ -494,7 +494,7 @@ static int aa_fs_seq_hash_show(struct seq_file *seq, void 
*v)
if (profile->hash) {
for (i = 0; i < size; i++)
seq_printf(seq, "%.2x", profile->hash[i]);
-   seq_puts(seq, "\n");
+   seq_putc(seq, '\n');
}
aa_put_profile(profile);
 
@@ -602,7 +602,7 @@ static int aa_fs_seq_raw_hash_show(struct seq_file *seq, 
void *v)
if (profile->rawdata->hash) {
for (i = 0; i < size; i++)
seq_printf(seq, "%.2x", profile->rawdata->hash[i]);
-   seq_puts(seq, "\n");
+   seq_putc(seq, '\n');
}
aa_put_profile(profile);
 
-- 
2.12.2



[PATCH 1/2] apparmorfs: Combine two function calls into one in aa_fs_seq_raw_abi_show()

2017-05-07 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sun, 7 May 2017 13:43:50 +0200

A bit of data was put into a sequence by two separate function calls.
Print the same data by a single function call instead.

Signed-off-by: Markus Elfring 
---
 security/apparmor/apparmorfs.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c
index 4f6ac9dbc65d..b4d83e0bc651 100644
--- a/security/apparmor/apparmorfs.c
+++ b/security/apparmor/apparmorfs.c
@@ -572,10 +572,9 @@ static int aa_fs_seq_raw_abi_show(struct seq_file *seq, 
void *v)
struct aa_proxy *proxy = seq->private;
struct aa_profile *profile = aa_get_profile_rcu(>profile);
 
-   if (profile->rawdata->abi) {
-   seq_printf(seq, "v%d", profile->rawdata->abi);
-   seq_puts(seq, "\n");
-   }
+   if (profile->rawdata->abi)
+   seq_printf(seq, "v%d\n", profile->rawdata->abi);
+
aa_put_profile(profile);
 
return 0;
-- 
2.12.2



[PATCH 2/2] apparmorfs: Use seq_putc() in two functions

2017-05-07 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sun, 7 May 2017 13:50:28 +0200

Two single characters (line breaks) should be put into a sequence.
Thus use the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 security/apparmor/apparmorfs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c
index b4d83e0bc651..41e427a4f051 100644
--- a/security/apparmor/apparmorfs.c
+++ b/security/apparmor/apparmorfs.c
@@ -494,7 +494,7 @@ static int aa_fs_seq_hash_show(struct seq_file *seq, void 
*v)
if (profile->hash) {
for (i = 0; i < size; i++)
seq_printf(seq, "%.2x", profile->hash[i]);
-   seq_puts(seq, "\n");
+   seq_putc(seq, '\n');
}
aa_put_profile(profile);
 
@@ -602,7 +602,7 @@ static int aa_fs_seq_raw_hash_show(struct seq_file *seq, 
void *v)
if (profile->rawdata->hash) {
for (i = 0; i < size; i++)
seq_printf(seq, "%.2x", profile->rawdata->hash[i]);
-   seq_puts(seq, "\n");
+   seq_putc(seq, '\n');
}
aa_put_profile(profile);
 
-- 
2.12.2



[PATCH 1/2] apparmorfs: Combine two function calls into one in aa_fs_seq_raw_abi_show()

2017-05-07 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sun, 7 May 2017 13:43:50 +0200

A bit of data was put into a sequence by two separate function calls.
Print the same data by a single function call instead.

Signed-off-by: Markus Elfring 
---
 security/apparmor/apparmorfs.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c
index 4f6ac9dbc65d..b4d83e0bc651 100644
--- a/security/apparmor/apparmorfs.c
+++ b/security/apparmor/apparmorfs.c
@@ -572,10 +572,9 @@ static int aa_fs_seq_raw_abi_show(struct seq_file *seq, 
void *v)
struct aa_proxy *proxy = seq->private;
struct aa_profile *profile = aa_get_profile_rcu(>profile);
 
-   if (profile->rawdata->abi) {
-   seq_printf(seq, "v%d", profile->rawdata->abi);
-   seq_puts(seq, "\n");
-   }
+   if (profile->rawdata->abi)
+   seq_printf(seq, "v%d\n", profile->rawdata->abi);
+
aa_put_profile(profile);
 
return 0;
-- 
2.12.2



[PATCH 0/2] apparmorfs: Fine-tuning for three function implementations

2017-05-07 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sun, 7 May 2017 13:58:08 +0200

Two update suggestions were taken into account
from static source code analysis.

Markus Elfring (2):
  Combine two function calls into one in aa_fs_seq_raw_abi_show()
  Use seq_putc() in two functions

 security/apparmor/apparmorfs.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

-- 
2.12.2



[PATCH 0/2] apparmorfs: Fine-tuning for three function implementations

2017-05-07 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sun, 7 May 2017 13:58:08 +0200

Two update suggestions were taken into account
from static source code analysis.

Markus Elfring (2):
  Combine two function calls into one in aa_fs_seq_raw_abi_show()
  Use seq_putc() in two functions

 security/apparmor/apparmorfs.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

-- 
2.12.2



[lkp-robot] [generic_file_read_iter()] 5ecda13711: BUG:KASAN:stack-out-of-bounds

2017-05-07 Thread kernel test robot

FYI, we noticed the following commit:

commit: 5ecda13711b3bd4a750b5740897bf13d1720de7c ("generic_file_read_iter(): 
make use of iov_iter_revert()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: ocfs2test
with following parameters:

disk: 1HDD
test: test-backup_super



on test machine: qemu-system-x86_64 -enable-kvm -cpu host -smp 2 -m 4G

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


+--+++
|  | 639a93a521 
| 5ecda13711 |
+--+++
| boot_successes   | 4  
| 0  |
| boot_failures| 4  
| 8  |
| invoked_oom-killer:gfp_mask=0x   | 4  
| 4  |
| Mem-Info | 4  
| 4  |
| Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 4  
| 4  |
| BUG:KASAN:stack-out-of-bounds| 0  
| 4  |
+--+++



[  175.170846] BUG: KASAN: stack-out-of-bounds in iov_iter_revert+0x329/0x38b 
at addr 880078647c78
[  175.170846] BUG: KASAN: stack-out-of-bounds in iov_iter_revert+0x329/0x38b 
at addr 880078647c78
[  175.174119] Read of size 8 by task mkfs.ocfs2/9842
[  175.174119] Read of size 8 by task mkfs.ocfs2/9842
[  175.175859] page:ea0001e191c0 count:0 mapcount:0 mapping:  
(null) index:0x1
[  175.175859] page:ea0001e191c0 count:0 mapcount:0 mapping:  
(null) index:0x1
[  175.179119] flags: 0x4000()
[  175.179119] flags: 0x4000()
[  175.180524] raw: 4000  0001 

[  175.180524] raw: 4000  0001 

[  175.183572] raw:  dead0200  

[  175.183572] raw:  dead0200  

[  175.186246] page dumped because: kasan: bad access detected
[  175.186246] page dumped because: kasan: bad access detected
[  175.188352] CPU: 0 PID: 9842 Comm: mkfs.ocfs2 Not tainted 
4.11.0-rc7-00010-g5ecda13 #2
[  175.188352] CPU: 0 PID: 9842 Comm: mkfs.ocfs2 Not tainted 
4.11.0-rc7-00010-g5ecda13 #2
[  175.191815] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.9.3-20161025_171302-gandalf 04/01/2014
[  175.191815] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.9.3-20161025_171302-gandalf 04/01/2014
[  175.195549] Call Trace:
[  175.195549] Call Trace:
[  175.196508]  show_stack+0x6b/0x6e
[  175.196508]  show_stack+0x6b/0x6e
[  175.198026]  dump_stack+0x19/0x1b
[  175.198026]  dump_stack+0x19/0x1b
[  175.199362]  kasan_report+0x49b/0x5ba
[  175.199362]  kasan_report+0x49b/0x5ba
[  175.200687]  ? iov_iter_revert+0x329/0x38b
[  175.200687]  ? iov_iter_revert+0x329/0x38b
[  175.202208]  ? ftrace_likely_update+0x245/0x267
[  175.202208]  ? ftrace_likely_update+0x245/0x267
[  175.203797]  __asan_load8+0x64/0x66
[  175.203797]  __asan_load8+0x64/0x66
[  175.205257]  iov_iter_revert+0x329/0x38b
[  175.205257]  iov_iter_revert+0x329/0x38b
[  175.206703]  generic_file_read_iter+0xe8b/0xeab
[  175.206703]  generic_file_read_iter+0xe8b/0xeab
[  175.208287]  ? iov_iter_init+0xc0/0xd5
[  175.208287]  ? iov_iter_init+0xc0/0xd5
[  175.209620]  ? import_single_range+0x23e/0x272
[  175.209620]  ? import_single_range+0x23e/0x272
[  175.211225]  blkdev_read_iter+0xd8/0xe3
[  175.211225]  blkdev_read_iter+0xd8/0xe3
[  175.212754]  aio_read+0x251/0x2b2
[  175.212754]  aio_read+0x251/0x2b2
[  175.214095]  ? inc_slabs_node+0x38/0x56
[  175.214095]  ? inc_slabs_node+0x38/0x56
[  175.215420]  ? aio_ret+0x40/0x40
[  175.215420]  ? aio_ret+0x40/0x40
[  175.216629]  ? ftrace_likely_update+0x245/0x267
[  175.216629]  ? ftrace_likely_update+0x245/0x267
[  175.218348]  ? ftrace_likely_update+0x245/0x267
[  175.218348]  ? ftrace_likely_update+0x245/0x267
[  175.219937]  ? __asan_loadN+0xf/0x11
[  175.219937]  ? __asan_loadN+0xf/0x11
[  175.221193]  ? ___might_sleep+0x9a/0x233
[  175.221193]  ? ___might_sleep+0x9a/0x233
[  175.222755]  ? __might_sleep+0x16a/0x179
[  175.222755]  ? __might_sleep+0x16a/0x179
[  175.224220]  ? ftrace_likely_update+0x245/0x267
[  175.224220]  ? ftrace_likely_update+0x245/0x267
[  175.225714]  do_io_submit+0xb79/0xcec
[  175.225714]  do_io_submit+0xb79/0xcec
[  175.227109]  ? do_io_submit+0xb79/0xcec
[  175.227109]  ? do_io_submit+0xb79/0xcec
[  175.228580]  ? aio_write+0x383/0x383
[  175.228580]  ? aio_write+0x383/0x383
[  

[lkp-robot] [generic_file_read_iter()] 5ecda13711: BUG:KASAN:stack-out-of-bounds

2017-05-07 Thread kernel test robot

FYI, we noticed the following commit:

commit: 5ecda13711b3bd4a750b5740897bf13d1720de7c ("generic_file_read_iter(): 
make use of iov_iter_revert()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: ocfs2test
with following parameters:

disk: 1HDD
test: test-backup_super



on test machine: qemu-system-x86_64 -enable-kvm -cpu host -smp 2 -m 4G

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


+--+++
|  | 639a93a521 
| 5ecda13711 |
+--+++
| boot_successes   | 4  
| 0  |
| boot_failures| 4  
| 8  |
| invoked_oom-killer:gfp_mask=0x   | 4  
| 4  |
| Mem-Info | 4  
| 4  |
| Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 4  
| 4  |
| BUG:KASAN:stack-out-of-bounds| 0  
| 4  |
+--+++



[  175.170846] BUG: KASAN: stack-out-of-bounds in iov_iter_revert+0x329/0x38b 
at addr 880078647c78
[  175.170846] BUG: KASAN: stack-out-of-bounds in iov_iter_revert+0x329/0x38b 
at addr 880078647c78
[  175.174119] Read of size 8 by task mkfs.ocfs2/9842
[  175.174119] Read of size 8 by task mkfs.ocfs2/9842
[  175.175859] page:ea0001e191c0 count:0 mapcount:0 mapping:  
(null) index:0x1
[  175.175859] page:ea0001e191c0 count:0 mapcount:0 mapping:  
(null) index:0x1
[  175.179119] flags: 0x4000()
[  175.179119] flags: 0x4000()
[  175.180524] raw: 4000  0001 

[  175.180524] raw: 4000  0001 

[  175.183572] raw:  dead0200  

[  175.183572] raw:  dead0200  

[  175.186246] page dumped because: kasan: bad access detected
[  175.186246] page dumped because: kasan: bad access detected
[  175.188352] CPU: 0 PID: 9842 Comm: mkfs.ocfs2 Not tainted 
4.11.0-rc7-00010-g5ecda13 #2
[  175.188352] CPU: 0 PID: 9842 Comm: mkfs.ocfs2 Not tainted 
4.11.0-rc7-00010-g5ecda13 #2
[  175.191815] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.9.3-20161025_171302-gandalf 04/01/2014
[  175.191815] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.9.3-20161025_171302-gandalf 04/01/2014
[  175.195549] Call Trace:
[  175.195549] Call Trace:
[  175.196508]  show_stack+0x6b/0x6e
[  175.196508]  show_stack+0x6b/0x6e
[  175.198026]  dump_stack+0x19/0x1b
[  175.198026]  dump_stack+0x19/0x1b
[  175.199362]  kasan_report+0x49b/0x5ba
[  175.199362]  kasan_report+0x49b/0x5ba
[  175.200687]  ? iov_iter_revert+0x329/0x38b
[  175.200687]  ? iov_iter_revert+0x329/0x38b
[  175.202208]  ? ftrace_likely_update+0x245/0x267
[  175.202208]  ? ftrace_likely_update+0x245/0x267
[  175.203797]  __asan_load8+0x64/0x66
[  175.203797]  __asan_load8+0x64/0x66
[  175.205257]  iov_iter_revert+0x329/0x38b
[  175.205257]  iov_iter_revert+0x329/0x38b
[  175.206703]  generic_file_read_iter+0xe8b/0xeab
[  175.206703]  generic_file_read_iter+0xe8b/0xeab
[  175.208287]  ? iov_iter_init+0xc0/0xd5
[  175.208287]  ? iov_iter_init+0xc0/0xd5
[  175.209620]  ? import_single_range+0x23e/0x272
[  175.209620]  ? import_single_range+0x23e/0x272
[  175.211225]  blkdev_read_iter+0xd8/0xe3
[  175.211225]  blkdev_read_iter+0xd8/0xe3
[  175.212754]  aio_read+0x251/0x2b2
[  175.212754]  aio_read+0x251/0x2b2
[  175.214095]  ? inc_slabs_node+0x38/0x56
[  175.214095]  ? inc_slabs_node+0x38/0x56
[  175.215420]  ? aio_ret+0x40/0x40
[  175.215420]  ? aio_ret+0x40/0x40
[  175.216629]  ? ftrace_likely_update+0x245/0x267
[  175.216629]  ? ftrace_likely_update+0x245/0x267
[  175.218348]  ? ftrace_likely_update+0x245/0x267
[  175.218348]  ? ftrace_likely_update+0x245/0x267
[  175.219937]  ? __asan_loadN+0xf/0x11
[  175.219937]  ? __asan_loadN+0xf/0x11
[  175.221193]  ? ___might_sleep+0x9a/0x233
[  175.221193]  ? ___might_sleep+0x9a/0x233
[  175.222755]  ? __might_sleep+0x16a/0x179
[  175.222755]  ? __might_sleep+0x16a/0x179
[  175.224220]  ? ftrace_likely_update+0x245/0x267
[  175.224220]  ? ftrace_likely_update+0x245/0x267
[  175.225714]  do_io_submit+0xb79/0xcec
[  175.225714]  do_io_submit+0xb79/0xcec
[  175.227109]  ? do_io_submit+0xb79/0xcec
[  175.227109]  ? do_io_submit+0xb79/0xcec
[  175.228580]  ? aio_write+0x383/0x383
[  175.228580]  ? aio_write+0x383/0x383
[  

RE: [PATCH V3] cpuidle: check dev before usage in cpuidle_use_deepest_state

2017-05-07 Thread Li, Fei
> >> Reviewed-by: Andy Shevchenko 
> >> Reviewed-by: Koul, Vinod 
> 
> > A previous version of this has been applied and I don't see any differences
> > in the code changes.
> 
> AFAIR only tags (above) had been extended in v3.

Thanks.
Yes, for v3 no change in code, and only tags "Reviewed-by" added.

> 
> --
> With Best Regards,
> Andy Shevchenko

Best Regards,
Fei


RE: [PATCH V3] cpuidle: check dev before usage in cpuidle_use_deepest_state

2017-05-07 Thread Li, Fei
> >> Reviewed-by: Andy Shevchenko 
> >> Reviewed-by: Koul, Vinod 
> 
> > A previous version of this has been applied and I don't see any differences
> > in the code changes.
> 
> AFAIR only tags (above) had been extended in v3.

Thanks.
Yes, for v3 no change in code, and only tags "Reviewed-by" added.

> 
> --
> With Best Regards,
> Andy Shevchenko

Best Regards,
Fei


Re: [PATCH v6 2/7] perf/x86/intel: Record branch type

2017-05-07 Thread Jin, Yao



On 4/24/2017 8:47 AM, Jin, Yao wrote:



On 4/23/2017 9:55 PM, Jiri Olsa wrote:

On Thu, Apr 20, 2017 at 08:07:50PM +0800, Jin Yao wrote:

SNIP


  +#define X86_BR_TYPE_MAP_MAX16
+
+static int
+common_branch_type(int type)
+{
+int i, mask;
+const int branch_map[X86_BR_TYPE_MAP_MAX] = {
+PERF_BR_CALL,/* X86_BR_CALL */
+PERF_BR_RET,/* X86_BR_RET */
+PERF_BR_SYSCALL,/* X86_BR_SYSCALL */
+PERF_BR_SYSRET,/* X86_BR_SYSRET */
+PERF_BR_INT,/* X86_BR_INT */
+PERF_BR_IRET,/* X86_BR_IRET */
+PERF_BR_JCC,/* X86_BR_JCC */
+PERF_BR_JMP,/* X86_BR_JMP */
+PERF_BR_IRQ,/* X86_BR_IRQ */
+PERF_BR_IND_CALL,/* X86_BR_IND_CALL */
+PERF_BR_NONE,/* X86_BR_ABORT */
+PERF_BR_NONE,/* X86_BR_IN_TX */
+PERF_BR_NONE,/* X86_BR_NO_TX */
+PERF_BR_CALL,/* X86_BR_ZERO_CALL */
+PERF_BR_NONE,/* X86_BR_CALL_STACK */
+PERF_BR_IND_JMP,/* X86_BR_IND_JMP */
+};
+
+type >>= 2; /* skip X86_BR_USER and X86_BR_KERNEL */
+mask = ~(~0 << 1);

is that a fancy way to get 1 into the mask? what do I miss?


+
+for (i = 0; i < X86_BR_TYPE_MAP_MAX; i++) {
+if (type & mask)
+return branch_map[i];

I wonder some bit search would be faster in here, but maybe not big deal

jirka


I just think the branch_map[] doesn't contain many entries (16 entries 
here), so maybe checking 1 bit one time should be acceptable. I just 
want to keep the code simple.


But if the number of entries is more (e.g. 64), maybe it'd better 
check 2 or 4 bits one time.


Thanks
Jin Yao



Hi,

Is this explanation OK? Since for tools part, it's Acked-by: Jiri Olsa. 
I just want to know  if the kernel part is OK either?


Thanks
Jin Yao



Re: [PATCH v6 2/7] perf/x86/intel: Record branch type

2017-05-07 Thread Jin, Yao



On 4/24/2017 8:47 AM, Jin, Yao wrote:



On 4/23/2017 9:55 PM, Jiri Olsa wrote:

On Thu, Apr 20, 2017 at 08:07:50PM +0800, Jin Yao wrote:

SNIP


  +#define X86_BR_TYPE_MAP_MAX16
+
+static int
+common_branch_type(int type)
+{
+int i, mask;
+const int branch_map[X86_BR_TYPE_MAP_MAX] = {
+PERF_BR_CALL,/* X86_BR_CALL */
+PERF_BR_RET,/* X86_BR_RET */
+PERF_BR_SYSCALL,/* X86_BR_SYSCALL */
+PERF_BR_SYSRET,/* X86_BR_SYSRET */
+PERF_BR_INT,/* X86_BR_INT */
+PERF_BR_IRET,/* X86_BR_IRET */
+PERF_BR_JCC,/* X86_BR_JCC */
+PERF_BR_JMP,/* X86_BR_JMP */
+PERF_BR_IRQ,/* X86_BR_IRQ */
+PERF_BR_IND_CALL,/* X86_BR_IND_CALL */
+PERF_BR_NONE,/* X86_BR_ABORT */
+PERF_BR_NONE,/* X86_BR_IN_TX */
+PERF_BR_NONE,/* X86_BR_NO_TX */
+PERF_BR_CALL,/* X86_BR_ZERO_CALL */
+PERF_BR_NONE,/* X86_BR_CALL_STACK */
+PERF_BR_IND_JMP,/* X86_BR_IND_JMP */
+};
+
+type >>= 2; /* skip X86_BR_USER and X86_BR_KERNEL */
+mask = ~(~0 << 1);

is that a fancy way to get 1 into the mask? what do I miss?


+
+for (i = 0; i < X86_BR_TYPE_MAP_MAX; i++) {
+if (type & mask)
+return branch_map[i];

I wonder some bit search would be faster in here, but maybe not big deal

jirka


I just think the branch_map[] doesn't contain many entries (16 entries 
here), so maybe checking 1 bit one time should be acceptable. I just 
want to keep the code simple.


But if the number of entries is more (e.g. 64), maybe it'd better 
check 2 or 4 bits one time.


Thanks
Jin Yao



Hi,

Is this explanation OK? Since for tools part, it's Acked-by: Jiri Olsa. 
I just want to know  if the kernel part is OK either?


Thanks
Jin Yao



[PATCH] powerpc/mm: Use seq_putc() in two functions

2017-05-07 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sun, 7 May 2017 16:32:04 +0200

Two single characters (line breaks) should be put into a sequence.
Thus use the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 arch/powerpc/mm/dump_hashpagetable.c   | 2 +-
 arch/powerpc/mm/dump_linuxpagetables.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/dump_hashpagetable.c 
b/arch/powerpc/mm/dump_hashpagetable.c
index c6b900f54c07..31d248c82f95 100644
--- a/arch/powerpc/mm/dump_hashpagetable.c
+++ b/arch/powerpc/mm/dump_hashpagetable.c
@@ -205,7 +205,7 @@ static void dump_hpte_info(struct pg_state *st, unsigned 
long ea, u64 v, u64 r,
aps_index = calculate_pagesize(st, aps, "actual");
if (aps_index != 2)
seq_printf(st->seq, "LP enc: %lx", lp);
-   seq_puts(st->seq, "\n");
+   seq_putc(st->seq, '\n');
 }
 
 
diff --git a/arch/powerpc/mm/dump_linuxpagetables.c 
b/arch/powerpc/mm/dump_linuxpagetables.c
index d659345a98d6..5c08de16339d 100644
--- a/arch/powerpc/mm/dump_linuxpagetables.c
+++ b/arch/powerpc/mm/dump_linuxpagetables.c
@@ -349,7 +349,7 @@ static void note_page(struct pg_state *st, unsigned long 
addr,
  st->current_flags,
  pg_level[st->level].num);
 
-   seq_puts(st->seq, "\n");
+   seq_putc(st->seq, '\n');
}
 
/*
-- 
2.12.2



[PATCH] powerpc/mm: Use seq_putc() in two functions

2017-05-07 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sun, 7 May 2017 16:32:04 +0200

Two single characters (line breaks) should be put into a sequence.
Thus use the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 arch/powerpc/mm/dump_hashpagetable.c   | 2 +-
 arch/powerpc/mm/dump_linuxpagetables.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/dump_hashpagetable.c 
b/arch/powerpc/mm/dump_hashpagetable.c
index c6b900f54c07..31d248c82f95 100644
--- a/arch/powerpc/mm/dump_hashpagetable.c
+++ b/arch/powerpc/mm/dump_hashpagetable.c
@@ -205,7 +205,7 @@ static void dump_hpte_info(struct pg_state *st, unsigned 
long ea, u64 v, u64 r,
aps_index = calculate_pagesize(st, aps, "actual");
if (aps_index != 2)
seq_printf(st->seq, "LP enc: %lx", lp);
-   seq_puts(st->seq, "\n");
+   seq_putc(st->seq, '\n');
 }
 
 
diff --git a/arch/powerpc/mm/dump_linuxpagetables.c 
b/arch/powerpc/mm/dump_linuxpagetables.c
index d659345a98d6..5c08de16339d 100644
--- a/arch/powerpc/mm/dump_linuxpagetables.c
+++ b/arch/powerpc/mm/dump_linuxpagetables.c
@@ -349,7 +349,7 @@ static void note_page(struct pg_state *st, unsigned long 
addr,
  st->current_flags,
  pg_level[st->level].num);
 
-   seq_puts(st->seq, "\n");
+   seq_putc(st->seq, '\n');
}
 
/*
-- 
2.12.2



Re: [PATCH 1/3] drm: fourcc byteorder: drop DRM_FORMAT_BIG_ENDIAN

2017-05-07 Thread Michel Dänzer
On 03/05/17 06:24 PM, Gerd Hoffmann wrote:
>   Hi,
> 
>>> R600+ supports bigendian framebuffer formats, so no byteswapping on
>>> access is needed.  Not sure whenever that includes 16bpp formats or
>>> whenever this is limited to the 8 bit-per-color formats [...]
>>
>> It includes 16bpp. Looking at
>> drivers/gpu/drm/radeon/atombios_crtc.c:dce4_crtc_do_set_base(), it sets
>> up byte-swapping for all multi-byte formats, so it effectively treats
>> all those formats as if they had DRM_FORMAT_BIG_ENDIAN set.
> 
>> If the radeon (and amdgpu) driver were to be changed to use
>> drm_mode_legacy_fb_format_he for >= R600, that must also handle 16 bpp,
>> which requires DRM_FORMAT_BIG_ENDIAN. So I still don't see how that can
>> be removed or even deprecated.
> 
> Ok.
> 
> Dropped patch #1.
> 
> Updated patch #2 to include all formats returned by
> drm_mode_legacy_fb_format, and also renamed them to DRM_FORMAT_HOST_*.
> 
> Question is how to go forward with patch #3.  I'd prefer to not add
> drm_mode_legacy_fb_format_he if possible.  Is there a chance to adapt
> the radeon and nvidia drivers to a fixed drm_mode_legacy_fb_format
> function (returning be formats on be) without invasive changes?  Given
> they both treat formats as if they had DRM_FORMAT_BIG_ENDIAN set this
> could (with the help of the extended patch #2) be a simple
> s/DRM_FORMAT_/DRM_FORMAT_HOST/ at the right places ...

For radeon this doesn't work with pre-R600 GPUs, which only support
little endian formats for display.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer


Re: [PATCH 1/3] drm: fourcc byteorder: drop DRM_FORMAT_BIG_ENDIAN

2017-05-07 Thread Michel Dänzer
On 03/05/17 06:24 PM, Gerd Hoffmann wrote:
>   Hi,
> 
>>> R600+ supports bigendian framebuffer formats, so no byteswapping on
>>> access is needed.  Not sure whenever that includes 16bpp formats or
>>> whenever this is limited to the 8 bit-per-color formats [...]
>>
>> It includes 16bpp. Looking at
>> drivers/gpu/drm/radeon/atombios_crtc.c:dce4_crtc_do_set_base(), it sets
>> up byte-swapping for all multi-byte formats, so it effectively treats
>> all those formats as if they had DRM_FORMAT_BIG_ENDIAN set.
> 
>> If the radeon (and amdgpu) driver were to be changed to use
>> drm_mode_legacy_fb_format_he for >= R600, that must also handle 16 bpp,
>> which requires DRM_FORMAT_BIG_ENDIAN. So I still don't see how that can
>> be removed or even deprecated.
> 
> Ok.
> 
> Dropped patch #1.
> 
> Updated patch #2 to include all formats returned by
> drm_mode_legacy_fb_format, and also renamed them to DRM_FORMAT_HOST_*.
> 
> Question is how to go forward with patch #3.  I'd prefer to not add
> drm_mode_legacy_fb_format_he if possible.  Is there a chance to adapt
> the radeon and nvidia drivers to a fixed drm_mode_legacy_fb_format
> function (returning be formats on be) without invasive changes?  Given
> they both treat formats as if they had DRM_FORMAT_BIG_ENDIAN set this
> could (with the help of the extended patch #2) be a simple
> s/DRM_FORMAT_/DRM_FORMAT_HOST/ at the right places ...

For radeon this doesn't work with pre-R600 GPUs, which only support
little endian formats for display.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer


[PATCH] [media] pvrusb2: remove redundant check on cnt > 8

2017-05-07 Thread Colin King
From: Colin Ian King 

The 2nd check of cnt > 8 is redundant as cnt is already checked
and thresholded to a maximum of 8 a few statements earlier.
Remove this redundant 2nd check.

Detected by CoverityScan, CID#114281 ("Logically dead code")

Signed-off-by: Colin Ian King 
---
 drivers/media/usb/pvrusb2/pvrusb2-i2c-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/usb/pvrusb2/pvrusb2-i2c-core.c 
b/drivers/media/usb/pvrusb2/pvrusb2-i2c-core.c
index f727b54a53c6..20a52b785fff 100644
--- a/drivers/media/usb/pvrusb2/pvrusb2-i2c-core.c
+++ b/drivers/media/usb/pvrusb2/pvrusb2-i2c-core.c
@@ -488,7 +488,7 @@ static int pvr2_i2c_xfer(struct i2c_adapter *i2c_adap,
if ((ret > 0) || !(msgs[idx].flags & I2C_M_RD)) {
if (cnt > 8) cnt = 8;
printk(KERN_CONT " [");
-   for (offs = 0; offs < (cnt>8?8:cnt); offs++) {
+   for (offs = 0; offs < cnt; offs++) {
if (offs) printk(KERN_CONT " ");
printk(KERN_CONT 
"%02x",msgs[idx].buf[offs]);
}
-- 
2.11.0



Re: [PATCH 2/2] apparmorfs: Use seq_putc() in two functions

2017-05-07 Thread John Johansen
On 05/07/2017 05:03 AM, SF Markus Elfring wrote:
> From: Markus Elfring 
> Date: Sun, 7 May 2017 13:50:28 +0200
> 
> Two single characters (line breaks) should be put into a sequence.
> Thus use the corresponding function "seq_putc".
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring 

Acked-by: John Johansen 

I'll pull it into my tree for my next push


> ---
>  security/apparmor/apparmorfs.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c
> index b4d83e0bc651..41e427a4f051 100644
> --- a/security/apparmor/apparmorfs.c
> +++ b/security/apparmor/apparmorfs.c
> @@ -494,7 +494,7 @@ static int aa_fs_seq_hash_show(struct seq_file *seq, void 
> *v)
>   if (profile->hash) {
>   for (i = 0; i < size; i++)
>   seq_printf(seq, "%.2x", profile->hash[i]);
> - seq_puts(seq, "\n");
> + seq_putc(seq, '\n');
>   }
>   aa_put_profile(profile);
>  
> @@ -602,7 +602,7 @@ static int aa_fs_seq_raw_hash_show(struct seq_file *seq, 
> void *v)
>   if (profile->rawdata->hash) {
>   for (i = 0; i < size; i++)
>   seq_printf(seq, "%.2x", profile->rawdata->hash[i]);
> - seq_puts(seq, "\n");
> + seq_putc(seq, '\n');
>   }
>   aa_put_profile(profile);
>  
> 



  1   2   3   4   5   >