Re: MFC: different h264 profile and level output the same size encoded result

2016-08-28 Thread Randy Li



On 08/29/2016 01:49 PM, Andrzej Hajda wrote:

Hi,

On 08/27/2016 11:55 AM, Randy Li wrote:

Hi:

   I have been reported that the setting the profile, level and bitrate
through the v4l2 extra controls would not make the encoded result
different. I tried it recently, it is true. Although the h264 parser
would tell me the result have been applied as different h264 profile and
level, but size is the same.

You may try this in Gstreamer.

gst-launch-1.0 -v \
videotestsrc num-buffers=500 ! video/x-raw, width=1920,height=1080 ! \
videoconvert ! \
v4l2video4h264enc
extra-controls="controls,h264_profile=1,video_bitrate=100;" ! \
h264parse ! matroskamux ! filesink location=/tmp/1.mkv

Is there any way to reduce the size of MFC encoded data?



There is control called rc_enable (rate control enable), it must be set
to one if you want to control bitrate.
This control confuses many users, I guess it cannot be removed as it
is already part of UAPI, but enabling it internally by the driver
if user sets bitrate, profille, etc, would make it more saner.

I see, thank you so much.
A guy told me that the "frame_level_rate_control_enable=1" in _ 
extra-controls="encode,h264_level=10,h264_profile=4,frame_level_rate_control_enable=1,video_bitrate=2097152"

would also make it works.
But I really know there is a switch need to turn on.



Regards
Andrzej




--
Randy Li
The third produce department



Re: MFC: different h264 profile and level output the same size encoded result

2016-08-28 Thread Randy Li



On 08/29/2016 01:49 PM, Andrzej Hajda wrote:

Hi,

On 08/27/2016 11:55 AM, Randy Li wrote:

Hi:

   I have been reported that the setting the profile, level and bitrate
through the v4l2 extra controls would not make the encoded result
different. I tried it recently, it is true. Although the h264 parser
would tell me the result have been applied as different h264 profile and
level, but size is the same.

You may try this in Gstreamer.

gst-launch-1.0 -v \
videotestsrc num-buffers=500 ! video/x-raw, width=1920,height=1080 ! \
videoconvert ! \
v4l2video4h264enc
extra-controls="controls,h264_profile=1,video_bitrate=100;" ! \
h264parse ! matroskamux ! filesink location=/tmp/1.mkv

Is there any way to reduce the size of MFC encoded data?



There is control called rc_enable (rate control enable), it must be set
to one if you want to control bitrate.
This control confuses many users, I guess it cannot be removed as it
is already part of UAPI, but enabling it internally by the driver
if user sets bitrate, profille, etc, would make it more saner.

I see, thank you so much.
A guy told me that the "frame_level_rate_control_enable=1" in _ 
extra-controls="encode,h264_level=10,h264_profile=4,frame_level_rate_control_enable=1,video_bitrate=2097152"

would also make it works.
But I really know there is a switch need to turn on.



Regards
Andrzej




--
Randy Li
The third produce department



Re: MFC: different h264 profile and level output the same size encoded result

2016-08-28 Thread Andrzej Hajda
Hi,

On 08/27/2016 11:55 AM, Randy Li wrote:
> Hi:
>
>I have been reported that the setting the profile, level and bitrate 
> through the v4l2 extra controls would not make the encoded result 
> different. I tried it recently, it is true. Although the h264 parser 
> would tell me the result have been applied as different h264 profile and 
> level, but size is the same.
>
> You may try this in Gstreamer.
>
> gst-launch-1.0 -v \
> videotestsrc num-buffers=500 ! video/x-raw, width=1920,height=1080 ! \
> videoconvert ! \
> v4l2video4h264enc 
> extra-controls="controls,h264_profile=1,video_bitrate=100;" ! \
> h264parse ! matroskamux ! filesink location=/tmp/1.mkv
>
> Is there any way to reduce the size of MFC encoded data?
>

There is control called rc_enable (rate control enable), it must be set
to one if you want to control bitrate.
This control confuses many users, I guess it cannot be removed as it
is already part of UAPI, but enabling it internally by the driver
if user sets bitrate, profille, etc, would make it more saner.

Regards
Andrzej



Re: MFC: different h264 profile and level output the same size encoded result

2016-08-28 Thread Andrzej Hajda
Hi,

On 08/27/2016 11:55 AM, Randy Li wrote:
> Hi:
>
>I have been reported that the setting the profile, level and bitrate 
> through the v4l2 extra controls would not make the encoded result 
> different. I tried it recently, it is true. Although the h264 parser 
> would tell me the result have been applied as different h264 profile and 
> level, but size is the same.
>
> You may try this in Gstreamer.
>
> gst-launch-1.0 -v \
> videotestsrc num-buffers=500 ! video/x-raw, width=1920,height=1080 ! \
> videoconvert ! \
> v4l2video4h264enc 
> extra-controls="controls,h264_profile=1,video_bitrate=100;" ! \
> h264parse ! matroskamux ! filesink location=/tmp/1.mkv
>
> Is there any way to reduce the size of MFC encoded data?
>

There is control called rc_enable (rate control enable), it must be set
to one if you want to control bitrate.
This control confuses many users, I guess it cannot be removed as it
is already part of UAPI, but enabling it internally by the driver
if user sets bitrate, profille, etc, would make it more saner.

Regards
Andrzej



[GIT] Networking

2016-08-28 Thread David Miller

1) Segregate namespaces properly in conntrack dumps, from Liping
   Zhang.

2) tcp listener refcount fix in netfilter tproxy, from Eric
   Dumazet.

3) Fix timeouts in qed driver due to xmit_more, from Yuval Mintz.

4) Fix use-after-free in tcp_xmit_retransmit_queue().

5) Userspace header fixups (use of __u32, missing includes, etc.)
   from Mikko Rapeli.

6) Further refinements to fragmentation wrt. gso and tunnels, from
   Shmulik Ladkani.

7) Trigger poll correctly for zero length UDP packets, from Eric
   Dumazet.

8) TCP window scaling fix, also from Eric Dumazet.

9) SLAB_DESTROY_BY_RCU is not relevant any more for UDP sockets.

10) Module refcount leak in qdisc_create_dflt(), from Eric Dumazet.

11) Fix deadlock in cp_rx_poll() of 8139cp driver, from Gao Feng.

12) Memory leak in rhashtable's alloc_bucket_locks(), from Eric
Dumazet.

13) Add new device ID to alx driver, from Owen Lin.

Please pull, thanks a lot!

The following changes since commit 184ca823481c99dadd7d946e5afd4bb921eab30d:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2016-08-17 
17:26:58 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 

for you to fetch changes up to b99b43bb4bdf1d361f7487cf03d803082bbf9101:

  Add Killer E2500 device ID in alx driver. (2016-08-29 00:23:50 -0400)


Alexander Duyck (1):
  ixgbe: Do not clear RAR entry when clearing VMDq for SAN MAC

Amir Vadai (1):
  net/mlx5: Update last-use statistics for flow rules

Andrew Rybchenko (1):
  sfc: fix potential stack corruption from running past stat bitmask

Anjali Singhai Jain (1):
  i40e: Change some init flow for the client

Colin Ian King (2):
  net: tehuti: fix typo: "eneble" -> "enable"
  net: hns: dereference ppe_cb->ppe_common_cb if it is non-null

Daniel Borkmann (1):
  Bluetooth: split sk_filter in l2cap_sock_recv_cb

Daniel Romell (1):
  net: xilinx: emaclite: Fallback to random MAC address.

David Ahern (1):
  net: diag: Fix refcnt leak in error path destroying socket

David Daney (1):
  net: thunderx: Fix OOPs with ethtool --register-dump

David S. Miller (5):
  Merge git://git.kernel.org/.../pablo/nf
  Merge branch 'kaweth-oopses'
  Merge branch 'mlx5-fixes'
  Merge branch 'for-upstream' of 
git://git.kernel.org/.../bluetooth/bluetooth
  Merge branch 'mlx5-series'

Eran Ben Elisha (2):
  net/mlx5e: Fix ethtool -g/G rx ring parameter report with striding RQ
  net/mlx5: Add error prints when validate ETS failed

Eric Dumazet (7):
  netfilter: tproxy: properly refcount tcp listeners
  tcp: fix use after free in tcp_xmit_retransmit_queue()
  udp: fix poll() issue with zero sized packets
  tcp: properly scale window in tcp_v[46]_reqsk_send_ack()
  udp: get rid of SLAB_DESTROY_BY_RCU allocations
  qdisc: fix a module refcount leak in qdisc_create_dflt()
  rhashtable: fix a memory leak in alloc_bucket_locks()

Fabio Estevam (1):
  net: lpc_eth: Check clk_prepare_enable() error

Florian Fainelli (2):
  net: dsa: bcm_sf2: Fix race condition while unmasking interrupts
  Documentation: networking: dsa: Remove platform device TODO

Frederic Dalleau (1):
  Bluetooth: Fix memory leak at end of hci requests

Gao Feng (2):
  l2tp: Fix the connect status check in pppol2tp_getname
  8139cp: Fix one possible deadloop in cp_rx_poll

Hadar Hen Zion (2):
  net/mlx5e: Use correct flow dissector key on flower offloading
  net/mlx5e: Retrieve the switchdev id from the firmware only once

Hariprasad Shenai (1):
  cxgb4: Fixes resource allocation for ULD's in kdump kernel

Ido Schimmel (1):
  mlxsw: spectrum: Add missing flood to router port

Jamal Hadi Salim (1):
  net sched: fix encoding to use real length

Jamie Lentin (1):
  net: mv88e6xxx: Fix ingress rate removal for mv6131 chips

Jiri Pirko (2):
  mlxsw: spectrum_buffers: Fix pool value handling in 
mlxsw_sp_sb_tc_pool_bind_set
  team: loadbalance: push lacpdus to exact delivery

Kamal Heib (1):
  net/mlx5e: Fix memory leak if refreshing TIRs fails

Lance Richardson (1):
  sctp: fix overrun in sctp_diag_dump_one()

Liping Zhang (5):
  netfilter: conntrack: do not dump other netns's conntrack entries via proc
  netfilter: nfnetlink_log: add "nf-logger-3-1" module alias name
  netfilter: nfnetlink_acct: report overquota to the right netns
  netfilter: nfnetlink_acct: fix race between nfacct del and xt_nfacct 
destroy
  netfilter: cttimeout: fix use after free error when delete netns

Luiz Augusto von Dentz (2):
  Bluetooth: Fix bt_sock_recvmsg when MSG_TRUNC is not set
  Bluetooth: Fix hci_sock_recvmsg when MSG_TRUNC is not set

Maor Gottlieb (1):
  net/mlx5: Increase number of ethtool steering priorities

Marcelo Ricardo Leitner (1):
  sctp: linearize early if it's not GSO

Mike 

[GIT] Networking

2016-08-28 Thread David Miller

1) Segregate namespaces properly in conntrack dumps, from Liping
   Zhang.

2) tcp listener refcount fix in netfilter tproxy, from Eric
   Dumazet.

3) Fix timeouts in qed driver due to xmit_more, from Yuval Mintz.

4) Fix use-after-free in tcp_xmit_retransmit_queue().

5) Userspace header fixups (use of __u32, missing includes, etc.)
   from Mikko Rapeli.

6) Further refinements to fragmentation wrt. gso and tunnels, from
   Shmulik Ladkani.

7) Trigger poll correctly for zero length UDP packets, from Eric
   Dumazet.

8) TCP window scaling fix, also from Eric Dumazet.

9) SLAB_DESTROY_BY_RCU is not relevant any more for UDP sockets.

10) Module refcount leak in qdisc_create_dflt(), from Eric Dumazet.

11) Fix deadlock in cp_rx_poll() of 8139cp driver, from Gao Feng.

12) Memory leak in rhashtable's alloc_bucket_locks(), from Eric
Dumazet.

13) Add new device ID to alx driver, from Owen Lin.

Please pull, thanks a lot!

The following changes since commit 184ca823481c99dadd7d946e5afd4bb921eab30d:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2016-08-17 
17:26:58 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 

for you to fetch changes up to b99b43bb4bdf1d361f7487cf03d803082bbf9101:

  Add Killer E2500 device ID in alx driver. (2016-08-29 00:23:50 -0400)


Alexander Duyck (1):
  ixgbe: Do not clear RAR entry when clearing VMDq for SAN MAC

Amir Vadai (1):
  net/mlx5: Update last-use statistics for flow rules

Andrew Rybchenko (1):
  sfc: fix potential stack corruption from running past stat bitmask

Anjali Singhai Jain (1):
  i40e: Change some init flow for the client

Colin Ian King (2):
  net: tehuti: fix typo: "eneble" -> "enable"
  net: hns: dereference ppe_cb->ppe_common_cb if it is non-null

Daniel Borkmann (1):
  Bluetooth: split sk_filter in l2cap_sock_recv_cb

Daniel Romell (1):
  net: xilinx: emaclite: Fallback to random MAC address.

David Ahern (1):
  net: diag: Fix refcnt leak in error path destroying socket

David Daney (1):
  net: thunderx: Fix OOPs with ethtool --register-dump

David S. Miller (5):
  Merge git://git.kernel.org/.../pablo/nf
  Merge branch 'kaweth-oopses'
  Merge branch 'mlx5-fixes'
  Merge branch 'for-upstream' of 
git://git.kernel.org/.../bluetooth/bluetooth
  Merge branch 'mlx5-series'

Eran Ben Elisha (2):
  net/mlx5e: Fix ethtool -g/G rx ring parameter report with striding RQ
  net/mlx5: Add error prints when validate ETS failed

Eric Dumazet (7):
  netfilter: tproxy: properly refcount tcp listeners
  tcp: fix use after free in tcp_xmit_retransmit_queue()
  udp: fix poll() issue with zero sized packets
  tcp: properly scale window in tcp_v[46]_reqsk_send_ack()
  udp: get rid of SLAB_DESTROY_BY_RCU allocations
  qdisc: fix a module refcount leak in qdisc_create_dflt()
  rhashtable: fix a memory leak in alloc_bucket_locks()

Fabio Estevam (1):
  net: lpc_eth: Check clk_prepare_enable() error

Florian Fainelli (2):
  net: dsa: bcm_sf2: Fix race condition while unmasking interrupts
  Documentation: networking: dsa: Remove platform device TODO

Frederic Dalleau (1):
  Bluetooth: Fix memory leak at end of hci requests

Gao Feng (2):
  l2tp: Fix the connect status check in pppol2tp_getname
  8139cp: Fix one possible deadloop in cp_rx_poll

Hadar Hen Zion (2):
  net/mlx5e: Use correct flow dissector key on flower offloading
  net/mlx5e: Retrieve the switchdev id from the firmware only once

Hariprasad Shenai (1):
  cxgb4: Fixes resource allocation for ULD's in kdump kernel

Ido Schimmel (1):
  mlxsw: spectrum: Add missing flood to router port

Jamal Hadi Salim (1):
  net sched: fix encoding to use real length

Jamie Lentin (1):
  net: mv88e6xxx: Fix ingress rate removal for mv6131 chips

Jiri Pirko (2):
  mlxsw: spectrum_buffers: Fix pool value handling in 
mlxsw_sp_sb_tc_pool_bind_set
  team: loadbalance: push lacpdus to exact delivery

Kamal Heib (1):
  net/mlx5e: Fix memory leak if refreshing TIRs fails

Lance Richardson (1):
  sctp: fix overrun in sctp_diag_dump_one()

Liping Zhang (5):
  netfilter: conntrack: do not dump other netns's conntrack entries via proc
  netfilter: nfnetlink_log: add "nf-logger-3-1" module alias name
  netfilter: nfnetlink_acct: report overquota to the right netns
  netfilter: nfnetlink_acct: fix race between nfacct del and xt_nfacct 
destroy
  netfilter: cttimeout: fix use after free error when delete netns

Luiz Augusto von Dentz (2):
  Bluetooth: Fix bt_sock_recvmsg when MSG_TRUNC is not set
  Bluetooth: Fix hci_sock_recvmsg when MSG_TRUNC is not set

Maor Gottlieb (1):
  net/mlx5: Increase number of ethtool steering priorities

Marcelo Ricardo Leitner (1):
  sctp: linearize early if it's not GSO

Mike 

[GIT PULL] platform-drivers-x86 for 4.8-4

2016-08-28 Thread Darren Hart
Hi Linus,

The following changes since commit 3eab887a55424fc2c27553b7bfe32330df83f7b8:

  Linux 4.8-rc4 (2016-08-28 15:04:33 -0700)

are available in the git repository at:

  git://git.infradead.org/users/dvhart/linux-platform-drivers-x86.git 
tags/platform-drivers-x86-v4.8-4

for you to fetch changes up to da43bf0c21e57fff0221da5de0a9a388ec0d27cd:

  intel_pmic_gpio: Make explicitly non-modular (2016-08-28 22:31:52 -0700)

Thanks,

Darren Hart
Intel Open Source Technology Center


platform-drivers-x86 for 4.8-4

Remove module related code from two drivers that are only configurable as
built-in.

intel_pmic_gpio:
 - Make explicitly non-modular

platform/olpc:
 - Make ec explicitly non-modular


Paul Gortmaker (2):
  platform/olpc: Make ec explicitly non-modular
  intel_pmic_gpio: Make explicitly non-modular

 drivers/platform/olpc/olpc-ec.c| 8 +++-
 drivers/platform/x86/intel_pmic_gpio.c | 8 ++--
 2 files changed, 5 insertions(+), 11 deletions(-)

-- 
Darren Hart
Intel Open Source Technology Center


[GIT PULL] platform-drivers-x86 for 4.8-4

2016-08-28 Thread Darren Hart
Hi Linus,

The following changes since commit 3eab887a55424fc2c27553b7bfe32330df83f7b8:

  Linux 4.8-rc4 (2016-08-28 15:04:33 -0700)

are available in the git repository at:

  git://git.infradead.org/users/dvhart/linux-platform-drivers-x86.git 
tags/platform-drivers-x86-v4.8-4

for you to fetch changes up to da43bf0c21e57fff0221da5de0a9a388ec0d27cd:

  intel_pmic_gpio: Make explicitly non-modular (2016-08-28 22:31:52 -0700)

Thanks,

Darren Hart
Intel Open Source Technology Center


platform-drivers-x86 for 4.8-4

Remove module related code from two drivers that are only configurable as
built-in.

intel_pmic_gpio:
 - Make explicitly non-modular

platform/olpc:
 - Make ec explicitly non-modular


Paul Gortmaker (2):
  platform/olpc: Make ec explicitly non-modular
  intel_pmic_gpio: Make explicitly non-modular

 drivers/platform/olpc/olpc-ec.c| 8 +++-
 drivers/platform/x86/intel_pmic_gpio.c | 8 ++--
 2 files changed, 5 insertions(+), 11 deletions(-)

-- 
Darren Hart
Intel Open Source Technology Center


Re: [PATCH] mm: Use zonelist name instead of using hardcoded index

2016-08-28 Thread Anshuman Khandual
On 08/26/2016 09:27 PM, Aneesh Kumar K.V wrote:
> This use the existing enums instead of hardcoded index when looking at the

Small nit. 'use' --> 'uses'

> zonelist. This makes it more readable. No functionality change by this
> patch.

Came across this some time back, yeah it really makes sense to replace
those hard coded indices.

> 
> Signed-off-by: Aneesh Kumar K.V 

Reviewed-by: Anshuman Khandual 



Re: [PATCH] mm: Use zonelist name instead of using hardcoded index

2016-08-28 Thread Anshuman Khandual
On 08/26/2016 09:27 PM, Aneesh Kumar K.V wrote:
> This use the existing enums instead of hardcoded index when looking at the

Small nit. 'use' --> 'uses'

> zonelist. This makes it more readable. No functionality change by this
> patch.

Came across this some time back, yeah it really makes sense to replace
those hard coded indices.

> 
> Signed-off-by: Aneesh Kumar K.V 

Reviewed-by: Anshuman Khandual 



Grant Offer

2016-08-28 Thread Mrs Julie Leach
You are a recipient to Mrs Julie leach Donation of 2M USD. Contact
(julie_leach...@hotmail.com) for claims


Grant Offer

2016-08-28 Thread Mrs Julie Leach
You are a recipient to Mrs Julie leach Donation of 2M USD. Contact
(julie_leach...@hotmail.com) for claims


[PATCH 2/3] ARM: dts: imx7-colibri: add basic supply regulators

2016-08-28 Thread Stefan Agner
Colibri modules need to be powered using the power pins 3V3 and
AVDD_AUDIO. Add fixed regulators which represent this power rails.
Potentially, those power rails could be switched on a carrier
board. A carrier board device tree could add a own regulator with
a GPIO, and reference that regulator in a vin-supply property of
those new module level system regulators. This also synchronize
the name of the +3.3V regulator with the one used in the Colibri
VF50/VF61 device tree.

Signed-off-by: Stefan Agner 
---
 arch/arm/boot/dts/imx7-colibri.dtsi | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/arm/boot/dts/imx7-colibri.dtsi 
b/arch/arm/boot/dts/imx7-colibri.dtsi
index 044b83e..06fb567 100644
--- a/arch/arm/boot/dts/imx7-colibri.dtsi
+++ b/arch/arm/boot/dts/imx7-colibri.dtsi
@@ -46,12 +46,18 @@
pwms = < 0 500>;
};
 
-   reg_3p3v: regulator-3p3v {
+   reg_module_3v3: regulator-module-3v3 {
compatible = "regulator-fixed";
-   regulator-name = "3P3V";
+   regulator-name = "+V3.3";
+   regulator-min-microvolt = <330>;
+   regulator-max-microvolt = <330>;
+   };
+
+   reg_module_3v3_avdd: regulator-module-3v3-avdd {
+   compatible = "regulator-fixed";
+   regulator-name = "+V3.3_AVDD_AUDIO";
regulator-min-microvolt = <330>;
regulator-max-microvolt = <330>;
-   regulator-always-on;
};
 
reg_vref_1v8: regulator-vref-1v8 {
-- 
2.9.0



[PATCH 3/3] ARM: dts: imx7-colibri: add Audio support

2016-08-28 Thread Stefan Agner
Add audio support via on module I2S SGTL5000 codec.

Signed-off-by: Stefan Agner 
---
 arch/arm/boot/dts/imx7-colibri.dtsi | 41 -
 1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/imx7-colibri.dtsi 
b/arch/arm/boot/dts/imx7-colibri.dtsi
index 06fb567..a9cc657 100644
--- a/arch/arm/boot/dts/imx7-colibri.dtsi
+++ b/arch/arm/boot/dts/imx7-colibri.dtsi
@@ -66,6 +66,22 @@
regulator-min-microvolt = <180>;
regulator-max-microvolt = <180>;
};
+
+   sound {
+   compatible = "simple-audio-card";
+   simple-audio-card,name = "imx7-sgtl5000";
+   simple-audio-card,format = "i2s";
+   simple-audio-card,bitclock-master = <_master>;
+   simple-audio-card,frame-master = <_master>;
+   simple-audio-card,cpu {
+   sound-dai = <>;
+   };
+
+   dailink_master: simple-audio-card,codec {
+   sound-dai = <>;
+   clocks = < IMX7D_AUDIO_MCLK_ROOT_CLK>;
+   };
+   };
 };
 
  {
@@ -103,6 +119,18 @@
pinctrl-0 = <_i2c1 _i2c1_int>;
status = "okay";
 
+   codec: sgtl5000@0a {
+   compatible = "fsl,sgtl5000";
+   #sound-dai-cells = <0>;
+   reg = <0x0a>;
+   clocks = < IMX7D_AUDIO_MCLK_ROOT_CLK>;
+   pinctrl-names = "default";
+   pinctrl-0 = <_sai1_mclk>;
+   VDDA-supply = <_module_3v3_avdd>;
+   VDDIO-supply = <_module_3v3>;
+   VDDD-supply = <_DCDC3>;
+   };
+
ad7879@2c {
compatible = "adi,ad7879-1";
reg = <0x2c>;
@@ -223,6 +251,12 @@
vin-supply = <_DCDC3>;
 };
 
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_sai1>;
+   status = "okay";
+};
+
 _pwrkey {
status = "disabled";
 };
@@ -542,13 +576,18 @@
 
pinctrl_sai1: sai1-grp {
fsl,pins = <
-   MX7D_PAD_SAI1_MCLK__SAI1_MCLK   0x1f
MX7D_PAD_ENET1_RX_CLK__SAI1_TX_BCLK 0x1f
MX7D_PAD_SAI1_TX_SYNC__SAI1_TX_SYNC 0x1f
MX7D_PAD_ENET1_COL__SAI1_TX_DATA0   0x30
MX7D_PAD_ENET1_TX_CLK__SAI1_RX_DATA00x1f
>;
};
+
+   pinctrl_sai1_mclk: sai1grp_mclk {
+   fsl,pins = <
+   MX7D_PAD_SAI1_MCLK__SAI1_MCLK   0x1f
+   >;
+   };
 };
 
 _lpsr {
-- 
2.9.0



[PATCH 1/3] ARM: dts: imx7-colibri: move SD-card to module level

2016-08-28 Thread Stefan Agner
Move SD-card definition to module level. While at it, also disable
write-protect since the Colibri standard does not define a pin for
SD-Card write-protection.

Signed-off-by: Stefan Agner 
---
 arch/arm/boot/dts/imx7-colibri-eval-v3.dtsi | 4 
 arch/arm/boot/dts/imx7-colibri.dtsi | 8 
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/arm/boot/dts/imx7-colibri-eval-v3.dtsi 
b/arch/arm/boot/dts/imx7-colibri-eval-v3.dtsi
index 1545661..373ee19 100644
--- a/arch/arm/boot/dts/imx7-colibri-eval-v3.dtsi
+++ b/arch/arm/boot/dts/imx7-colibri-eval-v3.dtsi
@@ -138,10 +138,6 @@
 };
 
  {
-   pinctrl-names = "default";
-   pinctrl-0 = <_usdhc1 _cd_usdhc1>;
-   no-1-8-v;
-   cd-gpios = < 0 GPIO_ACTIVE_LOW>;
keep-power-in-suspend;
wakeup-source;
status = "okay";
diff --git a/arch/arm/boot/dts/imx7-colibri.dtsi 
b/arch/arm/boot/dts/imx7-colibri.dtsi
index 0a9d3a8..044b83e 100644
--- a/arch/arm/boot/dts/imx7-colibri.dtsi
+++ b/arch/arm/boot/dts/imx7-colibri.dtsi
@@ -251,6 +251,14 @@
dr_mode = "host";
 };
 
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_usdhc1 _cd_usdhc1>;
+   no-1-8-v;
+   cd-gpios = < 0 GPIO_ACTIVE_LOW>;
+   disable-wp;
+};
+
  {
pinctrl-names = "default";
pinctrl-0 = <_gpio1 _gpio2 _gpio3 
_gpio4>;
-- 
2.9.0



[PATCH 2/3] ARM: dts: imx7-colibri: add basic supply regulators

2016-08-28 Thread Stefan Agner
Colibri modules need to be powered using the power pins 3V3 and
AVDD_AUDIO. Add fixed regulators which represent this power rails.
Potentially, those power rails could be switched on a carrier
board. A carrier board device tree could add a own regulator with
a GPIO, and reference that regulator in a vin-supply property of
those new module level system regulators. This also synchronize
the name of the +3.3V regulator with the one used in the Colibri
VF50/VF61 device tree.

Signed-off-by: Stefan Agner 
---
 arch/arm/boot/dts/imx7-colibri.dtsi | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/arm/boot/dts/imx7-colibri.dtsi 
b/arch/arm/boot/dts/imx7-colibri.dtsi
index 044b83e..06fb567 100644
--- a/arch/arm/boot/dts/imx7-colibri.dtsi
+++ b/arch/arm/boot/dts/imx7-colibri.dtsi
@@ -46,12 +46,18 @@
pwms = < 0 500>;
};
 
-   reg_3p3v: regulator-3p3v {
+   reg_module_3v3: regulator-module-3v3 {
compatible = "regulator-fixed";
-   regulator-name = "3P3V";
+   regulator-name = "+V3.3";
+   regulator-min-microvolt = <330>;
+   regulator-max-microvolt = <330>;
+   };
+
+   reg_module_3v3_avdd: regulator-module-3v3-avdd {
+   compatible = "regulator-fixed";
+   regulator-name = "+V3.3_AVDD_AUDIO";
regulator-min-microvolt = <330>;
regulator-max-microvolt = <330>;
-   regulator-always-on;
};
 
reg_vref_1v8: regulator-vref-1v8 {
-- 
2.9.0



[PATCH 3/3] ARM: dts: imx7-colibri: add Audio support

2016-08-28 Thread Stefan Agner
Add audio support via on module I2S SGTL5000 codec.

Signed-off-by: Stefan Agner 
---
 arch/arm/boot/dts/imx7-colibri.dtsi | 41 -
 1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/imx7-colibri.dtsi 
b/arch/arm/boot/dts/imx7-colibri.dtsi
index 06fb567..a9cc657 100644
--- a/arch/arm/boot/dts/imx7-colibri.dtsi
+++ b/arch/arm/boot/dts/imx7-colibri.dtsi
@@ -66,6 +66,22 @@
regulator-min-microvolt = <180>;
regulator-max-microvolt = <180>;
};
+
+   sound {
+   compatible = "simple-audio-card";
+   simple-audio-card,name = "imx7-sgtl5000";
+   simple-audio-card,format = "i2s";
+   simple-audio-card,bitclock-master = <_master>;
+   simple-audio-card,frame-master = <_master>;
+   simple-audio-card,cpu {
+   sound-dai = <>;
+   };
+
+   dailink_master: simple-audio-card,codec {
+   sound-dai = <>;
+   clocks = < IMX7D_AUDIO_MCLK_ROOT_CLK>;
+   };
+   };
 };
 
  {
@@ -103,6 +119,18 @@
pinctrl-0 = <_i2c1 _i2c1_int>;
status = "okay";
 
+   codec: sgtl5000@0a {
+   compatible = "fsl,sgtl5000";
+   #sound-dai-cells = <0>;
+   reg = <0x0a>;
+   clocks = < IMX7D_AUDIO_MCLK_ROOT_CLK>;
+   pinctrl-names = "default";
+   pinctrl-0 = <_sai1_mclk>;
+   VDDA-supply = <_module_3v3_avdd>;
+   VDDIO-supply = <_module_3v3>;
+   VDDD-supply = <_DCDC3>;
+   };
+
ad7879@2c {
compatible = "adi,ad7879-1";
reg = <0x2c>;
@@ -223,6 +251,12 @@
vin-supply = <_DCDC3>;
 };
 
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_sai1>;
+   status = "okay";
+};
+
 _pwrkey {
status = "disabled";
 };
@@ -542,13 +576,18 @@
 
pinctrl_sai1: sai1-grp {
fsl,pins = <
-   MX7D_PAD_SAI1_MCLK__SAI1_MCLK   0x1f
MX7D_PAD_ENET1_RX_CLK__SAI1_TX_BCLK 0x1f
MX7D_PAD_SAI1_TX_SYNC__SAI1_TX_SYNC 0x1f
MX7D_PAD_ENET1_COL__SAI1_TX_DATA0   0x30
MX7D_PAD_ENET1_TX_CLK__SAI1_RX_DATA00x1f
>;
};
+
+   pinctrl_sai1_mclk: sai1grp_mclk {
+   fsl,pins = <
+   MX7D_PAD_SAI1_MCLK__SAI1_MCLK   0x1f
+   >;
+   };
 };
 
 _lpsr {
-- 
2.9.0



[PATCH 1/3] ARM: dts: imx7-colibri: move SD-card to module level

2016-08-28 Thread Stefan Agner
Move SD-card definition to module level. While at it, also disable
write-protect since the Colibri standard does not define a pin for
SD-Card write-protection.

Signed-off-by: Stefan Agner 
---
 arch/arm/boot/dts/imx7-colibri-eval-v3.dtsi | 4 
 arch/arm/boot/dts/imx7-colibri.dtsi | 8 
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/arm/boot/dts/imx7-colibri-eval-v3.dtsi 
b/arch/arm/boot/dts/imx7-colibri-eval-v3.dtsi
index 1545661..373ee19 100644
--- a/arch/arm/boot/dts/imx7-colibri-eval-v3.dtsi
+++ b/arch/arm/boot/dts/imx7-colibri-eval-v3.dtsi
@@ -138,10 +138,6 @@
 };
 
  {
-   pinctrl-names = "default";
-   pinctrl-0 = <_usdhc1 _cd_usdhc1>;
-   no-1-8-v;
-   cd-gpios = < 0 GPIO_ACTIVE_LOW>;
keep-power-in-suspend;
wakeup-source;
status = "okay";
diff --git a/arch/arm/boot/dts/imx7-colibri.dtsi 
b/arch/arm/boot/dts/imx7-colibri.dtsi
index 0a9d3a8..044b83e 100644
--- a/arch/arm/boot/dts/imx7-colibri.dtsi
+++ b/arch/arm/boot/dts/imx7-colibri.dtsi
@@ -251,6 +251,14 @@
dr_mode = "host";
 };
 
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_usdhc1 _cd_usdhc1>;
+   no-1-8-v;
+   cd-gpios = < 0 GPIO_ACTIVE_LOW>;
+   disable-wp;
+};
+
  {
pinctrl-names = "default";
pinctrl-0 = <_gpio1 _gpio2 _gpio3 
_gpio4>;
-- 
2.9.0



[PATCH v5 6/6] mm/cma: remove per zone CMA stat

2016-08-28 Thread js1304
From: Joonsoo Kim 

Now, all reserved pages for CMA region are belong to the ZONE_CMA
so we don't need to maintain CMA stat in other zones. Remove it.

Acked-by: Vlastimil Babka 
Signed-off-by: Joonsoo Kim 
---
 fs/proc/meminfo.c  |  2 +-
 include/linux/cma.h|  6 ++
 include/linux/mmzone.h |  1 -
 mm/cma.c   | 15 +++
 mm/page_alloc.c|  7 +++
 mm/vmstat.c|  1 -
 6 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 8a42849..0ca6f38 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -151,7 +151,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
 #ifdef CONFIG_CMA
show_val_kb(m, "CmaTotal:   ", totalcma_pages);
show_val_kb(m, "CmaFree:",
-   global_page_state(NR_FREE_CMA_PAGES));
+   cma_get_free());
 #endif
 
hugetlb_report_meminfo(m);
diff --git a/include/linux/cma.h b/include/linux/cma.h
index 29f9e77..816290c 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -28,4 +28,10 @@ extern int cma_init_reserved_mem(phys_addr_t base, 
phys_addr_t size,
struct cma **res_cma);
 extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int 
align);
 extern bool cma_release(struct cma *cma, const struct page *pages, unsigned 
int count);
+
+#ifdef CONFIG_CMA
+extern unsigned long cma_get_free(void);
+#else
+static inline unsigned long cma_get_free(void) { return 0; }
+#endif
 #endif
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 24e46ca..8bc2611 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -113,7 +113,6 @@ enum zone_stat_item {
NUMA_LOCAL, /* allocation from local node */
NUMA_OTHER, /* allocation from other node */
 #endif
-   NR_FREE_CMA_PAGES,
NR_VM_ZONE_STAT_ITEMS };
 
 enum node_stat_item {
diff --git a/mm/cma.c b/mm/cma.c
index c1bae7f..981633b 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -54,6 +54,21 @@ unsigned long cma_get_size(const struct cma *cma)
return cma->count << PAGE_SHIFT;
 }
 
+unsigned long cma_get_free(void)
+{
+   struct zone *zone;
+   unsigned long freecma = 0;
+
+   for_each_populated_zone(zone) {
+   if (!is_zone_cma(zone))
+   continue;
+
+   freecma += zone_page_state(zone, NR_FREE_PAGES);
+   }
+
+   return freecma;
+}
+
 static unsigned long cma_bitmap_aligned_mask(const struct cma *cma,
 int align_order)
 {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ca17de9..587d542 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -65,6 +65,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -4206,7 +4207,7 @@ void show_free_areas(unsigned int filter)
global_page_state(NR_BOUNCE),
global_page_state(NR_FREE_PAGES),
free_pcp,
-   global_page_state(NR_FREE_CMA_PAGES));
+   cma_get_free());
 
for_each_online_pgdat(pgdat) {
printk("Node %d"
@@ -4287,7 +4288,6 @@ void show_free_areas(unsigned int filter)
" bounce:%lukB"
" free_pcp:%lukB"
" local_pcp:%ukB"
-   " free_cma:%lukB"
"\n",
zone->name,
K(zone_page_state(zone, NR_FREE_PAGES)),
@@ -4309,8 +4309,7 @@ void show_free_areas(unsigned int filter)
K(zone_page_state(zone, NR_PAGETABLE)),
K(zone_page_state(zone, NR_BOUNCE)),
K(free_pcp),
-   K(this_cpu_read(zone->pageset->pcp.count)),
-   K(zone_page_state(zone, NR_FREE_CMA_PAGES)));
+   K(this_cpu_read(zone->pageset->pcp.count)));
printk("lowmem_reserve[]:");
for (i = 0; i < MAX_NR_ZONES; i++)
printk(" %ld", zone->lowmem_reserve[i]);
diff --git a/mm/vmstat.c b/mm/vmstat.c
index ce5838b..93dfd9d 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -951,7 +951,6 @@ const char * const vmstat_text[] = {
"numa_local",
"numa_other",
 #endif
-   "nr_free_cma",
 
/* Node-based counters */
"nr_inactive_anon",
-- 
1.9.1



[PATCH v5 2/6] mm/cma: introduce new zone, ZONE_CMA

2016-08-28 Thread js1304
From: Joonsoo Kim 

Attached cover-letter:

This series try to solve problems of current CMA implementation.

CMA is introduced to provide physically contiguous pages at runtime
without exclusive reserved memory area. But, current implementation
works like as previous reserved memory approach, because freepages
on CMA region are used only if there is no movable freepage. In other
words, freepages on CMA region are only used as fallback. In that
situation where freepages on CMA region are used as fallback, kswapd
would be woken up easily since there is no unmovable and reclaimable
freepage, too. If kswapd starts to reclaim memory, fallback allocation
to MIGRATE_CMA doesn't occur any more since movable freepages are
already refilled by kswapd and then most of freepage on CMA are left
to be in free. This situation looks like exclusive reserved memory case.

In my experiment, I found that if system memory has 1024 MB memory and
512 MB is reserved for CMA, kswapd is mostly woken up when roughly 512 MB
free memory is left. Detailed reason is that for keeping enough free
memory for unmovable and reclaimable allocation, kswapd uses below
equation when calculating free memory and it easily go under the watermark.

Free memory for unmovable and reclaimable = Free total - Free CMA pages

This is derivated from the property of CMA freepage that CMA freepage
can't be used for unmovable and reclaimable allocation.

Anyway, in this case, kswapd are woken up when (FreeTotal - FreeCMA)
is lower than low watermark and tries to make free memory until
(FreeTotal - FreeCMA) is higher than high watermark. That results
in that FreeTotal is moving around 512MB boundary consistently. It
then means that we can't utilize full memory capacity.

To fix this problem, I submitted some patches [1] about 10 months ago,
but, found some more problems to be fixed before solving this problem.
It requires many hooks in allocator hotpath so some developers doesn't
like it. Instead, some of them suggest different approach [2] to fix
all the problems related to CMA, that is, introducing a new zone to deal
with free CMA pages. I agree that it is the best way to go so implement
here. Although properties of ZONE_MOVABLE and ZONE_CMA is similar, I
decide to add a new zone rather than piggyback on ZONE_MOVABLE since
they have some differences. First, reserved CMA pages should not be
offlined. If freepage for CMA is managed by ZONE_MOVABLE, we need to keep
MIGRATE_CMA migratetype and insert many hooks on memory hotplug code
to distiguish hotpluggable memory and reserved memory for CMA in the same
zone. It would make memory hotplug code which is already complicated
more complicated. Second, cma_alloc() can be called more frequently
than memory hotplug operation and possibly we need to control
allocation rate of ZONE_CMA to optimize latency in the future.
In this case, separate zone approach is easy to modify. Third, I'd
like to see statistics for CMA, separately. Sometimes, we need to debug
why cma_alloc() is failed and separate statistics would be more helpful
in this situtaion.

Anyway, this patchset solves four problems related to CMA implementation.

1) Utilization problem
As mentioned above, we can't utilize full memory capacity due to the
limitation of CMA freepage and fallback policy. This patchset implements
a new zone for CMA and uses it for GFP_HIGHUSER_MOVABLE request. This
typed allocation is used for page cache and anonymous pages which
occupies most of memory usage in normal case so we can utilize full
memory capacity. Below is the experiment result about this problem.

8 CPUs, 1024 MB, VIRTUAL MACHINE
make -j16


CMA reserve:0 MB512 MB
Elapsed-time:   92.4186.5
pswpin: 82  18647
pswpout:160 69839


CMA reserve:0 MB512 MB
Elapsed-time:   93.193.4
pswpin: 84  46
pswpout:183 92

FYI, there is another attempt [3] trying to solve this problem in lkml.
And, as far as I know, Qualcomm also has out-of-tree solution for this
problem.

2) Reclaim problem
Currently, there is no logic to distinguish CMA pages in reclaim path.
If reclaim is initiated for unmovable and reclaimable allocation,
reclaiming CMA pages doesn't help to satisfy the request and reclaiming
CMA page is just waste. By managing CMA pages in the new zone, we can
skip to reclaim ZONE_CMA completely if it is unnecessary.

3) Atomic allocation failure problem
Kswapd isn't started to reclaim pages when allocation request is movable
type and there is enough free page in the CMA region. After bunch of
consecutive movable allocation requests, free pages in ordinary region
(not CMA region) would be exhausted without waking up kswapd. At that time,
if atomic unmovable allocation comes, it can't be successful since there
is not enough page in ordinary region. This problem 

[PATCH v5 5/6] mm/cma: remove MIGRATE_CMA

2016-08-28 Thread js1304
From: Joonsoo Kim 

Now, all reserved pages for CMA region are belong to the ZONE_CMA
and there is no other type of pages. Therefore, we don't need to
use MIGRATE_CMA to distinguish and handle differently for CMA pages
and ordinary pages. Remove MIGRATE_CMA.

Unfortunately, this patch make free CMA counter incorrect because
we count it when pages are on the MIGRATE_CMA. It will be fixed
by next patch. I can squash next patch here but it makes changes
complicated and hard to review so I separate that.

Acked-by: Vlastimil Babka 
Signed-off-by: Joonsoo Kim 
---
 include/linux/gfp.h|  3 +-
 include/linux/mmzone.h | 24 
 include/linux/page-isolation.h |  5 +--
 include/linux/vmstat.h |  8 
 mm/cma.c   |  2 +-
 mm/compaction.c| 10 +
 mm/hugetlb.c   |  2 +-
 mm/memory_hotplug.c|  7 ++--
 mm/page_alloc.c| 89 --
 mm/page_isolation.c| 15 +++
 mm/page_owner.c|  6 +--
 mm/usercopy.c  |  4 +-
 12 files changed, 43 insertions(+), 132 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index b86e0c2..815d756 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -553,8 +553,7 @@ static inline bool pm_suspended_storage(void)
 
 #if (defined(CONFIG_MEMORY_ISOLATION) && defined(CONFIG_COMPACTION)) || 
defined(CONFIG_CMA)
 /* The below functions must be run on a range from a single zone. */
-extern int alloc_contig_range(unsigned long start, unsigned long end,
- unsigned migratetype);
+extern int alloc_contig_range(unsigned long start, unsigned long end);
 extern void free_contig_range(unsigned long pfn, unsigned nr_pages);
 #endif
 
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 87b344e..24e46ca 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -41,22 +41,6 @@ enum {
MIGRATE_RECLAIMABLE,
MIGRATE_PCPTYPES,   /* the number of types on the pcp lists */
MIGRATE_HIGHATOMIC = MIGRATE_PCPTYPES,
-#ifdef CONFIG_CMA
-   /*
-* MIGRATE_CMA migration type is designed to mimic the way
-* ZONE_MOVABLE works.  Only movable pages can be allocated
-* from MIGRATE_CMA pageblocks and page allocator never
-* implicitly change migration type of MIGRATE_CMA pageblock.
-*
-* The way to use it is to change migratetype of a range of
-* pageblocks to MIGRATE_CMA which can be done by
-* __free_pageblock_cma() function.  What is important though
-* is that a range of pageblocks must be aligned to
-* MAX_ORDER_NR_PAGES should biggest page be bigger then
-* a single pageblock.
-*/
-   MIGRATE_CMA,
-#endif
 #ifdef CONFIG_MEMORY_ISOLATION
MIGRATE_ISOLATE,/* can't allocate from here */
 #endif
@@ -66,14 +50,6 @@ enum {
 /* In mm/page_alloc.c; keep in sync also with show_migration_types() there */
 extern char * const migratetype_names[MIGRATE_TYPES];
 
-#ifdef CONFIG_CMA
-#  define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA)
-#  define is_migrate_cma_page(_page) (get_pageblock_migratetype(_page) == 
MIGRATE_CMA)
-#else
-#  define is_migrate_cma(migratetype) false
-#  define is_migrate_cma_page(_page) false
-#endif
-
 #define for_each_migratetype_order(order, type) \
for (order = 0; order < MAX_ORDER; order++) \
for (type = 0; type < MIGRATE_TYPES; type++)
diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 047d647..1db9759 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -49,15 +49,14 @@ int move_freepages(struct zone *zone,
  */
 int
 start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
-unsigned migratetype, bool skip_hwpoisoned_pages);
+   bool skip_hwpoisoned_pages);
 
 /*
  * Changes MIGRATE_ISOLATE to MIGRATE_MOVABLE.
  * target range is [start_pfn, end_pfn)
  */
 int
-undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
-   unsigned migratetype);
+undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn);
 
 /*
  * Test all pages in [start_pfn, end_pfn) are isolated or not.
diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 6137719..ac6db88 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -341,14 +341,6 @@ static inline void drain_zonestat(struct zone *zone,
struct per_cpu_pageset *pset) { }
 #endif /* CONFIG_SMP */
 
-static inline void __mod_zone_freepage_state(struct zone *zone, int nr_pages,
-int migratetype)
-{
-   __mod_zone_page_state(zone, NR_FREE_PAGES, nr_pages);
-   if 

[PATCH v5 6/6] mm/cma: remove per zone CMA stat

2016-08-28 Thread js1304
From: Joonsoo Kim 

Now, all reserved pages for CMA region are belong to the ZONE_CMA
so we don't need to maintain CMA stat in other zones. Remove it.

Acked-by: Vlastimil Babka 
Signed-off-by: Joonsoo Kim 
---
 fs/proc/meminfo.c  |  2 +-
 include/linux/cma.h|  6 ++
 include/linux/mmzone.h |  1 -
 mm/cma.c   | 15 +++
 mm/page_alloc.c|  7 +++
 mm/vmstat.c|  1 -
 6 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 8a42849..0ca6f38 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -151,7 +151,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
 #ifdef CONFIG_CMA
show_val_kb(m, "CmaTotal:   ", totalcma_pages);
show_val_kb(m, "CmaFree:",
-   global_page_state(NR_FREE_CMA_PAGES));
+   cma_get_free());
 #endif
 
hugetlb_report_meminfo(m);
diff --git a/include/linux/cma.h b/include/linux/cma.h
index 29f9e77..816290c 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -28,4 +28,10 @@ extern int cma_init_reserved_mem(phys_addr_t base, 
phys_addr_t size,
struct cma **res_cma);
 extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int 
align);
 extern bool cma_release(struct cma *cma, const struct page *pages, unsigned 
int count);
+
+#ifdef CONFIG_CMA
+extern unsigned long cma_get_free(void);
+#else
+static inline unsigned long cma_get_free(void) { return 0; }
+#endif
 #endif
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 24e46ca..8bc2611 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -113,7 +113,6 @@ enum zone_stat_item {
NUMA_LOCAL, /* allocation from local node */
NUMA_OTHER, /* allocation from other node */
 #endif
-   NR_FREE_CMA_PAGES,
NR_VM_ZONE_STAT_ITEMS };
 
 enum node_stat_item {
diff --git a/mm/cma.c b/mm/cma.c
index c1bae7f..981633b 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -54,6 +54,21 @@ unsigned long cma_get_size(const struct cma *cma)
return cma->count << PAGE_SHIFT;
 }
 
+unsigned long cma_get_free(void)
+{
+   struct zone *zone;
+   unsigned long freecma = 0;
+
+   for_each_populated_zone(zone) {
+   if (!is_zone_cma(zone))
+   continue;
+
+   freecma += zone_page_state(zone, NR_FREE_PAGES);
+   }
+
+   return freecma;
+}
+
 static unsigned long cma_bitmap_aligned_mask(const struct cma *cma,
 int align_order)
 {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ca17de9..587d542 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -65,6 +65,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -4206,7 +4207,7 @@ void show_free_areas(unsigned int filter)
global_page_state(NR_BOUNCE),
global_page_state(NR_FREE_PAGES),
free_pcp,
-   global_page_state(NR_FREE_CMA_PAGES));
+   cma_get_free());
 
for_each_online_pgdat(pgdat) {
printk("Node %d"
@@ -4287,7 +4288,6 @@ void show_free_areas(unsigned int filter)
" bounce:%lukB"
" free_pcp:%lukB"
" local_pcp:%ukB"
-   " free_cma:%lukB"
"\n",
zone->name,
K(zone_page_state(zone, NR_FREE_PAGES)),
@@ -4309,8 +4309,7 @@ void show_free_areas(unsigned int filter)
K(zone_page_state(zone, NR_PAGETABLE)),
K(zone_page_state(zone, NR_BOUNCE)),
K(free_pcp),
-   K(this_cpu_read(zone->pageset->pcp.count)),
-   K(zone_page_state(zone, NR_FREE_CMA_PAGES)));
+   K(this_cpu_read(zone->pageset->pcp.count)));
printk("lowmem_reserve[]:");
for (i = 0; i < MAX_NR_ZONES; i++)
printk(" %ld", zone->lowmem_reserve[i]);
diff --git a/mm/vmstat.c b/mm/vmstat.c
index ce5838b..93dfd9d 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -951,7 +951,6 @@ const char * const vmstat_text[] = {
"numa_local",
"numa_other",
 #endif
-   "nr_free_cma",
 
/* Node-based counters */
"nr_inactive_anon",
-- 
1.9.1



[PATCH v5 2/6] mm/cma: introduce new zone, ZONE_CMA

2016-08-28 Thread js1304
From: Joonsoo Kim 

Attached cover-letter:

This series try to solve problems of current CMA implementation.

CMA is introduced to provide physically contiguous pages at runtime
without exclusive reserved memory area. But, current implementation
works like as previous reserved memory approach, because freepages
on CMA region are used only if there is no movable freepage. In other
words, freepages on CMA region are only used as fallback. In that
situation where freepages on CMA region are used as fallback, kswapd
would be woken up easily since there is no unmovable and reclaimable
freepage, too. If kswapd starts to reclaim memory, fallback allocation
to MIGRATE_CMA doesn't occur any more since movable freepages are
already refilled by kswapd and then most of freepage on CMA are left
to be in free. This situation looks like exclusive reserved memory case.

In my experiment, I found that if system memory has 1024 MB memory and
512 MB is reserved for CMA, kswapd is mostly woken up when roughly 512 MB
free memory is left. Detailed reason is that for keeping enough free
memory for unmovable and reclaimable allocation, kswapd uses below
equation when calculating free memory and it easily go under the watermark.

Free memory for unmovable and reclaimable = Free total - Free CMA pages

This is derivated from the property of CMA freepage that CMA freepage
can't be used for unmovable and reclaimable allocation.

Anyway, in this case, kswapd are woken up when (FreeTotal - FreeCMA)
is lower than low watermark and tries to make free memory until
(FreeTotal - FreeCMA) is higher than high watermark. That results
in that FreeTotal is moving around 512MB boundary consistently. It
then means that we can't utilize full memory capacity.

To fix this problem, I submitted some patches [1] about 10 months ago,
but, found some more problems to be fixed before solving this problem.
It requires many hooks in allocator hotpath so some developers doesn't
like it. Instead, some of them suggest different approach [2] to fix
all the problems related to CMA, that is, introducing a new zone to deal
with free CMA pages. I agree that it is the best way to go so implement
here. Although properties of ZONE_MOVABLE and ZONE_CMA is similar, I
decide to add a new zone rather than piggyback on ZONE_MOVABLE since
they have some differences. First, reserved CMA pages should not be
offlined. If freepage for CMA is managed by ZONE_MOVABLE, we need to keep
MIGRATE_CMA migratetype and insert many hooks on memory hotplug code
to distiguish hotpluggable memory and reserved memory for CMA in the same
zone. It would make memory hotplug code which is already complicated
more complicated. Second, cma_alloc() can be called more frequently
than memory hotplug operation and possibly we need to control
allocation rate of ZONE_CMA to optimize latency in the future.
In this case, separate zone approach is easy to modify. Third, I'd
like to see statistics for CMA, separately. Sometimes, we need to debug
why cma_alloc() is failed and separate statistics would be more helpful
in this situtaion.

Anyway, this patchset solves four problems related to CMA implementation.

1) Utilization problem
As mentioned above, we can't utilize full memory capacity due to the
limitation of CMA freepage and fallback policy. This patchset implements
a new zone for CMA and uses it for GFP_HIGHUSER_MOVABLE request. This
typed allocation is used for page cache and anonymous pages which
occupies most of memory usage in normal case so we can utilize full
memory capacity. Below is the experiment result about this problem.

8 CPUs, 1024 MB, VIRTUAL MACHINE
make -j16


CMA reserve:0 MB512 MB
Elapsed-time:   92.4186.5
pswpin: 82  18647
pswpout:160 69839


CMA reserve:0 MB512 MB
Elapsed-time:   93.193.4
pswpin: 84  46
pswpout:183 92

FYI, there is another attempt [3] trying to solve this problem in lkml.
And, as far as I know, Qualcomm also has out-of-tree solution for this
problem.

2) Reclaim problem
Currently, there is no logic to distinguish CMA pages in reclaim path.
If reclaim is initiated for unmovable and reclaimable allocation,
reclaiming CMA pages doesn't help to satisfy the request and reclaiming
CMA page is just waste. By managing CMA pages in the new zone, we can
skip to reclaim ZONE_CMA completely if it is unnecessary.

3) Atomic allocation failure problem
Kswapd isn't started to reclaim pages when allocation request is movable
type and there is enough free page in the CMA region. After bunch of
consecutive movable allocation requests, free pages in ordinary region
(not CMA region) would be exhausted without waking up kswapd. At that time,
if atomic unmovable allocation comes, it can't be successful since there
is not enough page in ordinary region. This problem is reported
by Aneesh 

[PATCH v5 5/6] mm/cma: remove MIGRATE_CMA

2016-08-28 Thread js1304
From: Joonsoo Kim 

Now, all reserved pages for CMA region are belong to the ZONE_CMA
and there is no other type of pages. Therefore, we don't need to
use MIGRATE_CMA to distinguish and handle differently for CMA pages
and ordinary pages. Remove MIGRATE_CMA.

Unfortunately, this patch make free CMA counter incorrect because
we count it when pages are on the MIGRATE_CMA. It will be fixed
by next patch. I can squash next patch here but it makes changes
complicated and hard to review so I separate that.

Acked-by: Vlastimil Babka 
Signed-off-by: Joonsoo Kim 
---
 include/linux/gfp.h|  3 +-
 include/linux/mmzone.h | 24 
 include/linux/page-isolation.h |  5 +--
 include/linux/vmstat.h |  8 
 mm/cma.c   |  2 +-
 mm/compaction.c| 10 +
 mm/hugetlb.c   |  2 +-
 mm/memory_hotplug.c|  7 ++--
 mm/page_alloc.c| 89 --
 mm/page_isolation.c| 15 +++
 mm/page_owner.c|  6 +--
 mm/usercopy.c  |  4 +-
 12 files changed, 43 insertions(+), 132 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index b86e0c2..815d756 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -553,8 +553,7 @@ static inline bool pm_suspended_storage(void)
 
 #if (defined(CONFIG_MEMORY_ISOLATION) && defined(CONFIG_COMPACTION)) || 
defined(CONFIG_CMA)
 /* The below functions must be run on a range from a single zone. */
-extern int alloc_contig_range(unsigned long start, unsigned long end,
- unsigned migratetype);
+extern int alloc_contig_range(unsigned long start, unsigned long end);
 extern void free_contig_range(unsigned long pfn, unsigned nr_pages);
 #endif
 
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 87b344e..24e46ca 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -41,22 +41,6 @@ enum {
MIGRATE_RECLAIMABLE,
MIGRATE_PCPTYPES,   /* the number of types on the pcp lists */
MIGRATE_HIGHATOMIC = MIGRATE_PCPTYPES,
-#ifdef CONFIG_CMA
-   /*
-* MIGRATE_CMA migration type is designed to mimic the way
-* ZONE_MOVABLE works.  Only movable pages can be allocated
-* from MIGRATE_CMA pageblocks and page allocator never
-* implicitly change migration type of MIGRATE_CMA pageblock.
-*
-* The way to use it is to change migratetype of a range of
-* pageblocks to MIGRATE_CMA which can be done by
-* __free_pageblock_cma() function.  What is important though
-* is that a range of pageblocks must be aligned to
-* MAX_ORDER_NR_PAGES should biggest page be bigger then
-* a single pageblock.
-*/
-   MIGRATE_CMA,
-#endif
 #ifdef CONFIG_MEMORY_ISOLATION
MIGRATE_ISOLATE,/* can't allocate from here */
 #endif
@@ -66,14 +50,6 @@ enum {
 /* In mm/page_alloc.c; keep in sync also with show_migration_types() there */
 extern char * const migratetype_names[MIGRATE_TYPES];
 
-#ifdef CONFIG_CMA
-#  define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA)
-#  define is_migrate_cma_page(_page) (get_pageblock_migratetype(_page) == 
MIGRATE_CMA)
-#else
-#  define is_migrate_cma(migratetype) false
-#  define is_migrate_cma_page(_page) false
-#endif
-
 #define for_each_migratetype_order(order, type) \
for (order = 0; order < MAX_ORDER; order++) \
for (type = 0; type < MIGRATE_TYPES; type++)
diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 047d647..1db9759 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -49,15 +49,14 @@ int move_freepages(struct zone *zone,
  */
 int
 start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
-unsigned migratetype, bool skip_hwpoisoned_pages);
+   bool skip_hwpoisoned_pages);
 
 /*
  * Changes MIGRATE_ISOLATE to MIGRATE_MOVABLE.
  * target range is [start_pfn, end_pfn)
  */
 int
-undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
-   unsigned migratetype);
+undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn);
 
 /*
  * Test all pages in [start_pfn, end_pfn) are isolated or not.
diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 6137719..ac6db88 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -341,14 +341,6 @@ static inline void drain_zonestat(struct zone *zone,
struct per_cpu_pageset *pset) { }
 #endif /* CONFIG_SMP */
 
-static inline void __mod_zone_freepage_state(struct zone *zone, int nr_pages,
-int migratetype)
-{
-   __mod_zone_page_state(zone, NR_FREE_PAGES, nr_pages);
-   if (is_migrate_cma(migratetype))
-   __mod_zone_page_state(zone, 

[PATCH v5 0/6] Introduce ZONE_CMA

2016-08-28 Thread js1304
From: Joonsoo Kim 

Hello,

Changes from v4
o Rebase on next-20160825
o Add general fix patch for lowmem reserve
o Fix lowmem reserve ratio
o Fix zone span optimizaion per Vlastimil
o Fix pageset initialization
o Change invocation timing on cma_init_reserved_areas()

Changes from v3
o Rebase on next-20160805
o Split first patch per Vlastimil
o Remove useless function parameter per Vlastimil
o Add code comment per Vlastimil
o Add following description on cover-letter

This is the 5th version of ZONE_CMA patchset. Most of changes are
due to rebase and some minor fixes.

CMA has many problems and I mentioned them on the bottom of the
cover letter. These problems comes from limitation of CMA memory that
should be always migratable for device usage. I think that introducing
a new zone is the best approach to solve them. Here are the reasons.

Zone is introduced to solve some issues due to H/W addressing limitation.
MM subsystem is implemented to work efficiently with these zones.
Allocation/reclaim logic in MM consider this limitation very much.
What I did in this patchset is introducing a new zone and extending zone's
concept slightly. New concept is that zone can have not only H/W addressing
limitation but also S/W limitation to guarantee page migration.
This concept is originated from ZONE_MOVABLE and it works well
for a long time. So, ZONE_CMA should not be special at this moment.

There is a major concern from Mel that ZONE_MOVABLE which has
S/W limitation causes highmem/lowmem problem. Highmem/lowmem problem is
that some of memory cannot be usable for kernel memory due to limitation
of the zone. It causes to break LRU ordering and makes hard to find kernel
usable memory when memory pressure.

However, important point is that this problem doesn't come from
implementation detail (ZONE_MOVABLE/MIGRATETYPE). Even if we implement it
by MIGRATETYPE instead of by ZONE_MOVABLE, we cannot use that type of
memory for kernel allocation because it isn't migratable. So, it will cause
to break LRU ordering, too. We cannot avoid the problem in any case.
Therefore, we should focus on which solution is better for maintainance
and not intrusive for MM subsystem.

In this viewpoint, I think that zone approach is better. As mentioned
earlier, MM subsystem already have many infrastructures to deal with
zone's H/W addressing limitation. Adding S/W limitation on zone concept
and adding a new zone doesn't change anything. It will work by itself.
My patchset can remove many hooks related to CMA area management in MM
while solving the problems. More hooks are required to solve the problems
if we choose MIGRATETYPE approach.

Although Mel withdrew the review, Vlastimil expressed an agreement on this
new zone approach [6].

 "I realize I differ here from much more experienced mm guys, and will
 probably deservingly regret it later on, but I think that the ZONE_CMA
 approach could work indeed better than current MIGRATE_CMA pageblocks."

If anyone has a different opinion, please let me know.

Thanks.


Changes from v2
o Rebase on next-20160525
o No other changes except following description

There was a discussion with Mel [5] after LSF/MM 2016. I could summarise
it to help merge decision but it's better to read by yourself since
if I summarise it, it would be biased for me. But, if anyone hope
the summary, I will do it. :)

Anyway, Mel's position on this patchset seems to be neutral. He saids:
"I'm not going to outright NAK your series but I won't ACK it either"

We can fix the problems with any approach but I hope to go a new zone
approach because it is less error-prone. It reduces some corner case
handling for now and remove need for potential corner case handling to fix
problems.

Note that our company is already using ZONE_CMA and there is no problem.

If anyone has a different opinion, please let me know and let's discuss
together.

Andrew, if there is something to do for merge, please let me know.

Changes from v1
o Separate some patches which deserve to submit independently
o Modify description to reflect current kernel state
(e.g. high-order watermark problem disappeared by Mel's work)
o Don't increase SECTION_SIZE_BITS to make a room in page flags
(detailed reason is on the patch that adds ZONE_CMA)
o Adjust ZONE_CMA population code

This series try to solve problems of current CMA implementation.

CMA is introduced to provide physically contiguous pages at runtime
without exclusive reserved memory area. But, current implementation
works like as previous reserved memory approach, because freepages
on CMA region are used only if there is no movable freepage. In other
words, freepages on CMA region are only used as fallback. In that
situation where freepages on CMA region are used as fallback, kswapd
would be woken up easily since there is no unmovable and reclaimable
freepage, too. If kswapd starts to reclaim memory, fallback allocation
to MIGRATE_CMA doesn't occur any more since movable freepages 

[PATCH v5 0/6] Introduce ZONE_CMA

2016-08-28 Thread js1304
From: Joonsoo Kim 

Hello,

Changes from v4
o Rebase on next-20160825
o Add general fix patch for lowmem reserve
o Fix lowmem reserve ratio
o Fix zone span optimizaion per Vlastimil
o Fix pageset initialization
o Change invocation timing on cma_init_reserved_areas()

Changes from v3
o Rebase on next-20160805
o Split first patch per Vlastimil
o Remove useless function parameter per Vlastimil
o Add code comment per Vlastimil
o Add following description on cover-letter

This is the 5th version of ZONE_CMA patchset. Most of changes are
due to rebase and some minor fixes.

CMA has many problems and I mentioned them on the bottom of the
cover letter. These problems comes from limitation of CMA memory that
should be always migratable for device usage. I think that introducing
a new zone is the best approach to solve them. Here are the reasons.

Zone is introduced to solve some issues due to H/W addressing limitation.
MM subsystem is implemented to work efficiently with these zones.
Allocation/reclaim logic in MM consider this limitation very much.
What I did in this patchset is introducing a new zone and extending zone's
concept slightly. New concept is that zone can have not only H/W addressing
limitation but also S/W limitation to guarantee page migration.
This concept is originated from ZONE_MOVABLE and it works well
for a long time. So, ZONE_CMA should not be special at this moment.

There is a major concern from Mel that ZONE_MOVABLE which has
S/W limitation causes highmem/lowmem problem. Highmem/lowmem problem is
that some of memory cannot be usable for kernel memory due to limitation
of the zone. It causes to break LRU ordering and makes hard to find kernel
usable memory when memory pressure.

However, important point is that this problem doesn't come from
implementation detail (ZONE_MOVABLE/MIGRATETYPE). Even if we implement it
by MIGRATETYPE instead of by ZONE_MOVABLE, we cannot use that type of
memory for kernel allocation because it isn't migratable. So, it will cause
to break LRU ordering, too. We cannot avoid the problem in any case.
Therefore, we should focus on which solution is better for maintainance
and not intrusive for MM subsystem.

In this viewpoint, I think that zone approach is better. As mentioned
earlier, MM subsystem already have many infrastructures to deal with
zone's H/W addressing limitation. Adding S/W limitation on zone concept
and adding a new zone doesn't change anything. It will work by itself.
My patchset can remove many hooks related to CMA area management in MM
while solving the problems. More hooks are required to solve the problems
if we choose MIGRATETYPE approach.

Although Mel withdrew the review, Vlastimil expressed an agreement on this
new zone approach [6].

 "I realize I differ here from much more experienced mm guys, and will
 probably deservingly regret it later on, but I think that the ZONE_CMA
 approach could work indeed better than current MIGRATE_CMA pageblocks."

If anyone has a different opinion, please let me know.

Thanks.


Changes from v2
o Rebase on next-20160525
o No other changes except following description

There was a discussion with Mel [5] after LSF/MM 2016. I could summarise
it to help merge decision but it's better to read by yourself since
if I summarise it, it would be biased for me. But, if anyone hope
the summary, I will do it. :)

Anyway, Mel's position on this patchset seems to be neutral. He saids:
"I'm not going to outright NAK your series but I won't ACK it either"

We can fix the problems with any approach but I hope to go a new zone
approach because it is less error-prone. It reduces some corner case
handling for now and remove need for potential corner case handling to fix
problems.

Note that our company is already using ZONE_CMA and there is no problem.

If anyone has a different opinion, please let me know and let's discuss
together.

Andrew, if there is something to do for merge, please let me know.

Changes from v1
o Separate some patches which deserve to submit independently
o Modify description to reflect current kernel state
(e.g. high-order watermark problem disappeared by Mel's work)
o Don't increase SECTION_SIZE_BITS to make a room in page flags
(detailed reason is on the patch that adds ZONE_CMA)
o Adjust ZONE_CMA population code

This series try to solve problems of current CMA implementation.

CMA is introduced to provide physically contiguous pages at runtime
without exclusive reserved memory area. But, current implementation
works like as previous reserved memory approach, because freepages
on CMA region are used only if there is no movable freepage. In other
words, freepages on CMA region are only used as fallback. In that
situation where freepages on CMA region are used as fallback, kswapd
would be woken up easily since there is no unmovable and reclaimable
freepage, too. If kswapd starts to reclaim memory, fallback allocation
to MIGRATE_CMA doesn't occur any more since movable freepages are
already refilled by 

[PATCH v5 4/6] mm/cma: remove ALLOC_CMA

2016-08-28 Thread js1304
From: Joonsoo Kim 

Now, all reserved pages for CMA region are belong to the ZONE_CMA
and it only serves for GFP_HIGHUSER_MOVABLE. Therefore, we don't need to
consider ALLOC_CMA at all.

Acked-by: Vlastimil Babka 
Signed-off-by: Joonsoo Kim 
---
 mm/compaction.c |  4 +---
 mm/internal.h   |  1 -
 mm/page_alloc.c | 28 +++-
 3 files changed, 4 insertions(+), 29 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 29f6c49..4532905 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1401,14 +1401,12 @@ static enum compact_result __compaction_suitable(struct 
zone *zone, int order,
 * if compaction succeeds.
 * For costly orders, we require low watermark instead of min for
 * compaction to proceed to increase its chances.
-* ALLOC_CMA is used, as pages in CMA pageblocks are considered
-* suitable migration targets
 */
watermark = (order > PAGE_ALLOC_COSTLY_ORDER) ?
low_wmark_pages(zone) : min_wmark_pages(zone);
watermark += compact_gap(order);
if (!__zone_watermark_ok(zone, 0, watermark, classzone_idx,
-   ALLOC_CMA, wmark_target))
+   0, wmark_target))
return COMPACT_SKIPPED;
 
/*
diff --git a/mm/internal.h b/mm/internal.h
index 3d3f052..01d06bb 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -466,7 +466,6 @@ unsigned long reclaim_clean_pages_from_list(struct zone 
*zone,
 #define ALLOC_HARDER   0x10 /* try to alloc harder */
 #define ALLOC_HIGH 0x20 /* __GFP_HIGH set */
 #define ALLOC_CPUSET   0x40 /* check for correct cpuset */
-#define ALLOC_CMA  0x80 /* allow allocations from CMA areas */
 
 enum ttu_flags;
 struct tlbflush_unmap_batch;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 91fb172..16ba1fe 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2565,7 +2565,7 @@ int __isolate_free_page(struct page *page, unsigned int 
order)
 * exists.
 */
watermark = min_wmark_pages(zone) + (1UL << order);
-   if (!zone_watermark_ok(zone, 0, watermark, 0, ALLOC_CMA))
+   if (!zone_watermark_ok(zone, 0, watermark, 0, 0))
return 0;
 
__mod_zone_freepage_state(zone, -(1UL << order), mt);
@@ -2808,12 +2808,6 @@ bool __zone_watermark_ok(struct zone *z, unsigned int 
order, unsigned long mark,
else
min -= min / 4;
 
-#ifdef CONFIG_CMA
-   /* If allocation can't use CMA areas don't use free CMA pages */
-   if (!(alloc_flags & ALLOC_CMA))
-   free_pages -= zone_page_state(z, NR_FREE_CMA_PAGES);
-#endif
-
/*
 * Check watermarks for an order-0 allocation request. If these
 * are not met, then a high-order request also cannot go ahead
@@ -2843,10 +2837,8 @@ bool __zone_watermark_ok(struct zone *z, unsigned int 
order, unsigned long mark,
}
 
 #ifdef CONFIG_CMA
-   if ((alloc_flags & ALLOC_CMA) &&
-   !list_empty(>free_list[MIGRATE_CMA])) {
+   if (!list_empty(>free_list[MIGRATE_CMA]))
return true;
-   }
 #endif
}
return false;
@@ -2863,13 +2855,6 @@ static inline bool zone_watermark_fast(struct zone *z, 
unsigned int order,
unsigned long mark, int classzone_idx, unsigned int alloc_flags)
 {
long free_pages = zone_page_state(z, NR_FREE_PAGES);
-   long cma_pages = 0;
-
-#ifdef CONFIG_CMA
-   /* If allocation can't use CMA areas don't use free CMA pages */
-   if (!(alloc_flags & ALLOC_CMA))
-   cma_pages = zone_page_state(z, NR_FREE_CMA_PAGES);
-#endif
 
/*
 * Fast check for order-0 only. If this fails then the reserves
@@ -2878,7 +2863,7 @@ static inline bool zone_watermark_fast(struct zone *z, 
unsigned int order,
 * the caller is !atomic then it'll uselessly search the free
 * list. That corner case is then slower but it is harmless.
 */
-   if (!order && (free_pages - cma_pages) > mark + 
z->lowmem_reserve[classzone_idx])
+   if (!order && free_pages > mark + z->lowmem_reserve[classzone_idx])
return true;
 
return __zone_watermark_ok(z, order, mark, classzone_idx, alloc_flags,
@@ -3355,10 +3340,6 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
} else if (unlikely(rt_task(current)) && !in_interrupt())
alloc_flags |= ALLOC_HARDER;
 
-#ifdef CONFIG_CMA
-   if (gfpflags_to_migratetype(gfp_mask) == MIGRATE_MOVABLE)
-   alloc_flags |= ALLOC_CMA;
-#endif
return alloc_flags;
 }
 
@@ -3727,9 +3708,6 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
if (unlikely(!zonelist->_zonerefs->zone))

[PATCH v5 1/6] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request

2016-08-28 Thread js1304
From: Joonsoo Kim 

Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
important to reserve. When ZONE_MOVABLE is used, this problem would
theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
allocation request which is mainly used for page cache and anon page
allocation. So, fix it.

And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
makes code complex. For example, if there is highmem system, following
reserve ratio is activated for *NORMAL ZONE* which would be easyily
misleading people.

 #ifdef CONFIG_HIGHMEM
 32
 #endif

This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
array by MAX_NR_ZONES and place "#ifdef" to right place.

Signed-off-by: Joonsoo Kim 
---
 include/linux/mmzone.h | 2 +-
 mm/page_alloc.c| 7 ---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index d572b78..e3f39af 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -877,7 +877,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
 int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
-extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
+extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
 int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
 int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4f7d5d7..a8310de 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -198,17 +198,18 @@ static void __free_pages_ok(struct page *page, unsigned 
int order);
  * TBD: should special case ZONE_DMA32 machines here - in those we normally
  * don't need any ZONE_NORMAL reservation
  */
-int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
+int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
 #ifdef CONFIG_ZONE_DMA
 256,
 #endif
 #ifdef CONFIG_ZONE_DMA32
 256,
 #endif
-#ifdef CONFIG_HIGHMEM
 32,
+#ifdef CONFIG_HIGHMEM
+INT_MAX,
 #endif
-32,
+INT_MAX,
 };
 
 EXPORT_SYMBOL(totalram_pages);
-- 
1.9.1



[PATCH v5 3/6] mm/cma: populate ZONE_CMA

2016-08-28 Thread js1304
From: Joonsoo Kim 

Until now, reserved pages for CMA are managed in the ordinary zones
where page's pfn are belong to. This approach has numorous problems
and fixing them isn't easy. (It is mentioned on previous patch.)
To fix this situation, ZONE_CMA is introduced in previous patch, but,
not yet populated. This patch implement population of ZONE_CMA
by stealing reserved pages from the ordinary zones.

Unlike previous implementation that kernel allocation request with
__GFP_MOVABLE could be serviced from CMA region, allocation request only
with GFP_HIGHUSER_MOVABLE can be serviced from CMA region in the new
approach. This is an inevitable design decision to use the zone
implementation because ZONE_CMA could contain highmem. Due to this
decision, ZONE_CMA will work like as ZONE_HIGHMEM or ZONE_MOVABLE.

I don't think it would be a problem because most of file cache pages
and anonymous pages are requested with GFP_HIGHUSER_MOVABLE. It could
be proved by the fact that there are many systems with ZONE_HIGHMEM and
they work fine. Notable disadvantage is that we cannot use these pages
for blockdev file cache page, because it usually has __GFP_MOVABLE but
not __GFP_HIGHMEM and __GFP_USER. But, in this case, there is pros and
cons. In my experience, blockdev file cache pages are one of the top
reason that causes cma_alloc() to fail temporarily. So, we can get more
guarantee of cma_alloc() success by discarding that case.

Implementation itself is very easy to understand. Steal when cma area is
initialized and recalculate various per zone stat/threshold.

Signed-off-by: Joonsoo Kim 
---
 include/linux/memory_hotplug.h |  3 ---
 include/linux/mm.h |  1 +
 mm/cma.c   | 56 ++
 mm/internal.h  |  3 +++
 mm/page_alloc.c| 29 +++---
 5 files changed, 80 insertions(+), 12 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 01033fa..ea5af47 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -198,9 +198,6 @@ void put_online_mems(void);
 void mem_hotplug_begin(void);
 void mem_hotplug_done(void);
 
-extern void set_zone_contiguous(struct zone *zone);
-extern void clear_zone_contiguous(struct zone *zone);
-
 #else /* ! CONFIG_MEMORY_HOTPLUG */
 /*
  * Stub functions for when hotplug is off
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 9d85402..f45e0e4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1933,6 +1933,7 @@ extern void setup_per_cpu_pageset(void);
 
 extern void zone_pcp_update(struct zone *zone);
 extern void zone_pcp_reset(struct zone *zone);
+extern void setup_zone_pageset(struct zone *zone);
 
 /* page_alloc.c */
 extern int min_free_kbytes;
diff --git a/mm/cma.c b/mm/cma.c
index 384c2cb..d69bdf7 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -38,6 +38,7 @@
 #include 
 
 #include "cma.h"
+#include "internal.h"
 
 struct cma cma_areas[MAX_CMA_AREAS];
 unsigned cma_area_count;
@@ -116,10 +117,9 @@ static int __init cma_activate_area(struct cma *cma)
for (j = pageblock_nr_pages; j; --j, pfn++) {
WARN_ON_ONCE(!pfn_valid(pfn));
/*
-* alloc_contig_range requires the pfn range
-* specified to be in the same zone. Make this
-* simple by forcing the entire CMA resv range
-* to be in the same zone.
+* In init_cma_reserved_pageblock(), present_pages is
+* adjusted with assumption that all pages come from
+* a single zone. It could be fixed but not yet done.
 */
if (page_zone(pfn_to_page(pfn)) != zone)
goto err;
@@ -145,6 +145,28 @@ err:
 static int __init cma_init_reserved_areas(void)
 {
int i;
+   struct zone *zone;
+   unsigned long start_pfn = UINT_MAX, end_pfn = 0;
+
+   if (!cma_area_count)
+   return 0;
+
+   for (i = 0; i < cma_area_count; i++) {
+   if (start_pfn > cma_areas[i].base_pfn)
+   start_pfn = cma_areas[i].base_pfn;
+   if (end_pfn < cma_areas[i].base_pfn + cma_areas[i].count)
+   end_pfn = cma_areas[i].base_pfn + cma_areas[i].count;
+   }
+
+   for_each_zone(zone) {
+   if (!is_zone_cma(zone))
+   continue;
+
+   /* ZONE_CMA doesn't need to exceed CMA region */
+   zone->zone_start_pfn = max(zone->zone_start_pfn, start_pfn);
+   zone->spanned_pages = min(zone_end_pfn(zone), end_pfn) -
+   zone->zone_start_pfn;
+   }
 
for (i = 0; i < cma_area_count; i++) {
int ret = cma_activate_area(_areas[i]);
@@ 

[PATCH v5 4/6] mm/cma: remove ALLOC_CMA

2016-08-28 Thread js1304
From: Joonsoo Kim 

Now, all reserved pages for CMA region are belong to the ZONE_CMA
and it only serves for GFP_HIGHUSER_MOVABLE. Therefore, we don't need to
consider ALLOC_CMA at all.

Acked-by: Vlastimil Babka 
Signed-off-by: Joonsoo Kim 
---
 mm/compaction.c |  4 +---
 mm/internal.h   |  1 -
 mm/page_alloc.c | 28 +++-
 3 files changed, 4 insertions(+), 29 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 29f6c49..4532905 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1401,14 +1401,12 @@ static enum compact_result __compaction_suitable(struct 
zone *zone, int order,
 * if compaction succeeds.
 * For costly orders, we require low watermark instead of min for
 * compaction to proceed to increase its chances.
-* ALLOC_CMA is used, as pages in CMA pageblocks are considered
-* suitable migration targets
 */
watermark = (order > PAGE_ALLOC_COSTLY_ORDER) ?
low_wmark_pages(zone) : min_wmark_pages(zone);
watermark += compact_gap(order);
if (!__zone_watermark_ok(zone, 0, watermark, classzone_idx,
-   ALLOC_CMA, wmark_target))
+   0, wmark_target))
return COMPACT_SKIPPED;
 
/*
diff --git a/mm/internal.h b/mm/internal.h
index 3d3f052..01d06bb 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -466,7 +466,6 @@ unsigned long reclaim_clean_pages_from_list(struct zone 
*zone,
 #define ALLOC_HARDER   0x10 /* try to alloc harder */
 #define ALLOC_HIGH 0x20 /* __GFP_HIGH set */
 #define ALLOC_CPUSET   0x40 /* check for correct cpuset */
-#define ALLOC_CMA  0x80 /* allow allocations from CMA areas */
 
 enum ttu_flags;
 struct tlbflush_unmap_batch;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 91fb172..16ba1fe 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2565,7 +2565,7 @@ int __isolate_free_page(struct page *page, unsigned int 
order)
 * exists.
 */
watermark = min_wmark_pages(zone) + (1UL << order);
-   if (!zone_watermark_ok(zone, 0, watermark, 0, ALLOC_CMA))
+   if (!zone_watermark_ok(zone, 0, watermark, 0, 0))
return 0;
 
__mod_zone_freepage_state(zone, -(1UL << order), mt);
@@ -2808,12 +2808,6 @@ bool __zone_watermark_ok(struct zone *z, unsigned int 
order, unsigned long mark,
else
min -= min / 4;
 
-#ifdef CONFIG_CMA
-   /* If allocation can't use CMA areas don't use free CMA pages */
-   if (!(alloc_flags & ALLOC_CMA))
-   free_pages -= zone_page_state(z, NR_FREE_CMA_PAGES);
-#endif
-
/*
 * Check watermarks for an order-0 allocation request. If these
 * are not met, then a high-order request also cannot go ahead
@@ -2843,10 +2837,8 @@ bool __zone_watermark_ok(struct zone *z, unsigned int 
order, unsigned long mark,
}
 
 #ifdef CONFIG_CMA
-   if ((alloc_flags & ALLOC_CMA) &&
-   !list_empty(>free_list[MIGRATE_CMA])) {
+   if (!list_empty(>free_list[MIGRATE_CMA]))
return true;
-   }
 #endif
}
return false;
@@ -2863,13 +2855,6 @@ static inline bool zone_watermark_fast(struct zone *z, 
unsigned int order,
unsigned long mark, int classzone_idx, unsigned int alloc_flags)
 {
long free_pages = zone_page_state(z, NR_FREE_PAGES);
-   long cma_pages = 0;
-
-#ifdef CONFIG_CMA
-   /* If allocation can't use CMA areas don't use free CMA pages */
-   if (!(alloc_flags & ALLOC_CMA))
-   cma_pages = zone_page_state(z, NR_FREE_CMA_PAGES);
-#endif
 
/*
 * Fast check for order-0 only. If this fails then the reserves
@@ -2878,7 +2863,7 @@ static inline bool zone_watermark_fast(struct zone *z, 
unsigned int order,
 * the caller is !atomic then it'll uselessly search the free
 * list. That corner case is then slower but it is harmless.
 */
-   if (!order && (free_pages - cma_pages) > mark + 
z->lowmem_reserve[classzone_idx])
+   if (!order && free_pages > mark + z->lowmem_reserve[classzone_idx])
return true;
 
return __zone_watermark_ok(z, order, mark, classzone_idx, alloc_flags,
@@ -3355,10 +3340,6 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
} else if (unlikely(rt_task(current)) && !in_interrupt())
alloc_flags |= ALLOC_HARDER;
 
-#ifdef CONFIG_CMA
-   if (gfpflags_to_migratetype(gfp_mask) == MIGRATE_MOVABLE)
-   alloc_flags |= ALLOC_CMA;
-#endif
return alloc_flags;
 }
 
@@ -3727,9 +3708,6 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
if (unlikely(!zonelist->_zonerefs->zone))
return NULL;
 
-   if (IS_ENABLED(CONFIG_CMA) && ac.migratetype 

[PATCH v5 1/6] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request

2016-08-28 Thread js1304
From: Joonsoo Kim 

Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
important to reserve. When ZONE_MOVABLE is used, this problem would
theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
allocation request which is mainly used for page cache and anon page
allocation. So, fix it.

And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
makes code complex. For example, if there is highmem system, following
reserve ratio is activated for *NORMAL ZONE* which would be easyily
misleading people.

 #ifdef CONFIG_HIGHMEM
 32
 #endif

This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
array by MAX_NR_ZONES and place "#ifdef" to right place.

Signed-off-by: Joonsoo Kim 
---
 include/linux/mmzone.h | 2 +-
 mm/page_alloc.c| 7 ---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index d572b78..e3f39af 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -877,7 +877,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
 int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
-extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
+extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
 int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
 int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4f7d5d7..a8310de 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -198,17 +198,18 @@ static void __free_pages_ok(struct page *page, unsigned 
int order);
  * TBD: should special case ZONE_DMA32 machines here - in those we normally
  * don't need any ZONE_NORMAL reservation
  */
-int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
+int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
 #ifdef CONFIG_ZONE_DMA
 256,
 #endif
 #ifdef CONFIG_ZONE_DMA32
 256,
 #endif
-#ifdef CONFIG_HIGHMEM
 32,
+#ifdef CONFIG_HIGHMEM
+INT_MAX,
 #endif
-32,
+INT_MAX,
 };
 
 EXPORT_SYMBOL(totalram_pages);
-- 
1.9.1



[PATCH v5 3/6] mm/cma: populate ZONE_CMA

2016-08-28 Thread js1304
From: Joonsoo Kim 

Until now, reserved pages for CMA are managed in the ordinary zones
where page's pfn are belong to. This approach has numorous problems
and fixing them isn't easy. (It is mentioned on previous patch.)
To fix this situation, ZONE_CMA is introduced in previous patch, but,
not yet populated. This patch implement population of ZONE_CMA
by stealing reserved pages from the ordinary zones.

Unlike previous implementation that kernel allocation request with
__GFP_MOVABLE could be serviced from CMA region, allocation request only
with GFP_HIGHUSER_MOVABLE can be serviced from CMA region in the new
approach. This is an inevitable design decision to use the zone
implementation because ZONE_CMA could contain highmem. Due to this
decision, ZONE_CMA will work like as ZONE_HIGHMEM or ZONE_MOVABLE.

I don't think it would be a problem because most of file cache pages
and anonymous pages are requested with GFP_HIGHUSER_MOVABLE. It could
be proved by the fact that there are many systems with ZONE_HIGHMEM and
they work fine. Notable disadvantage is that we cannot use these pages
for blockdev file cache page, because it usually has __GFP_MOVABLE but
not __GFP_HIGHMEM and __GFP_USER. But, in this case, there is pros and
cons. In my experience, blockdev file cache pages are one of the top
reason that causes cma_alloc() to fail temporarily. So, we can get more
guarantee of cma_alloc() success by discarding that case.

Implementation itself is very easy to understand. Steal when cma area is
initialized and recalculate various per zone stat/threshold.

Signed-off-by: Joonsoo Kim 
---
 include/linux/memory_hotplug.h |  3 ---
 include/linux/mm.h |  1 +
 mm/cma.c   | 56 ++
 mm/internal.h  |  3 +++
 mm/page_alloc.c| 29 +++---
 5 files changed, 80 insertions(+), 12 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 01033fa..ea5af47 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -198,9 +198,6 @@ void put_online_mems(void);
 void mem_hotplug_begin(void);
 void mem_hotplug_done(void);
 
-extern void set_zone_contiguous(struct zone *zone);
-extern void clear_zone_contiguous(struct zone *zone);
-
 #else /* ! CONFIG_MEMORY_HOTPLUG */
 /*
  * Stub functions for when hotplug is off
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 9d85402..f45e0e4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1933,6 +1933,7 @@ extern void setup_per_cpu_pageset(void);
 
 extern void zone_pcp_update(struct zone *zone);
 extern void zone_pcp_reset(struct zone *zone);
+extern void setup_zone_pageset(struct zone *zone);
 
 /* page_alloc.c */
 extern int min_free_kbytes;
diff --git a/mm/cma.c b/mm/cma.c
index 384c2cb..d69bdf7 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -38,6 +38,7 @@
 #include 
 
 #include "cma.h"
+#include "internal.h"
 
 struct cma cma_areas[MAX_CMA_AREAS];
 unsigned cma_area_count;
@@ -116,10 +117,9 @@ static int __init cma_activate_area(struct cma *cma)
for (j = pageblock_nr_pages; j; --j, pfn++) {
WARN_ON_ONCE(!pfn_valid(pfn));
/*
-* alloc_contig_range requires the pfn range
-* specified to be in the same zone. Make this
-* simple by forcing the entire CMA resv range
-* to be in the same zone.
+* In init_cma_reserved_pageblock(), present_pages is
+* adjusted with assumption that all pages come from
+* a single zone. It could be fixed but not yet done.
 */
if (page_zone(pfn_to_page(pfn)) != zone)
goto err;
@@ -145,6 +145,28 @@ err:
 static int __init cma_init_reserved_areas(void)
 {
int i;
+   struct zone *zone;
+   unsigned long start_pfn = UINT_MAX, end_pfn = 0;
+
+   if (!cma_area_count)
+   return 0;
+
+   for (i = 0; i < cma_area_count; i++) {
+   if (start_pfn > cma_areas[i].base_pfn)
+   start_pfn = cma_areas[i].base_pfn;
+   if (end_pfn < cma_areas[i].base_pfn + cma_areas[i].count)
+   end_pfn = cma_areas[i].base_pfn + cma_areas[i].count;
+   }
+
+   for_each_zone(zone) {
+   if (!is_zone_cma(zone))
+   continue;
+
+   /* ZONE_CMA doesn't need to exceed CMA region */
+   zone->zone_start_pfn = max(zone->zone_start_pfn, start_pfn);
+   zone->spanned_pages = min(zone_end_pfn(zone), end_pfn) -
+   zone->zone_start_pfn;
+   }
 
for (i = 0; i < cma_area_count; i++) {
int ret = cma_activate_area(_areas[i]);
@@ -153,9 +175,33 @@ static int __init 

Re: [PATCH][v8] PM / hibernate: Verify the consistent of e820 memory map by md5 value

2016-08-28 Thread Borislav Petkov
On Mon, Aug 29, 2016 at 12:35:40AM +0800, Chen Yu wrote:
> On some platforms, there is occasional panic triggered when trying to
> resume from hibernation, a typical panic looks like:
> 
> "BUG: unable to handle kernel paging request at 880085894000
> IP: [] load_image_lzo+0x8c2/0xe70"
> 
> This is because e820 map has been changed by BIOS across
> hibernation, and one of the page frames from first kernel
> is right located in second kernel's unmapped region, so panic
> comes out when accessing unmapped kernel address.
> 
> In order to expose this issue earlier, the md5 hash of e820 map
> is passed from suspend kernel to resume kernel, and the system will
> trigger panic once it finds the md5 value of previous kernel is not
> the same as current resume kernel.

... so basically now even the cases where it managed to resume would
panic because the digests differ, even if the original panic condition
doesn't trigger the bug, i.e. your Note 1 below.

The more important question IMHO would be, can we resume our system
successfully *even* if BIOS fiddled with the e820 map?

We'd still warn the hell out of it and even make that the md5 digest
comparison a default-enabled thing without even having a config option
to disable it but can we try harder not to panic and deal with this next
BIOS f*ckup more intelligently than throwing our hands in the air and
giving up?

Thanks.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCH][v8] PM / hibernate: Verify the consistent of e820 memory map by md5 value

2016-08-28 Thread Borislav Petkov
On Mon, Aug 29, 2016 at 12:35:40AM +0800, Chen Yu wrote:
> On some platforms, there is occasional panic triggered when trying to
> resume from hibernation, a typical panic looks like:
> 
> "BUG: unable to handle kernel paging request at 880085894000
> IP: [] load_image_lzo+0x8c2/0xe70"
> 
> This is because e820 map has been changed by BIOS across
> hibernation, and one of the page frames from first kernel
> is right located in second kernel's unmapped region, so panic
> comes out when accessing unmapped kernel address.
> 
> In order to expose this issue earlier, the md5 hash of e820 map
> is passed from suspend kernel to resume kernel, and the system will
> trigger panic once it finds the md5 value of previous kernel is not
> the same as current resume kernel.

... so basically now even the cases where it managed to resume would
panic because the digests differ, even if the original panic condition
doesn't trigger the bug, i.e. your Note 1 below.

The more important question IMHO would be, can we resume our system
successfully *even* if BIOS fiddled with the e820 map?

We'd still warn the hell out of it and even make that the md5 digest
comparison a default-enabled thing without even having a config option
to disable it but can we try harder not to panic and deal with this next
BIOS f*ckup more intelligently than throwing our hands in the air and
giving up?

Thanks.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PART2 PATCH v7 00/12] iommu/AMD: Introduce IOMMU AVIC support

2016-08-28 Thread Suravee Suthikulpanit

Hi Joerg, Radim

Any other concerns?

Thanks,
Suravee

On 8/24/16 01:52, Suravee Suthikulpanit wrote:

From: Suravee Suthikulpanit 

CHANGES FROM V6
===

Per Radim:
* No longer expose struct amd_ir_data to SVM.
* Introduce struct amd_svm_iommu_ir (amd_ir_data wrapper).
* Fix logic to manage ir_list where we need to remove
  the posted interrupt from the previous ir_list before
  mapping it to a new vcpu. Tested running smp VM with:
  -  Using irqbalance
  -  No irqbalance (manually set /proc/irq/smp_affinity)

Misc:
* 08/12: Only set ga_root_ptr in amd_ir_set_vcpu_affinity().
* 10/12: Fix bug in #define AVIC_GATAG_TO_VCPUID.

GITHUB
==
Latest git tree can be found at:
http://github.com/ssuthiku/linux.gitavic_part2_v7

OVERVIEW

This patch set is the second part of the two-part patch series to introduce
the new AMD Advance Virtual Interrupt Controller (AVIC) support.

In addition to the SVM AVIC, AMD IOMMU also extends the AVIC capability
to allow I/O interrupts injection directly into the virtualized guest
local APIC without the need for hypervisor intervention.

This patch series introduces a new hardware interrupt remapping (IR) mode
in AMD IOMMU driver, the Guest Virtual APIC (GA) mode. This is in contrast
to the existing "legacy" mode. The IR mode can be specified with a new
kernel parameter:

amd_iommu_guest_ir=[vapic (default) | legacy]

When enabling GA mode, the AMD IOMMU driver will configure device interrupt
remapping in GA mode when possible (i.e. SVM AVIC must be enabled, and if
the interrupt types are supported). Otherewise, the driver will fallback
to using the legacy IR mode.

This patch series also introduces new interfaces between SVM and IOMMU
to allow:
  * SVM driver to communicate to IOMMU with updated vcpu scheduling
information.
  * IOMMU driver to notify SVM driver to schedule vcpu on to physical core
handle IOMMU GALog entry.

DOCUMENTATIONS
==
More information about SVM AVIC can be found in the
AMD64 Architecture Programmer’s Manual Volume 2 - System Programming.

http://support.amd.com/TechDocs/24593.pdf

More information about IOMMU AVIC can be found int the
AMD I/O Virtualization Technology (IOMMU) Specification - Rev 2.62.

http://support.amd.com/TechDocs/48882_IOMMU.pdf

Any feedback and comments are very much appreciated.

Thank you,
Suravee

Suravee Suthikulpanit (12):
  iommu/amd: Detect and enable guest vAPIC support
  iommu/amd: Move and introduce new IRTE-related unions and structures
  iommu/amd: Introduce interrupt remapping ops structure
  iommu/amd: Add support for multiple IRTE formats
  iommu/amd: Detect and initialize guest vAPIC log
  iommu/amd: Adding GALOG interrupt handler
  iommu/amd: Introduce amd_iommu_update_ga()
  iommu/amd: Implements irq_set_vcpu_affinity() hook to setup vapic mode
for pass-through devices
  iommu/amd: Enable vAPIC interrupt remapping mode by default
  svm: Introduces AVIC per-VM ID
  svm: Introduce AMD IOMMU avic_ga_log_notifier
  svm: Implements update_pi_irte hook to setup posted interrupt

 Documentation/kernel-parameters.txt |   9 +
 arch/x86/include/asm/kvm_host.h |   2 +
 arch/x86/kvm/svm.c  | 406 --
 drivers/iommu/amd_iommu.c   | 484 +++-
 drivers/iommu/amd_iommu_init.c  | 181 +-
 drivers/iommu/amd_iommu_proto.h |   1 +
 drivers/iommu/amd_iommu_types.h | 149 +++
 include/linux/amd-iommu.h   |  43 +++-
 8 files changed, 1188 insertions(+), 87 deletions(-)



Re: [PART2 PATCH v7 00/12] iommu/AMD: Introduce IOMMU AVIC support

2016-08-28 Thread Suravee Suthikulpanit

Hi Joerg, Radim

Any other concerns?

Thanks,
Suravee

On 8/24/16 01:52, Suravee Suthikulpanit wrote:

From: Suravee Suthikulpanit 

CHANGES FROM V6
===

Per Radim:
* No longer expose struct amd_ir_data to SVM.
* Introduce struct amd_svm_iommu_ir (amd_ir_data wrapper).
* Fix logic to manage ir_list where we need to remove
  the posted interrupt from the previous ir_list before
  mapping it to a new vcpu. Tested running smp VM with:
  -  Using irqbalance
  -  No irqbalance (manually set /proc/irq/smp_affinity)

Misc:
* 08/12: Only set ga_root_ptr in amd_ir_set_vcpu_affinity().
* 10/12: Fix bug in #define AVIC_GATAG_TO_VCPUID.

GITHUB
==
Latest git tree can be found at:
http://github.com/ssuthiku/linux.gitavic_part2_v7

OVERVIEW

This patch set is the second part of the two-part patch series to introduce
the new AMD Advance Virtual Interrupt Controller (AVIC) support.

In addition to the SVM AVIC, AMD IOMMU also extends the AVIC capability
to allow I/O interrupts injection directly into the virtualized guest
local APIC without the need for hypervisor intervention.

This patch series introduces a new hardware interrupt remapping (IR) mode
in AMD IOMMU driver, the Guest Virtual APIC (GA) mode. This is in contrast
to the existing "legacy" mode. The IR mode can be specified with a new
kernel parameter:

amd_iommu_guest_ir=[vapic (default) | legacy]

When enabling GA mode, the AMD IOMMU driver will configure device interrupt
remapping in GA mode when possible (i.e. SVM AVIC must be enabled, and if
the interrupt types are supported). Otherewise, the driver will fallback
to using the legacy IR mode.

This patch series also introduces new interfaces between SVM and IOMMU
to allow:
  * SVM driver to communicate to IOMMU with updated vcpu scheduling
information.
  * IOMMU driver to notify SVM driver to schedule vcpu on to physical core
handle IOMMU GALog entry.

DOCUMENTATIONS
==
More information about SVM AVIC can be found in the
AMD64 Architecture Programmer’s Manual Volume 2 - System Programming.

http://support.amd.com/TechDocs/24593.pdf

More information about IOMMU AVIC can be found int the
AMD I/O Virtualization Technology (IOMMU) Specification - Rev 2.62.

http://support.amd.com/TechDocs/48882_IOMMU.pdf

Any feedback and comments are very much appreciated.

Thank you,
Suravee

Suravee Suthikulpanit (12):
  iommu/amd: Detect and enable guest vAPIC support
  iommu/amd: Move and introduce new IRTE-related unions and structures
  iommu/amd: Introduce interrupt remapping ops structure
  iommu/amd: Add support for multiple IRTE formats
  iommu/amd: Detect and initialize guest vAPIC log
  iommu/amd: Adding GALOG interrupt handler
  iommu/amd: Introduce amd_iommu_update_ga()
  iommu/amd: Implements irq_set_vcpu_affinity() hook to setup vapic mode
for pass-through devices
  iommu/amd: Enable vAPIC interrupt remapping mode by default
  svm: Introduces AVIC per-VM ID
  svm: Introduce AMD IOMMU avic_ga_log_notifier
  svm: Implements update_pi_irte hook to setup posted interrupt

 Documentation/kernel-parameters.txt |   9 +
 arch/x86/include/asm/kvm_host.h |   2 +
 arch/x86/kvm/svm.c  | 406 --
 drivers/iommu/amd_iommu.c   | 484 +++-
 drivers/iommu/amd_iommu_init.c  | 181 +-
 drivers/iommu/amd_iommu_proto.h |   1 +
 drivers/iommu/amd_iommu_types.h | 149 +++
 include/linux/amd-iommu.h   |  43 +++-
 8 files changed, 1188 insertions(+), 87 deletions(-)



[GIT PULL] Please pull powerpc/linux.git powerpc-4.8-4 tag

2016-08-28 Thread Benjamin Herrenschmidt
Hi Linus !

So my appologies for being a lousy replacement maintainer while Michael
is on vacation ... this was meant to be sent early last week, but I
has a change pending on one of the fixes and other things made me forget
all about. Ugh.

This is my first signed-tag and use of 2fa so I hope I got it all right...
I tried to use the same format Michael uses for the tag etc...

We have some misc fixes for powerpc 4.8. Some trivial bits and some
regressions, and a trivial cleanup or two that I saw no point in letting
rot in patchwork.

Cheers,
Ben.

The following changes since commit fa8410b355251fd30341662a40ac6b22d3e38468:

  Linux 4.8-rc3 (2016-08-21 16:14:10 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
tags/powerpc-4.8-4

for you to fetch changes up to 78a3e8889b4b6b99775ed954696ff3e017f5d19b:

  powerpc: signals: Discard transaction state from signal frames (2016-08-29 
12:48:40 +1000)



Andrew Donnellan (1):
  cxl: use pcibios_free_controller_deferred() when removing vPHBs

Andrzej Hajda (1):
  powerpc/powernv/pci: fix iterator signedness

Boqun Feng (1):
  powerpc, hotplug: Avoid to touch non-existent cpumasks.

Christophe Leroy (1):
  powerpc: sysdev: cpm: fix gpio save_regs functions

Cyril Bur (1):
  powerpc: signals: Discard transaction state from signal frames

Guenter Roeck (1):
  powerpc: cputhreads: Add missing include file

Markus Elfring (3):
  drivers/macintosh: Delete owner assignment
  powerpc/512x: Delete unnecessary assignment for the field "owner"
  powerpc: mpc8349emitx: Delete unnecessary assignment for the field "owner"

Mauricio Faria de Oliveira (1):
  powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb)

Michael Ellerman (1):
  powerpc/prom: Fix sub-processor option passed to ibm, 
client-architecture-support

Mukesh Ojha (1):
  powerpc/powernv : Drop reference added by kset_find_obj()

Nicholas Piggin (3):
  powerpc/pseries: PACA save area fix for general exception vs MCE
  powerpc/pseries: PACA save area fix for MCE vs MCE
  powerpc/tm: do not use r13 for tabort_syscall

Paolo Bonzini (1):
  powerpc: move hmi.c to arch/powerpc/kvm/

Paul Gortmaker (1):
  powerpc: migrate exception table users off module.h and onto extable.h

 Documentation/powerpc/transactional_memory.txt |  2 ++
 arch/powerpc/include/asm/cputhreads.h  |  1 +
 arch/powerpc/include/asm/hmi.h |  2 +-
 arch/powerpc/include/asm/paca.h| 12 +---
 arch/powerpc/include/asm/pci-bridge.h  |  1 +
 arch/powerpc/kernel/Makefile   |  2 +-
 arch/powerpc/kernel/entry_64.S | 12 
 arch/powerpc/kernel/exceptions-64s.S   | 29 ++---
 arch/powerpc/kernel/kprobes.c  |  2 +-
 arch/powerpc/kernel/pci-common.c   | 36 ++
 arch/powerpc/kernel/prom_init.c|  9 --
 arch/powerpc/kernel/signal_32.c| 14 +
 arch/powerpc/kernel/signal_64.c| 14 +
 arch/powerpc/kernel/smp.c  |  2 +-
 arch/powerpc/kernel/traps.c|  3 +-
 arch/powerpc/kvm/Makefile  |  1 +
 arch/powerpc/{kernel/hmi.c => kvm/book3s_hv_hmi.c} |  0
 arch/powerpc/mm/fault.c|  2 +-
 arch/powerpc/platforms/512x/mpc512x_lpbfifo.c  |  1 -
 arch/powerpc/platforms/83xx/mcu_mpc8349emitx.c |  1 -
 arch/powerpc/platforms/embedded6xx/holly.c |  2 +-
 arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c  |  2 +-
 arch/powerpc/platforms/powernv/opal-dump.c |  7 -
 arch/powerpc/platforms/powernv/opal-elog.c |  7 -
 arch/powerpc/platforms/powernv/pci-ioda.c  |  2 +-
 arch/powerpc/platforms/pseries/pci.c   |  4 +++
 arch/powerpc/platforms/pseries/pci_dlpar.c |  7 +++--
 arch/powerpc/sysdev/cpm1.c |  6 ++--
 arch/powerpc/sysdev/cpm_common.c   |  3 +-
 arch/powerpc/sysdev/fsl_rio.c  |  2 +-
 drivers/macintosh/ams/ams-i2c.c|  1 -
 drivers/macintosh/windfarm_pm112.c |  1 -
 drivers/macintosh/windfarm_pm72.c  |  1 -
 drivers/macintosh/windfarm_rm31.c  |  1 -
 drivers/misc/cxl/vphb.c| 10 +-
 drivers/pci/host-bridge.c  |  1 +
 36 files changed, 160 insertions(+), 43 deletions(-)
 rename arch/powerpc/{kernel/hmi.c => kvm/book3s_hv_hmi.c} (100%)


[GIT PULL] Please pull powerpc/linux.git powerpc-4.8-4 tag

2016-08-28 Thread Benjamin Herrenschmidt
Hi Linus !

So my appologies for being a lousy replacement maintainer while Michael
is on vacation ... this was meant to be sent early last week, but I
has a change pending on one of the fixes and other things made me forget
all about. Ugh.

This is my first signed-tag and use of 2fa so I hope I got it all right...
I tried to use the same format Michael uses for the tag etc...

We have some misc fixes for powerpc 4.8. Some trivial bits and some
regressions, and a trivial cleanup or two that I saw no point in letting
rot in patchwork.

Cheers,
Ben.

The following changes since commit fa8410b355251fd30341662a40ac6b22d3e38468:

  Linux 4.8-rc3 (2016-08-21 16:14:10 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
tags/powerpc-4.8-4

for you to fetch changes up to 78a3e8889b4b6b99775ed954696ff3e017f5d19b:

  powerpc: signals: Discard transaction state from signal frames (2016-08-29 
12:48:40 +1000)



Andrew Donnellan (1):
  cxl: use pcibios_free_controller_deferred() when removing vPHBs

Andrzej Hajda (1):
  powerpc/powernv/pci: fix iterator signedness

Boqun Feng (1):
  powerpc, hotplug: Avoid to touch non-existent cpumasks.

Christophe Leroy (1):
  powerpc: sysdev: cpm: fix gpio save_regs functions

Cyril Bur (1):
  powerpc: signals: Discard transaction state from signal frames

Guenter Roeck (1):
  powerpc: cputhreads: Add missing include file

Markus Elfring (3):
  drivers/macintosh: Delete owner assignment
  powerpc/512x: Delete unnecessary assignment for the field "owner"
  powerpc: mpc8349emitx: Delete unnecessary assignment for the field "owner"

Mauricio Faria de Oliveira (1):
  powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb)

Michael Ellerman (1):
  powerpc/prom: Fix sub-processor option passed to ibm, 
client-architecture-support

Mukesh Ojha (1):
  powerpc/powernv : Drop reference added by kset_find_obj()

Nicholas Piggin (3):
  powerpc/pseries: PACA save area fix for general exception vs MCE
  powerpc/pseries: PACA save area fix for MCE vs MCE
  powerpc/tm: do not use r13 for tabort_syscall

Paolo Bonzini (1):
  powerpc: move hmi.c to arch/powerpc/kvm/

Paul Gortmaker (1):
  powerpc: migrate exception table users off module.h and onto extable.h

 Documentation/powerpc/transactional_memory.txt |  2 ++
 arch/powerpc/include/asm/cputhreads.h  |  1 +
 arch/powerpc/include/asm/hmi.h |  2 +-
 arch/powerpc/include/asm/paca.h| 12 +---
 arch/powerpc/include/asm/pci-bridge.h  |  1 +
 arch/powerpc/kernel/Makefile   |  2 +-
 arch/powerpc/kernel/entry_64.S | 12 
 arch/powerpc/kernel/exceptions-64s.S   | 29 ++---
 arch/powerpc/kernel/kprobes.c  |  2 +-
 arch/powerpc/kernel/pci-common.c   | 36 ++
 arch/powerpc/kernel/prom_init.c|  9 --
 arch/powerpc/kernel/signal_32.c| 14 +
 arch/powerpc/kernel/signal_64.c| 14 +
 arch/powerpc/kernel/smp.c  |  2 +-
 arch/powerpc/kernel/traps.c|  3 +-
 arch/powerpc/kvm/Makefile  |  1 +
 arch/powerpc/{kernel/hmi.c => kvm/book3s_hv_hmi.c} |  0
 arch/powerpc/mm/fault.c|  2 +-
 arch/powerpc/platforms/512x/mpc512x_lpbfifo.c  |  1 -
 arch/powerpc/platforms/83xx/mcu_mpc8349emitx.c |  1 -
 arch/powerpc/platforms/embedded6xx/holly.c |  2 +-
 arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c  |  2 +-
 arch/powerpc/platforms/powernv/opal-dump.c |  7 -
 arch/powerpc/platforms/powernv/opal-elog.c |  7 -
 arch/powerpc/platforms/powernv/pci-ioda.c  |  2 +-
 arch/powerpc/platforms/pseries/pci.c   |  4 +++
 arch/powerpc/platforms/pseries/pci_dlpar.c |  7 +++--
 arch/powerpc/sysdev/cpm1.c |  6 ++--
 arch/powerpc/sysdev/cpm_common.c   |  3 +-
 arch/powerpc/sysdev/fsl_rio.c  |  2 +-
 drivers/macintosh/ams/ams-i2c.c|  1 -
 drivers/macintosh/windfarm_pm112.c |  1 -
 drivers/macintosh/windfarm_pm72.c  |  1 -
 drivers/macintosh/windfarm_rm31.c  |  1 -
 drivers/misc/cxl/vphb.c| 10 +-
 drivers/pci/host-bridge.c  |  1 +
 36 files changed, 160 insertions(+), 43 deletions(-)
 rename arch/powerpc/{kernel/hmi.c => kvm/book3s_hv_hmi.c} (100%)


Build error in timer-atmel-pit.c

2016-08-28 Thread Brent Taylor
Daniel,
   After updating to linux-4.8-rc4, I got the following build error:

linux-x.yy/drivers/clocksource/timer-atmel-pit.c: In function
'at91sam926x_pit_dt_init':
linux-x.yy/drivers/clocksource/timer-atmel-pit.c:264:2: error: 'ret'
undeclared (first use in this function)
  ret = clk_prepare_enable(data->mck);
  ^~~
linux-x.yy/drivers/clocksource/timer-atmel-pit.c:264:2: note: each
undeclared identifier is reported only once for each function it
appears in

This was introduced in commit: 699e36e5b8e9f77b2be4c23f0b309e53be4b2880

Regards,
Brent Taylor


Build error in timer-atmel-pit.c

2016-08-28 Thread Brent Taylor
Daniel,
   After updating to linux-4.8-rc4, I got the following build error:

linux-x.yy/drivers/clocksource/timer-atmel-pit.c: In function
'at91sam926x_pit_dt_init':
linux-x.yy/drivers/clocksource/timer-atmel-pit.c:264:2: error: 'ret'
undeclared (first use in this function)
  ret = clk_prepare_enable(data->mck);
  ^~~
linux-x.yy/drivers/clocksource/timer-atmel-pit.c:264:2: note: each
undeclared identifier is reported only once for each function it
appears in

This was introduced in commit: 699e36e5b8e9f77b2be4c23f0b309e53be4b2880

Regards,
Brent Taylor


Re: imx-drm: Possible regression after update to atomic

2016-08-28 Thread Ying Liu
Hi Thorsten,

On Sun, Aug 28, 2016 at 6:17 PM, Thorsten Leemhuis
 wrote:
> Lo! Dave, below report made it to the list of regression for 4.8, but
> afaics nothing happened after the initial report. Was it discussed (and
> maybe even fixed?) elsewhere? Or is there some reason why it shouldn't
> be on the list of regressions at all?

We've got a patch set[1] to fix this.

[1] http://www.spinics.net/lists/dri-devel/msg116491.html

Regards,
Liu Ying

>
> Ciao, Thorsten
>
> On 13.08.2016 14:37, Peter Senna Tschudin wrote:
>>
>> d7868cb7ac58640e9c0383205ba31bd6a985cc6f is the last commit that works for 
>> me. I'm experiencing black screen after Weston starts in two different i.MX 
>> based devices:
>>
>>  - i.MX6 -> arch/arm/boot/dts/imx6q-b850v3.dts
>>  - i.MX53 based device
>>
>> Weston starts, but nothing is shown on screen. fb works fine. Disabling fb 
>> on Kconfig or simply commenting out drm_fbdev_cma_init() solves the black 
>> screen issue.
>>
>> The tests that are causing the black screen:
>>
>> diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c 
>> b/drivers/gpu/drm/imx/ipuv3-plane.c
>> index 4ad67d0..52dc1b7 100644
>> --- a/drivers/gpu/drm/imx/ipuv3-plane.c
>> +++ b/drivers/gpu/drm/imx/ipuv3-plane.c
>> @@ -325,7 +325,7 @@ static int ipu_plane_atomic_check(struct drm_plane 
>> *plane,
>> if (old_fb && (state->src_w != old_state->src_w ||
>>   state->src_h != old_state->src_h ||
>>   fb->pixel_format != old_fb->pixel_format))
>> -   return -EINVAL;
>>
>> eba = drm_plane_state_to_eba(state);
>>
>> @@ -336,7 +336,7 @@ static int ipu_plane_atomic_check(struct drm_plane 
>> *plane,
>> return -EINVAL;
>>
>> if (old_fb && fb->pitches[0] != old_fb->pitches[0])
>> -   return -EINVAL;
>>
>> switch (fb->pixel_format) {
>> case DRM_FORMAT_YUV420:
>> @@ -372,7 +372,7 @@ static int ipu_plane_atomic_check(struct drm_plane 
>> *plane,
>> return -EINVAL;
>>
>> if (old_fb && old_fb->pitches[1] != fb->pitches[1])
>> -   return -EINVAL;
>> }
>>
>> I tried to replace the return -EINVAL by crtc_state->mode_changed = true 
>> with no positive results.
>>
>> I'm trying to understand what is the difference with and without fb, but I 
>> have no conclusions yet.
>>
>> Hints on what could be the cause here?
>>
>> Thank you,
>>
>> Peter
>>
>> P.S. This is what I get after replacing the return -EINVAL(the mode is 
>> correct): https://goo.gl/photos/1eRdcco9GpszgvzM8


Re: imx-drm: Possible regression after update to atomic

2016-08-28 Thread Ying Liu
Hi Thorsten,

On Sun, Aug 28, 2016 at 6:17 PM, Thorsten Leemhuis
 wrote:
> Lo! Dave, below report made it to the list of regression for 4.8, but
> afaics nothing happened after the initial report. Was it discussed (and
> maybe even fixed?) elsewhere? Or is there some reason why it shouldn't
> be on the list of regressions at all?

We've got a patch set[1] to fix this.

[1] http://www.spinics.net/lists/dri-devel/msg116491.html

Regards,
Liu Ying

>
> Ciao, Thorsten
>
> On 13.08.2016 14:37, Peter Senna Tschudin wrote:
>>
>> d7868cb7ac58640e9c0383205ba31bd6a985cc6f is the last commit that works for 
>> me. I'm experiencing black screen after Weston starts in two different i.MX 
>> based devices:
>>
>>  - i.MX6 -> arch/arm/boot/dts/imx6q-b850v3.dts
>>  - i.MX53 based device
>>
>> Weston starts, but nothing is shown on screen. fb works fine. Disabling fb 
>> on Kconfig or simply commenting out drm_fbdev_cma_init() solves the black 
>> screen issue.
>>
>> The tests that are causing the black screen:
>>
>> diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c 
>> b/drivers/gpu/drm/imx/ipuv3-plane.c
>> index 4ad67d0..52dc1b7 100644
>> --- a/drivers/gpu/drm/imx/ipuv3-plane.c
>> +++ b/drivers/gpu/drm/imx/ipuv3-plane.c
>> @@ -325,7 +325,7 @@ static int ipu_plane_atomic_check(struct drm_plane 
>> *plane,
>> if (old_fb && (state->src_w != old_state->src_w ||
>>   state->src_h != old_state->src_h ||
>>   fb->pixel_format != old_fb->pixel_format))
>> -   return -EINVAL;
>>
>> eba = drm_plane_state_to_eba(state);
>>
>> @@ -336,7 +336,7 @@ static int ipu_plane_atomic_check(struct drm_plane 
>> *plane,
>> return -EINVAL;
>>
>> if (old_fb && fb->pitches[0] != old_fb->pitches[0])
>> -   return -EINVAL;
>>
>> switch (fb->pixel_format) {
>> case DRM_FORMAT_YUV420:
>> @@ -372,7 +372,7 @@ static int ipu_plane_atomic_check(struct drm_plane 
>> *plane,
>> return -EINVAL;
>>
>> if (old_fb && old_fb->pitches[1] != fb->pitches[1])
>> -   return -EINVAL;
>> }
>>
>> I tried to replace the return -EINVAL by crtc_state->mode_changed = true 
>> with no positive results.
>>
>> I'm trying to understand what is the difference with and without fb, but I 
>> have no conclusions yet.
>>
>> Hints on what could be the cause here?
>>
>> Thank you,
>>
>> Peter
>>
>> P.S. This is what I get after replacing the return -EINVAL(the mode is 
>> correct): https://goo.gl/photos/1eRdcco9GpszgvzM8


Re: [PATCH 5/5] net/xgene: fix error handling during reset

2016-08-28 Thread David Miller
From: Arnd Bergmann 
Date: Fri, 26 Aug 2016 17:25:46 +0200

> The newly added reset logic uses helper functions for the MMIO that
> may fail. However, when the read operation fails, we end up writing
> back uninitialized data to the register, as gcc warns:
> 
> drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c: In function 
> 'xgene_enet_link_state':
> drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c:213:2: error: 'data' may be 
> used uninitialized in this function [-Werror=maybe-uninitialized]
> drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c:209:6: note: 'data' was 
> declared here
>   u32 data;
> 
> We already print a warning to the console log if that happens,
> the best alternative that I can see is skip the rest of the reset
> sequence if the register value cannot be read: Most likely the
> write would fail as well, and if it succeeded, worse things could
> happen.
> 
> Signed-off-by: Arnd Bergmann 
> Fixes: 3eb7cb9dc946 ("drivers: net: xgene: XFI PCS reset when link is down")

Applied.


Re: [PATCH 4/5] net_sched: fix use of uninitialized ethertype variable in cls_flower

2016-08-28 Thread David Miller
From: Arnd Bergmann 
Date: Fri, 26 Aug 2016 17:25:45 +0200

> The addition of VLAN support caused a possible use of uninitialized
> data if we encounter a zero TCA_FLOWER_KEY_ETH_TYPE key, as pointed
> out by "gcc -Wmaybe-uninitialized":
> 
> net/sched/cls_flower.c: In function 'fl_change':
> net/sched/cls_flower.c:366:22: error: 'ethertype' may be used uninitialized 
> in this function [-Werror=maybe-uninitialized]
> 
> This changes the code to only set the ethertype field if it
> was nonzero, as before the patch.
> 
> Signed-off-by: Arnd Bergmann 
> Fixes: 9399ae9a6cb2 ("net_sched: flower: Add vlan support")

Applied.


Re: [PATCH 5/5] net/xgene: fix error handling during reset

2016-08-28 Thread David Miller
From: Arnd Bergmann 
Date: Fri, 26 Aug 2016 17:25:46 +0200

> The newly added reset logic uses helper functions for the MMIO that
> may fail. However, when the read operation fails, we end up writing
> back uninitialized data to the register, as gcc warns:
> 
> drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c: In function 
> 'xgene_enet_link_state':
> drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c:213:2: error: 'data' may be 
> used uninitialized in this function [-Werror=maybe-uninitialized]
> drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c:209:6: note: 'data' was 
> declared here
>   u32 data;
> 
> We already print a warning to the console log if that happens,
> the best alternative that I can see is skip the rest of the reset
> sequence if the register value cannot be read: Most likely the
> write would fail as well, and if it succeeded, worse things could
> happen.
> 
> Signed-off-by: Arnd Bergmann 
> Fixes: 3eb7cb9dc946 ("drivers: net: xgene: XFI PCS reset when link is down")

Applied.


Re: [PATCH 4/5] net_sched: fix use of uninitialized ethertype variable in cls_flower

2016-08-28 Thread David Miller
From: Arnd Bergmann 
Date: Fri, 26 Aug 2016 17:25:45 +0200

> The addition of VLAN support caused a possible use of uninitialized
> data if we encounter a zero TCA_FLOWER_KEY_ETH_TYPE key, as pointed
> out by "gcc -Wmaybe-uninitialized":
> 
> net/sched/cls_flower.c: In function 'fl_change':
> net/sched/cls_flower.c:366:22: error: 'ethertype' may be used uninitialized 
> in this function [-Werror=maybe-uninitialized]
> 
> This changes the code to only set the ethertype field if it
> was nonzero, as before the patch.
> 
> Signed-off-by: Arnd Bergmann 
> Fixes: 9399ae9a6cb2 ("net_sched: flower: Add vlan support")

Applied.


Re: [PATCH v3 1/2] input: misc: Add generic input driver to read encoded GPIO lines

2016-08-28 Thread Vignesh R


On Thursday 25 August 2016 10:26 PM, Dmitry Torokhov wrote:
> On Wed, Aug 24, 2016 at 01:28:58PM +0530, Vignesh R wrote:
>> Add a driver to read group of GPIO lines and provide its status as a
>> numerical value as input event to the system. This will help in
>> interfacing devices, that can be connected over GPIOs, that provide
>> input to the system by driving GPIO lines connected to them like a
>> rotary dial or a switch.
>>
>> For example, a rotary switch can be connected to four GPIO lines. The
>> status of the GPIO lines reflect the actual position of the rotary
>> switch dial. For example, if dial points to 9, then the four GPIO lines
>> connected to the switch will read HLLH(0b'1001 = 9). This value
>> can be reported as an ABS_* event to the input subsystem.
>>
>> Signed-off-by: Vignesh R 
>> Acked-by: Rob Herring 
>> ---
>>
>> v3: Fix comments by Andrew and Dmitry
>> Link to v2: https://lkml.org/lkml/2016/8/23/79
>>
>>  .../devicetree/bindings/input/gpio-decoder.txt |  23 
>>  drivers/input/misc/Kconfig |  12 ++
>>  drivers/input/misc/Makefile|   1 +
>>  drivers/input/misc/gpio_decoder.c  | 134 
>> +
>>  4 files changed, 170 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/input/gpio-decoder.txt
>>  create mode 100644 drivers/input/misc/gpio_decoder.c
>>
>> diff --git a/Documentation/devicetree/bindings/input/gpio-decoder.txt 
>> b/Documentation/devicetree/bindings/input/gpio-decoder.txt
>> new file mode 100644
>> index ..14a77fb96cf0
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/input/gpio-decoder.txt
>> @@ -0,0 +1,23 @@
>> +* GPIO Decoder DT bindings
>> +
>> +Required Properties:
>> +- compatible: should be "gpio-decoder"
>> +- gpios: a spec of gpios (at least two) to be decoded to a number with
>> +  first entry representing the MSB.
>> +
>> +Optional Properties:
>> +- decoder-max-value: Maximum possible value that can be reported by
>> +  the gpios.
>> +- linux,axis: the input subsystem axis to map to (ABS_X/ABS_Y).
>> +  Defaults to 0 (ABS_X).
>> +
>> +Example:
>> +gpio-decoder0 {
>> +compatible = "gpio-decoder";
>> +gpios = < 3 GPIO_ACTIVE_HIGH>,
>> +< 2 GPIO_ACTIVE_HIGH>,
>> +< 1 GPIO_ACTIVE_HIGH>,
>> +< 0 GPIO_ACTIVE_HIGH>;
>> +linux,axis = <0>; /* ABS_X */
>> +decoder-max-value = <9>;
>> +};
>> diff --git a/drivers/input/misc/Kconfig b/drivers/input/misc/Kconfig
>> index efb0ca871327..7cdb89397d18 100644
>> --- a/drivers/input/misc/Kconfig
>> +++ b/drivers/input/misc/Kconfig
>> @@ -292,6 +292,18 @@ config INPUT_GPIO_TILT_POLLED
>>To compile this driver as a module, choose M here: the
>>module will be called gpio_tilt_polled.
>>  
>> +config INPUT_GPIO_DECODER
>> +tristate "Polled GPIO Decoder Input driver"
>> +depends on GPIOLIB || COMPILE_TEST
>> +select INPUT_POLLDEV
>> +help
>> + Say Y here if you want driver to read status of multiple GPIO
>> + lines and report the encoded value as an absolute integer to
>> + input subsystem.
>> +
>> + To compile this driver as a module, choose M here: the module
>> + will be called gpio_decoder.
>> +
>>  config INPUT_IXP4XX_BEEPER
>>  tristate "IXP4XX Beeper support"
>>  depends on ARCH_IXP4XX
>> diff --git a/drivers/input/misc/Makefile b/drivers/input/misc/Makefile
>> index 6a1e5e20fc1c..0b6d025f0487 100644
>> --- a/drivers/input/misc/Makefile
>> +++ b/drivers/input/misc/Makefile
>> @@ -35,6 +35,7 @@ obj-$(CONFIG_INPUT_DRV2667_HAPTICS)+= drv2667.o
>>  obj-$(CONFIG_INPUT_GP2A)+= gp2ap002a00f.o
>>  obj-$(CONFIG_INPUT_GPIO_BEEPER) += gpio-beeper.o
>>  obj-$(CONFIG_INPUT_GPIO_TILT_POLLED)+= gpio_tilt_polled.o
>> +obj-$(CONFIG_INPUT_GPIO_DECODER)+= gpio_decoder.o
>>  obj-$(CONFIG_INPUT_HISI_POWERKEY)   += hisi_powerkey.o
>>  obj-$(CONFIG_HP_SDC_RTC)+= hp_sdc_rtc.o
>>  obj-$(CONFIG_INPUT_IMS_PCU) += ims-pcu.o
>> diff --git a/drivers/input/misc/gpio_decoder.c 
>> b/drivers/input/misc/gpio_decoder.c
>> new file mode 100644
>> index ..1c2191d4b143
>> --- /dev/null
>> +++ b/drivers/input/misc/gpio_decoder.c
>> @@ -0,0 +1,134 @@
>> +/*
>> + * Copyright (C) 2016 Texas Instruments Incorporated - http://www.ti.com/
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public License as
>> + * published by the Free Software Foundation version 2.
>> + *
>> + * This program is distributed "as is" WITHOUT ANY WARRANTY of any
>> + * kind, whether express or implied; without even the implied warranty
>> + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * A generic driver to read multiple gpio 

Re: [PATCH v3 1/2] input: misc: Add generic input driver to read encoded GPIO lines

2016-08-28 Thread Vignesh R


On Thursday 25 August 2016 10:26 PM, Dmitry Torokhov wrote:
> On Wed, Aug 24, 2016 at 01:28:58PM +0530, Vignesh R wrote:
>> Add a driver to read group of GPIO lines and provide its status as a
>> numerical value as input event to the system. This will help in
>> interfacing devices, that can be connected over GPIOs, that provide
>> input to the system by driving GPIO lines connected to them like a
>> rotary dial or a switch.
>>
>> For example, a rotary switch can be connected to four GPIO lines. The
>> status of the GPIO lines reflect the actual position of the rotary
>> switch dial. For example, if dial points to 9, then the four GPIO lines
>> connected to the switch will read HLLH(0b'1001 = 9). This value
>> can be reported as an ABS_* event to the input subsystem.
>>
>> Signed-off-by: Vignesh R 
>> Acked-by: Rob Herring 
>> ---
>>
>> v3: Fix comments by Andrew and Dmitry
>> Link to v2: https://lkml.org/lkml/2016/8/23/79
>>
>>  .../devicetree/bindings/input/gpio-decoder.txt |  23 
>>  drivers/input/misc/Kconfig |  12 ++
>>  drivers/input/misc/Makefile|   1 +
>>  drivers/input/misc/gpio_decoder.c  | 134 
>> +
>>  4 files changed, 170 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/input/gpio-decoder.txt
>>  create mode 100644 drivers/input/misc/gpio_decoder.c
>>
>> diff --git a/Documentation/devicetree/bindings/input/gpio-decoder.txt 
>> b/Documentation/devicetree/bindings/input/gpio-decoder.txt
>> new file mode 100644
>> index ..14a77fb96cf0
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/input/gpio-decoder.txt
>> @@ -0,0 +1,23 @@
>> +* GPIO Decoder DT bindings
>> +
>> +Required Properties:
>> +- compatible: should be "gpio-decoder"
>> +- gpios: a spec of gpios (at least two) to be decoded to a number with
>> +  first entry representing the MSB.
>> +
>> +Optional Properties:
>> +- decoder-max-value: Maximum possible value that can be reported by
>> +  the gpios.
>> +- linux,axis: the input subsystem axis to map to (ABS_X/ABS_Y).
>> +  Defaults to 0 (ABS_X).
>> +
>> +Example:
>> +gpio-decoder0 {
>> +compatible = "gpio-decoder";
>> +gpios = < 3 GPIO_ACTIVE_HIGH>,
>> +< 2 GPIO_ACTIVE_HIGH>,
>> +< 1 GPIO_ACTIVE_HIGH>,
>> +< 0 GPIO_ACTIVE_HIGH>;
>> +linux,axis = <0>; /* ABS_X */
>> +decoder-max-value = <9>;
>> +};
>> diff --git a/drivers/input/misc/Kconfig b/drivers/input/misc/Kconfig
>> index efb0ca871327..7cdb89397d18 100644
>> --- a/drivers/input/misc/Kconfig
>> +++ b/drivers/input/misc/Kconfig
>> @@ -292,6 +292,18 @@ config INPUT_GPIO_TILT_POLLED
>>To compile this driver as a module, choose M here: the
>>module will be called gpio_tilt_polled.
>>  
>> +config INPUT_GPIO_DECODER
>> +tristate "Polled GPIO Decoder Input driver"
>> +depends on GPIOLIB || COMPILE_TEST
>> +select INPUT_POLLDEV
>> +help
>> + Say Y here if you want driver to read status of multiple GPIO
>> + lines and report the encoded value as an absolute integer to
>> + input subsystem.
>> +
>> + To compile this driver as a module, choose M here: the module
>> + will be called gpio_decoder.
>> +
>>  config INPUT_IXP4XX_BEEPER
>>  tristate "IXP4XX Beeper support"
>>  depends on ARCH_IXP4XX
>> diff --git a/drivers/input/misc/Makefile b/drivers/input/misc/Makefile
>> index 6a1e5e20fc1c..0b6d025f0487 100644
>> --- a/drivers/input/misc/Makefile
>> +++ b/drivers/input/misc/Makefile
>> @@ -35,6 +35,7 @@ obj-$(CONFIG_INPUT_DRV2667_HAPTICS)+= drv2667.o
>>  obj-$(CONFIG_INPUT_GP2A)+= gp2ap002a00f.o
>>  obj-$(CONFIG_INPUT_GPIO_BEEPER) += gpio-beeper.o
>>  obj-$(CONFIG_INPUT_GPIO_TILT_POLLED)+= gpio_tilt_polled.o
>> +obj-$(CONFIG_INPUT_GPIO_DECODER)+= gpio_decoder.o
>>  obj-$(CONFIG_INPUT_HISI_POWERKEY)   += hisi_powerkey.o
>>  obj-$(CONFIG_HP_SDC_RTC)+= hp_sdc_rtc.o
>>  obj-$(CONFIG_INPUT_IMS_PCU) += ims-pcu.o
>> diff --git a/drivers/input/misc/gpio_decoder.c 
>> b/drivers/input/misc/gpio_decoder.c
>> new file mode 100644
>> index ..1c2191d4b143
>> --- /dev/null
>> +++ b/drivers/input/misc/gpio_decoder.c
>> @@ -0,0 +1,134 @@
>> +/*
>> + * Copyright (C) 2016 Texas Instruments Incorporated - http://www.ti.com/
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public License as
>> + * published by the Free Software Foundation version 2.
>> + *
>> + * This program is distributed "as is" WITHOUT ANY WARRANTY of any
>> + * kind, whether express or implied; without even the implied warranty
>> + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * A generic driver to read multiple gpio lines and translate the
>> + * 

Re: [PATCH 3/3] scsi/ncr5380: Improve interrupt latency during PIO tranfers

2016-08-28 Thread Finn Thain

On Sun, 28 Aug 2016, Geert Uytterhoeven wrote:

> Hi Finn,
> 
> On Sat, Aug 27, 2016 at 4:30 AM, Finn Thain  
> wrote:
> > Large PIO transfers are broken up into chunks to try to avoid 
> > disabling local IRQs for long periods. But IRQs are still disabled for 
> > too long and this causes SCC FIFO overruns during serial port 
> > transfers. This patch fixes the problem by halving the PIO chunk size.
> >
> > Testing with mac_scsi shows that the extra NCR5380_main() loop 
> > iterations have negligible performance impact on SCSI transfers (about 
> > 1% slower). On a faster system (using the dmx3191d module) transfers 
> > showed no measurable change.
> >
> > Signed-off-by: Finn Thain 
> >
> > ---
> >  drivers/scsi/NCR5380.c |6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > Index: linux/drivers/scsi/NCR5380.c
> > ===
> > --- linux.orig/drivers/scsi/NCR5380.c   2016-08-27 12:29:57.0 +1000
> > +++ linux/drivers/scsi/NCR5380.c2016-08-27 12:29:58.0 +1000
> > @@ -1847,11 +1847,11 @@ static void NCR5380_information_transfer
> > /* XXX - need to source or 
> > sink data here, as appropriate */
> > }
> > } else {
> > -   /* Break up transfer into 3 ms 
> > chunks,
> > -* presuming 6 accesses per 
> > handshake.
> > +   /* Transfer a small chunk so that 
> > the
> > +* irq mode lock is not held too 
> > long.
> >  */
> > transfersize = min((unsigned 
> > long)cmd->SCp.this_residual,
> > -  
> > hostdata->accesses_per_ms / 2);
> > +  
> > hostdata->accesses_per_ms >> 2);
> 
> I think it's easier to read if you use "/ 4".

I think the factor, "1/4 byte milliseconds per access" is not very 
meaningful. The PIO transfersize can be understood as,

pio_bytes_until_scc_fifo_overflow = accesses_per_ms /
 (accesses_per_pio_byte / ms_until_fifo_overflow)

This loop seemed like a good place to avoid a DIV instruction (though I 
didn't try to confirm that) and so I used a bit shift to indicate that 
intention.

The shift amount was an empirical result that happened to work for the 
hardware I tested it on, at the baud rate I was using. Admittedly, if we 
want to avoid further tweaks to this then I'll have to do more testing and 
find a better approximation.

-- 

> 
> Gr{oetje,eeting}s,
> 
> Geert
> 


Re: [PATCH 3/3] scsi/ncr5380: Improve interrupt latency during PIO tranfers

2016-08-28 Thread Finn Thain

On Sun, 28 Aug 2016, Geert Uytterhoeven wrote:

> Hi Finn,
> 
> On Sat, Aug 27, 2016 at 4:30 AM, Finn Thain  
> wrote:
> > Large PIO transfers are broken up into chunks to try to avoid 
> > disabling local IRQs for long periods. But IRQs are still disabled for 
> > too long and this causes SCC FIFO overruns during serial port 
> > transfers. This patch fixes the problem by halving the PIO chunk size.
> >
> > Testing with mac_scsi shows that the extra NCR5380_main() loop 
> > iterations have negligible performance impact on SCSI transfers (about 
> > 1% slower). On a faster system (using the dmx3191d module) transfers 
> > showed no measurable change.
> >
> > Signed-off-by: Finn Thain 
> >
> > ---
> >  drivers/scsi/NCR5380.c |6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > Index: linux/drivers/scsi/NCR5380.c
> > ===
> > --- linux.orig/drivers/scsi/NCR5380.c   2016-08-27 12:29:57.0 +1000
> > +++ linux/drivers/scsi/NCR5380.c2016-08-27 12:29:58.0 +1000
> > @@ -1847,11 +1847,11 @@ static void NCR5380_information_transfer
> > /* XXX - need to source or 
> > sink data here, as appropriate */
> > }
> > } else {
> > -   /* Break up transfer into 3 ms 
> > chunks,
> > -* presuming 6 accesses per 
> > handshake.
> > +   /* Transfer a small chunk so that 
> > the
> > +* irq mode lock is not held too 
> > long.
> >  */
> > transfersize = min((unsigned 
> > long)cmd->SCp.this_residual,
> > -  
> > hostdata->accesses_per_ms / 2);
> > +  
> > hostdata->accesses_per_ms >> 2);
> 
> I think it's easier to read if you use "/ 4".

I think the factor, "1/4 byte milliseconds per access" is not very 
meaningful. The PIO transfersize can be understood as,

pio_bytes_until_scc_fifo_overflow = accesses_per_ms /
 (accesses_per_pio_byte / ms_until_fifo_overflow)

This loop seemed like a good place to avoid a DIV instruction (though I 
didn't try to confirm that) and so I used a bit shift to indicate that 
intention.

The shift amount was an empirical result that happened to work for the 
hardware I tested it on, at the baud rate I was using. Admittedly, if we 
want to avoid further tweaks to this then I'll have to do more testing and 
find a better approximation.

-- 

> 
> Gr{oetje,eeting}s,
> 
> Geert
> 


Re: [RFC 1/1] drivers: i2c: omap: Add slave support

2016-08-28 Thread Matthijs van Duin
On 28 August 2016 at 07:35, Wolfram Sang  wrote:
> Well, I2C is simple, what could go wrong? :/

Actually I2C is elegant and *seems* simple, but in all its
asynchronicity there are actually a surprising number of fine details
you can trip over.  Maybe that's why so many i2c controllers suck: since
i2c looks simple enough manufacturers are easily tempted to roll their
own instead of licensing a good implementation.

Having said that, most of the inconsistency and obnoxiousness of the TI
I2C controller is not even excusable by that argument.  For example its
irq registers *look* like the usual set { rawstatus, status, en, dis }
that's their current standard ("Highlander") for peripherals. They do
not however *behave* like the standard set however:
  1. status isn't always (rawstatus & enabled)
  2. status != 0 does not always imply the irq output is asserted
  3. some enable-bits also change the behaviour of rawstatus
All of these misbehaviours are unprecedented afaik.

Normally you'd also expect each irq (raw)status bit to either
  a. be an event, set by hw and can be cleared by software any time, or
  b. be a level status, unaffected by software attempts to set/clear.
Again the i2c controller decided this is far too little diversity.

> So, it is possible to make a proper I2C slave with OMAP, but you need
> to know those 100 gory details?

Mostly.  There are some limitations such as:

* No ability to selectively ACK/NACK when addressed as slave. If you're
unable to respond for some time then you'd end up blocking the bus with
clock stretching.  You could temporarily deconfigure your slave address
but the TRM states changing slave address is forbidden while bus busy.

* According to my notes it always ACKs a General Call and this cannot
even be stalled using the SBLOCK register.  Since I don't care about GC
there's no more details in my notes, but if this is true then on any bus
where GC is used, irq handling will have real-time deadlines to avoid
losing track of transaction boundaries and misinterpreting data.

Finally, as my first link pointed out, various protocol errors can lock
up the peripheral's internal state machine.  When operating as slave
this is basically undetectable: all registers look normal and the
bus-busy bit will continue to track start/stop, but the peripheral will
not ACK any slave address anymore until you reset it.

You could argue "well, but that requires bus protocol errors" but it is
nevertheless a direct violation of the I2C standard:

I2C-bus compatible devices must reset their bus logic on receipt
of a START or repeated START condition such that they all
anticipate the sending of a slave address, even if these START
conditions are not positioned according to the proper format.

Also, my testing showed pulsing SDA low on an idle bus sufficed to
trigger this state.  It needs to pass the glitch filter of course, but
this filter is implemented by sampling the bus requiring two consecutive
samples to agree.  Two small glitches with just the right timing would
therefore suffice.  Rather unlikely for random noise, but having lots of
signals on your pcb that ultimately derive from the same clock source
probably makes the odds a lot more favorable.

Matthijs


Re: [RFC 1/1] drivers: i2c: omap: Add slave support

2016-08-28 Thread Matthijs van Duin
On 28 August 2016 at 07:35, Wolfram Sang  wrote:
> Well, I2C is simple, what could go wrong? :/

Actually I2C is elegant and *seems* simple, but in all its
asynchronicity there are actually a surprising number of fine details
you can trip over.  Maybe that's why so many i2c controllers suck: since
i2c looks simple enough manufacturers are easily tempted to roll their
own instead of licensing a good implementation.

Having said that, most of the inconsistency and obnoxiousness of the TI
I2C controller is not even excusable by that argument.  For example its
irq registers *look* like the usual set { rawstatus, status, en, dis }
that's their current standard ("Highlander") for peripherals. They do
not however *behave* like the standard set however:
  1. status isn't always (rawstatus & enabled)
  2. status != 0 does not always imply the irq output is asserted
  3. some enable-bits also change the behaviour of rawstatus
All of these misbehaviours are unprecedented afaik.

Normally you'd also expect each irq (raw)status bit to either
  a. be an event, set by hw and can be cleared by software any time, or
  b. be a level status, unaffected by software attempts to set/clear.
Again the i2c controller decided this is far too little diversity.

> So, it is possible to make a proper I2C slave with OMAP, but you need
> to know those 100 gory details?

Mostly.  There are some limitations such as:

* No ability to selectively ACK/NACK when addressed as slave. If you're
unable to respond for some time then you'd end up blocking the bus with
clock stretching.  You could temporarily deconfigure your slave address
but the TRM states changing slave address is forbidden while bus busy.

* According to my notes it always ACKs a General Call and this cannot
even be stalled using the SBLOCK register.  Since I don't care about GC
there's no more details in my notes, but if this is true then on any bus
where GC is used, irq handling will have real-time deadlines to avoid
losing track of transaction boundaries and misinterpreting data.

Finally, as my first link pointed out, various protocol errors can lock
up the peripheral's internal state machine.  When operating as slave
this is basically undetectable: all registers look normal and the
bus-busy bit will continue to track start/stop, but the peripheral will
not ACK any slave address anymore until you reset it.

You could argue "well, but that requires bus protocol errors" but it is
nevertheless a direct violation of the I2C standard:

I2C-bus compatible devices must reset their bus logic on receipt
of a START or repeated START condition such that they all
anticipate the sending of a slave address, even if these START
conditions are not positioned according to the proper format.

Also, my testing showed pulsing SDA low on an idle bus sufficed to
trigger this state.  It needs to pass the glitch filter of course, but
this filter is implemented by sampling the bus requiring two consecutive
samples to agree.  Two small glitches with just the right timing would
therefore suffice.  Rather unlikely for random noise, but having lots of
signals on your pcb that ultimately derive from the same clock source
probably makes the odds a lot more favorable.

Matthijs


Re: [PATCH] cxgb4/cxgb4vf: fix spelling mistake "provissioned" -> "provisioned"

2016-08-28 Thread David Miller
From: Colin King 
Date: Sun, 28 Aug 2016 12:07:02 +0100

> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in dev_warn message.
> 
> Signed-off-by: Colin Ian King 

Applied.


Re: [PATCH] wan/fsl_ucc_hdlc: fix spelling mistake "prameter" -> "parameter"

2016-08-28 Thread David Miller
From: Colin King 
Date: Sun, 28 Aug 2016 11:40:41 +0100

> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in dev_err message.
> 
> Signed-off-by: Colin Ian King 

Applied.


Re: [PATCH] net: ucc_geth: fix spelling mistake "propperty" -> "property"

2016-08-28 Thread David Miller
From: Colin King 
Date: Sun, 28 Aug 2016 12:03:27 +0100

> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in dev_warn message.
> 
> Signed-off-by: Colin Ian King 

Applied.


Re: [PATCH] cxgb4/cxgb4vf: fix spelling mistake "provissioned" -> "provisioned"

2016-08-28 Thread David Miller
From: Colin King 
Date: Sun, 28 Aug 2016 12:07:02 +0100

> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in dev_warn message.
> 
> Signed-off-by: Colin Ian King 

Applied.


Re: [PATCH] wan/fsl_ucc_hdlc: fix spelling mistake "prameter" -> "parameter"

2016-08-28 Thread David Miller
From: Colin King 
Date: Sun, 28 Aug 2016 11:40:41 +0100

> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in dev_err message.
> 
> Signed-off-by: Colin Ian King 

Applied.


Re: [PATCH] net: ucc_geth: fix spelling mistake "propperty" -> "property"

2016-08-28 Thread David Miller
From: Colin King 
Date: Sun, 28 Aug 2016 12:03:27 +0100

> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in dev_warn message.
> 
> Signed-off-by: Colin Ian King 

Applied.


[PATCH] mount: dont execute propagate_umount() many times for same mounts

2016-08-28 Thread Andrei Vagin
In a worse case the current complexity of umount_tree() is O(n^3).
* Enumirate all mounts in a target tree (propagate_umount)
* Enumirate mounts to find where these changes have to
  be propagated (mark_umount_candidates)
* Enumirate mounts to find a requered mount by parent and dentry
  (__lookup_mnt_lat)

The worse case is when all mounts from the tree live in the same shared
group. And in this case we have to enumirate all mounts on each step.

Here we can optimize the second step. We don't need to make it for
mounts which we already met when we do this step for previous mounts.
It reduces the complexity of umount_tree() to O(n^2).

Here is a script to generate such mount tree:
$ cat run.sh
mount -t tmpfs xxx /mnt
mount --make-shared /mnt
for i in `seq $1`; do
mount --bind /mnt `mktemp -d /mnt/test.XX`
done
time umount -l /mnt
$ for i in `seq 10 16`; do echo $i; unshare -Urm bash ./run.sh $i; done

Here is performance measurements with and without this patch:

mounts | after  | before (sec)
-
1024   | 0.024  | 0.084
2048   | 0.041  | 0.39
4096   | 0.059  | 3.198
8192   | 0.227  | 50.794
16384  | 1.015  | 810

This patch is a first step to fix CVE-2016-6213. The next step will be
to add ucount (user namespace limit) for mounts.

Signed-off-by: Andrei Vagin 
---
 fs/mount.h |  2 ++
 fs/namespace.c | 19 ---
 fs/pnode.c | 23 +--
 3 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/fs/mount.h b/fs/mount.h
index 14db05d..b5631bd 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -87,6 +87,8 @@ static inline int is_mounted(struct vfsmount *mnt)
 
 extern struct mount *__lookup_mnt(struct vfsmount *, struct dentry *);
 extern struct mount *__lookup_mnt_last(struct vfsmount *, struct dentry *);
+extern struct mount *__lookup_mnt_cont(struct mount *,
+   struct vfsmount *, struct dentry *);
 
 extern int __legitimize_mnt(struct vfsmount *, unsigned);
 extern bool legitimize_mnt(struct vfsmount *, unsigned);
diff --git a/fs/namespace.c b/fs/namespace.c
index 7bb2cda..924cea7 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -649,9 +649,7 @@ struct mount *__lookup_mnt_last(struct vfsmount *mnt, 
struct dentry *dentry)
goto out;
if (!(p->mnt.mnt_flags & MNT_UMOUNT))
res = p;
-   hlist_for_each_entry_continue(p, mnt_hash) {
-   if (>mnt_parent->mnt != mnt || p->mnt_mountpoint != dentry)
-   break;
+   for (; p != NULL; p = __lookup_mnt_cont(p, mnt, dentry)) {
if (!(p->mnt.mnt_flags & MNT_UMOUNT))
res = p;
}
@@ -659,6 +657,21 @@ out:
return res;
 }
 
+struct mount *__lookup_mnt_cont(struct mount *p,
+   struct vfsmount *mnt, struct dentry *dentry)
+{
+   struct hlist_node *node = p->mnt_hash.next;
+
+   if (!node)
+   return NULL;
+
+   p = hlist_entry(node, struct mount, mnt_hash);
+   if (>mnt_parent->mnt != mnt || p->mnt_mountpoint != dentry)
+   return NULL;
+
+   return p;
+}
+
 /*
  * lookup_mnt - Return the first child mount mounted at path
  *
diff --git a/fs/pnode.c b/fs/pnode.c
index 9989970..2242aad 100644
--- a/fs/pnode.c
+++ b/fs/pnode.c
@@ -399,10 +399,24 @@ static void mark_umount_candidates(struct mount *mnt)
 
BUG_ON(parent == mnt);
 
+   if (IS_MNT_MARKED(mnt))
+   return;
+
for (m = propagation_next(parent, parent); m;
m = propagation_next(m, parent)) {
-   struct mount *child = __lookup_mnt_last(>mnt,
+   struct mount *child = __lookup_mnt(>mnt,
mnt->mnt_mountpoint);
+
+   while (child && child->mnt.mnt_flags & MNT_UMOUNT) {
+   /*
+* Mark umounted mounts to not call
+* __propagate_umount for them again.
+*/
+   SET_MNT_MARK(child);
+   child = __lookup_mnt_cont(child, >mnt,
+   mnt->mnt_mountpoint);
+   }
+
if (child && (!IS_MNT_LOCKED(child) || IS_MNT_MARKED(m))) {
SET_MNT_MARK(child);
}
@@ -420,6 +434,9 @@ static void __propagate_umount(struct mount *mnt)
 
BUG_ON(parent == mnt);
 
+   if (IS_MNT_MARKED(mnt))
+   return;
+
for (m = propagation_next(parent, parent); m;
m = propagation_next(m, parent)) {
 
@@ -431,6 +448,8 @@ static void __propagate_umount(struct mount *mnt)
 */
if (!child || !IS_MNT_MARKED(child))
continue;
+   if (child->mnt.mnt_flags & MNT_UMOUNT)
+   continue;
CLEAR_MNT_MARK(child);
 

[PATCH] mount: dont execute propagate_umount() many times for same mounts

2016-08-28 Thread Andrei Vagin
In a worse case the current complexity of umount_tree() is O(n^3).
* Enumirate all mounts in a target tree (propagate_umount)
* Enumirate mounts to find where these changes have to
  be propagated (mark_umount_candidates)
* Enumirate mounts to find a requered mount by parent and dentry
  (__lookup_mnt_lat)

The worse case is when all mounts from the tree live in the same shared
group. And in this case we have to enumirate all mounts on each step.

Here we can optimize the second step. We don't need to make it for
mounts which we already met when we do this step for previous mounts.
It reduces the complexity of umount_tree() to O(n^2).

Here is a script to generate such mount tree:
$ cat run.sh
mount -t tmpfs xxx /mnt
mount --make-shared /mnt
for i in `seq $1`; do
mount --bind /mnt `mktemp -d /mnt/test.XX`
done
time umount -l /mnt
$ for i in `seq 10 16`; do echo $i; unshare -Urm bash ./run.sh $i; done

Here is performance measurements with and without this patch:

mounts | after  | before (sec)
-
1024   | 0.024  | 0.084
2048   | 0.041  | 0.39
4096   | 0.059  | 3.198
8192   | 0.227  | 50.794
16384  | 1.015  | 810

This patch is a first step to fix CVE-2016-6213. The next step will be
to add ucount (user namespace limit) for mounts.

Signed-off-by: Andrei Vagin 
---
 fs/mount.h |  2 ++
 fs/namespace.c | 19 ---
 fs/pnode.c | 23 +--
 3 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/fs/mount.h b/fs/mount.h
index 14db05d..b5631bd 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -87,6 +87,8 @@ static inline int is_mounted(struct vfsmount *mnt)
 
 extern struct mount *__lookup_mnt(struct vfsmount *, struct dentry *);
 extern struct mount *__lookup_mnt_last(struct vfsmount *, struct dentry *);
+extern struct mount *__lookup_mnt_cont(struct mount *,
+   struct vfsmount *, struct dentry *);
 
 extern int __legitimize_mnt(struct vfsmount *, unsigned);
 extern bool legitimize_mnt(struct vfsmount *, unsigned);
diff --git a/fs/namespace.c b/fs/namespace.c
index 7bb2cda..924cea7 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -649,9 +649,7 @@ struct mount *__lookup_mnt_last(struct vfsmount *mnt, 
struct dentry *dentry)
goto out;
if (!(p->mnt.mnt_flags & MNT_UMOUNT))
res = p;
-   hlist_for_each_entry_continue(p, mnt_hash) {
-   if (>mnt_parent->mnt != mnt || p->mnt_mountpoint != dentry)
-   break;
+   for (; p != NULL; p = __lookup_mnt_cont(p, mnt, dentry)) {
if (!(p->mnt.mnt_flags & MNT_UMOUNT))
res = p;
}
@@ -659,6 +657,21 @@ out:
return res;
 }
 
+struct mount *__lookup_mnt_cont(struct mount *p,
+   struct vfsmount *mnt, struct dentry *dentry)
+{
+   struct hlist_node *node = p->mnt_hash.next;
+
+   if (!node)
+   return NULL;
+
+   p = hlist_entry(node, struct mount, mnt_hash);
+   if (>mnt_parent->mnt != mnt || p->mnt_mountpoint != dentry)
+   return NULL;
+
+   return p;
+}
+
 /*
  * lookup_mnt - Return the first child mount mounted at path
  *
diff --git a/fs/pnode.c b/fs/pnode.c
index 9989970..2242aad 100644
--- a/fs/pnode.c
+++ b/fs/pnode.c
@@ -399,10 +399,24 @@ static void mark_umount_candidates(struct mount *mnt)
 
BUG_ON(parent == mnt);
 
+   if (IS_MNT_MARKED(mnt))
+   return;
+
for (m = propagation_next(parent, parent); m;
m = propagation_next(m, parent)) {
-   struct mount *child = __lookup_mnt_last(>mnt,
+   struct mount *child = __lookup_mnt(>mnt,
mnt->mnt_mountpoint);
+
+   while (child && child->mnt.mnt_flags & MNT_UMOUNT) {
+   /*
+* Mark umounted mounts to not call
+* __propagate_umount for them again.
+*/
+   SET_MNT_MARK(child);
+   child = __lookup_mnt_cont(child, >mnt,
+   mnt->mnt_mountpoint);
+   }
+
if (child && (!IS_MNT_LOCKED(child) || IS_MNT_MARKED(m))) {
SET_MNT_MARK(child);
}
@@ -420,6 +434,9 @@ static void __propagate_umount(struct mount *mnt)
 
BUG_ON(parent == mnt);
 
+   if (IS_MNT_MARKED(mnt))
+   return;
+
for (m = propagation_next(parent, parent); m;
m = propagation_next(m, parent)) {
 
@@ -431,6 +448,8 @@ static void __propagate_umount(struct mount *mnt)
 */
if (!child || !IS_MNT_MARKED(child))
continue;
+   if (child->mnt.mnt_flags & MNT_UMOUNT)
+   continue;
CLEAR_MNT_MARK(child);
if 

[PATCH 2/2] f2fs: add roll-forward recovery process for encrypted dentry

2016-08-28 Thread Shuoran Liu
Add roll-forward recovery process for encrypted dentry, so the first fsync
issued to an encrypted file does not need writing checkpoint.

This improves the performance of the following test at thousands of small
files: open -> write -> fsync -> close

Signed-off-by: Shuoran Liu 
---
 fs/f2fs/dir.c  | 75 ++
 fs/f2fs/f2fs.h |  4 +++
 fs/f2fs/file.c |  2 --
 fs/f2fs/recovery.c | 16 +---
 4 files changed, 58 insertions(+), 39 deletions(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index 9054aea..8eca6dd 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -212,31 +212,17 @@ static struct f2fs_dir_entry *find_in_level(struct inode 
*dir,
return de;
 }
 
-/*
- * Find an entry in the specified directory with the wanted name.
- * It returns the page where the entry was found (as a parameter - res_page),
- * and the entry itself. Page is returned mapped and unlocked.
- * Entry is guaranteed to be valid.
- */
-struct f2fs_dir_entry *f2fs_find_entry(struct inode *dir,
-   const struct qstr *child, struct page **res_page)
+struct f2fs_dir_entry *__f2fs_find_entry(struct inode *dir,
+   struct fscrypt_name *fname, struct page **res_page)
 {
unsigned long npages = dir_blocks(dir);
struct f2fs_dir_entry *de = NULL;
unsigned int max_depth;
unsigned int level;
-   struct fscrypt_name fname;
-   int err;
-
-   err = fscrypt_setup_filename(dir, child, 1, );
-   if (err) {
-   *res_page = ERR_PTR(err);
-   return NULL;
-   }
 
if (f2fs_has_inline_dentry(dir)) {
*res_page = NULL;
-   de = find_in_inline_dir(dir, , res_page);
+   de = find_in_inline_dir(dir, fname, res_page);
goto out;
}
 
@@ -256,11 +242,35 @@ struct f2fs_dir_entry *f2fs_find_entry(struct inode *dir,
 
for (level = 0; level < max_depth; level++) {
*res_page = NULL;
-   de = find_in_level(dir, level, , res_page);
+   de = find_in_level(dir, level, fname, res_page);
if (de || IS_ERR(*res_page))
break;
}
 out:
+   return de;
+}
+
+/*
+ * Find an entry in the specified directory with the wanted name.
+ * It returns the page where the entry was found (as a parameter - res_page),
+ * and the entry itself. Page is returned mapped and unlocked.
+ * Entry is guaranteed to be valid.
+ */
+struct f2fs_dir_entry *f2fs_find_entry(struct inode *dir,
+   const struct qstr *child, struct page **res_page)
+{
+   struct f2fs_dir_entry *de = NULL;
+   struct fscrypt_name fname;
+   int err;
+
+   err = fscrypt_setup_filename(dir, child, 1, );
+   if (err) {
+   *res_page = ERR_PTR(err);
+   return NULL;
+   }
+
+   de = __f2fs_find_entry(dir, , res_page);
+
fscrypt_free_filename();
return de;
 }
@@ -599,6 +609,24 @@ fail:
return err;
 }
 
+int __f2fs_do_add_link(struct inode *dir, struct fscrypt_name *fname,
+   struct inode *inode, nid_t ino, umode_t mode)
+{
+   struct qstr new_name;
+   int err = -EAGAIN;
+
+   new_name.name = fname_name(fname);
+   new_name.len = fname_len(fname);
+
+   if (f2fs_has_inline_dentry(dir))
+   err = f2fs_add_inline_entry(dir, _name, inode, ino, mode);
+   if (err == -EAGAIN)
+   err = f2fs_add_regular_entry(dir, _name, inode, ino, mode);
+
+   f2fs_update_time(F2FS_I_SB(dir), REQ_TIME);
+   return err;
+}
+
 /*
  * Caller should grab and release a rwsem by calling f2fs_lock_op() and
  * f2fs_unlock_op().
@@ -607,24 +635,15 @@ int __f2fs_add_link(struct inode *dir, const struct qstr 
*name,
struct inode *inode, nid_t ino, umode_t mode)
 {
struct fscrypt_name fname;
-   struct qstr new_name;
int err;
 
err = fscrypt_setup_filename(dir, name, 0, );
if (err)
return err;
 
-   new_name.name = fname_name();
-   new_name.len = fname_len();
-
-   err = -EAGAIN;
-   if (f2fs_has_inline_dentry(dir))
-   err = f2fs_add_inline_entry(dir, _name, inode, ino, mode);
-   if (err == -EAGAIN)
-   err = f2fs_add_regular_entry(dir, _name, inode, ino, mode);
+   err = __f2fs_do_add_link(dir, , inode, ino, mode);
 
fscrypt_free_filename();
-   f2fs_update_time(F2FS_I_SB(dir), REQ_TIME);
return err;
 }
 
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 14f5fe2..78d7641 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1914,6 +1914,8 @@ struct page *init_inode_metadata(struct inode *, struct 
inode *,
 void update_parent_metadata(struct inode *, struct inode *, unsigned int);
 int room_for_filename(const void *, int, int);
 void f2fs_drop_nlink(struct 

[PATCH 1/2] f2fs: set encryption name flag in add inline entry path

2016-08-28 Thread Shuoran Liu
This patch sets encryption name flag in the add inline entry path
if filename is encrypted.

Signed-off-by: Shuoran Liu 
---
 fs/f2fs/inline.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index ccea873..f9ce04a7 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -524,6 +524,8 @@ int f2fs_add_inline_entry(struct inode *dir, const struct 
qstr *name,
err = PTR_ERR(page);
goto fail;
}
+   if (f2fs_encrypted_inode(dir))
+   file_set_enc_name(inode);
}
 
f2fs_wait_on_page_writeback(ipage, NODE, true);
-- 
1.9.1



[PATCH 1/2] f2fs: set encryption name flag in add inline entry path

2016-08-28 Thread Shuoran Liu
This patch sets encryption name flag in the add inline entry path
if filename is encrypted.

Signed-off-by: Shuoran Liu 
---
 fs/f2fs/inline.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index ccea873..f9ce04a7 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -524,6 +524,8 @@ int f2fs_add_inline_entry(struct inode *dir, const struct 
qstr *name,
err = PTR_ERR(page);
goto fail;
}
+   if (f2fs_encrypted_inode(dir))
+   file_set_enc_name(inode);
}
 
f2fs_wait_on_page_writeback(ipage, NODE, true);
-- 
1.9.1



[PATCH 2/2] f2fs: add roll-forward recovery process for encrypted dentry

2016-08-28 Thread Shuoran Liu
Add roll-forward recovery process for encrypted dentry, so the first fsync
issued to an encrypted file does not need writing checkpoint.

This improves the performance of the following test at thousands of small
files: open -> write -> fsync -> close

Signed-off-by: Shuoran Liu 
---
 fs/f2fs/dir.c  | 75 ++
 fs/f2fs/f2fs.h |  4 +++
 fs/f2fs/file.c |  2 --
 fs/f2fs/recovery.c | 16 +---
 4 files changed, 58 insertions(+), 39 deletions(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index 9054aea..8eca6dd 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -212,31 +212,17 @@ static struct f2fs_dir_entry *find_in_level(struct inode 
*dir,
return de;
 }
 
-/*
- * Find an entry in the specified directory with the wanted name.
- * It returns the page where the entry was found (as a parameter - res_page),
- * and the entry itself. Page is returned mapped and unlocked.
- * Entry is guaranteed to be valid.
- */
-struct f2fs_dir_entry *f2fs_find_entry(struct inode *dir,
-   const struct qstr *child, struct page **res_page)
+struct f2fs_dir_entry *__f2fs_find_entry(struct inode *dir,
+   struct fscrypt_name *fname, struct page **res_page)
 {
unsigned long npages = dir_blocks(dir);
struct f2fs_dir_entry *de = NULL;
unsigned int max_depth;
unsigned int level;
-   struct fscrypt_name fname;
-   int err;
-
-   err = fscrypt_setup_filename(dir, child, 1, );
-   if (err) {
-   *res_page = ERR_PTR(err);
-   return NULL;
-   }
 
if (f2fs_has_inline_dentry(dir)) {
*res_page = NULL;
-   de = find_in_inline_dir(dir, , res_page);
+   de = find_in_inline_dir(dir, fname, res_page);
goto out;
}
 
@@ -256,11 +242,35 @@ struct f2fs_dir_entry *f2fs_find_entry(struct inode *dir,
 
for (level = 0; level < max_depth; level++) {
*res_page = NULL;
-   de = find_in_level(dir, level, , res_page);
+   de = find_in_level(dir, level, fname, res_page);
if (de || IS_ERR(*res_page))
break;
}
 out:
+   return de;
+}
+
+/*
+ * Find an entry in the specified directory with the wanted name.
+ * It returns the page where the entry was found (as a parameter - res_page),
+ * and the entry itself. Page is returned mapped and unlocked.
+ * Entry is guaranteed to be valid.
+ */
+struct f2fs_dir_entry *f2fs_find_entry(struct inode *dir,
+   const struct qstr *child, struct page **res_page)
+{
+   struct f2fs_dir_entry *de = NULL;
+   struct fscrypt_name fname;
+   int err;
+
+   err = fscrypt_setup_filename(dir, child, 1, );
+   if (err) {
+   *res_page = ERR_PTR(err);
+   return NULL;
+   }
+
+   de = __f2fs_find_entry(dir, , res_page);
+
fscrypt_free_filename();
return de;
 }
@@ -599,6 +609,24 @@ fail:
return err;
 }
 
+int __f2fs_do_add_link(struct inode *dir, struct fscrypt_name *fname,
+   struct inode *inode, nid_t ino, umode_t mode)
+{
+   struct qstr new_name;
+   int err = -EAGAIN;
+
+   new_name.name = fname_name(fname);
+   new_name.len = fname_len(fname);
+
+   if (f2fs_has_inline_dentry(dir))
+   err = f2fs_add_inline_entry(dir, _name, inode, ino, mode);
+   if (err == -EAGAIN)
+   err = f2fs_add_regular_entry(dir, _name, inode, ino, mode);
+
+   f2fs_update_time(F2FS_I_SB(dir), REQ_TIME);
+   return err;
+}
+
 /*
  * Caller should grab and release a rwsem by calling f2fs_lock_op() and
  * f2fs_unlock_op().
@@ -607,24 +635,15 @@ int __f2fs_add_link(struct inode *dir, const struct qstr 
*name,
struct inode *inode, nid_t ino, umode_t mode)
 {
struct fscrypt_name fname;
-   struct qstr new_name;
int err;
 
err = fscrypt_setup_filename(dir, name, 0, );
if (err)
return err;
 
-   new_name.name = fname_name();
-   new_name.len = fname_len();
-
-   err = -EAGAIN;
-   if (f2fs_has_inline_dentry(dir))
-   err = f2fs_add_inline_entry(dir, _name, inode, ino, mode);
-   if (err == -EAGAIN)
-   err = f2fs_add_regular_entry(dir, _name, inode, ino, mode);
+   err = __f2fs_do_add_link(dir, , inode, ino, mode);
 
fscrypt_free_filename();
-   f2fs_update_time(F2FS_I_SB(dir), REQ_TIME);
return err;
 }
 
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 14f5fe2..78d7641 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1914,6 +1914,8 @@ struct page *init_inode_metadata(struct inode *, struct 
inode *,
 void update_parent_metadata(struct inode *, struct inode *, unsigned int);
 int room_for_filename(const void *, int, int);
 void f2fs_drop_nlink(struct inode *, struct inode *);

Re: [PATCH 2/2] arm64: dts: rockchip: add eMMC's power domain support for rk3399

2016-08-28 Thread Shawn Lin

On 2016/8/29 10:50, Elaine Zhang wrote:



On 08/27/2016 11:05 PM, Shawn Lin wrote:

On 2016/8/27 21:41, Ziyuan Xu wrote:

Control power domain for eMMC via genpd to reduce power consumption.

Signed-off-by: Elaine Zhang 
Signed-off-by: Ziyuan Xu 



It looks nice to me. But this should be merged after applying that[0]
as your patch will break bind/unbind test for sdhci-of-arasan on rk3399
without it[0]. Moreover, Elaine should make sure that upstreamed
rockchip power domain stuff would not off pd for emmc, *otherwise*, I
should update my patch to make sure we update clkmul every time when
doing suspend 2 resume..



Forgot to say:
If use pd, Although there is no call to power odd the pd_emmc,
it will be power off when the system doing suspend 2 resume.
(Because the system call
__device_suspend_noirq->pm_genpd_suspend_noirq->rockchip_pd_power_off)


Thanks for explaining this. I checked the code a bit and actually I
don't need to updata clkmul since it was recorded, although it is still
reset to 0x10 reading from syscon. So for that, we can now pick it
up without waiting for my sdhci-of-arasan's update.

Reviewed-by: Shawn Lin 





And it's important to note:
If the pd has been power off, some grf regs will be back to the default
value.(which grf regs in this pd)
So if the pd support power off , this grf regs need to save and restore
or reinit.
For example:
pd_emmc
aclk_emmc_grf

If the pd is always on,and this pd have wakeup func.
The device need to add device_init_wakeup() to make the pd always on
when the system doing suspend 2 resume.



[0]: https://patchwork.kernel.org/patch/9300971/


---

 arch/arm64/boot/dts/rockchip/rk3399.dtsi | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
index 32aebc8..71733d4 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -239,6 +239,7 @@
 #clock-cells = <0>;
 phys = <_phy>;
 phy-names = "phy_arasan";
+power-domains = < RK3399_PD_EMMC>;
 status = "disabled";
 };

@@ -611,6 +612,11 @@
 status = "disabled";
 };

+qos_emmc: qos@ffa58000 {
+compatible = "syscon";
+reg = <0x0 0xffa58000 0x0 0x20>;
+};
+
 qos_hdcp: qos@ffa9 {
 compatible = "syscon";
 reg = <0x0 0xffa9 0x0 0x20>;
@@ -739,6 +745,11 @@
 };

 /* These power domains are grouped by VD_LOGIC */
+pd_emmc@RK3399_PD_EMMC {
+reg = ;
+clocks = < ACLK_EMMC>;
+pm_qos = <_emmc>;
+};
 pd_vio@RK3399_PD_VIO {
 reg = ;
 #address-cells = <1>;











--
Best Regards
Shawn Lin



Re: [PATCH 2/2] arm64: dts: rockchip: add eMMC's power domain support for rk3399

2016-08-28 Thread Shawn Lin

On 2016/8/29 10:50, Elaine Zhang wrote:



On 08/27/2016 11:05 PM, Shawn Lin wrote:

On 2016/8/27 21:41, Ziyuan Xu wrote:

Control power domain for eMMC via genpd to reduce power consumption.

Signed-off-by: Elaine Zhang 
Signed-off-by: Ziyuan Xu 



It looks nice to me. But this should be merged after applying that[0]
as your patch will break bind/unbind test for sdhci-of-arasan on rk3399
without it[0]. Moreover, Elaine should make sure that upstreamed
rockchip power domain stuff would not off pd for emmc, *otherwise*, I
should update my patch to make sure we update clkmul every time when
doing suspend 2 resume..



Forgot to say:
If use pd, Although there is no call to power odd the pd_emmc,
it will be power off when the system doing suspend 2 resume.
(Because the system call
__device_suspend_noirq->pm_genpd_suspend_noirq->rockchip_pd_power_off)


Thanks for explaining this. I checked the code a bit and actually I
don't need to updata clkmul since it was recorded, although it is still
reset to 0x10 reading from syscon. So for that, we can now pick it
up without waiting for my sdhci-of-arasan's update.

Reviewed-by: Shawn Lin 





And it's important to note:
If the pd has been power off, some grf regs will be back to the default
value.(which grf regs in this pd)
So if the pd support power off , this grf regs need to save and restore
or reinit.
For example:
pd_emmc
aclk_emmc_grf

If the pd is always on,and this pd have wakeup func.
The device need to add device_init_wakeup() to make the pd always on
when the system doing suspend 2 resume.



[0]: https://patchwork.kernel.org/patch/9300971/


---

 arch/arm64/boot/dts/rockchip/rk3399.dtsi | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
index 32aebc8..71733d4 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -239,6 +239,7 @@
 #clock-cells = <0>;
 phys = <_phy>;
 phy-names = "phy_arasan";
+power-domains = < RK3399_PD_EMMC>;
 status = "disabled";
 };

@@ -611,6 +612,11 @@
 status = "disabled";
 };

+qos_emmc: qos@ffa58000 {
+compatible = "syscon";
+reg = <0x0 0xffa58000 0x0 0x20>;
+};
+
 qos_hdcp: qos@ffa9 {
 compatible = "syscon";
 reg = <0x0 0xffa9 0x0 0x20>;
@@ -739,6 +745,11 @@
 };

 /* These power domains are grouped by VD_LOGIC */
+pd_emmc@RK3399_PD_EMMC {
+reg = ;
+clocks = < ACLK_EMMC>;
+pm_qos = <_emmc>;
+};
 pd_vio@RK3399_PD_VIO {
 reg = ;
 #address-cells = <1>;











--
Best Regards
Shawn Lin



RE: [PATCH] omapdrm: dss: drop unneeded of_node_put() on ref passed to of_get_next_parent()

2016-08-28 Thread Peter Chen
 
>Sent: Saturday, August 27, 2016 8:07 PM
>To: Tomi Valkeinen ; Tony Lindgren ;
>Sean Paul ; Peter Chen ;
>Andrey Utkin 
>Cc: David Airlie ; Peter Ujfalusi ; 
>Dave
>Airlie ; Rob Clark ; Dr. H. Nikolaus
>Schaller ; Andrew Bradford ;
>ker...@pyra-handheld.com; Discussions about the Letux Kernel ker...@openphoenux.org>; dri-de...@lists.freedesktop.org; lkml ker...@vger.kernel.org>; linux-o...@vger.kernel.org
>Subject: Re: [PATCH] omapdrm: dss: drop unneeded of_node_put() on ref passed to
>of_get_next_parent()
>
>> [8.842806] OF: ERROR: Bad of_node_put() on /encoder/ports/port@1/endpoint
>> [8.843014] [] (omapdss_of_find_source_for_first_ep [omapdss])
>
>I can confirm that reverting 2ab9f5879162 fixes this regression, tested on 
>omap5-
>uevm.
>

It was my careless for introducing regression. The revert patch has already been
at linux-next. Sorry for inconvenience.

https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=5a78ff7bf7e25191144b550961001bbf6c734da4


Peter


RE: [PATCH] omapdrm: dss: drop unneeded of_node_put() on ref passed to of_get_next_parent()

2016-08-28 Thread Peter Chen
 
>Sent: Saturday, August 27, 2016 8:07 PM
>To: Tomi Valkeinen ; Tony Lindgren ;
>Sean Paul ; Peter Chen ;
>Andrey Utkin 
>Cc: David Airlie ; Peter Ujfalusi ; 
>Dave
>Airlie ; Rob Clark ; Dr. H. Nikolaus
>Schaller ; Andrew Bradford ;
>ker...@pyra-handheld.com; Discussions about the Letux Kernel ker...@openphoenux.org>; dri-de...@lists.freedesktop.org; lkml ker...@vger.kernel.org>; linux-o...@vger.kernel.org
>Subject: Re: [PATCH] omapdrm: dss: drop unneeded of_node_put() on ref passed to
>of_get_next_parent()
>
>> [8.842806] OF: ERROR: Bad of_node_put() on /encoder/ports/port@1/endpoint
>> [8.843014] [] (omapdss_of_find_source_for_first_ep [omapdss])
>
>I can confirm that reverting 2ab9f5879162 fixes this regression, tested on 
>omap5-
>uevm.
>

It was my careless for introducing regression. The revert patch has already been
at linux-next. Sorry for inconvenience.

https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=5a78ff7bf7e25191144b550961001bbf6c734da4


Peter


Re: [PATCH v7 11/14] arm64/numa: support HAVE_MEMORYLESS_NODES

2016-08-28 Thread Leizhen (ThunderTown)


On 2016/8/27 19:05, Leizhen (ThunderTown) wrote:
> 
> 
> On 2016/8/26 23:43, Will Deacon wrote:
>> On Wed, Aug 24, 2016 at 03:44:50PM +0800, Zhen Lei wrote:
>>> Some numa nodes may have no memory. For example:
>>> 1. cpu0 on node0
>>> 2. cpu1 on node1
>>> 3. device0 access the momory from node0 and node1 take the same time.
>>>
>>> So, we can not simply classify device0 to node0 or node1, but we can
>>> define a node2 which distances to node0 and node1 are the same.
>>>
>>> Signed-off-by: Zhen Lei 
>>> ---
>>>  arch/arm64/Kconfig  |  4 
>>>  arch/arm64/kernel/smp.c |  1 +
>>>  arch/arm64/mm/numa.c| 43 +--
>>>  3 files changed, 46 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>> index 2815af6..3a2b6ed 100644
>>> --- a/arch/arm64/Kconfig
>>> +++ b/arch/arm64/Kconfig
>>> @@ -611,6 +611,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
>>> def_bool y
>>> depends on NUMA
>>>
>>> +config HAVE_MEMORYLESS_NODES
>>> +   def_bool y
>>> +   depends on NUMA
>>> +
>>>  source kernel/Kconfig.preempt
>>>  source kernel/Kconfig.hz
>>>
>>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>>> index d93d433..4879085 100644
>>> --- a/arch/arm64/kernel/smp.c
>>> +++ b/arch/arm64/kernel/smp.c
>>> @@ -619,6 +619,7 @@ static void __init of_parse_and_init_cpus(void)
>>> }
>>>
>>> bootcpu_valid = true;
>>> +   early_map_cpu_to_node(0, of_node_to_nid(dn));
>>
>> This seems unrelated?
> I will get off my work soon. Maybe I need put it into patch 12.
> 
>>
>>> /*
>>>  * cpu_logical_map has already been
>>> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
>>> index 6853db7..114180f 100644
>>> --- a/arch/arm64/mm/numa.c
>>> +++ b/arch/arm64/mm/numa.c
>>> @@ -129,6 +129,14 @@ void __init early_map_cpu_to_node(unsigned int cpu, 
>>> int nid)
>>> nid = 0;
>>>
>>> cpu_to_node_map[cpu] = nid;
>>> +
>>> +   /*
>>> +* We should set the numa node of cpu0 as soon as possible, because it
>>> +* has already been set up online before. cpu_to_node(0) will soon be
>>> +* called.
>>> +*/
>>> +   if (!cpu)
>>> +   set_cpu_numa_node(cpu, nid);
>>
>> Likewise.
>>
>>>  }
>>>
>>>  #ifdef CONFIG_HAVE_SETUP_PER_CPU_AREA
>>> @@ -211,6 +219,35 @@ int __init numa_add_memblk(int nid, u64 start, u64 end)
>>> return ret;
>>>  }
>>>
>>> +static u64 __init alloc_node_data_from_nearest_node(int nid, const size_t 
>>> size)
>>> +{
>>> +   int i, best_nid, distance;
>>> +   u64 pa;
>>> +   DECLARE_BITMAP(nodes_map, MAX_NUMNODES);
>>> +
>>> +   bitmap_zero(nodes_map, MAX_NUMNODES);
>>> +   bitmap_set(nodes_map, nid, 1);
>>> +
>>> +find_nearest_node:
>>> +   best_nid = NUMA_NO_NODE;
>>> +   distance = INT_MAX;
>>> +
>>> +   for_each_clear_bit(i, nodes_map, MAX_NUMNODES)
>>> +   if (numa_distance[nid][i] < distance) {
>>> +   best_nid = i;
>>> +   distance = numa_distance[nid][i];
>>> +   }
>>> +
>>> +   pa = memblock_alloc_nid(size, SMP_CACHE_BYTES, best_nid);
>>> +   if (!pa) {
>>> +   BUG_ON(best_nid == NUMA_NO_NODE);
>>> +   bitmap_set(nodes_map, best_nid, 1);
>>> +   goto find_nearest_node;
>>> +   }
>>> +
>>> +   return pa;
>>> +}
>>> +
>>>  /**
>>>   * Initialize NODE_DATA for a node on the local memory
>>>   */
>>> @@ -224,7 +261,9 @@ static void __init setup_node_data(int nid, u64 
>>> start_pfn, u64 end_pfn)
>>> pr_info("Initmem setup node %d [mem %#010Lx-%#010Lx]\n",
>>> nid, start_pfn << PAGE_SHIFT, (end_pfn << PAGE_SHIFT) - 1);
>>>
>>> -   nd_pa = memblock_alloc_try_nid(nd_size, SMP_CACHE_BYTES, nid);
>>> +   nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES, nid);
>>> +   if (!nd_pa)
>>> +   nd_pa = alloc_node_data_from_nearest_node(nid, nd_size);
>>
>> Why not add memblock_alloc_near_nid to the core code, and make it do
>> what you need there?
> I'm thinking about it next week. But some ARCHs like X86/IA64 have their own 
> implementation.

Do you mean directly and only call alloc_node_data_from_nearest_node? OK, 
that's fine. Thanks.

> 
>>
>> Will
>>
>> .
>>



Re: [PATCH v7 11/14] arm64/numa: support HAVE_MEMORYLESS_NODES

2016-08-28 Thread Leizhen (ThunderTown)


On 2016/8/27 19:05, Leizhen (ThunderTown) wrote:
> 
> 
> On 2016/8/26 23:43, Will Deacon wrote:
>> On Wed, Aug 24, 2016 at 03:44:50PM +0800, Zhen Lei wrote:
>>> Some numa nodes may have no memory. For example:
>>> 1. cpu0 on node0
>>> 2. cpu1 on node1
>>> 3. device0 access the momory from node0 and node1 take the same time.
>>>
>>> So, we can not simply classify device0 to node0 or node1, but we can
>>> define a node2 which distances to node0 and node1 are the same.
>>>
>>> Signed-off-by: Zhen Lei 
>>> ---
>>>  arch/arm64/Kconfig  |  4 
>>>  arch/arm64/kernel/smp.c |  1 +
>>>  arch/arm64/mm/numa.c| 43 +--
>>>  3 files changed, 46 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>> index 2815af6..3a2b6ed 100644
>>> --- a/arch/arm64/Kconfig
>>> +++ b/arch/arm64/Kconfig
>>> @@ -611,6 +611,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
>>> def_bool y
>>> depends on NUMA
>>>
>>> +config HAVE_MEMORYLESS_NODES
>>> +   def_bool y
>>> +   depends on NUMA
>>> +
>>>  source kernel/Kconfig.preempt
>>>  source kernel/Kconfig.hz
>>>
>>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>>> index d93d433..4879085 100644
>>> --- a/arch/arm64/kernel/smp.c
>>> +++ b/arch/arm64/kernel/smp.c
>>> @@ -619,6 +619,7 @@ static void __init of_parse_and_init_cpus(void)
>>> }
>>>
>>> bootcpu_valid = true;
>>> +   early_map_cpu_to_node(0, of_node_to_nid(dn));
>>
>> This seems unrelated?
> I will get off my work soon. Maybe I need put it into patch 12.
> 
>>
>>> /*
>>>  * cpu_logical_map has already been
>>> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
>>> index 6853db7..114180f 100644
>>> --- a/arch/arm64/mm/numa.c
>>> +++ b/arch/arm64/mm/numa.c
>>> @@ -129,6 +129,14 @@ void __init early_map_cpu_to_node(unsigned int cpu, 
>>> int nid)
>>> nid = 0;
>>>
>>> cpu_to_node_map[cpu] = nid;
>>> +
>>> +   /*
>>> +* We should set the numa node of cpu0 as soon as possible, because it
>>> +* has already been set up online before. cpu_to_node(0) will soon be
>>> +* called.
>>> +*/
>>> +   if (!cpu)
>>> +   set_cpu_numa_node(cpu, nid);
>>
>> Likewise.
>>
>>>  }
>>>
>>>  #ifdef CONFIG_HAVE_SETUP_PER_CPU_AREA
>>> @@ -211,6 +219,35 @@ int __init numa_add_memblk(int nid, u64 start, u64 end)
>>> return ret;
>>>  }
>>>
>>> +static u64 __init alloc_node_data_from_nearest_node(int nid, const size_t 
>>> size)
>>> +{
>>> +   int i, best_nid, distance;
>>> +   u64 pa;
>>> +   DECLARE_BITMAP(nodes_map, MAX_NUMNODES);
>>> +
>>> +   bitmap_zero(nodes_map, MAX_NUMNODES);
>>> +   bitmap_set(nodes_map, nid, 1);
>>> +
>>> +find_nearest_node:
>>> +   best_nid = NUMA_NO_NODE;
>>> +   distance = INT_MAX;
>>> +
>>> +   for_each_clear_bit(i, nodes_map, MAX_NUMNODES)
>>> +   if (numa_distance[nid][i] < distance) {
>>> +   best_nid = i;
>>> +   distance = numa_distance[nid][i];
>>> +   }
>>> +
>>> +   pa = memblock_alloc_nid(size, SMP_CACHE_BYTES, best_nid);
>>> +   if (!pa) {
>>> +   BUG_ON(best_nid == NUMA_NO_NODE);
>>> +   bitmap_set(nodes_map, best_nid, 1);
>>> +   goto find_nearest_node;
>>> +   }
>>> +
>>> +   return pa;
>>> +}
>>> +
>>>  /**
>>>   * Initialize NODE_DATA for a node on the local memory
>>>   */
>>> @@ -224,7 +261,9 @@ static void __init setup_node_data(int nid, u64 
>>> start_pfn, u64 end_pfn)
>>> pr_info("Initmem setup node %d [mem %#010Lx-%#010Lx]\n",
>>> nid, start_pfn << PAGE_SHIFT, (end_pfn << PAGE_SHIFT) - 1);
>>>
>>> -   nd_pa = memblock_alloc_try_nid(nd_size, SMP_CACHE_BYTES, nid);
>>> +   nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES, nid);
>>> +   if (!nd_pa)
>>> +   nd_pa = alloc_node_data_from_nearest_node(nid, nd_size);
>>
>> Why not add memblock_alloc_near_nid to the core code, and make it do
>> what you need there?
> I'm thinking about it next week. But some ARCHs like X86/IA64 have their own 
> implementation.

Do you mean directly and only call alloc_node_data_from_nearest_node? OK, 
that's fine. Thanks.

> 
>>
>> Will
>>
>> .
>>



Re: [PATCH v6 0/8] power: add power sequence library

2016-08-28 Thread Peter Chen
On Wed, Aug 24, 2016 at 04:53:35PM +0800, Peter Chen wrote:
> On Tue, Aug 23, 2016 at 04:02:48PM +0530, Vaibhav Hiremath wrote:
> > 
> > 
> > On Monday 15 August 2016 02:43 PM, Peter Chen wrote:
> > >Hi all,
> > >
> > >This is a follow-up for my last power sequence framework patch set [1].
> > >According to Rob Herring and Ulf Hansson's comments[2], I use a generic
> > >power sequence library for parsing the power sequence elements on DT,
> > >and implement generic power sequence on library. The host driver
> > >can allocate power sequence instance, and calls pwrseq APIs accordingly.
> > >
> > >In future, if there are special power sequence requirements, the special
> > >power sequence library can be created.
> > >
> > >This patch set is tested on i.mx6 sabresx evk using a dts change, I use
> > >two hot-plug devices to simulate this use case, the related binding
> > >change is updated at patch [1/6], The udoo board changes were tested
> > >using my last power sequence patch set.[3]
> > >
> > >Except for hard-wired MMC and USB devices, I find the USB ULPI PHY also
> > >need to power on itself before it can be found by ULPI bus.
> > >
> > >[1] http://www.spinics.net/lists/linux-usb/msg142755.html
> > >[2] http://www.spinics.net/lists/linux-usb/msg143106.html
> > >[3] http://www.spinics.net/lists/linux-usb/msg142815.html
> > (Please ignore my response on V2)
> > 
> > Sorry being so late in the discussion...
> > 
> > If I am not missing anything, then I am afraid to say that the
> > generic library
> > implementation in this patch series is not going to solve many of
> > the custom
> > requirement of power on, off, etc...
> > I know you mentioned about adding another library when we come
> > across such platforms, but should we not keep provision (or easy
> > hooks/path)
> > to enable that ?
> > 
> > Let me bring in the use case I am dealing with,
> > 
> > 
> >   Host
> >|
> >V
> >USB port
> > 
> >|
> >V
> >   USB HUB device (May need custom on/off seq)
> >|
> >V
> >   =
> >  | |
> >  V V
> >  Device-1   Device-2
> > (Needs special power   (Needs special power
> >  on/off sequence.   on/off sequence.
> >  Also may need custom   Also, may need custom
> >  sequence for   sequence for
> >  suspend/resume)suspend/resume)
> > 
> > 
> > Note: Both Devices are connected to HUB via HSIC and may differ
> >   in terms of functionality, features they support.
> > 
> > In the above case, both Device-1 and Device-2, need separate
> > power on/off sequence. So generic library currently we have in this
> > patch series is not going to satisfy the need here.
> > 
> > I looked at all 6 revisions of this patch-series, went through the
> > review comments, and looked at MMC power sequence code;
> > what I can say here is, we need something similar to
> > MMC power sequence here, where every device can have its own
> > power sequence (if needed).
> > 
> > I know Rob is not in favor of creating platform device for
> > this, and I understand his comment.
> > If not platform device, but atleast we need mechanism to
> > connect each device back to its of_node and its respective
> > driver/library fns. For example, the Devices may support different
> > boot modes, and platform driver needs to make sure that
> > the right sequence is followed for booting.
> > 
> > Peter, My apologies for taking you back again on this series.
> > I am OK, if you wish to address this in incremental addition,
> > but my point is, we know that the current generic way is not
> > enough for us, so I think we should try to fix it in initial phase only.
> > 
> 
> Rob, it seems generic power sequence can't cover all cases.
> Without information from DT, we can't know which power sequence
> for which device.
> 

Vaibhav, do you agree that I create pwrseq library list using postcore_initcall
for each library, and choose pwrseq library according to compatible
string first, if there is no compatible string for this library, just
use generic pwrseq library.

-- 

Best Regards,
Peter Chen


Re: [PATCH v6 0/8] power: add power sequence library

2016-08-28 Thread Peter Chen
On Wed, Aug 24, 2016 at 04:53:35PM +0800, Peter Chen wrote:
> On Tue, Aug 23, 2016 at 04:02:48PM +0530, Vaibhav Hiremath wrote:
> > 
> > 
> > On Monday 15 August 2016 02:43 PM, Peter Chen wrote:
> > >Hi all,
> > >
> > >This is a follow-up for my last power sequence framework patch set [1].
> > >According to Rob Herring and Ulf Hansson's comments[2], I use a generic
> > >power sequence library for parsing the power sequence elements on DT,
> > >and implement generic power sequence on library. The host driver
> > >can allocate power sequence instance, and calls pwrseq APIs accordingly.
> > >
> > >In future, if there are special power sequence requirements, the special
> > >power sequence library can be created.
> > >
> > >This patch set is tested on i.mx6 sabresx evk using a dts change, I use
> > >two hot-plug devices to simulate this use case, the related binding
> > >change is updated at patch [1/6], The udoo board changes were tested
> > >using my last power sequence patch set.[3]
> > >
> > >Except for hard-wired MMC and USB devices, I find the USB ULPI PHY also
> > >need to power on itself before it can be found by ULPI bus.
> > >
> > >[1] http://www.spinics.net/lists/linux-usb/msg142755.html
> > >[2] http://www.spinics.net/lists/linux-usb/msg143106.html
> > >[3] http://www.spinics.net/lists/linux-usb/msg142815.html
> > (Please ignore my response on V2)
> > 
> > Sorry being so late in the discussion...
> > 
> > If I am not missing anything, then I am afraid to say that the
> > generic library
> > implementation in this patch series is not going to solve many of
> > the custom
> > requirement of power on, off, etc...
> > I know you mentioned about adding another library when we come
> > across such platforms, but should we not keep provision (or easy
> > hooks/path)
> > to enable that ?
> > 
> > Let me bring in the use case I am dealing with,
> > 
> > 
> >   Host
> >|
> >V
> >USB port
> > 
> >|
> >V
> >   USB HUB device (May need custom on/off seq)
> >|
> >V
> >   =
> >  | |
> >  V V
> >  Device-1   Device-2
> > (Needs special power   (Needs special power
> >  on/off sequence.   on/off sequence.
> >  Also may need custom   Also, may need custom
> >  sequence for   sequence for
> >  suspend/resume)suspend/resume)
> > 
> > 
> > Note: Both Devices are connected to HUB via HSIC and may differ
> >   in terms of functionality, features they support.
> > 
> > In the above case, both Device-1 and Device-2, need separate
> > power on/off sequence. So generic library currently we have in this
> > patch series is not going to satisfy the need here.
> > 
> > I looked at all 6 revisions of this patch-series, went through the
> > review comments, and looked at MMC power sequence code;
> > what I can say here is, we need something similar to
> > MMC power sequence here, where every device can have its own
> > power sequence (if needed).
> > 
> > I know Rob is not in favor of creating platform device for
> > this, and I understand his comment.
> > If not platform device, but atleast we need mechanism to
> > connect each device back to its of_node and its respective
> > driver/library fns. For example, the Devices may support different
> > boot modes, and platform driver needs to make sure that
> > the right sequence is followed for booting.
> > 
> > Peter, My apologies for taking you back again on this series.
> > I am OK, if you wish to address this in incremental addition,
> > but my point is, we know that the current generic way is not
> > enough for us, so I think we should try to fix it in initial phase only.
> > 
> 
> Rob, it seems generic power sequence can't cover all cases.
> Without information from DT, we can't know which power sequence
> for which device.
> 

Vaibhav, do you agree that I create pwrseq library list using postcore_initcall
for each library, and choose pwrseq library according to compatible
string first, if there is no compatible string for this library, just
use generic pwrseq library.

-- 

Best Regards,
Peter Chen


[PATCH] ftrace: Access ret_stack->subtime only in the function profiler

2016-08-28 Thread Namhyung Kim
The subtime is used only for function profiler with function graph
tracer enabled.  Move the definition of subtime under
CONFIG_FUNCTION_PROFILER to reduce the memory usage.  Also move the
initialization of subtime into the graph entry callback.

Cc: Josh Poimboeuf 
Signed-off-by: Namhyung Kim 
---
 include/linux/ftrace.h   | 2 ++
 kernel/trace/ftrace.c| 6 ++
 kernel/trace/trace_functions_graph.c | 1 -
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 6f93ac46e7f0..b3d34d3e0e7e 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -794,7 +794,9 @@ struct ftrace_ret_stack {
unsigned long ret;
unsigned long func;
unsigned long long calltime;
+#ifdef CONFIG_FUNCTION_PROFILER
unsigned long long subtime;
+#endif
 #ifdef HAVE_FUNCTION_GRAPH_FP_TEST
unsigned long fp;
 #endif
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 84752c8e28b5..2050a7652a86 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -872,7 +872,13 @@ function_profile_call(unsigned long ip, unsigned long 
parent_ip,
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 static int profile_graph_entry(struct ftrace_graph_ent *trace)
 {
+   int index = trace->depth;
+
function_profile_call(trace->func, 0, NULL, NULL);
+
+   if (index >= 0 && index < FTRACE_RETFUNC_DEPTH)
+   current->ret_stack[index].subtime = 0;
+
return 1;
 }
 
diff --git a/kernel/trace/trace_functions_graph.c 
b/kernel/trace/trace_functions_graph.c
index 0cbe38a844fa..9c7ffa4df5a8 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -170,7 +170,6 @@ ftrace_push_return_trace(unsigned long ret, unsigned long 
func, int *depth,
current->ret_stack[index].ret = ret;
current->ret_stack[index].func = func;
current->ret_stack[index].calltime = calltime;
-   current->ret_stack[index].subtime = 0;
 #ifdef HAVE_FUNCTION_GRAPH_FP_TEST
current->ret_stack[index].fp = frame_pointer;
 #endif
-- 
2.9.3



[PATCH] ftrace: Access ret_stack->subtime only in the function profiler

2016-08-28 Thread Namhyung Kim
The subtime is used only for function profiler with function graph
tracer enabled.  Move the definition of subtime under
CONFIG_FUNCTION_PROFILER to reduce the memory usage.  Also move the
initialization of subtime into the graph entry callback.

Cc: Josh Poimboeuf 
Signed-off-by: Namhyung Kim 
---
 include/linux/ftrace.h   | 2 ++
 kernel/trace/ftrace.c| 6 ++
 kernel/trace/trace_functions_graph.c | 1 -
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 6f93ac46e7f0..b3d34d3e0e7e 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -794,7 +794,9 @@ struct ftrace_ret_stack {
unsigned long ret;
unsigned long func;
unsigned long long calltime;
+#ifdef CONFIG_FUNCTION_PROFILER
unsigned long long subtime;
+#endif
 #ifdef HAVE_FUNCTION_GRAPH_FP_TEST
unsigned long fp;
 #endif
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 84752c8e28b5..2050a7652a86 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -872,7 +872,13 @@ function_profile_call(unsigned long ip, unsigned long 
parent_ip,
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 static int profile_graph_entry(struct ftrace_graph_ent *trace)
 {
+   int index = trace->depth;
+
function_profile_call(trace->func, 0, NULL, NULL);
+
+   if (index >= 0 && index < FTRACE_RETFUNC_DEPTH)
+   current->ret_stack[index].subtime = 0;
+
return 1;
 }
 
diff --git a/kernel/trace/trace_functions_graph.c 
b/kernel/trace/trace_functions_graph.c
index 0cbe38a844fa..9c7ffa4df5a8 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -170,7 +170,6 @@ ftrace_push_return_trace(unsigned long ret, unsigned long 
func, int *depth,
current->ret_stack[index].ret = ret;
current->ret_stack[index].func = func;
current->ret_stack[index].calltime = calltime;
-   current->ret_stack[index].subtime = 0;
 #ifdef HAVE_FUNCTION_GRAPH_FP_TEST
current->ret_stack[index].fp = frame_pointer;
 #endif
-- 
2.9.3



Re: kcm: use-after-free in fput of kcm socket

2016-08-28 Thread Cong Wang
On Sun, Aug 28, 2016 at 3:10 AM, Dmitry Vyukov  wrote:
> Hello,
>
> The following program triggers use-after-free:
>
> // autogenerated by syzkaller (http://github.com/google/syzkaller)
> #include 
> #include 
>
> int main()
> {
>   int fd = syscall(SYS_socket, 0x29ul, 0x5ul, 0x0ul, 0, 0, 0);
>   syscall(SYS_ioctl, fd, 0x89e2ul, 0x20a98000ul, 0, 0, 0);
>   return 0;
> }
>
>
> [  367.240184] 
> ==
> [  367.240784] BUG: KASAN: use-after-free in __fput+0x65a/0x780 at
> addr 880069bc4b30
> [  367.241034] Read of size 2 by task a.out/4045
> [  367.241034] CPU: 3 PID: 4045 Comm: a.out Not tainted 4.8.0-rc3+ #34
> [  367.241034] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS Bochs 01/01/2011
> [  367.241034]  884b8280 880038fb7bc0 82d1b1d9
> 00622e00
> [  367.241034]  fbfff1097050 88003e198900 880069bc4b00
> 880069bc4ec0
> [  367.241034]  880069bc4b30 859e90a0 880038fb7be8
> 817da1fc
> [  367.241034] Call Trace:
> [  367.241034]  [] dump_stack+0x12e/0x185
> [  367.241034]  [] ? sock_release+0x1d0/0x1d0
> [  367.241034]  [] kasan_object_err+0x1c/0x70
> [  367.241034]  [] kasan_report_error+0x1ae/0x490
> [  367.241034]  [] ? sock_release+0x1d0/0x1d0
> [  367.241034]  [] __asan_report_load2_noabort+0x3e/0x40
> [  367.241034]  [] ? __fput+0x65a/0x780
> [  367.241034]  [] __fput+0x65a/0x780
> [  367.241034]  [] fput+0x15/0x20
> [  367.241034]  [] task_work_run+0xf3/0x170
> [  367.241034]  [] do_exit+0x868/0x2c10
> [  367.241034]  [] ? sock_ioctl+0x1db/0x3d0
> [  367.241034]  [] ? sock_do_ioctl+0xb0/0xb0
> [  367.241034]  [] ? do_vfs_ioctl+0x430/0x1080
> [  367.241034]  [] ? mm_update_next_owner+0x640/0x640
> [  367.241034]  [] ? ioctl_preallocate+0x210/0x210
> [  367.241034]  [] ? bad_area+0x69/0x80
> [  367.241034]  [] ? exit_to_usermode_loop+0x3e/0x210
> [  367.241034]  [] ? entry_SYSCALL_64_fastpath+0x5/0xc1
> [  367.241034]  [] do_group_exit+0x108/0x330
> [  367.241034]  [] SyS_exit_group+0x1d/0x20
> [  367.241034]  [] entry_SYSCALL_64_fastpath+0x23/0xc1


Hmm, we have a double free here. I have a patch to fix it, will send it out
very soon.

Thanks!


> [  367.241034] Object at 880069bc4b00, in cache sock_inode_cache size: 960
> [  367.241034] Allocated:
> [  367.241034] PID = 4045
> [  367.241034]  [] save_stack_trace+0x26/0x50
> [  367.241034]  [] save_stack+0x46/0xd0
> [  367.241034]  [] kasan_kmalloc+0xad/0xe0
> [  367.241034]  [] kasan_slab_alloc+0x12/0x20
> [  367.241034]  [] kmem_cache_alloc+0x12b/0x710
> [  367.241034]  [] sock_alloc_inode+0x1d/0x250
> [  367.241034]  [] alloc_inode+0x61/0x180
> [  367.241034]  [] new_inode_pseudo+0x17/0xe0
> [  367.241034]  [] sock_alloc+0x41/0x280
> [  367.241034]  [] kcm_ioctl+0x9b3/0x13e0
> [  367.241034]  [] sock_do_ioctl+0x65/0xb0
> [  367.241034]  [] sock_ioctl+0x2d2/0x3d0
> [  367.241034]  [] do_vfs_ioctl+0x18c/0x1080
> [  367.241034]  [] SyS_ioctl+0x8f/0xc0
> [  367.241034]  [] entry_SYSCALL_64_fastpath+0x23/0xc1
> [  367.241034] Freed:
> [  367.241034] PID = 4045
> [  367.241034]  [] save_stack_trace+0x26/0x50
> [  367.241034]  [] save_stack+0x46/0xd0
> [  367.241034]  [] kasan_slab_free+0x72/0xc0
> [  367.241034]  [] kmem_cache_free+0x76/0x300
> [  367.241034]  [] sock_destroy_inode+0x56/0x70
> [  367.241034]  [] destroy_inode+0xc7/0x130
> [  367.241034]  [] evict+0x329/0x500
> [  367.241034]  [] iput+0x495/0x930
> [  367.241034]  [] sock_release+0x164/0x1d0
> [  367.241034]  [] sock_close+0x16/0x20
> [  367.241034]  [] __fput+0x236/0x780
> [  367.241034]  [] fput+0x15/0x20
> [  367.241034]  [] task_work_run+0xf3/0x170
> [  367.241034]  [] do_exit+0x868/0x2c10
> [  367.241034]  [] do_group_exit+0x108/0x330
> [  367.241034]  [] SyS_exit_group+0x1d/0x20
> [  367.241034]  [] entry_SYSCALL_64_fastpath+0x23/0xc1
> [  367.241034] Memory state around the buggy address:
> [  367.241034]  880069bc4a00: fc fc fc fc fc fc fc fc fc fc fc fc
> fc fc fc fc
> [  367.241034]  880069bc4a80: fc fc fc fc fc fc fc fc fc fc fc fc
> fc fc fc fc
> [  367.241034] >880069bc4b00: fb fb fb fb fb fb fb fb fb fb fb fb
> fb fb fb fb
> [  367.241034]  ^
> [  367.241034]  880069bc4b80: fb fb fb fb fb fb fb fb fb fb fb fb
> fb fb fb fb
> [  367.241034]  880069bc4c00: fb fb fb fb fb fb fb fb fb fb fb fb
> fb fb fb fb
> [  367.241034] 
> ==
>
>
> It is then followed by a bunch of other bugs, full log is here:
> https://gist.githubusercontent.com/dvyukov/b9884388bee40b792ae7900928358484/raw/ace2fa242468d584fa61bf753a5891faa71b0932/gistfile1.txt
>
>
> On commit 61c04572de404e52a655a36752e696bbcb483cf5 (Aug 25).


Re: kcm: use-after-free in fput of kcm socket

2016-08-28 Thread Cong Wang
On Sun, Aug 28, 2016 at 3:10 AM, Dmitry Vyukov  wrote:
> Hello,
>
> The following program triggers use-after-free:
>
> // autogenerated by syzkaller (http://github.com/google/syzkaller)
> #include 
> #include 
>
> int main()
> {
>   int fd = syscall(SYS_socket, 0x29ul, 0x5ul, 0x0ul, 0, 0, 0);
>   syscall(SYS_ioctl, fd, 0x89e2ul, 0x20a98000ul, 0, 0, 0);
>   return 0;
> }
>
>
> [  367.240184] 
> ==
> [  367.240784] BUG: KASAN: use-after-free in __fput+0x65a/0x780 at
> addr 880069bc4b30
> [  367.241034] Read of size 2 by task a.out/4045
> [  367.241034] CPU: 3 PID: 4045 Comm: a.out Not tainted 4.8.0-rc3+ #34
> [  367.241034] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS Bochs 01/01/2011
> [  367.241034]  884b8280 880038fb7bc0 82d1b1d9
> 00622e00
> [  367.241034]  fbfff1097050 88003e198900 880069bc4b00
> 880069bc4ec0
> [  367.241034]  880069bc4b30 859e90a0 880038fb7be8
> 817da1fc
> [  367.241034] Call Trace:
> [  367.241034]  [] dump_stack+0x12e/0x185
> [  367.241034]  [] ? sock_release+0x1d0/0x1d0
> [  367.241034]  [] kasan_object_err+0x1c/0x70
> [  367.241034]  [] kasan_report_error+0x1ae/0x490
> [  367.241034]  [] ? sock_release+0x1d0/0x1d0
> [  367.241034]  [] __asan_report_load2_noabort+0x3e/0x40
> [  367.241034]  [] ? __fput+0x65a/0x780
> [  367.241034]  [] __fput+0x65a/0x780
> [  367.241034]  [] fput+0x15/0x20
> [  367.241034]  [] task_work_run+0xf3/0x170
> [  367.241034]  [] do_exit+0x868/0x2c10
> [  367.241034]  [] ? sock_ioctl+0x1db/0x3d0
> [  367.241034]  [] ? sock_do_ioctl+0xb0/0xb0
> [  367.241034]  [] ? do_vfs_ioctl+0x430/0x1080
> [  367.241034]  [] ? mm_update_next_owner+0x640/0x640
> [  367.241034]  [] ? ioctl_preallocate+0x210/0x210
> [  367.241034]  [] ? bad_area+0x69/0x80
> [  367.241034]  [] ? exit_to_usermode_loop+0x3e/0x210
> [  367.241034]  [] ? entry_SYSCALL_64_fastpath+0x5/0xc1
> [  367.241034]  [] do_group_exit+0x108/0x330
> [  367.241034]  [] SyS_exit_group+0x1d/0x20
> [  367.241034]  [] entry_SYSCALL_64_fastpath+0x23/0xc1


Hmm, we have a double free here. I have a patch to fix it, will send it out
very soon.

Thanks!


> [  367.241034] Object at 880069bc4b00, in cache sock_inode_cache size: 960
> [  367.241034] Allocated:
> [  367.241034] PID = 4045
> [  367.241034]  [] save_stack_trace+0x26/0x50
> [  367.241034]  [] save_stack+0x46/0xd0
> [  367.241034]  [] kasan_kmalloc+0xad/0xe0
> [  367.241034]  [] kasan_slab_alloc+0x12/0x20
> [  367.241034]  [] kmem_cache_alloc+0x12b/0x710
> [  367.241034]  [] sock_alloc_inode+0x1d/0x250
> [  367.241034]  [] alloc_inode+0x61/0x180
> [  367.241034]  [] new_inode_pseudo+0x17/0xe0
> [  367.241034]  [] sock_alloc+0x41/0x280
> [  367.241034]  [] kcm_ioctl+0x9b3/0x13e0
> [  367.241034]  [] sock_do_ioctl+0x65/0xb0
> [  367.241034]  [] sock_ioctl+0x2d2/0x3d0
> [  367.241034]  [] do_vfs_ioctl+0x18c/0x1080
> [  367.241034]  [] SyS_ioctl+0x8f/0xc0
> [  367.241034]  [] entry_SYSCALL_64_fastpath+0x23/0xc1
> [  367.241034] Freed:
> [  367.241034] PID = 4045
> [  367.241034]  [] save_stack_trace+0x26/0x50
> [  367.241034]  [] save_stack+0x46/0xd0
> [  367.241034]  [] kasan_slab_free+0x72/0xc0
> [  367.241034]  [] kmem_cache_free+0x76/0x300
> [  367.241034]  [] sock_destroy_inode+0x56/0x70
> [  367.241034]  [] destroy_inode+0xc7/0x130
> [  367.241034]  [] evict+0x329/0x500
> [  367.241034]  [] iput+0x495/0x930
> [  367.241034]  [] sock_release+0x164/0x1d0
> [  367.241034]  [] sock_close+0x16/0x20
> [  367.241034]  [] __fput+0x236/0x780
> [  367.241034]  [] fput+0x15/0x20
> [  367.241034]  [] task_work_run+0xf3/0x170
> [  367.241034]  [] do_exit+0x868/0x2c10
> [  367.241034]  [] do_group_exit+0x108/0x330
> [  367.241034]  [] SyS_exit_group+0x1d/0x20
> [  367.241034]  [] entry_SYSCALL_64_fastpath+0x23/0xc1
> [  367.241034] Memory state around the buggy address:
> [  367.241034]  880069bc4a00: fc fc fc fc fc fc fc fc fc fc fc fc
> fc fc fc fc
> [  367.241034]  880069bc4a80: fc fc fc fc fc fc fc fc fc fc fc fc
> fc fc fc fc
> [  367.241034] >880069bc4b00: fb fb fb fb fb fb fb fb fb fb fb fb
> fb fb fb fb
> [  367.241034]  ^
> [  367.241034]  880069bc4b80: fb fb fb fb fb fb fb fb fb fb fb fb
> fb fb fb fb
> [  367.241034]  880069bc4c00: fb fb fb fb fb fb fb fb fb fb fb fb
> fb fb fb fb
> [  367.241034] 
> ==
>
>
> It is then followed by a bunch of other bugs, full log is here:
> https://gist.githubusercontent.com/dvyukov/b9884388bee40b792ae7900928358484/raw/ace2fa242468d584fa61bf753a5891faa71b0932/gistfile1.txt
>
>
> On commit 61c04572de404e52a655a36752e696bbcb483cf5 (Aug 25).


Re: [PATCH 2/2] arm64: dts: rockchip: add eMMC's power domain support for rk3399

2016-08-28 Thread Elaine Zhang



On 08/27/2016 11:05 PM, Shawn Lin wrote:

On 2016/8/27 21:41, Ziyuan Xu wrote:

Control power domain for eMMC via genpd to reduce power consumption.

Signed-off-by: Elaine Zhang 
Signed-off-by: Ziyuan Xu 



It looks nice to me. But this should be merged after applying that[0]
as your patch will break bind/unbind test for sdhci-of-arasan on rk3399
without it[0]. Moreover, Elaine should make sure that upstreamed
rockchip power domain stuff would not off pd for emmc, *otherwise*, I
should update my patch to make sure we update clkmul every time when
doing suspend 2 resume..



Forgot to say:
If use pd, Although there is no call to power odd the pd_emmc,
it will be power off when the system doing suspend 2 resume.
(Because the system call 
__device_suspend_noirq->pm_genpd_suspend_noirq->rockchip_pd_power_off)


And it's important to note:
If the pd has been power off, some grf regs will be back to the default 
value.(which grf regs in this pd)
So if the pd support power off , this grf regs need to save and restore 
or reinit.

For example:
pd_emmc
aclk_emmc_grf

If the pd is always on,and this pd have wakeup func.
The device need to add device_init_wakeup() to make the pd always on 
when the system doing suspend 2 resume.




[0]: https://patchwork.kernel.org/patch/9300971/


---

 arch/arm64/boot/dts/rockchip/rk3399.dtsi | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
index 32aebc8..71733d4 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -239,6 +239,7 @@
 #clock-cells = <0>;
 phys = <_phy>;
 phy-names = "phy_arasan";
+power-domains = < RK3399_PD_EMMC>;
 status = "disabled";
 };

@@ -611,6 +612,11 @@
 status = "disabled";
 };

+qos_emmc: qos@ffa58000 {
+compatible = "syscon";
+reg = <0x0 0xffa58000 0x0 0x20>;
+};
+
 qos_hdcp: qos@ffa9 {
 compatible = "syscon";
 reg = <0x0 0xffa9 0x0 0x20>;
@@ -739,6 +745,11 @@
 };

 /* These power domains are grouped by VD_LOGIC */
+pd_emmc@RK3399_PD_EMMC {
+reg = ;
+clocks = < ACLK_EMMC>;
+pm_qos = <_emmc>;
+};
 pd_vio@RK3399_PD_VIO {
 reg = ;
 #address-cells = <1>;








Re: [PATCH 2/2] arm64: dts: rockchip: add eMMC's power domain support for rk3399

2016-08-28 Thread Elaine Zhang



On 08/27/2016 11:05 PM, Shawn Lin wrote:

On 2016/8/27 21:41, Ziyuan Xu wrote:

Control power domain for eMMC via genpd to reduce power consumption.

Signed-off-by: Elaine Zhang 
Signed-off-by: Ziyuan Xu 



It looks nice to me. But this should be merged after applying that[0]
as your patch will break bind/unbind test for sdhci-of-arasan on rk3399
without it[0]. Moreover, Elaine should make sure that upstreamed
rockchip power domain stuff would not off pd for emmc, *otherwise*, I
should update my patch to make sure we update clkmul every time when
doing suspend 2 resume..



Forgot to say:
If use pd, Although there is no call to power odd the pd_emmc,
it will be power off when the system doing suspend 2 resume.
(Because the system call 
__device_suspend_noirq->pm_genpd_suspend_noirq->rockchip_pd_power_off)


And it's important to note:
If the pd has been power off, some grf regs will be back to the default 
value.(which grf regs in this pd)
So if the pd support power off , this grf regs need to save and restore 
or reinit.

For example:
pd_emmc
aclk_emmc_grf

If the pd is always on,and this pd have wakeup func.
The device need to add device_init_wakeup() to make the pd always on 
when the system doing suspend 2 resume.




[0]: https://patchwork.kernel.org/patch/9300971/


---

 arch/arm64/boot/dts/rockchip/rk3399.dtsi | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
index 32aebc8..71733d4 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -239,6 +239,7 @@
 #clock-cells = <0>;
 phys = <_phy>;
 phy-names = "phy_arasan";
+power-domains = < RK3399_PD_EMMC>;
 status = "disabled";
 };

@@ -611,6 +612,11 @@
 status = "disabled";
 };

+qos_emmc: qos@ffa58000 {
+compatible = "syscon";
+reg = <0x0 0xffa58000 0x0 0x20>;
+};
+
 qos_hdcp: qos@ffa9 {
 compatible = "syscon";
 reg = <0x0 0xffa9 0x0 0x20>;
@@ -739,6 +745,11 @@
 };

 /* These power domains are grouped by VD_LOGIC */
+pd_emmc@RK3399_PD_EMMC {
+reg = ;
+clocks = < ACLK_EMMC>;
+pm_qos = <_emmc>;
+};
 pd_vio@RK3399_PD_VIO {
 reg = ;
 #address-cells = <1>;








Re: [PATCH] drm/rockchip: vop: make vop register setting take effect

2016-08-28 Thread Mark yao

On 2016年08月27日 11:39, Chris Zhong wrote:

The setting of vop registers need a reg_done writing to take effect.
In vop_enable the vop return to work by by restoring registers, but the
registers do not take effect immediately, it should a vop_cfg_done
after it. The same thing is needed by windows_disabled in
vop_crtc_disable.

Signed-off-by: Chris Zhong 
---
  drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 4 
  1 file changed, 4 insertions(+)

Thanks for your fix.

applied to my drm-fixes.


diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index efbc41a..a0bfcff 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -464,6 +464,8 @@ static int vop_enable(struct drm_crtc *crtc)
}
  
  	memcpy(vop->regs, vop->regsbak, vop->len);

+   vop_cfg_done(vop);
+
/*
 * At here, vop clock & iommu is enable, R/W vop regs would be safe.
 */
@@ -513,6 +515,8 @@ static void vop_crtc_disable(struct drm_crtc *crtc)
spin_unlock(>reg_lock);
}
  
+	vop_cfg_done(vop);

+
drm_crtc_vblank_off(crtc);
  
  	/*


--
Mark Yao




Re: [PATCH] drm/rockchip: vop: make vop register setting take effect

2016-08-28 Thread Mark yao

On 2016年08月27日 11:39, Chris Zhong wrote:

The setting of vop registers need a reg_done writing to take effect.
In vop_enable the vop return to work by by restoring registers, but the
registers do not take effect immediately, it should a vop_cfg_done
after it. The same thing is needed by windows_disabled in
vop_crtc_disable.

Signed-off-by: Chris Zhong 
---
  drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 4 
  1 file changed, 4 insertions(+)

Thanks for your fix.

applied to my drm-fixes.


diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index efbc41a..a0bfcff 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -464,6 +464,8 @@ static int vop_enable(struct drm_crtc *crtc)
}
  
  	memcpy(vop->regs, vop->regsbak, vop->len);

+   vop_cfg_done(vop);
+
/*
 * At here, vop clock & iommu is enable, R/W vop regs would be safe.
 */
@@ -513,6 +515,8 @@ static void vop_crtc_disable(struct drm_crtc *crtc)
spin_unlock(>reg_lock);
}
  
+	vop_cfg_done(vop);

+
drm_crtc_vblank_off(crtc);
  
  	/*


--
Mark Yao




[PATCH] iio: fix pressure data output unit in hid-sensor-attributes

2016-08-28 Thread Kweh, Hock Leong
From: "Kweh, Hock Leong" 

According to IIO ABI definition, IIO_PRESSURE data output unit is
kilopascal:
http://lxr.free-electrons.com/source/Documentation/ABI/testing/sysfs-bus-iio

This patch fix output unit of HID pressure sensor IIO driver from pascal to
kilopascal to follow IIO ABI definition.

Signed-off-by: Kweh, Hock Leong 
---
 .../iio/common/hid-sensors/hid-sensor-attributes.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iio/common/hid-sensors/hid-sensor-attributes.c 
b/drivers/iio/common/hid-sensors/hid-sensor-attributes.c
index e81f434..dc33c1d 100644
--- a/drivers/iio/common/hid-sensors/hid-sensor-attributes.c
+++ b/drivers/iio/common/hid-sensors/hid-sensor-attributes.c
@@ -56,8 +56,8 @@ static struct {
{HID_USAGE_SENSOR_ALS, 0, 1, 0},
{HID_USAGE_SENSOR_ALS, HID_USAGE_SENSOR_UNITS_LUX, 1, 0},
 
-   {HID_USAGE_SENSOR_PRESSURE, 0, 10, 0},
-   {HID_USAGE_SENSOR_PRESSURE, HID_USAGE_SENSOR_UNITS_PASCAL, 1, 0},
+   {HID_USAGE_SENSOR_PRESSURE, 0, 100, 0},
+   {HID_USAGE_SENSOR_PRESSURE, HID_USAGE_SENSOR_UNITS_PASCAL, 0, 1000},
 };
 
 static int pow_10(unsigned power)
-- 
1.7.9.5



[PATCH] iio: fix pressure data output unit in hid-sensor-attributes

2016-08-28 Thread Kweh, Hock Leong
From: "Kweh, Hock Leong" 

According to IIO ABI definition, IIO_PRESSURE data output unit is
kilopascal:
http://lxr.free-electrons.com/source/Documentation/ABI/testing/sysfs-bus-iio

This patch fix output unit of HID pressure sensor IIO driver from pascal to
kilopascal to follow IIO ABI definition.

Signed-off-by: Kweh, Hock Leong 
---
 .../iio/common/hid-sensors/hid-sensor-attributes.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iio/common/hid-sensors/hid-sensor-attributes.c 
b/drivers/iio/common/hid-sensors/hid-sensor-attributes.c
index e81f434..dc33c1d 100644
--- a/drivers/iio/common/hid-sensors/hid-sensor-attributes.c
+++ b/drivers/iio/common/hid-sensors/hid-sensor-attributes.c
@@ -56,8 +56,8 @@ static struct {
{HID_USAGE_SENSOR_ALS, 0, 1, 0},
{HID_USAGE_SENSOR_ALS, HID_USAGE_SENSOR_UNITS_LUX, 1, 0},
 
-   {HID_USAGE_SENSOR_PRESSURE, 0, 10, 0},
-   {HID_USAGE_SENSOR_PRESSURE, HID_USAGE_SENSOR_UNITS_PASCAL, 1, 0},
+   {HID_USAGE_SENSOR_PRESSURE, 0, 100, 0},
+   {HID_USAGE_SENSOR_PRESSURE, HID_USAGE_SENSOR_UNITS_PASCAL, 0, 1000},
 };
 
 static int pow_10(unsigned power)
-- 
1.7.9.5



Re: [PATCH] thermal: hisilicon: fix COMPILE_TEST dependencies

2016-08-28 Thread Leo Yan
On Mon, Aug 29, 2016 at 10:00:52AM +0800, Zhang Rui wrote:
> On 五, 2016-08-26 at 17:44 +0200, Arnd Bergmann wrote:
> > As we now 'select STUB_CLK_HI6220', all dependencies for that driver
> > have
> > to be present in order to enable HISI_THERMAL, as pointed out by
> > Kconfig:
> > 
> > warning: (HISI_THERMAL) selects STUB_CLK_HI6220 which has unmet
> > direct dependencies (COMMON_CLK && COMMON_CLK_HI6220 && MAILBOX)
> > 
> > This rearranges the dependencies for this symbol, so all the
> > dependencies
> > aside from ARCH_HISI are always met when building it for compile
> > testing.
> > This mainly helps for randconfig testing, as an "allmodconfig" kernel
> > will
> > enable them anyway.
> > 
> > Signed-off-by: Arnd Bergmann 
> > Fixes: 5f63581ce68e ("thermal: hisilicon: Add dependency on the clock
> > driver to allow frequency scaling")
> 
> As commit 5f63581ce68e has not been shipped in upstream yet, please
> fold this patch into the original one. I'd prefer one good patch
> instead of a broken patch + a fix.

Amit has one discussion with me, we have a more clear method to enable
Hisilicon thermal driver [1]: we are planning to enable
CONFIG_CPU_THERMAL in defconfig, and enable stub clock driver and
thermal driver depend on ARCH_HISI; so can resolve all dependency
issue.

I will prepare related patches and send out review ASAP, sorry for my
late.

[1] https://lkml.org/lkml/2016/8/8/879

Thanks,
Leo Yan

> > ---
> >  drivers/thermal/Kconfig | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> > index 5cba072c3a62..3c8607c07352 100644
> > --- a/drivers/thermal/Kconfig
> > +++ b/drivers/thermal/Kconfig
> > @@ -177,7 +177,8 @@ config THERMAL_EMULATION
> >  
> >  config HISI_THERMAL
> >     tristate "Hisilicon thermal driver"
> > -   depends on (ARCH_HISI && CPU_THERMAL && OF) || COMPILE_TEST
> > +   depends on ARCH_HISI || COMPILE_TEST
> > +   depends on CPU_THERMAL && OF && COMMON_CLK_HI6220 && MAILBOX
> >     depends on HAS_IOMEM
> >     select STUB_CLK_HI6220
> >     help


Re: [PATCH] thermal: hisilicon: fix COMPILE_TEST dependencies

2016-08-28 Thread Leo Yan
On Mon, Aug 29, 2016 at 10:00:52AM +0800, Zhang Rui wrote:
> On 五, 2016-08-26 at 17:44 +0200, Arnd Bergmann wrote:
> > As we now 'select STUB_CLK_HI6220', all dependencies for that driver
> > have
> > to be present in order to enable HISI_THERMAL, as pointed out by
> > Kconfig:
> > 
> > warning: (HISI_THERMAL) selects STUB_CLK_HI6220 which has unmet
> > direct dependencies (COMMON_CLK && COMMON_CLK_HI6220 && MAILBOX)
> > 
> > This rearranges the dependencies for this symbol, so all the
> > dependencies
> > aside from ARCH_HISI are always met when building it for compile
> > testing.
> > This mainly helps for randconfig testing, as an "allmodconfig" kernel
> > will
> > enable them anyway.
> > 
> > Signed-off-by: Arnd Bergmann 
> > Fixes: 5f63581ce68e ("thermal: hisilicon: Add dependency on the clock
> > driver to allow frequency scaling")
> 
> As commit 5f63581ce68e has not been shipped in upstream yet, please
> fold this patch into the original one. I'd prefer one good patch
> instead of a broken patch + a fix.

Amit has one discussion with me, we have a more clear method to enable
Hisilicon thermal driver [1]: we are planning to enable
CONFIG_CPU_THERMAL in defconfig, and enable stub clock driver and
thermal driver depend on ARCH_HISI; so can resolve all dependency
issue.

I will prepare related patches and send out review ASAP, sorry for my
late.

[1] https://lkml.org/lkml/2016/8/8/879

Thanks,
Leo Yan

> > ---
> >  drivers/thermal/Kconfig | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> > index 5cba072c3a62..3c8607c07352 100644
> > --- a/drivers/thermal/Kconfig
> > +++ b/drivers/thermal/Kconfig
> > @@ -177,7 +177,8 @@ config THERMAL_EMULATION
> >  
> >  config HISI_THERMAL
> >     tristate "Hisilicon thermal driver"
> > -   depends on (ARCH_HISI && CPU_THERMAL && OF) || COMPILE_TEST
> > +   depends on ARCH_HISI || COMPILE_TEST
> > +   depends on CPU_THERMAL && OF && COMMON_CLK_HI6220 && MAILBOX
> >     depends on HAS_IOMEM
> >     select STUB_CLK_HI6220
> >     help


Re: chipidea: udc: kernel panic in isr_setup_status_phase

2016-08-28 Thread Peter Chen
On Sun, Aug 28, 2016 at 08:15:02PM +0200, Clemens Gruber wrote:
> On Sat, Aug 27, 2016 at 01:21:52AM +0800, Peter Chen wrote:
> > The gadget triggers UI interrupt due to host sends packet.
> > 
> > I really can't understand that, why host does not send bus reset
> > before sending packet (eg, GET_DESCRIPTOR)? It violates USB spec.
> > 
> > Are you sure the first interrupt is UI when the vbus from off to on?
> 
> Yes, if the error is present, the first interrupt is intr=0x1 (USBi_UI)
> and then the NULL pointer dereference would occur.
> (Also: Checking for ci->status == NULL and avoiding the dereference does
> not make the gadget work. It just avoids the kernel panic.)
> 
> But I also observed a situation where the first interrupt is intr=0x100
> (USBi_SLI) followed by 0x40 (USBi_URI), 0x4 (USBi_PCI) and three times
> 0x1 (USBi_UI).
> After this "g_ether gadget: suspend" appears and the sequence repeats,
> starting again with intr=0x100, followed by 0x40, ... until three times
> 0x1 and the g_ether gadget: suspend message.
> On the host, every 500ms a new message with incrementing device number
> appears:
> usb 1-4: new high-speed USB device number 41 using xhci_hcd
> usb 1-4: new high-speed USB device number 42 using xhci_hcd
> ...
> 
> In the case where everything works, it looks like this:
> intr=0x100 (USBi_SLI)
> intr=0x40 (USBi_URI)
> intr=0x4 (USBi_PCI)
> intr=0x1 (USBi_UI)
> intr=0x1 (USBi_UI)
> ci_hdrc ci_hdrc.0: freeing queued_request
> intr=0x41 (USBi_URI + USBi_UI)
> intr=0x4 (USBi_PCI)
> intr=0x1 (USBi_UI) <-- appears 17 times
> g_ether gadget: high-speed config #1: CDC Ethernet (EEM)
> intr=0x1 (USBi_UI) <-- appears 5 times
> IPv6: ADDRCONF(NETDEV_CHANGE): usb0: link becomes ready
> 
> --
> 
> Do you think this could be a hardware problem? We used the same method
> as in the MCIMX6Q-SDB schematics (SPF-27516_C5.pdf) to avoid any current
> flow through OTG VBUS to the inside when the board is powered off but a
> host PC is still connected via OTG.
> So we not just pass the VBUS signal through, there are two MOSFETs,
> which prevent that (if the internal 3.3V is low).
> Mostly the same logic as in said document on page 11 (top-left area).
> 
> Another possibility, I am investigating now, is a ground loop and a
> main-supply voltage-dependency, although the whole USB OTG part is
> on a completely different supply rail, the GNDs are shared.
> 
> I am investigating in all directions at the moment ;-)
> 

Would you please measure the voltage of vbus within 1s at below two
conditions:

- Just connect cable
- Just disconnect cable

> 
> I also switched to CDC/EEM to make sure it has nothing to do with RNDIS,
> and the problem is still present. So the error must be on a lower level.
> 
> --
> 
> You could try to reproduce it with a MCIMX6Q-SDB and varying the main
> supply voltage between minimum and maximum allowed voltage levels. For
> example: Plug OTG in once at the minimum and once at the maximum level,
> see if it behaves differently.
> But this is just one of my desperate theories at the moment..
> 

Sorry, I have no equipment which can change the voltage of main supplier
now.

-- 

Best Regards,
Peter Chen


Re: chipidea: udc: kernel panic in isr_setup_status_phase

2016-08-28 Thread Peter Chen
On Sun, Aug 28, 2016 at 08:15:02PM +0200, Clemens Gruber wrote:
> On Sat, Aug 27, 2016 at 01:21:52AM +0800, Peter Chen wrote:
> > The gadget triggers UI interrupt due to host sends packet.
> > 
> > I really can't understand that, why host does not send bus reset
> > before sending packet (eg, GET_DESCRIPTOR)? It violates USB spec.
> > 
> > Are you sure the first interrupt is UI when the vbus from off to on?
> 
> Yes, if the error is present, the first interrupt is intr=0x1 (USBi_UI)
> and then the NULL pointer dereference would occur.
> (Also: Checking for ci->status == NULL and avoiding the dereference does
> not make the gadget work. It just avoids the kernel panic.)
> 
> But I also observed a situation where the first interrupt is intr=0x100
> (USBi_SLI) followed by 0x40 (USBi_URI), 0x4 (USBi_PCI) and three times
> 0x1 (USBi_UI).
> After this "g_ether gadget: suspend" appears and the sequence repeats,
> starting again with intr=0x100, followed by 0x40, ... until three times
> 0x1 and the g_ether gadget: suspend message.
> On the host, every 500ms a new message with incrementing device number
> appears:
> usb 1-4: new high-speed USB device number 41 using xhci_hcd
> usb 1-4: new high-speed USB device number 42 using xhci_hcd
> ...
> 
> In the case where everything works, it looks like this:
> intr=0x100 (USBi_SLI)
> intr=0x40 (USBi_URI)
> intr=0x4 (USBi_PCI)
> intr=0x1 (USBi_UI)
> intr=0x1 (USBi_UI)
> ci_hdrc ci_hdrc.0: freeing queued_request
> intr=0x41 (USBi_URI + USBi_UI)
> intr=0x4 (USBi_PCI)
> intr=0x1 (USBi_UI) <-- appears 17 times
> g_ether gadget: high-speed config #1: CDC Ethernet (EEM)
> intr=0x1 (USBi_UI) <-- appears 5 times
> IPv6: ADDRCONF(NETDEV_CHANGE): usb0: link becomes ready
> 
> --
> 
> Do you think this could be a hardware problem? We used the same method
> as in the MCIMX6Q-SDB schematics (SPF-27516_C5.pdf) to avoid any current
> flow through OTG VBUS to the inside when the board is powered off but a
> host PC is still connected via OTG.
> So we not just pass the VBUS signal through, there are two MOSFETs,
> which prevent that (if the internal 3.3V is low).
> Mostly the same logic as in said document on page 11 (top-left area).
> 
> Another possibility, I am investigating now, is a ground loop and a
> main-supply voltage-dependency, although the whole USB OTG part is
> on a completely different supply rail, the GNDs are shared.
> 
> I am investigating in all directions at the moment ;-)
> 

Would you please measure the voltage of vbus within 1s at below two
conditions:

- Just connect cable
- Just disconnect cable

> 
> I also switched to CDC/EEM to make sure it has nothing to do with RNDIS,
> and the problem is still present. So the error must be on a lower level.
> 
> --
> 
> You could try to reproduce it with a MCIMX6Q-SDB and varying the main
> supply voltage between minimum and maximum allowed voltage levels. For
> example: Plug OTG in once at the minimum and once at the maximum level,
> see if it behaves differently.
> But this is just one of my desperate theories at the moment..
> 

Sorry, I have no equipment which can change the voltage of main supplier
now.

-- 

Best Regards,
Peter Chen


Re: checkkpatch (in)sanity ?

2016-08-28 Thread Levin, Alexander
On Sun, Aug 28, 2016 at 07:20:52PM -0400, Joe Perches wrote:
> On Sun, 2016-08-28 at 18:37 -0400, Levin, Alexander wrote:
> > On Sun, Aug 28, 2016 at 01:15:57PM -0400, Joe Perches wrote:
> > > On Sat, 2016-08-27 at 22:47 -0400, Levin, Alexander wrote:
> > > > Would you agree that by default we shouldn't show anything that's
> > > > not an error/defect?
> > > Not particularly, no.
> > I think that we need to figure out this disagreement first then. My
> > claim is that checkpatch's output isn't useful.
> []
> > It'll be interesting to hear from these people about their view of
> > checkpatch, but IMO when on average there are more issues than commits
> > I can suggest two possible causes:
> > 
> >  1. People are used to ignore checkpatch warnings.
> >  2. People aren't using checkpatch.
> > 
> > Can you really make the claim that this is how checkpatch is supposed
> > to be working?
> 
> .  I make no particular claims about checkpatch.
> 
> I think checkpatch isn't particularly useful for those
> thoroughly inculcated in what style the kernel uses and
> is more useful for infrequent or new submitters.
> 
> The long time submitters and key maintainers are already
> pretty consistent about coding style.

I did the same test for authors of 5-9 commits (just an arbitrary choice of 
numbers for "infrequent"), the results there are much worse: 3981 commits, 7175 
issues.

The only big subsystem that seems to be forcing checkpatch "correctness" is 
mm/, where akpm is fixing up checkpatch issues himself. Otherwise, it looks 
like maintainers are not running checkpatch nor are making sure that the 
commits they merge in don't have checkpatch issues.

> It would be good to examine the specific messages though.

What for? The point is that with that amount of issues it's evident that people 
don't actually use checkpatch to begin with. We can discuss whether the output 
it produces makes sense all we want, but the fact is that people just don't use 
it - and I've tried to give my opinion of why I think it happens.

-- 

Thanks,
Sasha

Re: checkkpatch (in)sanity ?

2016-08-28 Thread Levin, Alexander
On Sun, Aug 28, 2016 at 07:20:52PM -0400, Joe Perches wrote:
> On Sun, 2016-08-28 at 18:37 -0400, Levin, Alexander wrote:
> > On Sun, Aug 28, 2016 at 01:15:57PM -0400, Joe Perches wrote:
> > > On Sat, 2016-08-27 at 22:47 -0400, Levin, Alexander wrote:
> > > > Would you agree that by default we shouldn't show anything that's
> > > > not an error/defect?
> > > Not particularly, no.
> > I think that we need to figure out this disagreement first then. My
> > claim is that checkpatch's output isn't useful.
> []
> > It'll be interesting to hear from these people about their view of
> > checkpatch, but IMO when on average there are more issues than commits
> > I can suggest two possible causes:
> > 
> >  1. People are used to ignore checkpatch warnings.
> >  2. People aren't using checkpatch.
> > 
> > Can you really make the claim that this is how checkpatch is supposed
> > to be working?
> 
> .  I make no particular claims about checkpatch.
> 
> I think checkpatch isn't particularly useful for those
> thoroughly inculcated in what style the kernel uses and
> is more useful for infrequent or new submitters.
> 
> The long time submitters and key maintainers are already
> pretty consistent about coding style.

I did the same test for authors of 5-9 commits (just an arbitrary choice of 
numbers for "infrequent"), the results there are much worse: 3981 commits, 7175 
issues.

The only big subsystem that seems to be forcing checkpatch "correctness" is 
mm/, where akpm is fixing up checkpatch issues himself. Otherwise, it looks 
like maintainers are not running checkpatch nor are making sure that the 
commits they merge in don't have checkpatch issues.

> It would be good to examine the specific messages though.

What for? The point is that with that amount of issues it's evident that people 
don't actually use checkpatch to begin with. We can discuss whether the output 
it produces makes sense all we want, but the fact is that people just don't use 
it - and I've tried to give my opinion of why I think it happens.

-- 

Thanks,
Sasha

Re: [PATCH v6 1/2] clk: uniphier: add core support code for UniPhier clock driver

2016-08-28 Thread Masahiro Yamada
Hi Stephen,


2016-08-20 4:16 GMT+09:00 Stephen Boyd :
>>
>> >> +
>> >> + parent = of_get_parent(dev->of_node); /* parent should be syscon 
>> >> node */
>> >> + regmap = syscon_node_to_regmap(parent);
>> >> + of_node_put(parent);
>> >
>> > devm_get_regmap(dev->parent) should work then? Why do we need to
>> > use OF APIs?
>>
>> "git grep devm_get_regmap" did not hit anything.
>>
>> Where is it defined?
>>
>
> Sorry I meant dev_get_regmap().
>

I tried this, but it did not work.

To make dev_get_regmap() work,
the parent device needs to call dev_regmap_init_mmio() beforehand.


Since commit bdb0066df96e74a4002125467ebe459feff1ebef
(mfd: syscon: Decouple syscon interface from platform devices),
syscon_probe() is not called for platform devices,
so that never happens.



-- 
Best Regards
Masahiro Yamada


Re: [PATCH v6 1/2] clk: uniphier: add core support code for UniPhier clock driver

2016-08-28 Thread Masahiro Yamada
Hi Stephen,


2016-08-20 4:16 GMT+09:00 Stephen Boyd :
>>
>> >> +
>> >> + parent = of_get_parent(dev->of_node); /* parent should be syscon 
>> >> node */
>> >> + regmap = syscon_node_to_regmap(parent);
>> >> + of_node_put(parent);
>> >
>> > devm_get_regmap(dev->parent) should work then? Why do we need to
>> > use OF APIs?
>>
>> "git grep devm_get_regmap" did not hit anything.
>>
>> Where is it defined?
>>
>
> Sorry I meant dev_get_regmap().
>

I tried this, but it did not work.

To make dev_get_regmap() work,
the parent device needs to call dev_regmap_init_mmio() beforehand.


Since commit bdb0066df96e74a4002125467ebe459feff1ebef
(mfd: syscon: Decouple syscon interface from platform devices),
syscon_probe() is not called for platform devices,
so that never happens.



-- 
Best Regards
Masahiro Yamada


Re: [PATCH 2/2] arm64: dts: rockchip: add eMMC's power domain support for rk3399

2016-08-28 Thread Elaine Zhang



On 08/27/2016 11:05 PM, Shawn Lin wrote:

On 2016/8/27 21:41, Ziyuan Xu wrote:

Control power domain for eMMC via genpd to reduce power consumption.

Signed-off-by: Elaine Zhang 
Signed-off-by: Ziyuan Xu 



It looks nice to me. But this should be merged after applying that[0]
as your patch will break bind/unbind test for sdhci-of-arasan on rk3399
without it[0]. Moreover, Elaine should make sure that upstreamed
rockchip power domain stuff would not off pd for emmc, *otherwise*, I
should update my patch to make sure we update clkmul every time when
doing suspend 2 resume..


It looks nice to me. I was going on to submit with other Pds.




[0]: https://patchwork.kernel.org/patch/9300971/


---

 arch/arm64/boot/dts/rockchip/rk3399.dtsi | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
index 32aebc8..71733d4 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -239,6 +239,7 @@
 #clock-cells = <0>;
 phys = <_phy>;
 phy-names = "phy_arasan";
+power-domains = < RK3399_PD_EMMC>;
 status = "disabled";
 };

@@ -611,6 +612,11 @@
 status = "disabled";
 };

+qos_emmc: qos@ffa58000 {
+compatible = "syscon";
+reg = <0x0 0xffa58000 0x0 0x20>;
+};
+
 qos_hdcp: qos@ffa9 {
 compatible = "syscon";
 reg = <0x0 0xffa9 0x0 0x20>;
@@ -739,6 +745,11 @@
 };

 /* These power domains are grouped by VD_LOGIC */
+pd_emmc@RK3399_PD_EMMC {
+reg = ;
+clocks = < ACLK_EMMC>;
+pm_qos = <_emmc>;
+};
 pd_vio@RK3399_PD_VIO {
 reg = ;
 #address-cells = <1>;








Re: [PATCH 2/2] arm64: dts: rockchip: add eMMC's power domain support for rk3399

2016-08-28 Thread Elaine Zhang



On 08/27/2016 11:05 PM, Shawn Lin wrote:

On 2016/8/27 21:41, Ziyuan Xu wrote:

Control power domain for eMMC via genpd to reduce power consumption.

Signed-off-by: Elaine Zhang 
Signed-off-by: Ziyuan Xu 



It looks nice to me. But this should be merged after applying that[0]
as your patch will break bind/unbind test for sdhci-of-arasan on rk3399
without it[0]. Moreover, Elaine should make sure that upstreamed
rockchip power domain stuff would not off pd for emmc, *otherwise*, I
should update my patch to make sure we update clkmul every time when
doing suspend 2 resume..


It looks nice to me. I was going on to submit with other Pds.




[0]: https://patchwork.kernel.org/patch/9300971/


---

 arch/arm64/boot/dts/rockchip/rk3399.dtsi | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
index 32aebc8..71733d4 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -239,6 +239,7 @@
 #clock-cells = <0>;
 phys = <_phy>;
 phy-names = "phy_arasan";
+power-domains = < RK3399_PD_EMMC>;
 status = "disabled";
 };

@@ -611,6 +612,11 @@
 status = "disabled";
 };

+qos_emmc: qos@ffa58000 {
+compatible = "syscon";
+reg = <0x0 0xffa58000 0x0 0x20>;
+};
+
 qos_hdcp: qos@ffa9 {
 compatible = "syscon";
 reg = <0x0 0xffa9 0x0 0x20>;
@@ -739,6 +745,11 @@
 };

 /* These power domains are grouped by VD_LOGIC */
+pd_emmc@RK3399_PD_EMMC {
+reg = ;
+clocks = < ACLK_EMMC>;
+pm_qos = <_emmc>;
+};
 pd_vio@RK3399_PD_VIO {
 reg = ;
 #address-cells = <1>;








Re: [PATCH] thermal: hisilicon: fix COMPILE_TEST dependencies

2016-08-28 Thread Zhang Rui
On 五, 2016-08-26 at 17:44 +0200, Arnd Bergmann wrote:
> As we now 'select STUB_CLK_HI6220', all dependencies for that driver
> have
> to be present in order to enable HISI_THERMAL, as pointed out by
> Kconfig:
> 
> warning: (HISI_THERMAL) selects STUB_CLK_HI6220 which has unmet
> direct dependencies (COMMON_CLK && COMMON_CLK_HI6220 && MAILBOX)
> 
> This rearranges the dependencies for this symbol, so all the
> dependencies
> aside from ARCH_HISI are always met when building it for compile
> testing.
> This mainly helps for randconfig testing, as an "allmodconfig" kernel
> will
> enable them anyway.
> 
> Signed-off-by: Arnd Bergmann 
> Fixes: 5f63581ce68e ("thermal: hisilicon: Add dependency on the clock
> driver to allow frequency scaling")

As commit 5f63581ce68e has not been shipped in upstream yet, please
fold this patch into the original one. I'd prefer one good patch
instead of a broken patch + a fix.

thanks,
rui
> ---
>  drivers/thermal/Kconfig | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> index 5cba072c3a62..3c8607c07352 100644
> --- a/drivers/thermal/Kconfig
> +++ b/drivers/thermal/Kconfig
> @@ -177,7 +177,8 @@ config THERMAL_EMULATION
>  
>  config HISI_THERMAL
>   tristate "Hisilicon thermal driver"
> - depends on (ARCH_HISI && CPU_THERMAL && OF) || COMPILE_TEST
> + depends on ARCH_HISI || COMPILE_TEST
> + depends on CPU_THERMAL && OF && COMMON_CLK_HI6220 && MAILBOX
>   depends on HAS_IOMEM
>   select STUB_CLK_HI6220
>   help


Re: [PATCH] thermal: hisilicon: fix COMPILE_TEST dependencies

2016-08-28 Thread Zhang Rui
On 五, 2016-08-26 at 17:44 +0200, Arnd Bergmann wrote:
> As we now 'select STUB_CLK_HI6220', all dependencies for that driver
> have
> to be present in order to enable HISI_THERMAL, as pointed out by
> Kconfig:
> 
> warning: (HISI_THERMAL) selects STUB_CLK_HI6220 which has unmet
> direct dependencies (COMMON_CLK && COMMON_CLK_HI6220 && MAILBOX)
> 
> This rearranges the dependencies for this symbol, so all the
> dependencies
> aside from ARCH_HISI are always met when building it for compile
> testing.
> This mainly helps for randconfig testing, as an "allmodconfig" kernel
> will
> enable them anyway.
> 
> Signed-off-by: Arnd Bergmann 
> Fixes: 5f63581ce68e ("thermal: hisilicon: Add dependency on the clock
> driver to allow frequency scaling")

As commit 5f63581ce68e has not been shipped in upstream yet, please
fold this patch into the original one. I'd prefer one good patch
instead of a broken patch + a fix.

thanks,
rui
> ---
>  drivers/thermal/Kconfig | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> index 5cba072c3a62..3c8607c07352 100644
> --- a/drivers/thermal/Kconfig
> +++ b/drivers/thermal/Kconfig
> @@ -177,7 +177,8 @@ config THERMAL_EMULATION
>  
>  config HISI_THERMAL
>   tristate "Hisilicon thermal driver"
> - depends on (ARCH_HISI && CPU_THERMAL && OF) || COMPILE_TEST
> + depends on ARCH_HISI || COMPILE_TEST
> + depends on CPU_THERMAL && OF && COMMON_CLK_HI6220 && MAILBOX
>   depends on HAS_IOMEM
>   select STUB_CLK_HI6220
>   help


Re: [PATCH][v6] PM / hibernate: Print the possible panic reason when resuming with inconsistent e820 map

2016-08-28 Thread Andreas Mohr
Hi,

[no properly binding reference via In-Reply-To: available thus manually 
re-creating, sorry]

> > So we can print warning in hibernation_die_notifier without
> > introducing a global variable?
> > 
> 
> Actually, I'd kill the machine right away.
> 
> if (memcmp(result, buf, MD5_DIGEST_SIZE)) {
>   pr_err("PM: e820 map conflict detected!\n");
>   panic("BIOS is playing funny tricks with us.\n");
> } 
> Best regards,
>   Pavel

+1.

I would tend to think that
it's rather preferable to
kill an affected environment scope
(in this case: whole system), hard
(except for perhaps some emergency epilogue state saving),
in case of its state having become
suspicious/unpredictable/dangerous,
rather than having things carry on in a merry-go-round manner and thus
making improper state progress
which translates into
*continued activity* of *corrupting things*
(possibly even leading to *persistent* i.e. storage-recorded corruption!)
willy-nilly.

Plus, killing a machine hard in such a questionable case
would increase our influx of valuable reports
due to users being
hard-pressed to report this rather than
silently ignoring / not even knowing this issue
(in those cases where we already know that
no further development fixes will be determinable,
carrying on with a dire warning probably still is preferable to
making the machine completely unusable, eternally,
though).

Thus, think robustness.

Andreas Mohr


Re: [PATCH][v6] PM / hibernate: Print the possible panic reason when resuming with inconsistent e820 map

2016-08-28 Thread Andreas Mohr
Hi,

[no properly binding reference via In-Reply-To: available thus manually 
re-creating, sorry]

> > So we can print warning in hibernation_die_notifier without
> > introducing a global variable?
> > 
> 
> Actually, I'd kill the machine right away.
> 
> if (memcmp(result, buf, MD5_DIGEST_SIZE)) {
>   pr_err("PM: e820 map conflict detected!\n");
>   panic("BIOS is playing funny tricks with us.\n");
> } 
> Best regards,
>   Pavel

+1.

I would tend to think that
it's rather preferable to
kill an affected environment scope
(in this case: whole system), hard
(except for perhaps some emergency epilogue state saving),
in case of its state having become
suspicious/unpredictable/dangerous,
rather than having things carry on in a merry-go-round manner and thus
making improper state progress
which translates into
*continued activity* of *corrupting things*
(possibly even leading to *persistent* i.e. storage-recorded corruption!)
willy-nilly.

Plus, killing a machine hard in such a questionable case
would increase our influx of valuable reports
due to users being
hard-pressed to report this rather than
silently ignoring / not even knowing this issue
(in those cases where we already know that
no further development fixes will be determinable,
carrying on with a dire warning probably still is preferable to
making the machine completely unusable, eternally,
though).

Thus, think robustness.

Andreas Mohr


Re: [PATCH 1/1] ceph: do not modify fi->frag in need_reset_readdir()

2016-08-28 Thread Yan, Zheng

> On Aug 29, 2016, at 00:47, Nicolas Iooss  wrote:
> 
> Commit f3c4ebe65ea1 ("ceph: using hash value to compose dentry offset")
> modified "if (fpos_frag(new_pos) != fi->frag)" to "if (fi->frag |=
> fpos_frag(new_pos))" in need_reset_readdir(), thus replacing a
> comparison operator with an assignment one.
> 
> This looks like a typo which is reported by clang when building the
> kernel with some warning flags:
> 
>fs/ceph/dir.c:600:22: error: using the result of an assignment as a
>condition without parentheses [-Werror,-Wparentheses]
>} else if (fi->frag |= fpos_frag(new_pos)) {
>   ~^
>fs/ceph/dir.c:600:22: note: place parentheses around the assignment
>to silence this warning
>} else if (fi->frag |= fpos_frag(new_pos)) {
>^
>   ( )
>fs/ceph/dir.c:600:22: note: use '!=' to turn this compound
>assignment into an inequality comparison
>} else if (fi->frag |= fpos_frag(new_pos)) {
>^~
>!=
> 
> Fixes: f3c4ebe65ea1 ("ceph: using hash value to compose dentry offset")
> Cc: sta...@vger.kernel.org # 4.7.x
> Signed-off-by: Nicolas Iooss 
> ---
> fs/ceph/dir.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
> index c64a0b794d49..df4b3e6fa563 100644
> --- a/fs/ceph/dir.c
> +++ b/fs/ceph/dir.c
> @@ -597,7 +597,7 @@ static bool need_reset_readdir(struct ceph_file_info *fi, 
> loff_t new_pos)
>   if (is_hash_order(new_pos)) {
>   /* no need to reset last_name for a forward seek when
>* dentries are sotred in hash order */
> - } else if (fi->frag |= fpos_frag(new_pos)) {
> + } else if (fi->frag != fpos_frag(new_pos)) {
>   return true;
>   }
>   rinfo = fi->last_readdir ? >last_readdir->r_reply_info : NULL;


Applied, thanks

Yan, Zheng



> -- 
> 2.9.3
> 



Re: [PATCH 1/1] ceph: do not modify fi->frag in need_reset_readdir()

2016-08-28 Thread Yan, Zheng

> On Aug 29, 2016, at 00:47, Nicolas Iooss  wrote:
> 
> Commit f3c4ebe65ea1 ("ceph: using hash value to compose dentry offset")
> modified "if (fpos_frag(new_pos) != fi->frag)" to "if (fi->frag |=
> fpos_frag(new_pos))" in need_reset_readdir(), thus replacing a
> comparison operator with an assignment one.
> 
> This looks like a typo which is reported by clang when building the
> kernel with some warning flags:
> 
>fs/ceph/dir.c:600:22: error: using the result of an assignment as a
>condition without parentheses [-Werror,-Wparentheses]
>} else if (fi->frag |= fpos_frag(new_pos)) {
>   ~^
>fs/ceph/dir.c:600:22: note: place parentheses around the assignment
>to silence this warning
>} else if (fi->frag |= fpos_frag(new_pos)) {
>^
>   ( )
>fs/ceph/dir.c:600:22: note: use '!=' to turn this compound
>assignment into an inequality comparison
>} else if (fi->frag |= fpos_frag(new_pos)) {
>^~
>!=
> 
> Fixes: f3c4ebe65ea1 ("ceph: using hash value to compose dentry offset")
> Cc: sta...@vger.kernel.org # 4.7.x
> Signed-off-by: Nicolas Iooss 
> ---
> fs/ceph/dir.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
> index c64a0b794d49..df4b3e6fa563 100644
> --- a/fs/ceph/dir.c
> +++ b/fs/ceph/dir.c
> @@ -597,7 +597,7 @@ static bool need_reset_readdir(struct ceph_file_info *fi, 
> loff_t new_pos)
>   if (is_hash_order(new_pos)) {
>   /* no need to reset last_name for a forward seek when
>* dentries are sotred in hash order */
> - } else if (fi->frag |= fpos_frag(new_pos)) {
> + } else if (fi->frag != fpos_frag(new_pos)) {
>   return true;
>   }
>   rinfo = fi->last_readdir ? >last_readdir->r_reply_info : NULL;


Applied, thanks

Yan, Zheng



> -- 
> 2.9.3
> 



Great Offer

2016-08-28 Thread Mrs Julie Leach


You are a recipient to Mrs Julie Leach Donation of $2 million USD. Contact
(julieleach...@hotmail.com) for claims.


Great Offer

2016-08-28 Thread Mrs Julie Leach


You are a recipient to Mrs Julie Leach Donation of $2 million USD. Contact
(julieleach...@hotmail.com) for claims.


  1   2   3   4   5   6   >