Re: [PATCH 00/13] Virtually mapped stacks with guard pages (x86, core)

2016-06-19 Thread Heiko Carstens
On Fri, Jun 17, 2016 at 10:38:24AM -0700, Andy Lutomirski wrote:
> > A disassembly looks like this (r15 is the stackpointer):
> >
> > 0670 :
> >  670:   eb 6f f0 48 00 24   stmg%r6,%r15,72(%r15)
> >  676:   c0 d0 00 00 00 00   larl%r13,676 
> >  67c:   a7 f1 3f 80 tmll%r15,16256  <--- test if 
> > enough space left
> >  680:   b9 04 00 ef lgr %r14,%r15
> >  684:   a7 84 00 01 je  686  <--- 
> > branch to illegal op
> >  688:   e3 f0 ff 90 ff 71   lay %r15,-112(%r15)
> >
> > The branch jumps actually into the branch instruction itself since the 0001
> > part of the "je" instruction is an illegal instruction.
> >
> > This catches at least wild stack overflows because of two many functions
> > being called.
> >
> > Of course it doesn't catch wild accesses outside the stack because e.g. the
> > index into an array on the stack is wrong.
> >
> > The runtime overhead is within noise ratio, therefore we have this always
> > enabled.
> >
> 
> Neat!  What exactly does tmll do?  I assume this works by checking the
> low bits of the stack pointer.
> 
> x86_64 would have to do:
> 
> movl %esp, %r11d
> shll %r11d, $18
> cmpl %r11d, 
> jg error
> 
> Or similar.  I think the cmpl could be eliminated if the threshold
> were a power of two by simply testing the low bits of the stack
> pointer.

The tmll instruction tests if any of the higher bits within the 16k
stackframe address are set. In this specific case that would be bits 7-15
(mask 0x3f80). If no bit would be set we know that only up to 128 bytes
would be left on the stack, and thus trigger an exception.

This check does of course only work if a 16k stack is also 16k aligned,
which is always the case.



Re: [PATCH 00/13] Virtually mapped stacks with guard pages (x86, core)

2016-06-19 Thread Heiko Carstens
On Fri, Jun 17, 2016 at 10:38:24AM -0700, Andy Lutomirski wrote:
> > A disassembly looks like this (r15 is the stackpointer):
> >
> > 0670 :
> >  670:   eb 6f f0 48 00 24   stmg%r6,%r15,72(%r15)
> >  676:   c0 d0 00 00 00 00   larl%r13,676 
> >  67c:   a7 f1 3f 80 tmll%r15,16256  <--- test if 
> > enough space left
> >  680:   b9 04 00 ef lgr %r14,%r15
> >  684:   a7 84 00 01 je  686  <--- 
> > branch to illegal op
> >  688:   e3 f0 ff 90 ff 71   lay %r15,-112(%r15)
> >
> > The branch jumps actually into the branch instruction itself since the 0001
> > part of the "je" instruction is an illegal instruction.
> >
> > This catches at least wild stack overflows because of two many functions
> > being called.
> >
> > Of course it doesn't catch wild accesses outside the stack because e.g. the
> > index into an array on the stack is wrong.
> >
> > The runtime overhead is within noise ratio, therefore we have this always
> > enabled.
> >
> 
> Neat!  What exactly does tmll do?  I assume this works by checking the
> low bits of the stack pointer.
> 
> x86_64 would have to do:
> 
> movl %esp, %r11d
> shll %r11d, $18
> cmpl %r11d, 
> jg error
> 
> Or similar.  I think the cmpl could be eliminated if the threshold
> were a power of two by simply testing the low bits of the stack
> pointer.

The tmll instruction tests if any of the higher bits within the 16k
stackframe address are set. In this specific case that would be bits 7-15
(mask 0x3f80). If no bit would be set we know that only up to 128 bytes
would be left on the stack, and thus trigger an exception.

This check does of course only work if a 16k stack is also 16k aligned,
which is always the case.



Re: [v2 PATCH 2/4] phy: Add USB Type-C PHY driver for rk3399

2016-06-19 Thread Chris Zhong

Hi Guenter

On 06/18/2016 11:45 PM, Guenter Roeck wrote:

Hi Chris,

On Mon, Jun 13, 2016 at 2:39 AM, Chris Zhong  wrote:

Add a PHY provider driver for the rk3399 SoC Type-c PHY. The USB
Type-C PHY is designed to support the USB3 and DP applications. The
PHY basically has two main components: USB3 and DisplyPort. USB3
operates in SuperSpeed mode and the DP can operate at RBR, HBR and
HBR2 data rates.


I started integrating the driver with our code.
Doing so, I realized a problem in the way you are using extcon.

[ ... ]


+
+static int tcphy_pd_event(struct notifier_block *nb,
+ unsigned long event, void *priv)
+{
+   struct rockchip_typec_phy *tcphy;
+   struct extcon_dev *edev = priv;
+   int value = edev->state;
+   int mode;
+   u8 is_plugged, dfp;
+
+   tcphy = container_of(nb, struct rockchip_typec_phy, event_nb);
+
+   is_plugged = GET_PLUGGED(value);
+   tcphy->flip = GET_FLIP(value);
+   dfp = GET_DFP(value);
+   tcphy->map = GET_PIN_MAP(value);
+

I don't think it is a good idea to use the extcon 'state' field like
this. I don't even think it is possible.

The state is supposed to be a bit mask, each bit indicating if a
specific connector (or functionality) on the cable is attached. The
extcon notifier code walks through this bit mask and determines based
on changed bits if the notifier should be called. So the notifier in
this case would only be called if bit 1 (EXTCON_USB) of 'state' has
changed, but not if one of the other bits has changed. One would have
to define 32 "virtual" cables, one for each bit, for this to work, and
then you would have to register a notifier for each of the bits. That
would not really make sense.

Of course, that makes using the extcon notifier quite useless for our
purpose, since we need the callback not only if a cable has been
attached or deattached, but also if some secondary state changes. I
don't really know myself how to solve the problem; I'll need to think
about it some more. Maybe we can add a callback into the type-c
infrastructure code and somehow tie into that code, but I don't know
yet if that is feasible either.

Guenter



Yes, currently, we can get the notification only when bit 0 change.
So the phy driver can know plug/unplug event.
if we need more trigger, how about set the BIT 0 for changed flag.

state = extcon_get_cable_state

state = ~state | is_plugged | flip | dfp | map

extcon_set_state(state)







Re: [v2 PATCH 2/4] phy: Add USB Type-C PHY driver for rk3399

2016-06-19 Thread Chris Zhong

Hi Guenter

On 06/18/2016 11:45 PM, Guenter Roeck wrote:

Hi Chris,

On Mon, Jun 13, 2016 at 2:39 AM, Chris Zhong  wrote:

Add a PHY provider driver for the rk3399 SoC Type-c PHY. The USB
Type-C PHY is designed to support the USB3 and DP applications. The
PHY basically has two main components: USB3 and DisplyPort. USB3
operates in SuperSpeed mode and the DP can operate at RBR, HBR and
HBR2 data rates.


I started integrating the driver with our code.
Doing so, I realized a problem in the way you are using extcon.

[ ... ]


+
+static int tcphy_pd_event(struct notifier_block *nb,
+ unsigned long event, void *priv)
+{
+   struct rockchip_typec_phy *tcphy;
+   struct extcon_dev *edev = priv;
+   int value = edev->state;
+   int mode;
+   u8 is_plugged, dfp;
+
+   tcphy = container_of(nb, struct rockchip_typec_phy, event_nb);
+
+   is_plugged = GET_PLUGGED(value);
+   tcphy->flip = GET_FLIP(value);
+   dfp = GET_DFP(value);
+   tcphy->map = GET_PIN_MAP(value);
+

I don't think it is a good idea to use the extcon 'state' field like
this. I don't even think it is possible.

The state is supposed to be a bit mask, each bit indicating if a
specific connector (or functionality) on the cable is attached. The
extcon notifier code walks through this bit mask and determines based
on changed bits if the notifier should be called. So the notifier in
this case would only be called if bit 1 (EXTCON_USB) of 'state' has
changed, but not if one of the other bits has changed. One would have
to define 32 "virtual" cables, one for each bit, for this to work, and
then you would have to register a notifier for each of the bits. That
would not really make sense.

Of course, that makes using the extcon notifier quite useless for our
purpose, since we need the callback not only if a cable has been
attached or deattached, but also if some secondary state changes. I
don't really know myself how to solve the problem; I'll need to think
about it some more. Maybe we can add a callback into the type-c
infrastructure code and somehow tie into that code, but I don't know
yet if that is feasible either.

Guenter



Yes, currently, we can get the notification only when bit 0 change.
So the phy driver can know plug/unplug event.
if we need more trigger, how about set the BIT 0 for changed flag.

state = extcon_get_cable_state

state = ~state | is_plugged | flip | dfp | map

extcon_set_state(state)







[PATCH] ddbridge: Replace vmalloc with vzalloc

2016-06-19 Thread Amitoj Kaur Chawla
vzalloc combines vmalloc and memset 0.

The Coccinelle semantic patch used to make this change is as follows:
@@
type T;
T *d;
expression e;
statement S;
@@

d =
-vmalloc
+vzalloc
 (...);
if (!d) S
-   memset(d, 0, sizeof(T));

Signed-off-by: Amitoj Kaur Chawla 
---
 drivers/media/pci/ddbridge/ddbridge-core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/media/pci/ddbridge/ddbridge-core.c 
b/drivers/media/pci/ddbridge/ddbridge-core.c
index 6e995ef..47def73 100644
--- a/drivers/media/pci/ddbridge/ddbridge-core.c
+++ b/drivers/media/pci/ddbridge/ddbridge-core.c
@@ -1569,10 +1569,9 @@ static int ddb_probe(struct pci_dev *pdev, const struct 
pci_device_id *id)
if (pci_enable_device(pdev) < 0)
return -ENODEV;
 
-   dev = vmalloc(sizeof(struct ddb));
+   dev = vzalloc(sizeof(struct ddb));
if (dev == NULL)
return -ENOMEM;
-   memset(dev, 0, sizeof(struct ddb));
 
dev->pdev = pdev;
pci_set_drvdata(pdev, dev);
-- 
1.9.1



[PATCH] ddbridge: Replace vmalloc with vzalloc

2016-06-19 Thread Amitoj Kaur Chawla
vzalloc combines vmalloc and memset 0.

The Coccinelle semantic patch used to make this change is as follows:
@@
type T;
T *d;
expression e;
statement S;
@@

d =
-vmalloc
+vzalloc
 (...);
if (!d) S
-   memset(d, 0, sizeof(T));

Signed-off-by: Amitoj Kaur Chawla 
---
 drivers/media/pci/ddbridge/ddbridge-core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/media/pci/ddbridge/ddbridge-core.c 
b/drivers/media/pci/ddbridge/ddbridge-core.c
index 6e995ef..47def73 100644
--- a/drivers/media/pci/ddbridge/ddbridge-core.c
+++ b/drivers/media/pci/ddbridge/ddbridge-core.c
@@ -1569,10 +1569,9 @@ static int ddb_probe(struct pci_dev *pdev, const struct 
pci_device_id *id)
if (pci_enable_device(pdev) < 0)
return -ENODEV;
 
-   dev = vmalloc(sizeof(struct ddb));
+   dev = vzalloc(sizeof(struct ddb));
if (dev == NULL)
return -ENOMEM;
-   memset(dev, 0, sizeof(struct ddb));
 
dev->pdev = pdev;
pci_set_drvdata(pdev, dev);
-- 
1.9.1



[PATCH v2 1/2] drm/dsi: Implement dcs set/get display brightness

2016-06-19 Thread Vinay Simha BN
Provide a small convenience wrapper that set/get the
display brightness value

Cc: John Stultz 
Cc: Sumit Semwal 
Cc: Archit Taneja 
Cc: Rob Clark 
Cc: Jani Nikula 
Cc: Thierry Reding 
Signed-off-by: Vinay Simha BN 
---
v1:
 *tested in nexus7 2nd gen.

v2:
 * implemented jani review comments
   -functions name mapped accordingly
   -bl value increased from 0xff to 0x
   -backlight interface will be handled in panel driver,
so it is moved from the mipi_dsi helper function
---
 drivers/gpu/drm/drm_mipi_dsi.c | 49 ++
 include/drm/drm_mipi_dsi.h |  4 
 2 files changed, 53 insertions(+)

diff --git a/drivers/gpu/drm/drm_mipi_dsi.c b/drivers/gpu/drm/drm_mipi_dsi.c
index 49311fc..2c03784 100644
--- a/drivers/gpu/drm/drm_mipi_dsi.c
+++ b/drivers/gpu/drm/drm_mipi_dsi.c
@@ -1041,6 +1041,55 @@ int mipi_dsi_dcs_set_pixel_format(struct mipi_dsi_device 
*dsi, u8 format)
 }
 EXPORT_SYMBOL(mipi_dsi_dcs_set_pixel_format);
 
+/**
+ * mipi_dsi_dcs_get_display_brightness() - gets the current brightness value
+ * of the display
+ * @dsi: DSI peripheral device
+ * @brightness: brightness value
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int mipi_dsi_dcs_get_display_brightness(struct mipi_dsi_device *dsi,
+   u16 *brightness)
+{
+   ssize_t err;
+
+   err = mipi_dsi_dcs_read(dsi, MIPI_DCS_GET_DISPLAY_BRIGHTNESS,
+   brightness, sizeof(*brightness));
+   if (err < 0) {
+   if (err == 0)
+   err = -ENODATA;
+
+   return err;
+   }
+
+   return 0;
+}
+EXPORT_SYMBOL(mipi_dsi_dcs_get_display_brightness);
+
+/**
+ * mipi_dsi_dcs_set_display_brightness() - sets the brightness value of
+ * the display
+ * @dsi: DSI peripheral device
+ * @brightness: brightness value
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int mipi_dsi_dcs_set_display_brightness(struct mipi_dsi_device *dsi,
+   u16 brightness)
+{
+   ssize_t err;
+   u8 bl_value[2] = { brightness & 0xff, brightness >> 8 };
+
+   err = mipi_dsi_dcs_write(dsi, MIPI_DCS_SET_DISPLAY_BRIGHTNESS,
+bl_value, sizeof(bl_value));
+   if (err < 0)
+   return err;
+
+   return 0;
+}
+EXPORT_SYMBOL(mipi_dsi_dcs_set_display_brightness);
+
 static int mipi_dsi_drv_probe(struct device *dev)
 {
struct mipi_dsi_driver *drv = to_mipi_dsi_driver(dev->driver);
diff --git a/include/drm/drm_mipi_dsi.h b/include/drm/drm_mipi_dsi.h
index 72f5b15..4d77bb0 100644
--- a/include/drm/drm_mipi_dsi.h
+++ b/include/drm/drm_mipi_dsi.h
@@ -270,6 +270,10 @@ int mipi_dsi_dcs_set_tear_off(struct mipi_dsi_device *dsi);
 int mipi_dsi_dcs_set_tear_on(struct mipi_dsi_device *dsi,
 enum mipi_dsi_dcs_tear_mode mode);
 int mipi_dsi_dcs_set_pixel_format(struct mipi_dsi_device *dsi, u8 format);
+int mipi_dsi_dcs_get_display_brightness(struct mipi_dsi_device *dsi,
+   u16 *brightness);
+int mipi_dsi_dcs_set_display_brightness(struct mipi_dsi_device *dsi,
+   u16 brightness);
 
 /**
  * struct mipi_dsi_driver - DSI driver
-- 
2.1.2



[PATCH v7 2/2] drm/panel: Add JDI LT070ME05000 WUXGA DSI Panel

2016-06-19 Thread Vinay Simha BN
Add support for the JDI LT070ME05000 WUXGA DSI panel used in
Nexus 7 2013 devices.

Programming sequence for the panel is was originally found in the
android-msm-flo-3.4-lollipop-release branch from:
https://android.googlesource.com/kernel/msm.git

And video mode setting is from dsi-panel-jdi-dualmipi1-video.dtsi
file in:
git://codeaurora.org/kernel/msm-3.10.git  LNX.LA.3.6_rb1.27

Cc: Archit Taneja 
Cc: Rob Clark 
Cc: Sumit Semwal 
Cc: John Stultz 
Cc: Emil Velikov 
Cc: Thierry Reding 
Cc: David Airlie 
Signed-off-by: Sumit Semwal 
Signed-off-by: John Stultz 
Signed-off-by: Vinay Simha BN 

---
v1:
 * sumit ported to drm/panel framework, john cherry-picked to mainline,
   folded down other fixes from Vinay and Archit, vinay removed interface
   setting cmd mode, video mode panel selected

v2:
 * incorporated code reviews from theiry, archit
   code style, alphabetical soring in Makefile, Kconfig, regulator_bulk,
   arrays of u8, generic helper function, documentation bindings,

v3:
 * dcs backlight support added
 * tested this panel driver in nexus7 2013 device

v4:
 * backlight interface added in the panel driver
 * incorporated width_mm and height_mm suggested by rob herring

v5:
 * theirry review comments incorporated
   panel model naming consistent, alphabetical soring in Kconfig
   Makefile, MAX_BRIGHTNESS dropped, regulator_names, parameterize
   panel width and height, descprition for control display, cabc
   and interface setting, temporary variable removed, consistent
   error reporting and commit message
 * removed tear on/off, scanline, since these are required only
   for command mode panels

v6:
 * emil review comments incorporated
   PANEL_NUM_REGULATORS dropped, return ret added at necessary
   places, if checks dropped for backlight and gpios

v7:
 * emil review comments incorporated
   added ARRAY_SIZE in struct, regulator_bulk_disable in poweroff,
   gpios checks dropped.
   some returns cannot be dropped, since drm panel framework return
   type required.
---
 drivers/gpu/drm/panel/Kconfig  |  11 +
 drivers/gpu/drm/panel/Makefile |   1 +
 drivers/gpu/drm/panel/panel-jdi-lt070me05000.c | 495 +
 3 files changed, 507 insertions(+)
 create mode 100644 drivers/gpu/drm/panel/panel-jdi-lt070me05000.c

diff --git a/drivers/gpu/drm/panel/Kconfig b/drivers/gpu/drm/panel/Kconfig
index 1500ab9..62aba97 100644
--- a/drivers/gpu/drm/panel/Kconfig
+++ b/drivers/gpu/drm/panel/Kconfig
@@ -18,6 +18,17 @@ config DRM_PANEL_SIMPLE
  that it can be automatically turned off when the panel goes into a
  low power state.
 
+config DRM_PANEL_JDI_LT070ME05000
+   tristate "JDI LT070ME05000 WUXGA DSI panel"
+   depends on OF
+   depends on DRM_MIPI_DSI
+   depends on BACKLIGHT_CLASS_DEVICE
+   help
+ Say Y here if you want to enable support for JDI DSI video mode
+ panel as found in Google Nexus 7 (2013) devices.
+ The panel has a 1200(RGB)×1920 (WUXGA) resolution and uses
+ 24 bit per pixel.
+
 config DRM_PANEL_SAMSUNG_LD9040
tristate "Samsung LD9040 RGB/SPI panel"
depends on OF && SPI
diff --git a/drivers/gpu/drm/panel/Makefile b/drivers/gpu/drm/panel/Makefile
index f277eed..a5c7ec0 100644
--- a/drivers/gpu/drm/panel/Makefile
+++ b/drivers/gpu/drm/panel/Makefile
@@ -1,4 +1,5 @@
 obj-$(CONFIG_DRM_PANEL_SIMPLE) += panel-simple.o
+obj-$(CONFIG_DRM_PANEL_JDI_LT070ME05000) += panel-jdi-lt070me05000.o
 obj-$(CONFIG_DRM_PANEL_LG_LG4573) += panel-lg-lg4573.o
 obj-$(CONFIG_DRM_PANEL_PANASONIC_VVX10F034N00) += 
panel-panasonic-vvx10f034n00.o
 obj-$(CONFIG_DRM_PANEL_SAMSUNG_LD9040) += panel-samsung-ld9040.o
diff --git a/drivers/gpu/drm/panel/panel-jdi-lt070me05000.c 
b/drivers/gpu/drm/panel/panel-jdi-lt070me05000.c
new file mode 100644
index 000..888fe2b
--- /dev/null
+++ b/drivers/gpu/drm/panel/panel-jdi-lt070me05000.c
@@ -0,0 +1,495 @@
+/*
+ * Copyright (C) 2016 InforceComputing
+ * Author: Vinay Simha BN 
+ *
+ * Copyright (C) 2016 Linaro Ltd
+ * Author: Sumit Semwal 
+ *
+ * From internet archives, the panel for Nexus 7 2nd Gen, 2013 model is a
+ * JDI model LT070ME05000, and its data sheet is at:
+ * http://panelone.net/en/7-0-inch/JDI_LT070ME05000_7.0_inch-datasheet
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * 

[PATCH v2 1/2] drm/dsi: Implement dcs set/get display brightness

2016-06-19 Thread Vinay Simha BN
Provide a small convenience wrapper that set/get the
display brightness value

Cc: John Stultz 
Cc: Sumit Semwal 
Cc: Archit Taneja 
Cc: Rob Clark 
Cc: Jani Nikula 
Cc: Thierry Reding 
Signed-off-by: Vinay Simha BN 
---
v1:
 *tested in nexus7 2nd gen.

v2:
 * implemented jani review comments
   -functions name mapped accordingly
   -bl value increased from 0xff to 0x
   -backlight interface will be handled in panel driver,
so it is moved from the mipi_dsi helper function
---
 drivers/gpu/drm/drm_mipi_dsi.c | 49 ++
 include/drm/drm_mipi_dsi.h |  4 
 2 files changed, 53 insertions(+)

diff --git a/drivers/gpu/drm/drm_mipi_dsi.c b/drivers/gpu/drm/drm_mipi_dsi.c
index 49311fc..2c03784 100644
--- a/drivers/gpu/drm/drm_mipi_dsi.c
+++ b/drivers/gpu/drm/drm_mipi_dsi.c
@@ -1041,6 +1041,55 @@ int mipi_dsi_dcs_set_pixel_format(struct mipi_dsi_device 
*dsi, u8 format)
 }
 EXPORT_SYMBOL(mipi_dsi_dcs_set_pixel_format);
 
+/**
+ * mipi_dsi_dcs_get_display_brightness() - gets the current brightness value
+ * of the display
+ * @dsi: DSI peripheral device
+ * @brightness: brightness value
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int mipi_dsi_dcs_get_display_brightness(struct mipi_dsi_device *dsi,
+   u16 *brightness)
+{
+   ssize_t err;
+
+   err = mipi_dsi_dcs_read(dsi, MIPI_DCS_GET_DISPLAY_BRIGHTNESS,
+   brightness, sizeof(*brightness));
+   if (err < 0) {
+   if (err == 0)
+   err = -ENODATA;
+
+   return err;
+   }
+
+   return 0;
+}
+EXPORT_SYMBOL(mipi_dsi_dcs_get_display_brightness);
+
+/**
+ * mipi_dsi_dcs_set_display_brightness() - sets the brightness value of
+ * the display
+ * @dsi: DSI peripheral device
+ * @brightness: brightness value
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int mipi_dsi_dcs_set_display_brightness(struct mipi_dsi_device *dsi,
+   u16 brightness)
+{
+   ssize_t err;
+   u8 bl_value[2] = { brightness & 0xff, brightness >> 8 };
+
+   err = mipi_dsi_dcs_write(dsi, MIPI_DCS_SET_DISPLAY_BRIGHTNESS,
+bl_value, sizeof(bl_value));
+   if (err < 0)
+   return err;
+
+   return 0;
+}
+EXPORT_SYMBOL(mipi_dsi_dcs_set_display_brightness);
+
 static int mipi_dsi_drv_probe(struct device *dev)
 {
struct mipi_dsi_driver *drv = to_mipi_dsi_driver(dev->driver);
diff --git a/include/drm/drm_mipi_dsi.h b/include/drm/drm_mipi_dsi.h
index 72f5b15..4d77bb0 100644
--- a/include/drm/drm_mipi_dsi.h
+++ b/include/drm/drm_mipi_dsi.h
@@ -270,6 +270,10 @@ int mipi_dsi_dcs_set_tear_off(struct mipi_dsi_device *dsi);
 int mipi_dsi_dcs_set_tear_on(struct mipi_dsi_device *dsi,
 enum mipi_dsi_dcs_tear_mode mode);
 int mipi_dsi_dcs_set_pixel_format(struct mipi_dsi_device *dsi, u8 format);
+int mipi_dsi_dcs_get_display_brightness(struct mipi_dsi_device *dsi,
+   u16 *brightness);
+int mipi_dsi_dcs_set_display_brightness(struct mipi_dsi_device *dsi,
+   u16 brightness);
 
 /**
  * struct mipi_dsi_driver - DSI driver
-- 
2.1.2



[PATCH v7 2/2] drm/panel: Add JDI LT070ME05000 WUXGA DSI Panel

2016-06-19 Thread Vinay Simha BN
Add support for the JDI LT070ME05000 WUXGA DSI panel used in
Nexus 7 2013 devices.

Programming sequence for the panel is was originally found in the
android-msm-flo-3.4-lollipop-release branch from:
https://android.googlesource.com/kernel/msm.git

And video mode setting is from dsi-panel-jdi-dualmipi1-video.dtsi
file in:
git://codeaurora.org/kernel/msm-3.10.git  LNX.LA.3.6_rb1.27

Cc: Archit Taneja 
Cc: Rob Clark 
Cc: Sumit Semwal 
Cc: John Stultz 
Cc: Emil Velikov 
Cc: Thierry Reding 
Cc: David Airlie 
Signed-off-by: Sumit Semwal 
Signed-off-by: John Stultz 
Signed-off-by: Vinay Simha BN 

---
v1:
 * sumit ported to drm/panel framework, john cherry-picked to mainline,
   folded down other fixes from Vinay and Archit, vinay removed interface
   setting cmd mode, video mode panel selected

v2:
 * incorporated code reviews from theiry, archit
   code style, alphabetical soring in Makefile, Kconfig, regulator_bulk,
   arrays of u8, generic helper function, documentation bindings,

v3:
 * dcs backlight support added
 * tested this panel driver in nexus7 2013 device

v4:
 * backlight interface added in the panel driver
 * incorporated width_mm and height_mm suggested by rob herring

v5:
 * theirry review comments incorporated
   panel model naming consistent, alphabetical soring in Kconfig
   Makefile, MAX_BRIGHTNESS dropped, regulator_names, parameterize
   panel width and height, descprition for control display, cabc
   and interface setting, temporary variable removed, consistent
   error reporting and commit message
 * removed tear on/off, scanline, since these are required only
   for command mode panels

v6:
 * emil review comments incorporated
   PANEL_NUM_REGULATORS dropped, return ret added at necessary
   places, if checks dropped for backlight and gpios

v7:
 * emil review comments incorporated
   added ARRAY_SIZE in struct, regulator_bulk_disable in poweroff,
   gpios checks dropped.
   some returns cannot be dropped, since drm panel framework return
   type required.
---
 drivers/gpu/drm/panel/Kconfig  |  11 +
 drivers/gpu/drm/panel/Makefile |   1 +
 drivers/gpu/drm/panel/panel-jdi-lt070me05000.c | 495 +
 3 files changed, 507 insertions(+)
 create mode 100644 drivers/gpu/drm/panel/panel-jdi-lt070me05000.c

diff --git a/drivers/gpu/drm/panel/Kconfig b/drivers/gpu/drm/panel/Kconfig
index 1500ab9..62aba97 100644
--- a/drivers/gpu/drm/panel/Kconfig
+++ b/drivers/gpu/drm/panel/Kconfig
@@ -18,6 +18,17 @@ config DRM_PANEL_SIMPLE
  that it can be automatically turned off when the panel goes into a
  low power state.
 
+config DRM_PANEL_JDI_LT070ME05000
+   tristate "JDI LT070ME05000 WUXGA DSI panel"
+   depends on OF
+   depends on DRM_MIPI_DSI
+   depends on BACKLIGHT_CLASS_DEVICE
+   help
+ Say Y here if you want to enable support for JDI DSI video mode
+ panel as found in Google Nexus 7 (2013) devices.
+ The panel has a 1200(RGB)×1920 (WUXGA) resolution and uses
+ 24 bit per pixel.
+
 config DRM_PANEL_SAMSUNG_LD9040
tristate "Samsung LD9040 RGB/SPI panel"
depends on OF && SPI
diff --git a/drivers/gpu/drm/panel/Makefile b/drivers/gpu/drm/panel/Makefile
index f277eed..a5c7ec0 100644
--- a/drivers/gpu/drm/panel/Makefile
+++ b/drivers/gpu/drm/panel/Makefile
@@ -1,4 +1,5 @@
 obj-$(CONFIG_DRM_PANEL_SIMPLE) += panel-simple.o
+obj-$(CONFIG_DRM_PANEL_JDI_LT070ME05000) += panel-jdi-lt070me05000.o
 obj-$(CONFIG_DRM_PANEL_LG_LG4573) += panel-lg-lg4573.o
 obj-$(CONFIG_DRM_PANEL_PANASONIC_VVX10F034N00) += 
panel-panasonic-vvx10f034n00.o
 obj-$(CONFIG_DRM_PANEL_SAMSUNG_LD9040) += panel-samsung-ld9040.o
diff --git a/drivers/gpu/drm/panel/panel-jdi-lt070me05000.c 
b/drivers/gpu/drm/panel/panel-jdi-lt070me05000.c
new file mode 100644
index 000..888fe2b
--- /dev/null
+++ b/drivers/gpu/drm/panel/panel-jdi-lt070me05000.c
@@ -0,0 +1,495 @@
+/*
+ * Copyright (C) 2016 InforceComputing
+ * Author: Vinay Simha BN 
+ *
+ * Copyright (C) 2016 Linaro Ltd
+ * Author: Sumit Semwal 
+ *
+ * From internet archives, the panel for Nexus 7 2nd Gen, 2013 model is a
+ * JDI model LT070ME05000, and its data sheet is at:
+ * http://panelone.net/en/7-0-inch/JDI_LT070ME05000_7.0_inch-datasheet
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include 

[PATCH] ACPICA: Use acpi_os_allocate_zeroed

2016-06-19 Thread Amitoj Kaur Chawla
acpi_os_allocate_zeroed combines acpi_os_allocate and memset 0.

The Coccinelle semantic patch used to make this change is as follows:
@@
type T;
T *d;
expression e;
statement S;
@@

d =
-acpi_os_allocate
+acpi_os_allocate_zeroed
 (...);
if (!d) S
-   memset(d, 0, sizeof(T));

Signed-off-by: Amitoj Kaur Chawla 
---
 drivers/acpi/acpica/utcache.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/acpi/acpica/utcache.c b/drivers/acpi/acpica/utcache.c
index f8e9978..862f963 100644
--- a/drivers/acpi/acpica/utcache.c
+++ b/drivers/acpi/acpica/utcache.c
@@ -77,14 +77,13 @@ acpi_os_create_cache(char *cache_name,
 
/* Create the cache object */
 
-   cache = acpi_os_allocate(sizeof(struct acpi_memory_list));
+   cache = acpi_os_allocate_zeroed(sizeof(struct acpi_memory_list));
if (!cache) {
return (AE_NO_MEMORY);
}
 
/* Populate the cache object and return it */
 
-   memset(cache, 0, sizeof(struct acpi_memory_list));
cache->list_name = cache_name;
cache->object_size = object_size;
cache->max_depth = max_depth;
-- 
1.9.1



[PATCH] ACPICA: Use acpi_os_allocate_zeroed

2016-06-19 Thread Amitoj Kaur Chawla
acpi_os_allocate_zeroed combines acpi_os_allocate and memset 0.

The Coccinelle semantic patch used to make this change is as follows:
@@
type T;
T *d;
expression e;
statement S;
@@

d =
-acpi_os_allocate
+acpi_os_allocate_zeroed
 (...);
if (!d) S
-   memset(d, 0, sizeof(T));

Signed-off-by: Amitoj Kaur Chawla 
---
 drivers/acpi/acpica/utcache.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/acpi/acpica/utcache.c b/drivers/acpi/acpica/utcache.c
index f8e9978..862f963 100644
--- a/drivers/acpi/acpica/utcache.c
+++ b/drivers/acpi/acpica/utcache.c
@@ -77,14 +77,13 @@ acpi_os_create_cache(char *cache_name,
 
/* Create the cache object */
 
-   cache = acpi_os_allocate(sizeof(struct acpi_memory_list));
+   cache = acpi_os_allocate_zeroed(sizeof(struct acpi_memory_list));
if (!cache) {
return (AE_NO_MEMORY);
}
 
/* Populate the cache object and return it */
 
-   memset(cache, 0, sizeof(struct acpi_memory_list));
cache->list_name = cache_name;
cache->object_size = object_size;
cache->max_depth = max_depth;
-- 
1.9.1



Re: [PATCH v5 0/7] /dev/random - a new approach

2016-06-19 Thread Stephan Mueller
Am Sonntag, 19. Juni 2016, 21:36:14 schrieb Pavel Machek:

Hi Pavel,

> On Sun 2016-06-19 17:58:41, Stephan Mueller wrote:
> > Hi Herbert, Ted,
> > 
> > The following patch set provides a different approach to /dev/random which
> > I call Linux Random Number Generator (LRNG) to collect entropy within the
> > Linux kernel. The main improvements compared to the legacy /dev/random is
> > to provide sufficient entropy during boot time as well as in virtual
> > environments and when using SSDs. A secondary design goal is to limit the
> > impact of the entropy collection on massive parallel systems and also
> > allow the use accelerated cryptographic primitives. Also, all steps of
> > the entropic data processing are testable. Finally massive performance
> > improvements are visible at /dev/urandom and get_random_bytes.
> 
> Dunno. It is very similar to existing rng, AFAICT. And at the very
> least, constants in existing RNG could be tuned to provide "entropy at
> the boot time".

The key differences and thus my main concerns I have with the current design 
are the following items. If we would change them, it is an intrusive change. 
As of now, I have not seen that intrusive changes were accepted. This led me 
to develop a competing algorithm.

- Correlation of noise sources: as outlined in [1] chapter 1, the three noise 
sources of the legacy /dev/random implementation have a high correlation. Such 
correlation is due to the fact that a HID/disk event at the same time produces 
an IRQ event. The time stamp (which deliver the majority of entropy) of both 
events are correlated. I would think that the maintenance of the fast_pools 
partially breaks that correlation to some degree though, yet how much the 
correlation is broken is unknown.

- Awarding IRQs only 1/64th bit of entropy compared to HID and disk noise 
sources is warranted due to the correlation. As I try to show, IRQs have a 
much higher entropy rate than what they are credited currently. But we cannot 
set that value higher due to the correlation issue. That means, currently we 
prefer desktop machines over server type systems since servers usually have no 
HID. In addition, with SSDs or virtio-disks the disk noise source is 
deactivated (again, common use cases for servers). Hence, server environments 
are heavily penalized. (Note, awarding IRQ events one bit of entropy is the 
root cause why my approach claims to be seeded very fast during boot time. 
Furthermore, as outlined in [1] chapter 1 and 2, IRQ events are entropic even 
in virtual machines which implies that even in VMs, my approach works well.)

- I am not sure the current way of crediting entropy has anything to do with 
its entropy. It just happen to underestimate our entropy so it does not hurt. 
I see no sensible reason why the calculation of an entropy estimate rests on 
the first/second and third derivation of the Jiffies -- the Jiffies hardly 
deliver any entropy and therefore why should they be a basis for entropy 
calculation?

- There was a debate around my approach assuming one bit of entropy per 
received IRQ. I am really wondering about that discussion when there is a much 
bigger "forcast" problem with the legacy /dev/random: how can we credit HIDs 
up to 11 bits of entropy when the user (a potential adversary) triggers these 
events? I am sure I would be shot down with such an approach if I would 
deliver that with a new implementation.

- The delivery of entropic data from the input_pool to the (non)blocking_pools 
is not atomic (for the lack of better word), i.e. one block of data with a 
given entropy content is injected into the (non)blocking_pool where the output 
pool is still locked (the user cannot obtain data during that injection time). 
With Ted's new patch set, two 64 bit blocks from the fast_pools are injected 
into the ChaCha20 DRNG. So, it is clearly better than previously. But still, 
with the blocking_pool, we face that issue. The reason for that issue is 
outlined in [1] 2.1. In the pathological case with an active attack, 
/dev/random could have a security strength of 2 * 128 bits of and not 2^128 
bits when reading 128 bits out of it (the numbers are for illustration only, 
it is a bit better as /dev/random is woken up at random_read_wakeup_bits 
intervals -- but that number can be set to dangerous low levels down to 8 
bits).


[1] http://www.chronox.de/lrng/doc/lrng.pdf

Ciao
Stephan


Re: [PATCH v5 0/7] /dev/random - a new approach

2016-06-19 Thread Stephan Mueller
Am Sonntag, 19. Juni 2016, 21:36:14 schrieb Pavel Machek:

Hi Pavel,

> On Sun 2016-06-19 17:58:41, Stephan Mueller wrote:
> > Hi Herbert, Ted,
> > 
> > The following patch set provides a different approach to /dev/random which
> > I call Linux Random Number Generator (LRNG) to collect entropy within the
> > Linux kernel. The main improvements compared to the legacy /dev/random is
> > to provide sufficient entropy during boot time as well as in virtual
> > environments and when using SSDs. A secondary design goal is to limit the
> > impact of the entropy collection on massive parallel systems and also
> > allow the use accelerated cryptographic primitives. Also, all steps of
> > the entropic data processing are testable. Finally massive performance
> > improvements are visible at /dev/urandom and get_random_bytes.
> 
> Dunno. It is very similar to existing rng, AFAICT. And at the very
> least, constants in existing RNG could be tuned to provide "entropy at
> the boot time".

The key differences and thus my main concerns I have with the current design 
are the following items. If we would change them, it is an intrusive change. 
As of now, I have not seen that intrusive changes were accepted. This led me 
to develop a competing algorithm.

- Correlation of noise sources: as outlined in [1] chapter 1, the three noise 
sources of the legacy /dev/random implementation have a high correlation. Such 
correlation is due to the fact that a HID/disk event at the same time produces 
an IRQ event. The time stamp (which deliver the majority of entropy) of both 
events are correlated. I would think that the maintenance of the fast_pools 
partially breaks that correlation to some degree though, yet how much the 
correlation is broken is unknown.

- Awarding IRQs only 1/64th bit of entropy compared to HID and disk noise 
sources is warranted due to the correlation. As I try to show, IRQs have a 
much higher entropy rate than what they are credited currently. But we cannot 
set that value higher due to the correlation issue. That means, currently we 
prefer desktop machines over server type systems since servers usually have no 
HID. In addition, with SSDs or virtio-disks the disk noise source is 
deactivated (again, common use cases for servers). Hence, server environments 
are heavily penalized. (Note, awarding IRQ events one bit of entropy is the 
root cause why my approach claims to be seeded very fast during boot time. 
Furthermore, as outlined in [1] chapter 1 and 2, IRQ events are entropic even 
in virtual machines which implies that even in VMs, my approach works well.)

- I am not sure the current way of crediting entropy has anything to do with 
its entropy. It just happen to underestimate our entropy so it does not hurt. 
I see no sensible reason why the calculation of an entropy estimate rests on 
the first/second and third derivation of the Jiffies -- the Jiffies hardly 
deliver any entropy and therefore why should they be a basis for entropy 
calculation?

- There was a debate around my approach assuming one bit of entropy per 
received IRQ. I am really wondering about that discussion when there is a much 
bigger "forcast" problem with the legacy /dev/random: how can we credit HIDs 
up to 11 bits of entropy when the user (a potential adversary) triggers these 
events? I am sure I would be shot down with such an approach if I would 
deliver that with a new implementation.

- The delivery of entropic data from the input_pool to the (non)blocking_pools 
is not atomic (for the lack of better word), i.e. one block of data with a 
given entropy content is injected into the (non)blocking_pool where the output 
pool is still locked (the user cannot obtain data during that injection time). 
With Ted's new patch set, two 64 bit blocks from the fast_pools are injected 
into the ChaCha20 DRNG. So, it is clearly better than previously. But still, 
with the blocking_pool, we face that issue. The reason for that issue is 
outlined in [1] 2.1. In the pathological case with an active attack, 
/dev/random could have a security strength of 2 * 128 bits of and not 2^128 
bits when reading 128 bits out of it (the numbers are for illustration only, 
it is a bit better as /dev/random is woken up at random_read_wakeup_bits 
intervals -- but that number can be set to dangerous low levels down to 8 
bits).


[1] http://www.chronox.de/lrng/doc/lrng.pdf

Ciao
Stephan


Re: [PATCH] clk: samsung: exynos5433: use clock_ignore_unused flag for SPI3 related clocks

2016-06-19 Thread Andi Shyti
Hi Tomasz,

> >> > The SPI 3 bus uses two clocks, a bus clock and an input clock.
> >> > Do not disable the clocks when unused in order to allow access to
> >> > the SPI 3 device.
> >>
> >> If unused, why would access to SPI 3 device needed?
> >
> > because next I will submit a small driver which uses the SPI3.
> > Actually in the exynos5433 boards all the SPI are used but not all
> > the drivers are ported to mainline.
> 
> Then shouldn't the driver request the clocks and enable them? Or I'm
> missing something obvious? :)

the reason is that...

[ from the patch ]

> GATE(CLK_SCLK_IOCLK_SPI3, "sclk_ioclk_spi3", "ioclk_spi3_clk_in",
> -   ENABLE_SCLK_PERIC, 20, CLK_SET_RATE_PARENT, 0),
> +   ENABLE_SCLK_PERIC, 20,
> +   CLK_IGNORE_UNUSED | CLK_SET_RATE_PARENT, 0),

... the sclk_ioclk_spi3 is new in exynos5433 and there is no
implementation for enabling/disabling that particular clock...

> GATE(CLK_SCLK_SPI3, "sclk_spi3", "sclk_spi3_peric", ENABLE_SCLK_PERIC,
> -   18, CLK_SET_RATE_PARENT, 0),
> +   18, CLK_IGNORE_UNUSED | CLK_SET_RATE_PARENT, 0),

... while in this case your question makes sense, but it depends
on which clock the device (s3c64xx) is requesting (from the DTS).
In any case, I kept it consistent with the SPI1, which falls in
the same case, as in mainline we don't have any DTS for
exynos5433 (yet!).

Thanks,
Andi


Re: [PATCH] clk: samsung: exynos5433: use clock_ignore_unused flag for SPI3 related clocks

2016-06-19 Thread Andi Shyti
Hi Tomasz,

> >> > The SPI 3 bus uses two clocks, a bus clock and an input clock.
> >> > Do not disable the clocks when unused in order to allow access to
> >> > the SPI 3 device.
> >>
> >> If unused, why would access to SPI 3 device needed?
> >
> > because next I will submit a small driver which uses the SPI3.
> > Actually in the exynos5433 boards all the SPI are used but not all
> > the drivers are ported to mainline.
> 
> Then shouldn't the driver request the clocks and enable them? Or I'm
> missing something obvious? :)

the reason is that...

[ from the patch ]

> GATE(CLK_SCLK_IOCLK_SPI3, "sclk_ioclk_spi3", "ioclk_spi3_clk_in",
> -   ENABLE_SCLK_PERIC, 20, CLK_SET_RATE_PARENT, 0),
> +   ENABLE_SCLK_PERIC, 20,
> +   CLK_IGNORE_UNUSED | CLK_SET_RATE_PARENT, 0),

... the sclk_ioclk_spi3 is new in exynos5433 and there is no
implementation for enabling/disabling that particular clock...

> GATE(CLK_SCLK_SPI3, "sclk_spi3", "sclk_spi3_peric", ENABLE_SCLK_PERIC,
> -   18, CLK_SET_RATE_PARENT, 0),
> +   18, CLK_IGNORE_UNUSED | CLK_SET_RATE_PARENT, 0),

... while in this case your question makes sense, but it depends
on which clock the device (s3c64xx) is requesting (from the DTS).
In any case, I kept it consistent with the SPI1, which falls in
the same case, as in mainline we don't have any DTS for
exynos5433 (yet!).

Thanks,
Andi


Re: [RFC PATCH] net: macb: Add gmii2rgmii converter support

2016-06-19 Thread Florian Fainelli
On June 19, 2016 10:27:17 PM MST, Kedareswara rao Appana 
 wrote:
>This patch adds support for gmii2rgmii converter
>in the macb driver.
>
>The GMII to RGMII IP core provides the
>Reduced Gigabit Media Independent Interface
>(RGMII) between Ethernet physical media devices
>And the Gigabit Ethernet controller.
>This core can switch dynamically between the
>Three different speed modes of operation (10/100/1000 Mb/s).
>MDIO interface is used to set operating speed of Ethernet MAC.
>
>Signed-off-by: Kedareswara rao Appana 
>---
>--> Tried to include this Coverter support in the
>PHY layer but it won't fit into the PHY framework as the
>coverter won't have vaild vendor/Device id registers.
>--> The Converter has only one register (16) that need's
>to be programmed with the external phy negotiated speed.
>--> The converter won't follow the Standard MII(ieee 802.3 clause 22).
>--> Will appreciate if someone can help on adding this coverter support

With the fixed PHY emulated PHY and registering a link_update callback (see 
drivers/net/dsa/bcm_sf2.c for an example), you could read specific registers 
which indicates link parameters and update the PHY device with these. 

How exactly is this converter working?

-- 
Florian


Re: [RFC PATCH] net: macb: Add gmii2rgmii converter support

2016-06-19 Thread Florian Fainelli
On June 19, 2016 10:27:17 PM MST, Kedareswara rao Appana 
 wrote:
>This patch adds support for gmii2rgmii converter
>in the macb driver.
>
>The GMII to RGMII IP core provides the
>Reduced Gigabit Media Independent Interface
>(RGMII) between Ethernet physical media devices
>And the Gigabit Ethernet controller.
>This core can switch dynamically between the
>Three different speed modes of operation (10/100/1000 Mb/s).
>MDIO interface is used to set operating speed of Ethernet MAC.
>
>Signed-off-by: Kedareswara rao Appana 
>---
>--> Tried to include this Coverter support in the
>PHY layer but it won't fit into the PHY framework as the
>coverter won't have vaild vendor/Device id registers.
>--> The Converter has only one register (16) that need's
>to be programmed with the external phy negotiated speed.
>--> The converter won't follow the Standard MII(ieee 802.3 clause 22).
>--> Will appreciate if someone can help on adding this coverter support

With the fixed PHY emulated PHY and registering a link_update callback (see 
drivers/net/dsa/bcm_sf2.c for an example), you could read specific registers 
which indicates link parameters and update the PHY device with these. 

How exactly is this converter working?

-- 
Florian


Re: Add mt6755 basic chip support

2016-06-19 Thread Mars Cheng
On Tue, 2016-06-14 at 10:20 +0800, Mars Cheng wrote:
> This patch adds basic support for Mediatek's new 8-core chip, mt6755.
> It is also named as Helio P10. It is based on 4.7-rc1
> 
> Mars Cheng (2):
>   Document: DT: Add bindings for mediatek MT6755 SoC Platform
>   arm64: dts: mediatek: add mt6755 support
> 
>  Documentation/devicetree/bindings/arm/mediatek.txt |4 +
>  .../interrupt-controller/mediatek,sysirq.txt   |1 +
>  .../devicetree/bindings/serial/mtk-uart.txt|1 +
>  arch/arm64/boot/dts/mediatek/Makefile  |1 +
>  arch/arm64/boot/dts/mediatek/mt6755-phone.dts  |   39 ++
>  arch/arm64/boot/dts/mediatek/mt6755.dtsi   |  143 
> 
>  6 files changed, 189 insertions(+)
>  create mode 100644 arch/arm64/boot/dts/mediatek/mt6755-phone.dts
>  create mode 100644 arch/arm64/boot/dts/mediatek/mt6755.dtsi
> 
Hi Matthias

Would you like to merge this patch? If there is any suggestions or
issues, I will fix them ASAP.

Thanks a lot.
> 
> ___
> Linux-mediatek mailing list
> linux-media...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-mediatek




Re: Add mt6755 basic chip support

2016-06-19 Thread Mars Cheng
On Tue, 2016-06-14 at 10:20 +0800, Mars Cheng wrote:
> This patch adds basic support for Mediatek's new 8-core chip, mt6755.
> It is also named as Helio P10. It is based on 4.7-rc1
> 
> Mars Cheng (2):
>   Document: DT: Add bindings for mediatek MT6755 SoC Platform
>   arm64: dts: mediatek: add mt6755 support
> 
>  Documentation/devicetree/bindings/arm/mediatek.txt |4 +
>  .../interrupt-controller/mediatek,sysirq.txt   |1 +
>  .../devicetree/bindings/serial/mtk-uart.txt|1 +
>  arch/arm64/boot/dts/mediatek/Makefile  |1 +
>  arch/arm64/boot/dts/mediatek/mt6755-phone.dts  |   39 ++
>  arch/arm64/boot/dts/mediatek/mt6755.dtsi   |  143 
> 
>  6 files changed, 189 insertions(+)
>  create mode 100644 arch/arm64/boot/dts/mediatek/mt6755-phone.dts
>  create mode 100644 arch/arm64/boot/dts/mediatek/mt6755.dtsi
> 
Hi Matthias

Would you like to merge this patch? If there is any suggestions or
issues, I will fix them ASAP.

Thanks a lot.
> 
> ___
> Linux-mediatek mailing list
> linux-media...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-mediatek




RE: [PATCH net-next] net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)

2016-06-19 Thread Rosen, Rami
Hi all, 

A very limited review below.

+
+   /* get capabilities of particular feature */
+   ENA_ADMIN_GET_FEATURE = 8,

Instead /* get capabilities  SHOULD BE:  /* set capabilities .
+
+   /* get capabilities of particular feature */
+   ENA_ADMIN_SET_FEATURE = 9,
+
..


+int ena_com_set_hash_ctrl(struct ena_com_dev *ena_dev)
+{
+   struct ena_com_admin_queue *admin_queue = _dev->admin_queue;
...

You set ret=-EINVAL, but you do not use this ret as you immediately return 0 in 
the next line, which is the end of the method. Either return ret or return 
-EINVAL.
+   if (unlikely(ret)) {
+   ena_trc_err("Failed to set hash input. error: %d\n", ret);
+   ret = -EINVAL;
+   }
+
+   return 0;
+}
+



+
+/* ena_com_set_mmio_read_mode - Enable/disable the mmio reg read mechanism
+ * @ena_dev: ENA communication layer struct


Instead realess_supported: SHOULD BE: readless_supported

+ * @realess_supported: readless mode (enable/disable)
+ */
+void ena_com_set_mmio_read_mode(struct ena_com_dev *ena_dev,
+   bool readless_supported);
+



+
+/* ena_com_create_io_queue - Create io queue.
+ * @ena_dev: ENA communication layer struct

Instead ena_com_create_io_ctx SHOULD BE: @ena_com_create_io_ctx

+ * ena_com_create_io_ctx - create context structure
+ *
+ * Create the submission and the completion queues.
+ *
+ * @return - 0 on success, negative value on failure.
+ */
+int ena_com_create_io_queue(struct ena_com_dev *ena_dev,
+   struct ena_com_create_io_ctx *ctx);
+




+/* ena_com_admin_destroy - Destroy IO queue with the queue id - qid.
+ * @ena_dev: ENA communication layer struct

Missing: @qid

+ */
+void ena_com_destroy_io_queue(struct ena_com_dev *ena_dev, u16 qid);
+

Regards,
Rami Rosen
Intel Corporation


RE: [PATCH net-next] net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)

2016-06-19 Thread Rosen, Rami
Hi all, 

A very limited review below.

+
+   /* get capabilities of particular feature */
+   ENA_ADMIN_GET_FEATURE = 8,

Instead /* get capabilities  SHOULD BE:  /* set capabilities .
+
+   /* get capabilities of particular feature */
+   ENA_ADMIN_SET_FEATURE = 9,
+
..


+int ena_com_set_hash_ctrl(struct ena_com_dev *ena_dev)
+{
+   struct ena_com_admin_queue *admin_queue = _dev->admin_queue;
...

You set ret=-EINVAL, but you do not use this ret as you immediately return 0 in 
the next line, which is the end of the method. Either return ret or return 
-EINVAL.
+   if (unlikely(ret)) {
+   ena_trc_err("Failed to set hash input. error: %d\n", ret);
+   ret = -EINVAL;
+   }
+
+   return 0;
+}
+



+
+/* ena_com_set_mmio_read_mode - Enable/disable the mmio reg read mechanism
+ * @ena_dev: ENA communication layer struct


Instead realess_supported: SHOULD BE: readless_supported

+ * @realess_supported: readless mode (enable/disable)
+ */
+void ena_com_set_mmio_read_mode(struct ena_com_dev *ena_dev,
+   bool readless_supported);
+



+
+/* ena_com_create_io_queue - Create io queue.
+ * @ena_dev: ENA communication layer struct

Instead ena_com_create_io_ctx SHOULD BE: @ena_com_create_io_ctx

+ * ena_com_create_io_ctx - create context structure
+ *
+ * Create the submission and the completion queues.
+ *
+ * @return - 0 on success, negative value on failure.
+ */
+int ena_com_create_io_queue(struct ena_com_dev *ena_dev,
+   struct ena_com_create_io_ctx *ctx);
+




+/* ena_com_admin_destroy - Destroy IO queue with the queue id - qid.
+ * @ena_dev: ENA communication layer struct

Missing: @qid

+ */
+void ena_com_destroy_io_queue(struct ena_com_dev *ena_dev, u16 qid);
+

Regards,
Rami Rosen
Intel Corporation


[PATCH 5/6] staging: rtl8188eu: remove EFUSE_GetEfuseDefinition function

2016-06-19 Thread Ivan Safonov
This function does not used.

Signed-off-by: Ivan Safonov 
---
 drivers/staging/rtl8188eu/core/rtw_efuse.c| 63 ---
 drivers/staging/rtl8188eu/include/rtw_efuse.h |  2 -
 2 files changed, 65 deletions(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_efuse.c 
b/drivers/staging/rtl8188eu/core/rtw_efuse.c
index e783102..6532441 100644
--- a/drivers/staging/rtl8188eu/core/rtw_efuse.c
+++ b/drivers/staging/rtl8188eu/core/rtw_efuse.c
@@ -317,69 +317,6 @@ void efuse_ReadEFuse(struct adapter *Adapter, u8 
efuseType, u16 _offset, u16 _si
}
 }
 
-/* Do not support BT */
-void EFUSE_GetEfuseDefinition(struct adapter *pAdapter, u8 efuseType, u8 type, 
void *pOut)
-{
-   switch (type) {
-   case TYPE_EFUSE_MAX_SECTION:
-   {
-   u8 *pMax_section;
-   pMax_section = pOut;
-   *pMax_section = EFUSE_MAX_SECTION_88E;
-   }
-   break;
-   case TYPE_EFUSE_REAL_CONTENT_LEN:
-   {
-   u16 *pu2Tmp;
-   pu2Tmp = pOut;
-   *pu2Tmp = EFUSE_REAL_CONTENT_LEN_88E;
-   }
-   break;
-   case TYPE_EFUSE_CONTENT_LEN_BANK:
-   {
-   u16 *pu2Tmp;
-   pu2Tmp = pOut;
-   *pu2Tmp = EFUSE_REAL_CONTENT_LEN_88E;
-   }
-   break;
-   case TYPE_AVAILABLE_EFUSE_BYTES_BANK:
-   {
-   u16 *pu2Tmp;
-   pu2Tmp = pOut;
-   *pu2Tmp = 
(u16)(EFUSE_REAL_CONTENT_LEN_88E-EFUSE_OOB_PROTECT_BYTES_88E);
-   }
-   break;
-   case TYPE_AVAILABLE_EFUSE_BYTES_TOTAL:
-   {
-   u16 *pu2Tmp;
-   pu2Tmp = pOut;
-   *pu2Tmp = 
(u16)(EFUSE_REAL_CONTENT_LEN_88E-EFUSE_OOB_PROTECT_BYTES_88E);
-   }
-   break;
-   case TYPE_EFUSE_MAP_LEN:
-   {
-   u16 *pu2Tmp;
-   pu2Tmp = pOut;
-   *pu2Tmp = (u16)EFUSE_MAP_LEN_88E;
-   }
-   break;
-   case TYPE_EFUSE_PROTECT_BYTES_BANK:
-   {
-   u8 *pu1Tmp;
-   pu1Tmp = pOut;
-   *pu1Tmp = (u8)(EFUSE_OOB_PROTECT_BYTES_88E);
-   }
-   break;
-   default:
-   {
-   u8 *pu1Tmp;
-   pu1Tmp = pOut;
-   *pu1Tmp = 0;
-   }
-   break;
-   }
-}
-
 u8 Efuse_WordEnableDataWrite(struct adapter *pAdapter, u16 efuse_addr, u8 
word_en, u8 *data)
 {
u16 tmpaddr = 0;
diff --git a/drivers/staging/rtl8188eu/include/rtw_efuse.h 
b/drivers/staging/rtl8188eu/include/rtw_efuse.h
index 9bfb10c..9e7d135 100644
--- a/drivers/staging/rtl8188eu/include/rtw_efuse.h
+++ b/drivers/staging/rtl8188eu/include/rtw_efuse.h
@@ -95,8 +95,6 @@ struct efuse_hal {
 };
 
 u8 Efuse_CalculateWordCnts(u8 word_en);
-void EFUSE_GetEfuseDefinition(struct adapter *adapt, u8 type, u8 type1,
- void *out);
 u8 efuse_OneByteRead(struct adapter *adapter, u16 addr, u8 *data);
 u8 efuse_OneByteWrite(struct adapter *adapter, u16 addr, u8 data);
 
-- 
2.7.3



[PATCH 5/6] staging: rtl8188eu: remove EFUSE_GetEfuseDefinition function

2016-06-19 Thread Ivan Safonov
This function does not used.

Signed-off-by: Ivan Safonov 
---
 drivers/staging/rtl8188eu/core/rtw_efuse.c| 63 ---
 drivers/staging/rtl8188eu/include/rtw_efuse.h |  2 -
 2 files changed, 65 deletions(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_efuse.c 
b/drivers/staging/rtl8188eu/core/rtw_efuse.c
index e783102..6532441 100644
--- a/drivers/staging/rtl8188eu/core/rtw_efuse.c
+++ b/drivers/staging/rtl8188eu/core/rtw_efuse.c
@@ -317,69 +317,6 @@ void efuse_ReadEFuse(struct adapter *Adapter, u8 
efuseType, u16 _offset, u16 _si
}
 }
 
-/* Do not support BT */
-void EFUSE_GetEfuseDefinition(struct adapter *pAdapter, u8 efuseType, u8 type, 
void *pOut)
-{
-   switch (type) {
-   case TYPE_EFUSE_MAX_SECTION:
-   {
-   u8 *pMax_section;
-   pMax_section = pOut;
-   *pMax_section = EFUSE_MAX_SECTION_88E;
-   }
-   break;
-   case TYPE_EFUSE_REAL_CONTENT_LEN:
-   {
-   u16 *pu2Tmp;
-   pu2Tmp = pOut;
-   *pu2Tmp = EFUSE_REAL_CONTENT_LEN_88E;
-   }
-   break;
-   case TYPE_EFUSE_CONTENT_LEN_BANK:
-   {
-   u16 *pu2Tmp;
-   pu2Tmp = pOut;
-   *pu2Tmp = EFUSE_REAL_CONTENT_LEN_88E;
-   }
-   break;
-   case TYPE_AVAILABLE_EFUSE_BYTES_BANK:
-   {
-   u16 *pu2Tmp;
-   pu2Tmp = pOut;
-   *pu2Tmp = 
(u16)(EFUSE_REAL_CONTENT_LEN_88E-EFUSE_OOB_PROTECT_BYTES_88E);
-   }
-   break;
-   case TYPE_AVAILABLE_EFUSE_BYTES_TOTAL:
-   {
-   u16 *pu2Tmp;
-   pu2Tmp = pOut;
-   *pu2Tmp = 
(u16)(EFUSE_REAL_CONTENT_LEN_88E-EFUSE_OOB_PROTECT_BYTES_88E);
-   }
-   break;
-   case TYPE_EFUSE_MAP_LEN:
-   {
-   u16 *pu2Tmp;
-   pu2Tmp = pOut;
-   *pu2Tmp = (u16)EFUSE_MAP_LEN_88E;
-   }
-   break;
-   case TYPE_EFUSE_PROTECT_BYTES_BANK:
-   {
-   u8 *pu1Tmp;
-   pu1Tmp = pOut;
-   *pu1Tmp = (u8)(EFUSE_OOB_PROTECT_BYTES_88E);
-   }
-   break;
-   default:
-   {
-   u8 *pu1Tmp;
-   pu1Tmp = pOut;
-   *pu1Tmp = 0;
-   }
-   break;
-   }
-}
-
 u8 Efuse_WordEnableDataWrite(struct adapter *pAdapter, u16 efuse_addr, u8 
word_en, u8 *data)
 {
u16 tmpaddr = 0;
diff --git a/drivers/staging/rtl8188eu/include/rtw_efuse.h 
b/drivers/staging/rtl8188eu/include/rtw_efuse.h
index 9bfb10c..9e7d135 100644
--- a/drivers/staging/rtl8188eu/include/rtw_efuse.h
+++ b/drivers/staging/rtl8188eu/include/rtw_efuse.h
@@ -95,8 +95,6 @@ struct efuse_hal {
 };
 
 u8 Efuse_CalculateWordCnts(u8 word_en);
-void EFUSE_GetEfuseDefinition(struct adapter *adapt, u8 type, u8 type1,
- void *out);
 u8 efuse_OneByteRead(struct adapter *adapter, u16 addr, u8 *data);
 u8 efuse_OneByteWrite(struct adapter *adapter, u16 addr, u8 data);
 
-- 
2.7.3



[PATCH 6/6] staging: rtl8188eu: remove _EFUSE_DEF_TYPE enum

2016-06-19 Thread Ivan Safonov
This enumeration does not used.

Signed-off-by: Ivan Safonov 
---
 drivers/staging/rtl8188eu/include/rtw_efuse.h | 10 --
 1 file changed, 10 deletions(-)

diff --git a/drivers/staging/rtl8188eu/include/rtw_efuse.h 
b/drivers/staging/rtl8188eu/include/rtw_efuse.h
index 9e7d135..168c12d 100644
--- a/drivers/staging/rtl8188eu/include/rtw_efuse.h
+++ b/drivers/staging/rtl8188eu/include/rtw_efuse.h
@@ -34,16 +34,6 @@
 #defineEFUSE_WIFI  0
 #defineEFUSE_BT1
 
-enum _EFUSE_DEF_TYPE {
-   TYPE_EFUSE_MAX_SECTION  = 0,
-   TYPE_EFUSE_REAL_CONTENT_LEN = 1,
-   TYPE_AVAILABLE_EFUSE_BYTES_BANK = 2,
-   TYPE_AVAILABLE_EFUSE_BYTES_TOTAL= 3,
-   TYPE_EFUSE_MAP_LEN  = 4,
-   TYPE_EFUSE_PROTECT_BYTES_BANK   = 5,
-   TYPE_EFUSE_CONTENT_LEN_BANK = 6,
-};
-
 /* E-Fuse */
 #define EFUSE_MAP_SIZE  512
 #define EFUSE_MAX_SIZE  256
-- 
2.7.3



[PATCH 2/6] staging: rtl8188eu: remove efuse_max variable in hal_EfusePartialWriteCheck

2016-06-19 Thread Ivan Safonov
This variable does not used after assigning value.

Signed-off-by: Ivan Safonov 
---
 drivers/staging/rtl8188eu/core/rtw_efuse.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_efuse.c 
b/drivers/staging/rtl8188eu/core/rtw_efuse.c
index ea28fa1..9d5bd43 100644
--- a/drivers/staging/rtl8188eu/core/rtw_efuse.c
+++ b/drivers/staging/rtl8188eu/core/rtw_efuse.c
@@ -769,11 +769,10 @@ static bool hal_EfusePartialWriteCheck(struct adapter 
*pAdapter, u8 efuseType, u
bool bRet = false;
u8 i, efuse_data = 0, cur_header = 0;
u8 matched_wden = 0, badworden = 0;
-   u16 startAddr = 0, efuse_max_available_len = 0, efuse_max = 0;
+   u16 startAddr = 0, efuse_max_available_len = 0;
struct pgpkt curPkt;
 
EFUSE_GetEfuseDefinition(pAdapter, efuseType, 
TYPE_AVAILABLE_EFUSE_BYTES_BANK, (void *)_max_available_len);
-   EFUSE_GetEfuseDefinition(pAdapter, efuseType, 
TYPE_EFUSE_REAL_CONTENT_LEN, (void *)_max);
 
rtw_hal_get_hwreg(pAdapter, HW_VAR_EFUSE_BYTES, (u8 *));
startAddr %= EFUSE_REAL_CONTENT_LEN;
-- 
2.7.3



[PATCH 1/6] staging: rtl8188eu: replace EFUSE_GetEfuseDefinition(..., TYPE_EFUSE_MAP_LEN, ...) call with it's result (EFUSE_MAP_LEN_88E)

2016-06-19 Thread Ivan Safonov
This makes the code easier to read.

Signed-off-by: Ivan Safonov 
---
 drivers/staging/rtl8188eu/core/rtw_efuse.c | 18 +++---
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_efuse.c 
b/drivers/staging/rtl8188eu/core/rtw_efuse.c
index c17870c..ea28fa1 100644
--- a/drivers/staging/rtl8188eu/core/rtw_efuse.c
+++ b/drivers/staging/rtl8188eu/core/rtw_efuse.c
@@ -846,12 +846,7 @@ hal_EfusePgCheckAvailableAddr(
u8 efuseType
)
 {
-   u16 efuse_max_available_len = 0;
-
-   /* Change to check TYPE_EFUSE_MAP_LEN , because 8188E raw 256, logic 
map over 256. */
-   EFUSE_GetEfuseDefinition(pAdapter, EFUSE_WIFI, TYPE_EFUSE_MAP_LEN, 
(void *)_max_available_len);
-
-   if (Efuse_GetCurrentSize(pAdapter) >= efuse_max_available_len)
+   if (Efuse_GetCurrentSize(pAdapter) >= EFUSE_MAP_LEN_88E)
return false;
return true;
 }
@@ -977,13 +972,9 @@ void efuse_WordEnableDataRead(u8 word_en, u8 *sourdata, u8 
*targetdata)
  */
 static void Efuse_ReadAllMap(struct adapter *pAdapter, u8 efuseType, u8 *Efuse)
 {
-   u16 mapLen = 0;
-
Efuse_PowerSwitch(pAdapter, false, true);
 
-   EFUSE_GetEfuseDefinition(pAdapter, efuseType, TYPE_EFUSE_MAP_LEN, (void 
*));
-
-   efuse_ReadEFuse(pAdapter, efuseType, 0, mapLen, Efuse);
+   efuse_ReadEFuse(pAdapter, efuseType, 0, EFUSE_MAP_LEN_88E, Efuse);
 
Efuse_PowerSwitch(pAdapter, false, false);
 }
@@ -996,12 +987,9 @@ void EFUSE_ShadowMapUpdate(
u8 efuseType)
 {
struct eeprom_priv *pEEPROM = GET_EEPROM_EFUSE_PRIV(pAdapter);
-   u16 mapLen = 0;
-
-   EFUSE_GetEfuseDefinition(pAdapter, efuseType, TYPE_EFUSE_MAP_LEN, (void 
*));
 
if (pEEPROM->bautoload_fail_flag)
-   memset(pEEPROM->efuse_eeprom_data, 0xFF, mapLen);
+   memset(pEEPROM->efuse_eeprom_data, 0xFF, EFUSE_MAP_LEN_88E);
else
Efuse_ReadAllMap(pAdapter, efuseType, 
pEEPROM->efuse_eeprom_data);
 }
-- 
2.7.3



[PATCH 3/6] staging: rtl8188eu: replace EFUSE_GetEfuseDefinition(..., TYPE_EFUSE_MAX_SECTION, ) with a = EFUSE_MAX_SECTION_88E

2016-06-19 Thread Ivan Safonov
This makes the code easier to read.

Signed-off-by: Ivan Safonov 
---
 drivers/staging/rtl8188eu/core/rtw_efuse.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_efuse.c 
b/drivers/staging/rtl8188eu/core/rtw_efuse.c
index 9d5bd43..1e96a81 100644
--- a/drivers/staging/rtl8188eu/core/rtw_efuse.c
+++ b/drivers/staging/rtl8188eu/core/rtw_efuse.c
@@ -483,14 +483,11 @@ int Efuse_PgPacketRead(struct adapter *pAdapter, u8 
offset, u8 *data)
u8 hoffset = 0, hworden = 0;
u8 tmpidx = 0;
u8 tmpdata[8];
-   u8 max_section = 0;
u8 tmp_header = 0;
 
-   EFUSE_GetEfuseDefinition(pAdapter, EFUSE_WIFI, TYPE_EFUSE_MAX_SECTION, 
(void *)_section);
-
if (!data)
return false;
-   if (offset > max_section)
+   if (offset > EFUSE_MAX_SECTION_88E)
return false;
 
memset(data, 0xff, sizeof(u8) * PGPKT_DATA_SIZE);
-- 
2.7.3



[PATCH 2/6] staging: rtl8188eu: remove efuse_max variable in hal_EfusePartialWriteCheck

2016-06-19 Thread Ivan Safonov
This variable does not used after assigning value.

Signed-off-by: Ivan Safonov 
---
 drivers/staging/rtl8188eu/core/rtw_efuse.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_efuse.c 
b/drivers/staging/rtl8188eu/core/rtw_efuse.c
index ea28fa1..9d5bd43 100644
--- a/drivers/staging/rtl8188eu/core/rtw_efuse.c
+++ b/drivers/staging/rtl8188eu/core/rtw_efuse.c
@@ -769,11 +769,10 @@ static bool hal_EfusePartialWriteCheck(struct adapter 
*pAdapter, u8 efuseType, u
bool bRet = false;
u8 i, efuse_data = 0, cur_header = 0;
u8 matched_wden = 0, badworden = 0;
-   u16 startAddr = 0, efuse_max_available_len = 0, efuse_max = 0;
+   u16 startAddr = 0, efuse_max_available_len = 0;
struct pgpkt curPkt;
 
EFUSE_GetEfuseDefinition(pAdapter, efuseType, 
TYPE_AVAILABLE_EFUSE_BYTES_BANK, (void *)_max_available_len);
-   EFUSE_GetEfuseDefinition(pAdapter, efuseType, 
TYPE_EFUSE_REAL_CONTENT_LEN, (void *)_max);
 
rtw_hal_get_hwreg(pAdapter, HW_VAR_EFUSE_BYTES, (u8 *));
startAddr %= EFUSE_REAL_CONTENT_LEN;
-- 
2.7.3



[PATCH 1/6] staging: rtl8188eu: replace EFUSE_GetEfuseDefinition(..., TYPE_EFUSE_MAP_LEN, ...) call with it's result (EFUSE_MAP_LEN_88E)

2016-06-19 Thread Ivan Safonov
This makes the code easier to read.

Signed-off-by: Ivan Safonov 
---
 drivers/staging/rtl8188eu/core/rtw_efuse.c | 18 +++---
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_efuse.c 
b/drivers/staging/rtl8188eu/core/rtw_efuse.c
index c17870c..ea28fa1 100644
--- a/drivers/staging/rtl8188eu/core/rtw_efuse.c
+++ b/drivers/staging/rtl8188eu/core/rtw_efuse.c
@@ -846,12 +846,7 @@ hal_EfusePgCheckAvailableAddr(
u8 efuseType
)
 {
-   u16 efuse_max_available_len = 0;
-
-   /* Change to check TYPE_EFUSE_MAP_LEN , because 8188E raw 256, logic 
map over 256. */
-   EFUSE_GetEfuseDefinition(pAdapter, EFUSE_WIFI, TYPE_EFUSE_MAP_LEN, 
(void *)_max_available_len);
-
-   if (Efuse_GetCurrentSize(pAdapter) >= efuse_max_available_len)
+   if (Efuse_GetCurrentSize(pAdapter) >= EFUSE_MAP_LEN_88E)
return false;
return true;
 }
@@ -977,13 +972,9 @@ void efuse_WordEnableDataRead(u8 word_en, u8 *sourdata, u8 
*targetdata)
  */
 static void Efuse_ReadAllMap(struct adapter *pAdapter, u8 efuseType, u8 *Efuse)
 {
-   u16 mapLen = 0;
-
Efuse_PowerSwitch(pAdapter, false, true);
 
-   EFUSE_GetEfuseDefinition(pAdapter, efuseType, TYPE_EFUSE_MAP_LEN, (void 
*));
-
-   efuse_ReadEFuse(pAdapter, efuseType, 0, mapLen, Efuse);
+   efuse_ReadEFuse(pAdapter, efuseType, 0, EFUSE_MAP_LEN_88E, Efuse);
 
Efuse_PowerSwitch(pAdapter, false, false);
 }
@@ -996,12 +987,9 @@ void EFUSE_ShadowMapUpdate(
u8 efuseType)
 {
struct eeprom_priv *pEEPROM = GET_EEPROM_EFUSE_PRIV(pAdapter);
-   u16 mapLen = 0;
-
-   EFUSE_GetEfuseDefinition(pAdapter, efuseType, TYPE_EFUSE_MAP_LEN, (void 
*));
 
if (pEEPROM->bautoload_fail_flag)
-   memset(pEEPROM->efuse_eeprom_data, 0xFF, mapLen);
+   memset(pEEPROM->efuse_eeprom_data, 0xFF, EFUSE_MAP_LEN_88E);
else
Efuse_ReadAllMap(pAdapter, efuseType, 
pEEPROM->efuse_eeprom_data);
 }
-- 
2.7.3



[PATCH 3/6] staging: rtl8188eu: replace EFUSE_GetEfuseDefinition(..., TYPE_EFUSE_MAX_SECTION, ) with a = EFUSE_MAX_SECTION_88E

2016-06-19 Thread Ivan Safonov
This makes the code easier to read.

Signed-off-by: Ivan Safonov 
---
 drivers/staging/rtl8188eu/core/rtw_efuse.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_efuse.c 
b/drivers/staging/rtl8188eu/core/rtw_efuse.c
index 9d5bd43..1e96a81 100644
--- a/drivers/staging/rtl8188eu/core/rtw_efuse.c
+++ b/drivers/staging/rtl8188eu/core/rtw_efuse.c
@@ -483,14 +483,11 @@ int Efuse_PgPacketRead(struct adapter *pAdapter, u8 
offset, u8 *data)
u8 hoffset = 0, hworden = 0;
u8 tmpidx = 0;
u8 tmpdata[8];
-   u8 max_section = 0;
u8 tmp_header = 0;
 
-   EFUSE_GetEfuseDefinition(pAdapter, EFUSE_WIFI, TYPE_EFUSE_MAX_SECTION, 
(void *)_section);
-
if (!data)
return false;
-   if (offset > max_section)
+   if (offset > EFUSE_MAX_SECTION_88E)
return false;
 
memset(data, 0xff, sizeof(u8) * PGPKT_DATA_SIZE);
-- 
2.7.3



[PATCH 6/6] staging: rtl8188eu: remove _EFUSE_DEF_TYPE enum

2016-06-19 Thread Ivan Safonov
This enumeration does not used.

Signed-off-by: Ivan Safonov 
---
 drivers/staging/rtl8188eu/include/rtw_efuse.h | 10 --
 1 file changed, 10 deletions(-)

diff --git a/drivers/staging/rtl8188eu/include/rtw_efuse.h 
b/drivers/staging/rtl8188eu/include/rtw_efuse.h
index 9e7d135..168c12d 100644
--- a/drivers/staging/rtl8188eu/include/rtw_efuse.h
+++ b/drivers/staging/rtl8188eu/include/rtw_efuse.h
@@ -34,16 +34,6 @@
 #defineEFUSE_WIFI  0
 #defineEFUSE_BT1
 
-enum _EFUSE_DEF_TYPE {
-   TYPE_EFUSE_MAX_SECTION  = 0,
-   TYPE_EFUSE_REAL_CONTENT_LEN = 1,
-   TYPE_AVAILABLE_EFUSE_BYTES_BANK = 2,
-   TYPE_AVAILABLE_EFUSE_BYTES_TOTAL= 3,
-   TYPE_EFUSE_MAP_LEN  = 4,
-   TYPE_EFUSE_PROTECT_BYTES_BANK   = 5,
-   TYPE_EFUSE_CONTENT_LEN_BANK = 6,
-};
-
 /* E-Fuse */
 #define EFUSE_MAP_SIZE  512
 #define EFUSE_MAX_SIZE  256
-- 
2.7.3



[RFC PATCH] net: macb: Add gmii2rgmii converter support

2016-06-19 Thread Kedareswara rao Appana
This patch adds support for gmii2rgmii converter
in the macb driver.

The GMII to RGMII IP core provides the
Reduced Gigabit Media Independent Interface
(RGMII) between Ethernet physical media devices
And the Gigabit Ethernet controller.
This core can switch dynamically between the
Three different speed modes of operation (10/100/1000 Mb/s).
MDIO interface is used to set operating speed of Ethernet MAC.

Signed-off-by: Kedareswara rao Appana 
---
--> Tried to include this Coverter support in the
PHY layer but it won't fit into the PHY framework as the
coverter won't have vaild vendor/Device id registers.
--> The Converter has only one register (16) that need's
to be programmed with the external phy negotiated speed.
--> The converter won't follow the Standard MII(ieee 802.3 clause 22).
--> Will appreciate if someone can help on adding this coverter support

 drivers/net/ethernet/cadence/macb.c |   37 --
 drivers/net/ethernet/cadence/macb.h |7 ++
 2 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.c 
b/drivers/net/ethernet/cadence/macb.c
index cb07d95..2b6412a 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -305,8 +305,10 @@ static void macb_handle_link_change(struct net_device *dev)
 {
struct macb *bp = netdev_priv(dev);
struct phy_device *phydev = bp->phy_dev;
+   struct phy_device *gmii2rgmii_phydev = bp->gmii2rgmii_phy_dev;
unsigned long flags;
int status_change = 0;
+   u16 gmii2rgmii_reg = 0;
 
spin_lock_irqsave(>lock, flags);
 
@@ -320,15 +322,26 @@ static void macb_handle_link_change(struct net_device 
*dev)
if (macb_is_gem(bp))
reg &= ~GEM_BIT(GBE);
 
-   if (phydev->duplex)
+   if (phydev->duplex) {
reg |= MACB_BIT(FD);
-   if (phydev->speed == SPEED_100)
+   gmii2rgmii_reg |= MACB_GMII2RGMII_FULLDPLX;
+   }
+   if (phydev->speed == SPEED_100) {
reg |= MACB_BIT(SPD);
+   gmii2rgmii_reg |= MACB_GMII2RGMII_SPEED100;
+   }
if (phydev->speed == SPEED_1000 &&
-   bp->caps & MACB_CAPS_GIGABIT_MODE_AVAILABLE)
+   bp->caps & MACB_CAPS_GIGABIT_MODE_AVAILABLE) {
reg |= GEM_BIT(GBE);
+   gmii2rgmii_reg |= MACB_GMII2RGMII_SPEED1000;
+   }
 
macb_or_gem_writel(bp, NCFGR, reg);
+   if (!gmii2rgmii_phydev) {
+   phy_write(gmii2rgmii_phydev,
+ MACB_GMII2RGMII_REG_NUM,
+ gmii2rgmii_reg);
+   }
 
bp->speed = phydev->speed;
bp->duplex = phydev->duplex;
@@ -376,6 +389,20 @@ static int macb_mii_probe(struct net_device *dev)
int phy_irq;
int ret;
 
+   if (bp->gmii2rgmii_phy_node) {
+   phydev = of_phy_attach(bp->dev,
+  bp->gmii2rgmii_phy_node,
+  0, 0);
+   if (!phydev) {
+   dev_err(>pdev->dev, "%s: no gmii_to_rgmii found\n",
+   dev->name);
+   return -1;
+   }
+   bp->gmii2rgmii_phy_dev = phydev;
+   } else {
+   bp->gmii2rgmii_phy_dev = NULL;
+   }
+
phydev = phy_find_first(bp->mii_bus);
if (!phydev) {
netdev_err(dev, "no PHY found\n");
@@ -3001,6 +3028,8 @@ static int macb_probe(struct platform_device *pdev)
bp->phy_interface = err;
}
 
+   bp->gmii2rgmii_phy_node = of_parse_phandle(bp->pdev->dev.of_node,
+  "gmii2rgmii-phy-handle", 0);
/* IP specific init */
err = init(pdev);
if (err)
@@ -3059,6 +3088,8 @@ static int macb_remove(struct platform_device *pdev)
bp = netdev_priv(dev);
if (bp->phy_dev)
phy_disconnect(bp->phy_dev);
+   if (bp->gmii2rgmii_phy_dev)
+   phy_disconnect(bp->gmii2rgmii_phy_dev);
mdiobus_unregister(bp->mii_bus);
mdiobus_free(bp->mii_bus);
 
diff --git a/drivers/net/ethernet/cadence/macb.h 
b/drivers/net/ethernet/cadence/macb.h
index 8a13824..625aaf3 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -559,6 +559,11 @@ struct macb_dma_desc {
 /* limit RX checksum offload to TCP and UDP packets */
 #define 

[PATCH 4/6] staging: rtl8188eu: replace EFUSE_GetEfuseDefinition(..., TYPE_AVAILABLE_EFUSE_BYTES_BANK, ) call with a = (EFUSE_REAL_CONTENT_LEN_88E - EFUSE_OOB_PROTECT_BYTES_88E)

2016-06-19 Thread Ivan Safonov
This makes the code easier to read.

Signed-off-by: Ivan Safonov 
---
 drivers/staging/rtl8188eu/core/rtw_efuse.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_efuse.c 
b/drivers/staging/rtl8188eu/core/rtw_efuse.c
index 1e96a81..e783102 100644
--- a/drivers/staging/rtl8188eu/core/rtw_efuse.c
+++ b/drivers/staging/rtl8188eu/core/rtw_efuse.c
@@ -588,12 +588,12 @@ static bool hal_EfuseFixHeaderProcess(struct adapter 
*pAdapter, u8 efuseType, st
 static bool hal_EfusePgPacketWrite2ByteHeader(struct adapter *pAdapter, u8 
efuseType, u16 *pAddr, struct pgpkt *pTargetPkt)
 {
bool bRet = false;
-   u16 efuse_addr = *pAddr, efuse_max_available_len = 0;
+   u16 efuse_addr = *pAddr;
+   u16 efuse_max_available_len =
+   EFUSE_REAL_CONTENT_LEN_88E - EFUSE_OOB_PROTECT_BYTES_88E;
u8 pg_header = 0, tmp_header = 0, pg_header_temp = 0;
u8 repeatcnt = 0;
 
-   EFUSE_GetEfuseDefinition(pAdapter, efuseType, 
TYPE_AVAILABLE_EFUSE_BYTES_BANK, (void *)_max_available_len);
-
while (efuse_addr < efuse_max_available_len) {
pg_header = ((pTargetPkt->offset & 0x07) << 5) | 0x0F;
efuse_OneByteWrite(pAdapter, efuse_addr, pg_header);
@@ -766,11 +766,11 @@ static bool hal_EfusePartialWriteCheck(struct adapter 
*pAdapter, u8 efuseType, u
bool bRet = false;
u8 i, efuse_data = 0, cur_header = 0;
u8 matched_wden = 0, badworden = 0;
-   u16 startAddr = 0, efuse_max_available_len = 0;
+   u16 startAddr = 0;
+   u16 efuse_max_available_len =
+   EFUSE_REAL_CONTENT_LEN_88E - EFUSE_OOB_PROTECT_BYTES_88E;
struct pgpkt curPkt;
 
-   EFUSE_GetEfuseDefinition(pAdapter, efuseType, 
TYPE_AVAILABLE_EFUSE_BYTES_BANK, (void *)_max_available_len);
-
rtw_hal_get_hwreg(pAdapter, HW_VAR_EFUSE_BYTES, (u8 *));
startAddr %= EFUSE_REAL_CONTENT_LEN;
 
-- 
2.7.3



[RFC PATCH] net: macb: Add gmii2rgmii converter support

2016-06-19 Thread Kedareswara rao Appana
This patch adds support for gmii2rgmii converter
in the macb driver.

The GMII to RGMII IP core provides the
Reduced Gigabit Media Independent Interface
(RGMII) between Ethernet physical media devices
And the Gigabit Ethernet controller.
This core can switch dynamically between the
Three different speed modes of operation (10/100/1000 Mb/s).
MDIO interface is used to set operating speed of Ethernet MAC.

Signed-off-by: Kedareswara rao Appana 
---
--> Tried to include this Coverter support in the
PHY layer but it won't fit into the PHY framework as the
coverter won't have vaild vendor/Device id registers.
--> The Converter has only one register (16) that need's
to be programmed with the external phy negotiated speed.
--> The converter won't follow the Standard MII(ieee 802.3 clause 22).
--> Will appreciate if someone can help on adding this coverter support

 drivers/net/ethernet/cadence/macb.c |   37 --
 drivers/net/ethernet/cadence/macb.h |7 ++
 2 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.c 
b/drivers/net/ethernet/cadence/macb.c
index cb07d95..2b6412a 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -305,8 +305,10 @@ static void macb_handle_link_change(struct net_device *dev)
 {
struct macb *bp = netdev_priv(dev);
struct phy_device *phydev = bp->phy_dev;
+   struct phy_device *gmii2rgmii_phydev = bp->gmii2rgmii_phy_dev;
unsigned long flags;
int status_change = 0;
+   u16 gmii2rgmii_reg = 0;
 
spin_lock_irqsave(>lock, flags);
 
@@ -320,15 +322,26 @@ static void macb_handle_link_change(struct net_device 
*dev)
if (macb_is_gem(bp))
reg &= ~GEM_BIT(GBE);
 
-   if (phydev->duplex)
+   if (phydev->duplex) {
reg |= MACB_BIT(FD);
-   if (phydev->speed == SPEED_100)
+   gmii2rgmii_reg |= MACB_GMII2RGMII_FULLDPLX;
+   }
+   if (phydev->speed == SPEED_100) {
reg |= MACB_BIT(SPD);
+   gmii2rgmii_reg |= MACB_GMII2RGMII_SPEED100;
+   }
if (phydev->speed == SPEED_1000 &&
-   bp->caps & MACB_CAPS_GIGABIT_MODE_AVAILABLE)
+   bp->caps & MACB_CAPS_GIGABIT_MODE_AVAILABLE) {
reg |= GEM_BIT(GBE);
+   gmii2rgmii_reg |= MACB_GMII2RGMII_SPEED1000;
+   }
 
macb_or_gem_writel(bp, NCFGR, reg);
+   if (!gmii2rgmii_phydev) {
+   phy_write(gmii2rgmii_phydev,
+ MACB_GMII2RGMII_REG_NUM,
+ gmii2rgmii_reg);
+   }
 
bp->speed = phydev->speed;
bp->duplex = phydev->duplex;
@@ -376,6 +389,20 @@ static int macb_mii_probe(struct net_device *dev)
int phy_irq;
int ret;
 
+   if (bp->gmii2rgmii_phy_node) {
+   phydev = of_phy_attach(bp->dev,
+  bp->gmii2rgmii_phy_node,
+  0, 0);
+   if (!phydev) {
+   dev_err(>pdev->dev, "%s: no gmii_to_rgmii found\n",
+   dev->name);
+   return -1;
+   }
+   bp->gmii2rgmii_phy_dev = phydev;
+   } else {
+   bp->gmii2rgmii_phy_dev = NULL;
+   }
+
phydev = phy_find_first(bp->mii_bus);
if (!phydev) {
netdev_err(dev, "no PHY found\n");
@@ -3001,6 +3028,8 @@ static int macb_probe(struct platform_device *pdev)
bp->phy_interface = err;
}
 
+   bp->gmii2rgmii_phy_node = of_parse_phandle(bp->pdev->dev.of_node,
+  "gmii2rgmii-phy-handle", 0);
/* IP specific init */
err = init(pdev);
if (err)
@@ -3059,6 +3088,8 @@ static int macb_remove(struct platform_device *pdev)
bp = netdev_priv(dev);
if (bp->phy_dev)
phy_disconnect(bp->phy_dev);
+   if (bp->gmii2rgmii_phy_dev)
+   phy_disconnect(bp->gmii2rgmii_phy_dev);
mdiobus_unregister(bp->mii_bus);
mdiobus_free(bp->mii_bus);
 
diff --git a/drivers/net/ethernet/cadence/macb.h 
b/drivers/net/ethernet/cadence/macb.h
index 8a13824..625aaf3 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -559,6 +559,11 @@ struct macb_dma_desc {
 /* limit RX checksum offload to TCP and UDP packets */
 #define GEM_RX_CSUM_CHECKED_MASK   2
 

[PATCH 4/6] staging: rtl8188eu: replace EFUSE_GetEfuseDefinition(..., TYPE_AVAILABLE_EFUSE_BYTES_BANK, ) call with a = (EFUSE_REAL_CONTENT_LEN_88E - EFUSE_OOB_PROTECT_BYTES_88E)

2016-06-19 Thread Ivan Safonov
This makes the code easier to read.

Signed-off-by: Ivan Safonov 
---
 drivers/staging/rtl8188eu/core/rtw_efuse.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_efuse.c 
b/drivers/staging/rtl8188eu/core/rtw_efuse.c
index 1e96a81..e783102 100644
--- a/drivers/staging/rtl8188eu/core/rtw_efuse.c
+++ b/drivers/staging/rtl8188eu/core/rtw_efuse.c
@@ -588,12 +588,12 @@ static bool hal_EfuseFixHeaderProcess(struct adapter 
*pAdapter, u8 efuseType, st
 static bool hal_EfusePgPacketWrite2ByteHeader(struct adapter *pAdapter, u8 
efuseType, u16 *pAddr, struct pgpkt *pTargetPkt)
 {
bool bRet = false;
-   u16 efuse_addr = *pAddr, efuse_max_available_len = 0;
+   u16 efuse_addr = *pAddr;
+   u16 efuse_max_available_len =
+   EFUSE_REAL_CONTENT_LEN_88E - EFUSE_OOB_PROTECT_BYTES_88E;
u8 pg_header = 0, tmp_header = 0, pg_header_temp = 0;
u8 repeatcnt = 0;
 
-   EFUSE_GetEfuseDefinition(pAdapter, efuseType, 
TYPE_AVAILABLE_EFUSE_BYTES_BANK, (void *)_max_available_len);
-
while (efuse_addr < efuse_max_available_len) {
pg_header = ((pTargetPkt->offset & 0x07) << 5) | 0x0F;
efuse_OneByteWrite(pAdapter, efuse_addr, pg_header);
@@ -766,11 +766,11 @@ static bool hal_EfusePartialWriteCheck(struct adapter 
*pAdapter, u8 efuseType, u
bool bRet = false;
u8 i, efuse_data = 0, cur_header = 0;
u8 matched_wden = 0, badworden = 0;
-   u16 startAddr = 0, efuse_max_available_len = 0;
+   u16 startAddr = 0;
+   u16 efuse_max_available_len =
+   EFUSE_REAL_CONTENT_LEN_88E - EFUSE_OOB_PROTECT_BYTES_88E;
struct pgpkt curPkt;
 
-   EFUSE_GetEfuseDefinition(pAdapter, efuseType, 
TYPE_AVAILABLE_EFUSE_BYTES_BANK, (void *)_max_available_len);
-
rtw_hal_get_hwreg(pAdapter, HW_VAR_EFUSE_BYTES, (u8 *));
startAddr %= EFUSE_REAL_CONTENT_LEN;
 
-- 
2.7.3



Re: [PATCH v6 2/2] phy: rockchip-inno-usb2: add a new driver for Rockchip usb2phy

2016-06-19 Thread Guenter Roeck
Hi Frank,

On Sun, Jun 19, 2016 at 8:32 PM, Frank Wang  wrote:
> Hi Heiko & Guenter,
>
>
> On 2016/6/20 11:00, Guenter Roeck wrote:
>>
>> On Sun, Jun 19, 2016 at 6:27 PM, Frank Wang 
>> wrote:
>>>
>>> Hi Guenter,
>>>
>>>
>>> On 2016/6/17 21:20, Guenter Roeck wrote:

 Hi Frank,

 On 06/16/2016 11:43 PM, Frank Wang wrote:
>
> Hi Guenter,
>
> On 2016/6/17 12:59, Guenter Roeck wrote:
>>
>> On 06/16/2016 07:09 PM, Frank Wang wrote:
>>>
>>> The newer SoCs (rk3366, rk3399) take a different usb-phy IP block
>>> than rk3288 and before, and most of phy-related registers are also
>>> different from the past, so a new phy driver is required necessarily.
>>>
>>> Signed-off-by: Frank Wang 
>>> Suggested-by: Guenter Roeck 
>>> Suggested-by: Doug Anderson 
>>> Reviewed-by: Heiko Stuebner 
>>> Tested-by: Heiko Stuebner 
>>> ---
>>
>>
>> [ ... ]
>>
>>> +
>>> +static int rockchip_usb2phy_resume(struct phy *phy)
>>> +{
>>> +struct rockchip_usb2phy_port *rport = phy_get_drvdata(phy);
>>> +struct rockchip_usb2phy *rphy =
>>> dev_get_drvdata(phy->dev.parent);
>>> +int ret;
>>> +
>>> +dev_dbg(>phy->dev, "port resume\n");
>>> +
>>> +ret = clk_prepare_enable(rphy->clk480m);
>>> +if (ret)
>>> +return ret;
>>> +
>>
>> If suspend can be called multiple times, resume can be called
>> multiple times as well. Doesn't this cause a clock imbalance
>> if you call clk_prepare_enable() multiple times on resume,
>> but clk_disable_unprepare() only once on suspend ?
>>
> Well, what you said is reasonable, How does something like below?
>
> @@ -307,6 +307,9 @@ static int rockchip_usb2phy_resume(struct phy *phy)
>
>   dev_dbg(>phy->dev, "port resume\n");
>
> +   if (!rport->suspended)
> +   return 0;
> +
>   ret = clk_prepare_enable(rphy->clk480m);
>   if (ret)
>   return ret;
> @@ -327,12 +330,16 @@ static int rockchip_usb2phy_suspend(struct phy
> *phy)
>
>   dev_dbg(>phy->dev, "port suspend\n");
>
> +   if (rport->suspended)
> +   return 0;
> +
>   ret = property_enable(rphy, >port_cfg->phy_sus, true);
>   if (ret)
>   return ret;
>
>   rport->suspended = true;
>   clk_disable_unprepare(rphy->clk480m);
> +
>   return 0;
>}
>
> @@ -485,6 +492,7 @@ static int rockchip_usb2phy_host_port_init(struct
> rockchip_usb2phy *rphy,
>
>   rport->port_id = USB2PHY_PORT_HOST;
>   rport->port_cfg =
> >phy_cfg->port_cfgs[USB2PHY_PORT_HOST];
> +   rport->suspended = true;
>
 Why does it start in suspended mode ? That seems odd.

>>> This is an initialization. Using above design which make 'suspended' as a
>>> condition both in *_usb2phy_resume and *_usb2phy_suspend, I believe if it
>>> is
>>> not initialized as suspended mode, the first resume process will be
>>> skipped.
>>
>> I had to re-read the entire patch.
>>
>> Turns out my problem was one of terminology. Using "suspend" and
>> "resume" to me suggested the common use of suspend and resume
>> functions. That is not the case here. After mentally replacing
>> "suspend" with "power_off" and "resume" with "power_on", you are
>> right, no problem exists. Sorry for the noise.
>>
>> Maybe it would be useful to replace "resume" with "power_on" and
>> "suspend" with "power_off" in the function and variable names to
>> reduce confusion and misunderstandings.
>>
>> Thanks,
>> Guenter
>
>
> Well, it does have a bits confusion, however, the phy-port always just goes
> to suspend and resume mode (Not power off and power on) in a fact. So must
> it be renamed?
>

Other phy drivers name the functions _power_off and _power_on and
avoid the confusion. The callbacks are named .power_off and .power_on,
which gives a clear indication of its intended purpose. Other drivers
implementing suspend/resume (such as the omap usb phy driver) tie
those functions not into the power_off/power_on callbacks, but into
the driver's suspend/resume callbacks. At least the omap driver has
separate power management functions.

Do the functions _have_ to be renamed ? Surely not. But, if the
functions are really suspend/resume functions and not
power_off/power_on functions, maybe they should tie to the
suspend/resume functions and not register themselves as
power_off/power_on functions ?

Thanks,
Guenter

> @Heiko Stübner. Hey Heiko, what is your unique perceptions? ;-)
>
>
> BR.
> Frank
>
>
>>
>>> Theoretically, the phy-port in suspended mode make sense when it is at

Re: [PATCH v6 2/2] phy: rockchip-inno-usb2: add a new driver for Rockchip usb2phy

2016-06-19 Thread Guenter Roeck
Hi Frank,

On Sun, Jun 19, 2016 at 8:32 PM, Frank Wang  wrote:
> Hi Heiko & Guenter,
>
>
> On 2016/6/20 11:00, Guenter Roeck wrote:
>>
>> On Sun, Jun 19, 2016 at 6:27 PM, Frank Wang 
>> wrote:
>>>
>>> Hi Guenter,
>>>
>>>
>>> On 2016/6/17 21:20, Guenter Roeck wrote:

 Hi Frank,

 On 06/16/2016 11:43 PM, Frank Wang wrote:
>
> Hi Guenter,
>
> On 2016/6/17 12:59, Guenter Roeck wrote:
>>
>> On 06/16/2016 07:09 PM, Frank Wang wrote:
>>>
>>> The newer SoCs (rk3366, rk3399) take a different usb-phy IP block
>>> than rk3288 and before, and most of phy-related registers are also
>>> different from the past, so a new phy driver is required necessarily.
>>>
>>> Signed-off-by: Frank Wang 
>>> Suggested-by: Guenter Roeck 
>>> Suggested-by: Doug Anderson 
>>> Reviewed-by: Heiko Stuebner 
>>> Tested-by: Heiko Stuebner 
>>> ---
>>
>>
>> [ ... ]
>>
>>> +
>>> +static int rockchip_usb2phy_resume(struct phy *phy)
>>> +{
>>> +struct rockchip_usb2phy_port *rport = phy_get_drvdata(phy);
>>> +struct rockchip_usb2phy *rphy =
>>> dev_get_drvdata(phy->dev.parent);
>>> +int ret;
>>> +
>>> +dev_dbg(>phy->dev, "port resume\n");
>>> +
>>> +ret = clk_prepare_enable(rphy->clk480m);
>>> +if (ret)
>>> +return ret;
>>> +
>>
>> If suspend can be called multiple times, resume can be called
>> multiple times as well. Doesn't this cause a clock imbalance
>> if you call clk_prepare_enable() multiple times on resume,
>> but clk_disable_unprepare() only once on suspend ?
>>
> Well, what you said is reasonable, How does something like below?
>
> @@ -307,6 +307,9 @@ static int rockchip_usb2phy_resume(struct phy *phy)
>
>   dev_dbg(>phy->dev, "port resume\n");
>
> +   if (!rport->suspended)
> +   return 0;
> +
>   ret = clk_prepare_enable(rphy->clk480m);
>   if (ret)
>   return ret;
> @@ -327,12 +330,16 @@ static int rockchip_usb2phy_suspend(struct phy
> *phy)
>
>   dev_dbg(>phy->dev, "port suspend\n");
>
> +   if (rport->suspended)
> +   return 0;
> +
>   ret = property_enable(rphy, >port_cfg->phy_sus, true);
>   if (ret)
>   return ret;
>
>   rport->suspended = true;
>   clk_disable_unprepare(rphy->clk480m);
> +
>   return 0;
>}
>
> @@ -485,6 +492,7 @@ static int rockchip_usb2phy_host_port_init(struct
> rockchip_usb2phy *rphy,
>
>   rport->port_id = USB2PHY_PORT_HOST;
>   rport->port_cfg =
> >phy_cfg->port_cfgs[USB2PHY_PORT_HOST];
> +   rport->suspended = true;
>
 Why does it start in suspended mode ? That seems odd.

>>> This is an initialization. Using above design which make 'suspended' as a
>>> condition both in *_usb2phy_resume and *_usb2phy_suspend, I believe if it
>>> is
>>> not initialized as suspended mode, the first resume process will be
>>> skipped.
>>
>> I had to re-read the entire patch.
>>
>> Turns out my problem was one of terminology. Using "suspend" and
>> "resume" to me suggested the common use of suspend and resume
>> functions. That is not the case here. After mentally replacing
>> "suspend" with "power_off" and "resume" with "power_on", you are
>> right, no problem exists. Sorry for the noise.
>>
>> Maybe it would be useful to replace "resume" with "power_on" and
>> "suspend" with "power_off" in the function and variable names to
>> reduce confusion and misunderstandings.
>>
>> Thanks,
>> Guenter
>
>
> Well, it does have a bits confusion, however, the phy-port always just goes
> to suspend and resume mode (Not power off and power on) in a fact. So must
> it be renamed?
>

Other phy drivers name the functions _power_off and _power_on and
avoid the confusion. The callbacks are named .power_off and .power_on,
which gives a clear indication of its intended purpose. Other drivers
implementing suspend/resume (such as the omap usb phy driver) tie
those functions not into the power_off/power_on callbacks, but into
the driver's suspend/resume callbacks. At least the omap driver has
separate power management functions.

Do the functions _have_ to be renamed ? Surely not. But, if the
functions are really suspend/resume functions and not
power_off/power_on functions, maybe they should tie to the
suspend/resume functions and not register themselves as
power_off/power_on functions ?

Thanks,
Guenter

> @Heiko Stübner. Hey Heiko, what is your unique perceptions? ;-)
>
>
> BR.
> Frank
>
>
>>
>>> Theoretically, the phy-port in suspended mode make sense when it is at
>>> start
>>> time, then the upper layer controller will invoke phy_power_on (See
>>> phy-core.c), and it further call back *_usb2phy_resume to make phy-port

[PATCH 4/5] fs/buffer.c: Remove trailing white space

2016-06-19 Thread Byungchul Park
Trailing white space is not accepted in kernel coding style. Remove
them.

Signed-off-by: Byungchul Park 
---
 fs/buffer.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index e1632ab..a75ca74 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -439,7 +439,7 @@ EXPORT_SYMBOL(mark_buffer_async_write);
  * try_to_free_buffers() will be operating against the *blockdev* mapping
  * at the time, not against the S_ISREG file which depends on those buffers.
  * So the locking for private_list is via the private_lock in the address_space
- * which backs the buffers.  Which is different from the address_space 
+ * which backs the buffers.  Which is different from the address_space
  * against which the buffers are listed.  So for a particular address_space,
  * mapping->private_lock does *not* protect mapping->private_list!  In fact,
  * mapping->private_list will always be protected by the backing blockdev's
@@ -713,7 +713,7 @@ EXPORT_SYMBOL(__set_page_dirty_buffers);
  * Do this in two main stages: first we copy dirty buffers to a
  * temporary inode list, queueing the writes as we go.  Then we clean
  * up, waiting for those writes to complete.
- * 
+ *
  * During this second stage, any subsequent updates to the file may end
  * up refiling the buffer on the original inode's dirty list again, so
  * there is a chance we will end up with a buffer queued for write but
@@ -791,7 +791,7 @@ static int fsync_buffers_list(spinlock_t *lock, struct 
list_head *list)
brelse(bh);
spin_lock(lock);
}
-   
+
spin_unlock(lock);
err2 = osync_buffers_list(lock, list);
if (err)
@@ -901,7 +901,7 @@ no_grow:
/*
 * Return failure for non-async IO requests.  Async IO requests
 * are not allowed to fail, so we have to wait until buffer heads
-* become available.  But we don't want tasks sleeping with 
+* become available.  But we don't want tasks sleeping with
 * partially complete buffers, so all were released above.
 */
if (!retry)
@@ -910,7 +910,7 @@ no_grow:
/* We're _really_ low on memory. Now we just
 * wait for old buffer heads to become free due to
 * finishing IO.  Since this is an async request and
-* the reserve list is empty, we're sure there are 
+* the reserve list is empty, we're sure there are
 * async buffer heads in use.
 */
free_more_memory();
@@ -946,7 +946,7 @@ static sector_t blkdev_max_block(struct block_device *bdev, 
unsigned int size)
 
 /*
  * Initialise the state of a blockdev page's buffers.
- */ 
+ */
 static sector_t
 init_page_buffers(struct page *page, struct block_device *bdev,
sector_t block, int size)
@@ -1448,7 +1448,7 @@ static bool has_bh_in_lru(int cpu, void *dummy)
 {
struct bh_lru *b = per_cpu_ptr(_lrus, cpu);
int i;
-   
+
for (i = 0; i < BH_LRU_SIZE; i++) {
if (b->bhs[i])
return 1;
@@ -1952,7 +1952,7 @@ int __block_write_begin(struct page *page, loff_t pos, 
unsigned len,
if (PageUptodate(page)) {
if (!buffer_uptodate(bh))
set_buffer_uptodate(bh);
-   continue; 
+   continue;
}
if (!buffer_uptodate(bh) && !buffer_delay(bh) &&
!buffer_unwritten(bh) &&
@@ -2258,7 +2258,7 @@ EXPORT_SYMBOL(block_read_full_page);
 
 /* utility function for filesystems that need to do work on expanding
  * truncates.  Uses filesystem pagecache writes to allow the filesystem to
- * deal with the hole.  
+ * deal with the hole.
  */
 int generic_cont_expand_simple(struct inode *inode, loff_t size)
 {
@@ -2819,7 +2819,7 @@ int block_truncate_page(struct address_space *mapping,
 
length = blocksize - length;
iblock = (sector_t)index << (PAGE_CACHE_SHIFT - inode->i_blkbits);
-   
+
page = grab_cache_page(mapping, index);
err = -ENOMEM;
if (!page)
@@ -3069,7 +3069,7 @@ EXPORT_SYMBOL(submit_bh);
  *
  * ll_rw_block sets b_end_io to simple completion handler that marks
  * the buffer up-to-date (if appropriate), unlocks the buffer and wakes
- * any waiters. 
+ * any waiters.
  *
  * All of the buffers must be for the same device, and must also be a
  * multiple of the current approved size for the device.
-- 
1.9.1



[PATCH 4/5] fs/buffer.c: Remove trailing white space

2016-06-19 Thread Byungchul Park
Trailing white space is not accepted in kernel coding style. Remove
them.

Signed-off-by: Byungchul Park 
---
 fs/buffer.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index e1632ab..a75ca74 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -439,7 +439,7 @@ EXPORT_SYMBOL(mark_buffer_async_write);
  * try_to_free_buffers() will be operating against the *blockdev* mapping
  * at the time, not against the S_ISREG file which depends on those buffers.
  * So the locking for private_list is via the private_lock in the address_space
- * which backs the buffers.  Which is different from the address_space 
+ * which backs the buffers.  Which is different from the address_space
  * against which the buffers are listed.  So for a particular address_space,
  * mapping->private_lock does *not* protect mapping->private_list!  In fact,
  * mapping->private_list will always be protected by the backing blockdev's
@@ -713,7 +713,7 @@ EXPORT_SYMBOL(__set_page_dirty_buffers);
  * Do this in two main stages: first we copy dirty buffers to a
  * temporary inode list, queueing the writes as we go.  Then we clean
  * up, waiting for those writes to complete.
- * 
+ *
  * During this second stage, any subsequent updates to the file may end
  * up refiling the buffer on the original inode's dirty list again, so
  * there is a chance we will end up with a buffer queued for write but
@@ -791,7 +791,7 @@ static int fsync_buffers_list(spinlock_t *lock, struct 
list_head *list)
brelse(bh);
spin_lock(lock);
}
-   
+
spin_unlock(lock);
err2 = osync_buffers_list(lock, list);
if (err)
@@ -901,7 +901,7 @@ no_grow:
/*
 * Return failure for non-async IO requests.  Async IO requests
 * are not allowed to fail, so we have to wait until buffer heads
-* become available.  But we don't want tasks sleeping with 
+* become available.  But we don't want tasks sleeping with
 * partially complete buffers, so all were released above.
 */
if (!retry)
@@ -910,7 +910,7 @@ no_grow:
/* We're _really_ low on memory. Now we just
 * wait for old buffer heads to become free due to
 * finishing IO.  Since this is an async request and
-* the reserve list is empty, we're sure there are 
+* the reserve list is empty, we're sure there are
 * async buffer heads in use.
 */
free_more_memory();
@@ -946,7 +946,7 @@ static sector_t blkdev_max_block(struct block_device *bdev, 
unsigned int size)
 
 /*
  * Initialise the state of a blockdev page's buffers.
- */ 
+ */
 static sector_t
 init_page_buffers(struct page *page, struct block_device *bdev,
sector_t block, int size)
@@ -1448,7 +1448,7 @@ static bool has_bh_in_lru(int cpu, void *dummy)
 {
struct bh_lru *b = per_cpu_ptr(_lrus, cpu);
int i;
-   
+
for (i = 0; i < BH_LRU_SIZE; i++) {
if (b->bhs[i])
return 1;
@@ -1952,7 +1952,7 @@ int __block_write_begin(struct page *page, loff_t pos, 
unsigned len,
if (PageUptodate(page)) {
if (!buffer_uptodate(bh))
set_buffer_uptodate(bh);
-   continue; 
+   continue;
}
if (!buffer_uptodate(bh) && !buffer_delay(bh) &&
!buffer_unwritten(bh) &&
@@ -2258,7 +2258,7 @@ EXPORT_SYMBOL(block_read_full_page);
 
 /* utility function for filesystems that need to do work on expanding
  * truncates.  Uses filesystem pagecache writes to allow the filesystem to
- * deal with the hole.  
+ * deal with the hole.
  */
 int generic_cont_expand_simple(struct inode *inode, loff_t size)
 {
@@ -2819,7 +2819,7 @@ int block_truncate_page(struct address_space *mapping,
 
length = blocksize - length;
iblock = (sector_t)index << (PAGE_CACHE_SHIFT - inode->i_blkbits);
-   
+
page = grab_cache_page(mapping, index);
err = -ENOMEM;
if (!page)
@@ -3069,7 +3069,7 @@ EXPORT_SYMBOL(submit_bh);
  *
  * ll_rw_block sets b_end_io to simple completion handler that marks
  * the buffer up-to-date (if appropriate), unlocks the buffer and wakes
- * any waiters. 
+ * any waiters.
  *
  * All of the buffers must be for the same device, and must also be a
  * multiple of the current approved size for the device.
-- 
1.9.1



[PATCH 3/5] lockdep: Apply bit_spin_lock lockdep to zram

2016-06-19 Thread Byungchul Park
In order to use lockdep-enabled bit_spin_lock, we have to call
bit_spin_init() when a instance including the bit used as lock
creates, and bit_spin_free() when the instance including the bit
destroys.

The zram is one of bit_spin_lock users. And this patch adds
bit_spin_init() and bit_spin_free() properly to apply the lock
correctness validator to bit_spin_lock the rzam is using.

Signed-off-by: Byungchul Park 
---
 drivers/block/zram/zram_drv.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 370c2f7..2bc3bde 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -495,6 +495,11 @@ static void zram_meta_free(struct zram_meta *meta, u64 
disksize)
}
 
zs_destroy_pool(meta->mem_pool);
+
+   for (index = 0; index < num_pages; index++) {
+   bit_spin_free(ZRAM_ACCESS, >table[index].value);
+   }
+
vfree(meta->table);
kfree(meta);
 }
@@ -503,6 +508,7 @@ static struct zram_meta *zram_meta_alloc(char *pool_name, 
u64 disksize)
 {
size_t num_pages;
struct zram_meta *meta = kmalloc(sizeof(*meta), GFP_KERNEL);
+   int index;
 
if (!meta)
return NULL;
@@ -520,6 +526,10 @@ static struct zram_meta *zram_meta_alloc(char *pool_name, 
u64 disksize)
goto out_error;
}
 
+   for (index = 0; index < num_pages; index++) {
+   bit_spin_init(ZRAM_ACCESS, >table[index].value);
+   }
+
return meta;
 
 out_error:
-- 
1.9.1



[PATCH] staging: unisys: visorbus: Replace semaphore with mutex

2016-06-19 Thread Binoy Jayan
The semaphore 'visordriver_callback_lock' is a simple mutex, so
it should be written as one. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---
 drivers/staging/unisys/include/visorbus.h   |  3 ++-
 drivers/staging/unisys/visorbus/visorbus_main.c | 14 +++---
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/unisys/include/visorbus.h 
b/drivers/staging/unisys/include/visorbus.h
index 9baf1ec..38edca8 100644
--- a/drivers/staging/unisys/include/visorbus.h
+++ b/drivers/staging/unisys/include/visorbus.h
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "periodic_work.h"
 #include "channel.h"
@@ -159,7 +160,7 @@ struct visor_device {
struct list_head list_all;
struct periodic_work *periodic_work;
bool being_removed;
-   struct semaphore visordriver_callback_lock;
+   struct mutex visordriver_callback_lock;
bool pausing;
bool resuming;
u32 chipset_bus_no;
diff --git a/drivers/staging/unisys/visorbus/visorbus_main.c 
b/drivers/staging/unisys/visorbus/visorbus_main.c
index 3a147db..93996a5 100644
--- a/drivers/staging/unisys/visorbus/visorbus_main.c
+++ b/drivers/staging/unisys/visorbus/visorbus_main.c
@@ -544,10 +544,10 @@ dev_periodic_work(void *xdev)
struct visor_device *dev = xdev;
struct visor_driver *drv = to_visor_driver(dev->device.driver);
 
-   down(>visordriver_callback_lock);
+   mutex_lock(>visordriver_callback_lock);
if (drv->channel_interrupt)
drv->channel_interrupt(dev);
-   up(>visordriver_callback_lock);
+   mutex_unlock(>visordriver_callback_lock);
if (!visor_periodic_work_nextperiod(dev->periodic_work))
put_device(>device);
 }
@@ -588,7 +588,7 @@ visordriver_probe_device(struct device *xdev)
if (!drv->probe)
return -ENODEV;
 
-   down(>visordriver_callback_lock);
+   mutex_lock(>visordriver_callback_lock);
dev->being_removed = false;
 
res = drv->probe(dev);
@@ -598,7 +598,7 @@ visordriver_probe_device(struct device *xdev)
fix_vbus_dev_info(dev);
}
 
-   up(>visordriver_callback_lock);
+   mutex_unlock(>visordriver_callback_lock);
return res;
 }
 
@@ -614,11 +614,11 @@ visordriver_remove_device(struct device *xdev)
 
dev = to_visor_device(xdev);
drv = to_visor_driver(xdev->driver);
-   down(>visordriver_callback_lock);
+   mutex_lock(>visordriver_callback_lock);
dev->being_removed = true;
if (drv->remove)
drv->remove(dev);
-   up(>visordriver_callback_lock);
+   mutex_unlock(>visordriver_callback_lock);
dev_stop_periodic_work(dev);
 
put_device(>device);
@@ -778,7 +778,7 @@ create_visor_device(struct visor_device *dev)
POSTCODE_LINUX_4(DEVICE_CREATE_ENTRY_PC, chipset_dev_no, chipset_bus_no,
 POSTCODE_SEVERITY_INFO);
 
-   sema_init(>visordriver_callback_lock, 1);  /* unlocked */
+   mutex_init(>visordriver_callback_lock);
dev->device.bus = _type;
dev->device.groups = visorbus_channel_groups;
device_initialize(>device);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[RFC 06/12] lockdep: Apply crossrelease to completion

2016-06-19 Thread Byungchul Park
wait_for_complete() and its family can cause deadlock. Nevertheless, it
cannot use the lock correntness validator because complete() will be
called in different context from the context calling wait_for_complete(),
which violates original lockdep's assumption.

However, thanks to CONFIG_LOCKDEP_CROSSRELEASE, we can apply the lockdep
detector to wait_for_complete() and complete(). Applied it.

Signed-off-by: Byungchul Park 
---
 include/linux/completion.h | 121 +
 kernel/locking/lockdep.c   |  18 +++
 kernel/sched/completion.c  |  55 -
 lib/Kconfig.debug  |   8 +++
 4 files changed, 169 insertions(+), 33 deletions(-)

diff --git a/include/linux/completion.h b/include/linux/completion.h
index 5d5aaae..67a27af 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -9,6 +9,9 @@
  */
 
 #include 
+#ifdef CONFIG_LOCKDEP_COMPLETE
+#include 
+#endif
 
 /*
  * struct completion - structure used to maintain state for a "completion"
@@ -25,10 +28,53 @@
 struct completion {
unsigned int done;
wait_queue_head_t wait;
+#ifdef CONFIG_LOCKDEP_COMPLETE
+   struct lockdep_map map;
+   struct cross_lock xlock;
+#endif
 };
 
+#ifdef CONFIG_LOCKDEP_COMPLETE
+static inline void complete_acquire(struct completion *x)
+{
+   lock_acquire_exclusive(>map, 0, 0, NULL, _RET_IP_);
+}
+
+static inline void complete_release(struct completion *x)
+{
+   lock_release(>map, 0, _RET_IP_);
+}
+
+static inline void complete_release_commit(struct completion *x)
+{
+   lock_commit_crosslock(>map);
+}
+
+#define init_completion(x) \
+do {   \
+   static struct lock_class_key __key; \
+   lockdep_init_map_crosslock(&(x)->map,   \
+   &(x)->xlock,\
+   "(complete)" #x,\
+   &__key, 0); \
+   __init_completion(x);   \
+} while (0)
+#else
+#define init_completion(x) __init_completion(x)
+static inline void complete_acquire(struct completion *x, int try) {}
+static inline void complete_release(struct completion *x) {}
+static inline void complete_release_commit(struct completion *x) {}
+#endif
+
+#ifdef CONFIG_LOCKDEP_COMPLETE
+#define COMPLETION_INITIALIZER(work) \
+   { 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait), \
+   STATIC_CROSS_LOCKDEP_MAP_INIT("(complete)" #work, &(work), \
+   &(work).xlock), STATIC_CROSS_LOCK_INIT()}
+#else
 #define COMPLETION_INITIALIZER(work) \
{ 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait) }
+#endif
 
 #define COMPLETION_INITIALIZER_ONSTACK(work) \
({ init_completion(); work; })
@@ -70,7 +116,7 @@ struct completion {
  * This inline function will initialize a dynamically created completion
  * structure.
  */
-static inline void init_completion(struct completion *x)
+static inline void __init_completion(struct completion *x)
 {
x->done = 0;
init_waitqueue_head(>wait);
@@ -88,18 +134,75 @@ static inline void reinit_completion(struct completion *x)
x->done = 0;
 }
 
-extern void wait_for_completion(struct completion *);
-extern void wait_for_completion_io(struct completion *);
-extern int wait_for_completion_interruptible(struct completion *x);
-extern int wait_for_completion_killable(struct completion *x);
-extern unsigned long wait_for_completion_timeout(struct completion *x,
+extern void __wait_for_completion(struct completion *);
+extern void __wait_for_completion_io(struct completion *);
+extern int __wait_for_completion_interruptible(struct completion *x);
+extern int __wait_for_completion_killable(struct completion *x);
+extern unsigned long __wait_for_completion_timeout(struct completion *x,
   unsigned long timeout);
-extern unsigned long wait_for_completion_io_timeout(struct completion *x,
+extern unsigned long __wait_for_completion_io_timeout(struct completion *x,
unsigned long timeout);
-extern long wait_for_completion_interruptible_timeout(
+extern long __wait_for_completion_interruptible_timeout(
struct completion *x, unsigned long timeout);
-extern long wait_for_completion_killable_timeout(
+extern long __wait_for_completion_killable_timeout(
struct completion *x, unsigned long timeout);
+
+static inline void wait_for_completion(struct completion *x)
+{
+   complete_acquire(x);
+   __wait_for_completion(x);
+   complete_release(x);
+}
+
+static inline void wait_for_completion_io(struct completion *x)
+{
+   complete_acquire(x);
+   __wait_for_completion_io(x);
+   complete_release(x);
+}
+
+static inline int wait_for_completion_interruptible(struct completion *x)
+{
+   int ret;
+   complete_acquire(x);
+   

[PATCH] staging: unisys: visorbus: Replace semaphore with mutex

2016-06-19 Thread Binoy Jayan
The semaphore 'visordriver_callback_lock' is a simple mutex, so
it should be written as one. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---
 drivers/staging/unisys/include/visorbus.h   |  3 ++-
 drivers/staging/unisys/visorbus/visorbus_main.c | 14 +++---
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/unisys/include/visorbus.h 
b/drivers/staging/unisys/include/visorbus.h
index 9baf1ec..38edca8 100644
--- a/drivers/staging/unisys/include/visorbus.h
+++ b/drivers/staging/unisys/include/visorbus.h
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "periodic_work.h"
 #include "channel.h"
@@ -159,7 +160,7 @@ struct visor_device {
struct list_head list_all;
struct periodic_work *periodic_work;
bool being_removed;
-   struct semaphore visordriver_callback_lock;
+   struct mutex visordriver_callback_lock;
bool pausing;
bool resuming;
u32 chipset_bus_no;
diff --git a/drivers/staging/unisys/visorbus/visorbus_main.c 
b/drivers/staging/unisys/visorbus/visorbus_main.c
index 3a147db..93996a5 100644
--- a/drivers/staging/unisys/visorbus/visorbus_main.c
+++ b/drivers/staging/unisys/visorbus/visorbus_main.c
@@ -544,10 +544,10 @@ dev_periodic_work(void *xdev)
struct visor_device *dev = xdev;
struct visor_driver *drv = to_visor_driver(dev->device.driver);
 
-   down(>visordriver_callback_lock);
+   mutex_lock(>visordriver_callback_lock);
if (drv->channel_interrupt)
drv->channel_interrupt(dev);
-   up(>visordriver_callback_lock);
+   mutex_unlock(>visordriver_callback_lock);
if (!visor_periodic_work_nextperiod(dev->periodic_work))
put_device(>device);
 }
@@ -588,7 +588,7 @@ visordriver_probe_device(struct device *xdev)
if (!drv->probe)
return -ENODEV;
 
-   down(>visordriver_callback_lock);
+   mutex_lock(>visordriver_callback_lock);
dev->being_removed = false;
 
res = drv->probe(dev);
@@ -598,7 +598,7 @@ visordriver_probe_device(struct device *xdev)
fix_vbus_dev_info(dev);
}
 
-   up(>visordriver_callback_lock);
+   mutex_unlock(>visordriver_callback_lock);
return res;
 }
 
@@ -614,11 +614,11 @@ visordriver_remove_device(struct device *xdev)
 
dev = to_visor_device(xdev);
drv = to_visor_driver(xdev->driver);
-   down(>visordriver_callback_lock);
+   mutex_lock(>visordriver_callback_lock);
dev->being_removed = true;
if (drv->remove)
drv->remove(dev);
-   up(>visordriver_callback_lock);
+   mutex_unlock(>visordriver_callback_lock);
dev_stop_periodic_work(dev);
 
put_device(>device);
@@ -778,7 +778,7 @@ create_visor_device(struct visor_device *dev)
POSTCODE_LINUX_4(DEVICE_CREATE_ENTRY_PC, chipset_dev_no, chipset_bus_no,
 POSTCODE_SEVERITY_INFO);
 
-   sema_init(>visordriver_callback_lock, 1);  /* unlocked */
+   mutex_init(>visordriver_callback_lock);
dev->device.bus = _type;
dev->device.groups = visorbus_channel_groups;
device_initialize(>device);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[RFC 06/12] lockdep: Apply crossrelease to completion

2016-06-19 Thread Byungchul Park
wait_for_complete() and its family can cause deadlock. Nevertheless, it
cannot use the lock correntness validator because complete() will be
called in different context from the context calling wait_for_complete(),
which violates original lockdep's assumption.

However, thanks to CONFIG_LOCKDEP_CROSSRELEASE, we can apply the lockdep
detector to wait_for_complete() and complete(). Applied it.

Signed-off-by: Byungchul Park 
---
 include/linux/completion.h | 121 +
 kernel/locking/lockdep.c   |  18 +++
 kernel/sched/completion.c  |  55 -
 lib/Kconfig.debug  |   8 +++
 4 files changed, 169 insertions(+), 33 deletions(-)

diff --git a/include/linux/completion.h b/include/linux/completion.h
index 5d5aaae..67a27af 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -9,6 +9,9 @@
  */
 
 #include 
+#ifdef CONFIG_LOCKDEP_COMPLETE
+#include 
+#endif
 
 /*
  * struct completion - structure used to maintain state for a "completion"
@@ -25,10 +28,53 @@
 struct completion {
unsigned int done;
wait_queue_head_t wait;
+#ifdef CONFIG_LOCKDEP_COMPLETE
+   struct lockdep_map map;
+   struct cross_lock xlock;
+#endif
 };
 
+#ifdef CONFIG_LOCKDEP_COMPLETE
+static inline void complete_acquire(struct completion *x)
+{
+   lock_acquire_exclusive(>map, 0, 0, NULL, _RET_IP_);
+}
+
+static inline void complete_release(struct completion *x)
+{
+   lock_release(>map, 0, _RET_IP_);
+}
+
+static inline void complete_release_commit(struct completion *x)
+{
+   lock_commit_crosslock(>map);
+}
+
+#define init_completion(x) \
+do {   \
+   static struct lock_class_key __key; \
+   lockdep_init_map_crosslock(&(x)->map,   \
+   &(x)->xlock,\
+   "(complete)" #x,\
+   &__key, 0); \
+   __init_completion(x);   \
+} while (0)
+#else
+#define init_completion(x) __init_completion(x)
+static inline void complete_acquire(struct completion *x, int try) {}
+static inline void complete_release(struct completion *x) {}
+static inline void complete_release_commit(struct completion *x) {}
+#endif
+
+#ifdef CONFIG_LOCKDEP_COMPLETE
+#define COMPLETION_INITIALIZER(work) \
+   { 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait), \
+   STATIC_CROSS_LOCKDEP_MAP_INIT("(complete)" #work, &(work), \
+   &(work).xlock), STATIC_CROSS_LOCK_INIT()}
+#else
 #define COMPLETION_INITIALIZER(work) \
{ 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait) }
+#endif
 
 #define COMPLETION_INITIALIZER_ONSTACK(work) \
({ init_completion(); work; })
@@ -70,7 +116,7 @@ struct completion {
  * This inline function will initialize a dynamically created completion
  * structure.
  */
-static inline void init_completion(struct completion *x)
+static inline void __init_completion(struct completion *x)
 {
x->done = 0;
init_waitqueue_head(>wait);
@@ -88,18 +134,75 @@ static inline void reinit_completion(struct completion *x)
x->done = 0;
 }
 
-extern void wait_for_completion(struct completion *);
-extern void wait_for_completion_io(struct completion *);
-extern int wait_for_completion_interruptible(struct completion *x);
-extern int wait_for_completion_killable(struct completion *x);
-extern unsigned long wait_for_completion_timeout(struct completion *x,
+extern void __wait_for_completion(struct completion *);
+extern void __wait_for_completion_io(struct completion *);
+extern int __wait_for_completion_interruptible(struct completion *x);
+extern int __wait_for_completion_killable(struct completion *x);
+extern unsigned long __wait_for_completion_timeout(struct completion *x,
   unsigned long timeout);
-extern unsigned long wait_for_completion_io_timeout(struct completion *x,
+extern unsigned long __wait_for_completion_io_timeout(struct completion *x,
unsigned long timeout);
-extern long wait_for_completion_interruptible_timeout(
+extern long __wait_for_completion_interruptible_timeout(
struct completion *x, unsigned long timeout);
-extern long wait_for_completion_killable_timeout(
+extern long __wait_for_completion_killable_timeout(
struct completion *x, unsigned long timeout);
+
+static inline void wait_for_completion(struct completion *x)
+{
+   complete_acquire(x);
+   __wait_for_completion(x);
+   complete_release(x);
+}
+
+static inline void wait_for_completion_io(struct completion *x)
+{
+   complete_acquire(x);
+   __wait_for_completion_io(x);
+   complete_release(x);
+}
+
+static inline int wait_for_completion_interruptible(struct completion *x)
+{
+   int ret;
+   complete_acquire(x);
+   ret = 

[PATCH 3/5] lockdep: Apply bit_spin_lock lockdep to zram

2016-06-19 Thread Byungchul Park
In order to use lockdep-enabled bit_spin_lock, we have to call
bit_spin_init() when a instance including the bit used as lock
creates, and bit_spin_free() when the instance including the bit
destroys.

The zram is one of bit_spin_lock users. And this patch adds
bit_spin_init() and bit_spin_free() properly to apply the lock
correctness validator to bit_spin_lock the rzam is using.

Signed-off-by: Byungchul Park 
---
 drivers/block/zram/zram_drv.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 370c2f7..2bc3bde 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -495,6 +495,11 @@ static void zram_meta_free(struct zram_meta *meta, u64 
disksize)
}
 
zs_destroy_pool(meta->mem_pool);
+
+   for (index = 0; index < num_pages; index++) {
+   bit_spin_free(ZRAM_ACCESS, >table[index].value);
+   }
+
vfree(meta->table);
kfree(meta);
 }
@@ -503,6 +508,7 @@ static struct zram_meta *zram_meta_alloc(char *pool_name, 
u64 disksize)
 {
size_t num_pages;
struct zram_meta *meta = kmalloc(sizeof(*meta), GFP_KERNEL);
+   int index;
 
if (!meta)
return NULL;
@@ -520,6 +526,10 @@ static struct zram_meta *zram_meta_alloc(char *pool_name, 
u64 disksize)
goto out_error;
}
 
+   for (index = 0; index < num_pages; index++) {
+   bit_spin_init(ZRAM_ACCESS, >table[index].value);
+   }
+
return meta;
 
 out_error:
-- 
1.9.1



[RFC 10/12] mm/swap_state.c: Remove trailing white space

2016-06-19 Thread Byungchul Park
Trailing white space is not accepted in kernel coding style. Remove
them.

Signed-off-by: Byungchul Park 
---
 mm/swap_state.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/swap_state.c b/mm/swap_state.c
index 69cb246..3fb7013 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -156,7 +156,7 @@ void __delete_from_swap_cache(struct page *page)
  * @page: page we want to move to swap
  *
  * Allocate swap space for the page and add the page to the
- * swap cache.  Caller needs to hold the page lock. 
+ * swap cache.  Caller needs to hold the page lock.
  */
 int add_to_swap(struct page *page, struct list_head *list)
 {
@@ -229,9 +229,9 @@ void delete_from_swap_cache(struct page *page)
page_cache_release(page);
 }
 
-/* 
- * If we are the only user, then try to free up the swap cache. 
- * 
+/*
+ * If we are the only user, then try to free up the swap cache.
+ *
  * Its ok to check for PageSwapCache without the page lock
  * here because we are going to recheck again inside
  * try_to_free_swap() _with_ the lock.
@@ -245,7 +245,7 @@ static inline void free_swap_cache(struct page *page)
}
 }
 
-/* 
+/*
  * Perform a free_page(), also freeing any swap cache associated with
  * this page if it is the last user of the page.
  */
-- 
1.9.1



[RFC 01/12] lockdep: Refactor lookup_chain_cache()

2016-06-19 Thread Byungchul Park
Currently, lookup_chain_cache() provides both "lookup" and "add"
functionalities in a function. However each one is useful indivisually.
Some features, e.g. crossrelease, can use each one indivisually.
Thus, splited these functionalities into 2 functions.

Signed-off-by: Byungchul Park 
---
 kernel/locking/lockdep.c | 125 ++-
 1 file changed, 79 insertions(+), 46 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 716547f..efd001c 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2010,15 +2010,9 @@ struct lock_class *lock_chain_get_class(struct 
lock_chain *chain, int i)
return lock_classes + chain_hlocks[chain->base + i];
 }
 
-/*
- * Look up a dependency chain. If the key is not present yet then
- * add it and return 1 - in this case the new dependency chain is
- * validated. If the key is already hashed, return 0.
- * (On return with 1 graph_lock is held.)
- */
-static inline int lookup_chain_cache(struct task_struct *curr,
-struct held_lock *hlock,
-u64 chain_key)
+static inline int add_chain_cache(struct task_struct *curr,
+ struct held_lock *hlock,
+ u64 chain_key)
 {
struct lock_class *class = hlock_class(hlock);
struct hlist_head *hash_head = chainhashentry(chain_key);
@@ -2027,46 +2021,18 @@ static inline int lookup_chain_cache(struct task_struct 
*curr,
int i, j;
 
/*
+* Allocate a new chain entry from the static array, and add
+* it to the hash:
+*/
+
+   /*
 * We might need to take the graph lock, ensure we've got IRQs
 * disabled to make this an IRQ-safe lock.. for recursion reasons
 * lockdep won't complain about its own locking errors.
 */
if (DEBUG_LOCKS_WARN_ON(!irqs_disabled()))
return 0;
-   /*
-* We can walk it lock-free, because entries only get added
-* to the hash:
-*/
-   hlist_for_each_entry_rcu(chain, hash_head, entry) {
-   if (chain->chain_key == chain_key) {
-cache_hit:
-   debug_atomic_inc(chain_lookup_hits);
-   if (very_verbose(class))
-   printk("\nhash chain already cached, key: "
-   "%016Lx tail class: [%p] %s\n",
-   (unsigned long long)chain_key,
-   class->key, class->name);
-   return 0;
-   }
-   }
-   if (very_verbose(class))
-   printk("\nnew hash chain, key: %016Lx tail class: [%p] %s\n",
-   (unsigned long long)chain_key, class->key, class->name);
-   /*
-* Allocate a new chain entry from the static array, and add
-* it to the hash:
-*/
-   if (!graph_lock())
-   return 0;
-   /*
-* We have to walk the chain again locked - to avoid duplicates:
-*/
-   hlist_for_each_entry(chain, hash_head, entry) {
-   if (chain->chain_key == chain_key) {
-   graph_unlock();
-   goto cache_hit;
-   }
-   }
+
if (unlikely(nr_lock_chains >= MAX_LOCKDEP_CHAINS)) {
if (!debug_locks_off_graph_unlock())
return 0;
@@ -2102,6 +2068,72 @@ cache_hit:
return 1;
 }
 
+/*
+ * Look up a dependency chain.
+ */
+static inline struct lock_chain *lookup_chain_cache(u64 chain_key)
+{
+   struct hlist_head *hash_head = chainhashentry(chain_key);
+   struct lock_chain *chain;
+
+   /*
+* We can walk it lock-free, because entries only get added
+* to the hash:
+*/
+   hlist_for_each_entry_rcu(chain, hash_head, entry) {
+   if (chain->chain_key == chain_key) {
+   debug_atomic_inc(chain_lookup_hits);
+   return chain;
+   }
+   }
+   return NULL;
+}
+
+/*
+ * If the key is not present yet in dependency chain cache then
+ * add it and return 1 - in this case the new dependency chain is
+ * validated. If the key is already hashed, return 0.
+ * (On return with 1 graph_lock is held.)
+ */
+static inline int lookup_chain_cache_add(struct task_struct *curr,
+struct held_lock *hlock,
+u64 chain_key)
+{
+   struct lock_class *class = hlock_class(hlock);
+   struct lock_chain *chain = lookup_chain_cache(chain_key);
+
+   if (chain) {
+cache_hit:
+   if (very_verbose(class))
+   printk("\nhash chain already cached, key: "
+   "%016Lx tail class: [%p] %s\n",
+  

[RFC 10/12] mm/swap_state.c: Remove trailing white space

2016-06-19 Thread Byungchul Park
Trailing white space is not accepted in kernel coding style. Remove
them.

Signed-off-by: Byungchul Park 
---
 mm/swap_state.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/swap_state.c b/mm/swap_state.c
index 69cb246..3fb7013 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -156,7 +156,7 @@ void __delete_from_swap_cache(struct page *page)
  * @page: page we want to move to swap
  *
  * Allocate swap space for the page and add the page to the
- * swap cache.  Caller needs to hold the page lock. 
+ * swap cache.  Caller needs to hold the page lock.
  */
 int add_to_swap(struct page *page, struct list_head *list)
 {
@@ -229,9 +229,9 @@ void delete_from_swap_cache(struct page *page)
page_cache_release(page);
 }
 
-/* 
- * If we are the only user, then try to free up the swap cache. 
- * 
+/*
+ * If we are the only user, then try to free up the swap cache.
+ *
  * Its ok to check for PageSwapCache without the page lock
  * here because we are going to recheck again inside
  * try_to_free_swap() _with_ the lock.
@@ -245,7 +245,7 @@ static inline void free_swap_cache(struct page *page)
}
 }
 
-/* 
+/*
  * Perform a free_page(), also freeing any swap cache associated with
  * this page if it is the last user of the page.
  */
-- 
1.9.1



[RFC 01/12] lockdep: Refactor lookup_chain_cache()

2016-06-19 Thread Byungchul Park
Currently, lookup_chain_cache() provides both "lookup" and "add"
functionalities in a function. However each one is useful indivisually.
Some features, e.g. crossrelease, can use each one indivisually.
Thus, splited these functionalities into 2 functions.

Signed-off-by: Byungchul Park 
---
 kernel/locking/lockdep.c | 125 ++-
 1 file changed, 79 insertions(+), 46 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 716547f..efd001c 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2010,15 +2010,9 @@ struct lock_class *lock_chain_get_class(struct 
lock_chain *chain, int i)
return lock_classes + chain_hlocks[chain->base + i];
 }
 
-/*
- * Look up a dependency chain. If the key is not present yet then
- * add it and return 1 - in this case the new dependency chain is
- * validated. If the key is already hashed, return 0.
- * (On return with 1 graph_lock is held.)
- */
-static inline int lookup_chain_cache(struct task_struct *curr,
-struct held_lock *hlock,
-u64 chain_key)
+static inline int add_chain_cache(struct task_struct *curr,
+ struct held_lock *hlock,
+ u64 chain_key)
 {
struct lock_class *class = hlock_class(hlock);
struct hlist_head *hash_head = chainhashentry(chain_key);
@@ -2027,46 +2021,18 @@ static inline int lookup_chain_cache(struct task_struct 
*curr,
int i, j;
 
/*
+* Allocate a new chain entry from the static array, and add
+* it to the hash:
+*/
+
+   /*
 * We might need to take the graph lock, ensure we've got IRQs
 * disabled to make this an IRQ-safe lock.. for recursion reasons
 * lockdep won't complain about its own locking errors.
 */
if (DEBUG_LOCKS_WARN_ON(!irqs_disabled()))
return 0;
-   /*
-* We can walk it lock-free, because entries only get added
-* to the hash:
-*/
-   hlist_for_each_entry_rcu(chain, hash_head, entry) {
-   if (chain->chain_key == chain_key) {
-cache_hit:
-   debug_atomic_inc(chain_lookup_hits);
-   if (very_verbose(class))
-   printk("\nhash chain already cached, key: "
-   "%016Lx tail class: [%p] %s\n",
-   (unsigned long long)chain_key,
-   class->key, class->name);
-   return 0;
-   }
-   }
-   if (very_verbose(class))
-   printk("\nnew hash chain, key: %016Lx tail class: [%p] %s\n",
-   (unsigned long long)chain_key, class->key, class->name);
-   /*
-* Allocate a new chain entry from the static array, and add
-* it to the hash:
-*/
-   if (!graph_lock())
-   return 0;
-   /*
-* We have to walk the chain again locked - to avoid duplicates:
-*/
-   hlist_for_each_entry(chain, hash_head, entry) {
-   if (chain->chain_key == chain_key) {
-   graph_unlock();
-   goto cache_hit;
-   }
-   }
+
if (unlikely(nr_lock_chains >= MAX_LOCKDEP_CHAINS)) {
if (!debug_locks_off_graph_unlock())
return 0;
@@ -2102,6 +2068,72 @@ cache_hit:
return 1;
 }
 
+/*
+ * Look up a dependency chain.
+ */
+static inline struct lock_chain *lookup_chain_cache(u64 chain_key)
+{
+   struct hlist_head *hash_head = chainhashentry(chain_key);
+   struct lock_chain *chain;
+
+   /*
+* We can walk it lock-free, because entries only get added
+* to the hash:
+*/
+   hlist_for_each_entry_rcu(chain, hash_head, entry) {
+   if (chain->chain_key == chain_key) {
+   debug_atomic_inc(chain_lookup_hits);
+   return chain;
+   }
+   }
+   return NULL;
+}
+
+/*
+ * If the key is not present yet in dependency chain cache then
+ * add it and return 1 - in this case the new dependency chain is
+ * validated. If the key is already hashed, return 0.
+ * (On return with 1 graph_lock is held.)
+ */
+static inline int lookup_chain_cache_add(struct task_struct *curr,
+struct held_lock *hlock,
+u64 chain_key)
+{
+   struct lock_class *class = hlock_class(hlock);
+   struct lock_chain *chain = lookup_chain_cache(chain_key);
+
+   if (chain) {
+cache_hit:
+   if (very_verbose(class))
+   printk("\nhash chain already cached, key: "
+   "%016Lx tail class: [%p] %s\n",
+   (unsigned long 

[RFC 11/12] lockdep: Call lock_acquire(release) when accessing PG_locked manually

2016-06-19 Thread Byungchul Park
The PG_locked bit can be updated through SetPageLocked() or
ClearPageLocked(), not by lock_page() and unlock_page().
SetPageLockded() and ClearPageLocked() also have to be considered to
get balanced between acquring and releasing the PG_locked lock.

Signed-off-by: Byungchul Park 
---
 fs/cifs/file.c  | 4 
 include/linux/pagemap.h | 5 -
 mm/filemap.c| 6 --
 mm/ksm.c| 1 +
 mm/migrate.c| 1 +
 mm/shmem.c  | 2 ++
 mm/swap_state.c | 2 ++
 mm/vmscan.c | 1 +
 8 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index bcf9ead..7b250c1 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -3392,12 +3392,14 @@ readpages_get_pages(struct address_space *mapping, 
struct list_head *page_list,
 * PG_locked without checking it first.
 */
__SetPageLocked(page);
+   lock_page_acquire(page, 1);
rc = add_to_page_cache_locked(page, mapping,
  page->index, gfp);
 
/* give up if we can't stick it in the cache */
if (rc) {
__ClearPageLocked(page);
+   lock_page_release(page);
return rc;
}
 
@@ -3419,8 +3421,10 @@ readpages_get_pages(struct address_space *mapping, 
struct list_head *page_list,
break;
 
__SetPageLocked(page);
+   lock_page_acquire(page, 1);
if (add_to_page_cache_locked(page, mapping, page->index, gfp)) {
__ClearPageLocked(page);
+   lock_page_release(page);
break;
}
list_move_tail(>lru, tmplist);
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 2fc4af1..f92972c 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -760,9 +760,12 @@ static inline int add_to_page_cache(struct page *page,
int error;
 
__SetPageLocked(page);
+   lock_page_acquire(page, 1);
error = add_to_page_cache_locked(page, mapping, offset, gfp_mask);
-   if (unlikely(error))
+   if (unlikely(error)) {
__ClearPageLocked(page);
+   lock_page_release(page);
+   }
return error;
 }
 
diff --git a/mm/filemap.c b/mm/filemap.c
index 47fc5c0..7acce5e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -690,11 +690,13 @@ int add_to_page_cache_lru(struct page *page, struct 
address_space *mapping,
int ret;
 
__SetPageLocked(page);
+   lock_page_acquire(page, 1);
ret = __add_to_page_cache_locked(page, mapping, offset,
 gfp_mask, );
-   if (unlikely(ret))
+   if (unlikely(ret)) {
__ClearPageLocked(page);
-   else {
+   lock_page_release(page);
+   } else {
/*
 * The page might have been evicted from cache only
 * recently, in which case it should be activated like
diff --git a/mm/ksm.c b/mm/ksm.c
index ca6d2a0..c89debd 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1869,6 +1869,7 @@ struct page *ksm_might_need_to_copy(struct page *page,
SetPageDirty(new_page);
__SetPageUptodate(new_page);
__SetPageLocked(new_page);
+   lock_page_acquire(new_page, 1);
}
 
return new_page;
diff --git a/mm/migrate.c b/mm/migrate.c
index 3ad0fea..9aab7c4 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1773,6 +1773,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 
/* Prepare a page as a migration target */
__SetPageLocked(new_page);
+   lock_page_acquire(new_page, 1);
SetPageSwapBacked(new_page);
 
/* anon mapping, we can simply copy page->mapping to the new page: */
diff --git a/mm/shmem.c b/mm/shmem.c
index 440e2a7..da35ca8 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1090,6 +1090,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t 
gfp,
flush_dcache_page(newpage);
 
__SetPageLocked(newpage);
+   lock_page_acquire(newpage, 1);
SetPageUptodate(newpage);
SetPageSwapBacked(newpage);
set_page_private(newpage, swap_index);
@@ -1283,6 +1284,7 @@ repeat:
 
__SetPageSwapBacked(page);
__SetPageLocked(page);
+   lock_page_acquire(page, 1);
if (sgp == SGP_WRITE)
__SetPageReferenced(page);
 
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 3fb7013..200edbf 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -358,6 +358,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, 
gfp_t gfp_mask,
 
/* May fail (-ENOMEM) if radix-tree node allocation failed. */
__SetPageLocked(new_page);
+   lock_page_acquire(new_page, 1);
SetPageSwapBacked(new_page);
 

[RFC 05/12] lockdep: Implement crossrelease feature

2016-06-19 Thread Byungchul Park
Crossrelease feature calls a lock which is releasable by a
different context from the context having acquired the lock,
crosslock. For crosslock, all locks having been held in the
context unlocking the crosslock, until eventually the crosslock
will be unlocked, have dependency with the crosslock. That's a
key idea to implement crossrelease feature.

Crossrelease feature introduces 2 new data structures.

1. pend_lock (== plock)

This is for keeping locks waiting to commit those so
that an actual dependency chain is built, when commiting
a crosslock.

Every task_struct has an array of this pending lock to
keep those locks. These pending locks will be added
whenever lock_acquire() is called for normal(non-crosslock)
lock and will be flushed(committed) at proper time.

2. cross_lock (== xlock)

This keeps some additional data only for crosslock. There
is one cross_lock per one lockdep_map for crosslock.
lockdep_init_map_crosslock() should be used instead of
lockdep_init_map() to use the lock as a crosslock.

Acquiring and releasing sequence for crossrelease feature:

1. Acquire

All validation check is performed for all locks.

1) For non-crosslock (normal lock)

The hlock will be added not only to held_locks
of the current's task_struct, but also to
pend_lock array of the task_struct, so that
a dependency chain can be built with the lock
when doing commit.

2) For crosslock

The hlock will be added only to the cross_lock
of the lock's lockdep_map instead of held_locks,
so that a dependency chain can be built with
the lock when doing commit. And this lock is
added to the xlocks_head list.

2. Commit (only for crosslock)

This establishes a dependency chain between the lock
unlocking it now and all locks having held in the context
unlocking it since the lock was held, even though it tries
to avoid building a chain unnecessarily as far as possible.

3. Release

1) For non-crosslock (normal lock)

No change.

2) For crosslock

Just Remove the lock from xlocks_head list. Release
operation should be used with commit operation
together for crosslock, in order to build a
dependency chain properly.

Signed-off-by: Byungchul Park 
---
 include/linux/irqflags.h |  16 +-
 include/linux/lockdep.h  | 139 +++
 include/linux/sched.h|   5 +
 kernel/fork.c|   4 +
 kernel/locking/lockdep.c | 626 +--
 lib/Kconfig.debug|  13 +
 6 files changed, 785 insertions(+), 18 deletions(-)

diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index 5dd1272..83eebe1 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -23,10 +23,18 @@
 # define trace_softirq_context(p)  ((p)->softirq_context)
 # define trace_hardirqs_enabled(p) ((p)->hardirqs_enabled)
 # define trace_softirqs_enabled(p) ((p)->softirqs_enabled)
-# define trace_hardirq_enter() do { current->hardirq_context++; } while (0)
-# define trace_hardirq_exit()  do { current->hardirq_context--; } while (0)
-# define lockdep_softirq_enter()   do { current->softirq_context++; } 
while (0)
-# define lockdep_softirq_exit()do { current->softirq_context--; } 
while (0)
+# define trace_hardirq_enter() \
+do {   \
+   current->hardirq_context++; \
+   crossrelease_hardirq_start();   \
+} while (0)
+# define trace_hardirq_exit()  do { current->hardirq_context--; } 
while (0)
+# define lockdep_softirq_enter()   \
+do {   \
+   current->softirq_context++; \
+   crossrelease_softirq_start();   \
+} while (0)
+# define lockdep_softirq_exit()do { 
current->softirq_context--; } while (0)
 # define INIT_TRACE_IRQFLAGS   .softirqs_enabled = 1,
 #else
 # define trace_hardirqs_on()   do { } while (0)
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 4dca42f..1bf513e 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -108,6 +108,19 @@ struct lock_class {
unsigned long   contention_point[LOCKSTAT_POINTS];
unsigned long   contending_point[LOCKSTAT_POINTS];
 #endif
+#ifdef CONFIG_LOCKDEP_CROSSRELEASE
+   /*
+* A flag to check if this lock class is releasable in
+* a differnet context from the context acquiring it.
+*/
+   int crosslock;
+
+   /*
+* When building a dependency chain, this help any classes
+* already established the chain can be skipped.
+*/
+   

[RFC 04/12] lockdep: Make save_trace can copy from other stack_trace

2016-06-19 Thread Byungchul Park
Currently, save_trace() can only save current context's stack trace.
However, it would be useful if it can save(copy from) another context's
stack trace. Especially, it can be used by crossrelease feature.

Signed-off-by: Byungchul Park 
---
 kernel/locking/lockdep.c | 22 ++
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index c596bef..b03014b 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -389,7 +389,7 @@ static void print_lockdep_off(const char *bug_msg)
 #endif
 }
 
-static int save_trace(struct stack_trace *trace)
+static int save_trace(struct stack_trace *trace, struct stack_trace *copy)
 {
trace->nr_entries = 0;
trace->max_entries = MAX_STACK_TRACE_ENTRIES - nr_stack_trace_entries;
@@ -397,7 +397,13 @@ static int save_trace(struct stack_trace *trace)
 
trace->skip = 3;
 
-   save_stack_trace(trace);
+   if (copy) {
+   trace->nr_entries = min(copy->nr_entries, trace->max_entries);
+   trace->skip = copy->skip;
+   memcpy(trace->entries, copy->entries,
+   trace->nr_entries * sizeof(unsigned long));
+   } else
+   save_stack_trace(trace);
 
/*
 * Some daft arches put -1 at the end to indicate its a full trace.
@@ -1201,7 +1207,7 @@ static noinline int print_circular_bug(struct lock_list 
*this,
if (!debug_locks_off_graph_unlock() || debug_locks_silent)
return 0;
 
-   if (!save_trace(>trace))
+   if (!save_trace(>trace, NULL))
return 0;
 
depth = get_lock_depth(target);
@@ -1547,13 +1553,13 @@ print_bad_irq_dependency(struct task_struct *curr,
 
printk("\nthe dependencies between %s-irq-safe lock", irqclass);
printk(" and the holding lock:\n");
-   if (!save_trace(_root->trace))
+   if (!save_trace(_root->trace, NULL))
return 0;
print_shortest_lock_dependencies(backwards_entry, prev_root);
 
printk("\nthe dependencies between the lock to be acquired");
printk(" and %s-irq-unsafe lock:\n", irqclass);
-   if (!save_trace(_root->trace))
+   if (!save_trace(_root->trace, NULL))
return 0;
print_shortest_lock_dependencies(forwards_entry, next_root);
 
@@ -1885,7 +1891,7 @@ check_prev_add(struct task_struct *curr, struct held_lock 
*prev,
}
 
if (!own_trace && stack_saved && !*stack_saved) {
-   if (!save_trace())
+   if (!save_trace(, NULL))
return 0;
*stack_saved = 1;
}
@@ -2436,7 +2442,7 @@ print_irq_inversion_bug(struct task_struct *curr,
lockdep_print_held_locks(curr);
 
printk("\nthe shortest dependencies between 2nd lock and 1st lock:\n");
-   if (!save_trace(>trace))
+   if (!save_trace(>trace, NULL))
return 0;
print_shortest_lock_dependencies(other, root);
 
@@ -3015,7 +3021,7 @@ static int mark_lock(struct task_struct *curr, struct 
held_lock *this,
 
hlock_class(this)->usage_mask |= new_mask;
 
-   if (!save_trace(hlock_class(this)->usage_traces + new_bit))
+   if (!save_trace(hlock_class(this)->usage_traces + new_bit, NULL))
return 0;
 
switch (new_bit) {
-- 
1.9.1



[RFC 11/12] lockdep: Call lock_acquire(release) when accessing PG_locked manually

2016-06-19 Thread Byungchul Park
The PG_locked bit can be updated through SetPageLocked() or
ClearPageLocked(), not by lock_page() and unlock_page().
SetPageLockded() and ClearPageLocked() also have to be considered to
get balanced between acquring and releasing the PG_locked lock.

Signed-off-by: Byungchul Park 
---
 fs/cifs/file.c  | 4 
 include/linux/pagemap.h | 5 -
 mm/filemap.c| 6 --
 mm/ksm.c| 1 +
 mm/migrate.c| 1 +
 mm/shmem.c  | 2 ++
 mm/swap_state.c | 2 ++
 mm/vmscan.c | 1 +
 8 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index bcf9ead..7b250c1 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -3392,12 +3392,14 @@ readpages_get_pages(struct address_space *mapping, 
struct list_head *page_list,
 * PG_locked without checking it first.
 */
__SetPageLocked(page);
+   lock_page_acquire(page, 1);
rc = add_to_page_cache_locked(page, mapping,
  page->index, gfp);
 
/* give up if we can't stick it in the cache */
if (rc) {
__ClearPageLocked(page);
+   lock_page_release(page);
return rc;
}
 
@@ -3419,8 +3421,10 @@ readpages_get_pages(struct address_space *mapping, 
struct list_head *page_list,
break;
 
__SetPageLocked(page);
+   lock_page_acquire(page, 1);
if (add_to_page_cache_locked(page, mapping, page->index, gfp)) {
__ClearPageLocked(page);
+   lock_page_release(page);
break;
}
list_move_tail(>lru, tmplist);
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 2fc4af1..f92972c 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -760,9 +760,12 @@ static inline int add_to_page_cache(struct page *page,
int error;
 
__SetPageLocked(page);
+   lock_page_acquire(page, 1);
error = add_to_page_cache_locked(page, mapping, offset, gfp_mask);
-   if (unlikely(error))
+   if (unlikely(error)) {
__ClearPageLocked(page);
+   lock_page_release(page);
+   }
return error;
 }
 
diff --git a/mm/filemap.c b/mm/filemap.c
index 47fc5c0..7acce5e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -690,11 +690,13 @@ int add_to_page_cache_lru(struct page *page, struct 
address_space *mapping,
int ret;
 
__SetPageLocked(page);
+   lock_page_acquire(page, 1);
ret = __add_to_page_cache_locked(page, mapping, offset,
 gfp_mask, );
-   if (unlikely(ret))
+   if (unlikely(ret)) {
__ClearPageLocked(page);
-   else {
+   lock_page_release(page);
+   } else {
/*
 * The page might have been evicted from cache only
 * recently, in which case it should be activated like
diff --git a/mm/ksm.c b/mm/ksm.c
index ca6d2a0..c89debd 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1869,6 +1869,7 @@ struct page *ksm_might_need_to_copy(struct page *page,
SetPageDirty(new_page);
__SetPageUptodate(new_page);
__SetPageLocked(new_page);
+   lock_page_acquire(new_page, 1);
}
 
return new_page;
diff --git a/mm/migrate.c b/mm/migrate.c
index 3ad0fea..9aab7c4 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1773,6 +1773,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 
/* Prepare a page as a migration target */
__SetPageLocked(new_page);
+   lock_page_acquire(new_page, 1);
SetPageSwapBacked(new_page);
 
/* anon mapping, we can simply copy page->mapping to the new page: */
diff --git a/mm/shmem.c b/mm/shmem.c
index 440e2a7..da35ca8 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1090,6 +1090,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t 
gfp,
flush_dcache_page(newpage);
 
__SetPageLocked(newpage);
+   lock_page_acquire(newpage, 1);
SetPageUptodate(newpage);
SetPageSwapBacked(newpage);
set_page_private(newpage, swap_index);
@@ -1283,6 +1284,7 @@ repeat:
 
__SetPageSwapBacked(page);
__SetPageLocked(page);
+   lock_page_acquire(page, 1);
if (sgp == SGP_WRITE)
__SetPageReferenced(page);
 
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 3fb7013..200edbf 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -358,6 +358,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, 
gfp_t gfp_mask,
 
/* May fail (-ENOMEM) if radix-tree node allocation failed. */
__SetPageLocked(new_page);
+   lock_page_acquire(new_page, 1);
SetPageSwapBacked(new_page);
err = 

[RFC 05/12] lockdep: Implement crossrelease feature

2016-06-19 Thread Byungchul Park
Crossrelease feature calls a lock which is releasable by a
different context from the context having acquired the lock,
crosslock. For crosslock, all locks having been held in the
context unlocking the crosslock, until eventually the crosslock
will be unlocked, have dependency with the crosslock. That's a
key idea to implement crossrelease feature.

Crossrelease feature introduces 2 new data structures.

1. pend_lock (== plock)

This is for keeping locks waiting to commit those so
that an actual dependency chain is built, when commiting
a crosslock.

Every task_struct has an array of this pending lock to
keep those locks. These pending locks will be added
whenever lock_acquire() is called for normal(non-crosslock)
lock and will be flushed(committed) at proper time.

2. cross_lock (== xlock)

This keeps some additional data only for crosslock. There
is one cross_lock per one lockdep_map for crosslock.
lockdep_init_map_crosslock() should be used instead of
lockdep_init_map() to use the lock as a crosslock.

Acquiring and releasing sequence for crossrelease feature:

1. Acquire

All validation check is performed for all locks.

1) For non-crosslock (normal lock)

The hlock will be added not only to held_locks
of the current's task_struct, but also to
pend_lock array of the task_struct, so that
a dependency chain can be built with the lock
when doing commit.

2) For crosslock

The hlock will be added only to the cross_lock
of the lock's lockdep_map instead of held_locks,
so that a dependency chain can be built with
the lock when doing commit. And this lock is
added to the xlocks_head list.

2. Commit (only for crosslock)

This establishes a dependency chain between the lock
unlocking it now and all locks having held in the context
unlocking it since the lock was held, even though it tries
to avoid building a chain unnecessarily as far as possible.

3. Release

1) For non-crosslock (normal lock)

No change.

2) For crosslock

Just Remove the lock from xlocks_head list. Release
operation should be used with commit operation
together for crosslock, in order to build a
dependency chain properly.

Signed-off-by: Byungchul Park 
---
 include/linux/irqflags.h |  16 +-
 include/linux/lockdep.h  | 139 +++
 include/linux/sched.h|   5 +
 kernel/fork.c|   4 +
 kernel/locking/lockdep.c | 626 +--
 lib/Kconfig.debug|  13 +
 6 files changed, 785 insertions(+), 18 deletions(-)

diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index 5dd1272..83eebe1 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -23,10 +23,18 @@
 # define trace_softirq_context(p)  ((p)->softirq_context)
 # define trace_hardirqs_enabled(p) ((p)->hardirqs_enabled)
 # define trace_softirqs_enabled(p) ((p)->softirqs_enabled)
-# define trace_hardirq_enter() do { current->hardirq_context++; } while (0)
-# define trace_hardirq_exit()  do { current->hardirq_context--; } while (0)
-# define lockdep_softirq_enter()   do { current->softirq_context++; } 
while (0)
-# define lockdep_softirq_exit()do { current->softirq_context--; } 
while (0)
+# define trace_hardirq_enter() \
+do {   \
+   current->hardirq_context++; \
+   crossrelease_hardirq_start();   \
+} while (0)
+# define trace_hardirq_exit()  do { current->hardirq_context--; } 
while (0)
+# define lockdep_softirq_enter()   \
+do {   \
+   current->softirq_context++; \
+   crossrelease_softirq_start();   \
+} while (0)
+# define lockdep_softirq_exit()do { 
current->softirq_context--; } while (0)
 # define INIT_TRACE_IRQFLAGS   .softirqs_enabled = 1,
 #else
 # define trace_hardirqs_on()   do { } while (0)
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 4dca42f..1bf513e 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -108,6 +108,19 @@ struct lock_class {
unsigned long   contention_point[LOCKSTAT_POINTS];
unsigned long   contending_point[LOCKSTAT_POINTS];
 #endif
+#ifdef CONFIG_LOCKDEP_CROSSRELEASE
+   /*
+* A flag to check if this lock class is releasable in
+* a differnet context from the context acquiring it.
+*/
+   int crosslock;
+
+   /*
+* When building a dependency chain, this help any classes
+* already established the chain can be skipped.
+*/
+   unsigned int

[RFC 04/12] lockdep: Make save_trace can copy from other stack_trace

2016-06-19 Thread Byungchul Park
Currently, save_trace() can only save current context's stack trace.
However, it would be useful if it can save(copy from) another context's
stack trace. Especially, it can be used by crossrelease feature.

Signed-off-by: Byungchul Park 
---
 kernel/locking/lockdep.c | 22 ++
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index c596bef..b03014b 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -389,7 +389,7 @@ static void print_lockdep_off(const char *bug_msg)
 #endif
 }
 
-static int save_trace(struct stack_trace *trace)
+static int save_trace(struct stack_trace *trace, struct stack_trace *copy)
 {
trace->nr_entries = 0;
trace->max_entries = MAX_STACK_TRACE_ENTRIES - nr_stack_trace_entries;
@@ -397,7 +397,13 @@ static int save_trace(struct stack_trace *trace)
 
trace->skip = 3;
 
-   save_stack_trace(trace);
+   if (copy) {
+   trace->nr_entries = min(copy->nr_entries, trace->max_entries);
+   trace->skip = copy->skip;
+   memcpy(trace->entries, copy->entries,
+   trace->nr_entries * sizeof(unsigned long));
+   } else
+   save_stack_trace(trace);
 
/*
 * Some daft arches put -1 at the end to indicate its a full trace.
@@ -1201,7 +1207,7 @@ static noinline int print_circular_bug(struct lock_list 
*this,
if (!debug_locks_off_graph_unlock() || debug_locks_silent)
return 0;
 
-   if (!save_trace(>trace))
+   if (!save_trace(>trace, NULL))
return 0;
 
depth = get_lock_depth(target);
@@ -1547,13 +1553,13 @@ print_bad_irq_dependency(struct task_struct *curr,
 
printk("\nthe dependencies between %s-irq-safe lock", irqclass);
printk(" and the holding lock:\n");
-   if (!save_trace(_root->trace))
+   if (!save_trace(_root->trace, NULL))
return 0;
print_shortest_lock_dependencies(backwards_entry, prev_root);
 
printk("\nthe dependencies between the lock to be acquired");
printk(" and %s-irq-unsafe lock:\n", irqclass);
-   if (!save_trace(_root->trace))
+   if (!save_trace(_root->trace, NULL))
return 0;
print_shortest_lock_dependencies(forwards_entry, next_root);
 
@@ -1885,7 +1891,7 @@ check_prev_add(struct task_struct *curr, struct held_lock 
*prev,
}
 
if (!own_trace && stack_saved && !*stack_saved) {
-   if (!save_trace())
+   if (!save_trace(, NULL))
return 0;
*stack_saved = 1;
}
@@ -2436,7 +2442,7 @@ print_irq_inversion_bug(struct task_struct *curr,
lockdep_print_held_locks(curr);
 
printk("\nthe shortest dependencies between 2nd lock and 1st lock:\n");
-   if (!save_trace(>trace))
+   if (!save_trace(>trace, NULL))
return 0;
print_shortest_lock_dependencies(other, root);
 
@@ -3015,7 +3021,7 @@ static int mark_lock(struct task_struct *curr, struct 
held_lock *this,
 
hlock_class(this)->usage_mask |= new_mask;
 
-   if (!save_trace(hlock_class(this)->usage_traces + new_bit))
+   if (!save_trace(hlock_class(this)->usage_traces + new_bit, NULL))
return 0;
 
switch (new_bit) {
-- 
1.9.1



[RFC 00/12] lockdep: Implement crossrelease feature

2016-06-19 Thread Byungchul Park
Crossrelease feature calls a lock which is releasable by a
different context from the context having acquired the lock,
crosslock. For crosslock, all locks having been held in the
context unlocking the crosslock, until eventually the crosslock
will be unlocked, have dependency with the crosslock. That's a
key idea to implement crossrelease feature.

Crossrelease feature introduces 2 new data structures.

1. pend_lock (== plock)

This is for keeping locks waiting to commit those so
that an actual dependency chain is built, when commiting
a crosslock.

Every task_struct has an array of this pending lock to
keep those locks. These pending locks will be added
whenever lock_acquire() is called for normal(non-crosslock)
lock and will be flushed(committed) at proper time.

2. cross_lock (== xlock)

This keeps some additional data only for crosslock. There
is one cross_lock per one lockdep_map for crosslock.
lockdep_init_map_crosslock() should be used instead of
lockdep_init_map() to use the lock as a crosslock.

Acquiring and releasing sequence for crossrelease feature:

1. Acquire

All validation check is performed for all locks.

1) For non-crosslock (normal lock)

The hlock will be added not only to held_locks
of the current's task_struct, but also to
pend_lock array of the task_struct, so that
a dependency chain can be built with the lock
when doing commit.

2) For crosslock

The hlock will be added only to the cross_lock
of the lock's lockdep_map instead of held_locks,
so that a dependency chain can be built with
the lock when doing commit. And this lock is
added to the xlocks_head list.

2. Commit (only for crosslock)

This establishes a dependency chain between the lock
unlocking it now and all locks having held in the context
unlocking it since the lock was held, even though it tries
to avoid building a chain unnecessarily as far as possible.

3. Release

1) For non-crosslock (normal lock)

No change.

2) For crosslock

Just Remove the lock from xlocks_head list. Release
operation should be used with commit operation
together for crosslock, in order to build a
dependency chain properly.

Byungchul Park (12):
  lockdep: Refactor lookup_chain_cache()
  lockdep: Add a function building a chain between two hlocks
  lockdep: Make check_prev_add can use a stack_trace of other context
  lockdep: Make save_trace can copy from other stack_trace
  lockdep: Implement crossrelease feature
  lockdep: Apply crossrelease to completion
  pagemap.h: Remove trailing white space
  lockdep: Apply crossrelease to PG_locked lock
  cifs/file.c: Remove trailing white space
  mm/swap_state.c: Remove trailing white space
  lockdep: Call lock_acquire(release) when accessing PG_locked manually
  x86/dumpstack: Optimize save_stack_trace

 arch/x86/include/asm/stacktrace.h |   1 +
 arch/x86/kernel/dumpstack.c   |   2 +
 arch/x86/kernel/dumpstack_32.c|   2 +
 arch/x86/kernel/stacktrace.c  |   7 +
 fs/cifs/file.c|   6 +-
 include/linux/completion.h| 121 +-
 include/linux/irqflags.h  |  16 +-
 include/linux/lockdep.h   | 139 +++
 include/linux/mm_types.h  |   9 +
 include/linux/pagemap.h   | 104 -
 include/linux/sched.h |   5 +
 kernel/fork.c |   4 +
 kernel/locking/lockdep.c  | 846 +++---
 kernel/sched/completion.c |  55 +--
 lib/Kconfig.debug |  30 ++
 mm/filemap.c  |  10 +-
 mm/ksm.c  |   1 +
 mm/migrate.c  |   1 +
 mm/page_alloc.c   |   3 +
 mm/shmem.c|   2 +
 mm/swap_state.c   |  12 +-
 mm/vmscan.c   |   1 +
 22 files changed, 1255 insertions(+), 122 deletions(-)

-- 
1.9.1



[RFC 08/12] lockdep: Apply crossrelease to PG_locked lock

2016-06-19 Thread Byungchul Park
lock_page() and its family can cause deadlock. Nevertheless, it cannot
use the lock correctness validator becasue unlock_page() can be called
in different context from the context calling lock_page(), which
violates original lockdep's assumption.

However, thanks to CONFIG_LOCKDEP_CROSSRELEASE, we can apply the lockdep
detector to lock_page() using PG_locked. Applied it.

Signed-off-by: Byungchul Park 
---
 include/linux/mm_types.h |  9 +
 include/linux/pagemap.h  | 95 +---
 lib/Kconfig.debug|  9 +
 mm/filemap.c |  4 +-
 mm/page_alloc.c  |  3 ++
 5 files changed, 112 insertions(+), 8 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 624b78b..ab33ee3 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -15,6 +15,10 @@
 #include 
 #include 
 
+#ifdef CONFIG_LOCKDEP_PAGELOCK
+#include 
+#endif
+
 #ifndef AT_VECTOR_SIZE_ARCH
 #define AT_VECTOR_SIZE_ARCH 0
 #endif
@@ -215,6 +219,11 @@ struct page {
 #ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
int _last_cpupid;
 #endif
+
+#ifdef CONFIG_LOCKDEP_PAGELOCK
+   struct lockdep_map map;
+   struct cross_lock xlock;
+#endif
 }
 /*
  * The struct page can be forced to be double word aligned so that atomic ops
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index c0049d9..2fc4af1 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -14,6 +14,9 @@
 #include 
 #include  /* for in_interrupt() */
 #include 
+#ifdef CONFIG_LOCKDEP_PAGELOCK
+#include 
+#endif
 
 /*
  * Bits in mapping->flags.  The lower __GFP_BITS_SHIFT bits are the page
@@ -441,26 +444,81 @@ static inline pgoff_t linear_page_index(struct 
vm_area_struct *vma,
return pgoff >> (PAGE_CACHE_SHIFT - PAGE_SHIFT);
 }
 
+#ifdef CONFIG_LOCKDEP_PAGELOCK
+#define lock_page_init(p)  \
+do {   \
+   static struct lock_class_key __key; \
+   lockdep_init_map_crosslock(&(p)->map, &(p)->xlock,  \
+   "(PG_locked)" #p, &__key, 0);   \
+} while (0)
+
+static inline void lock_page_acquire(struct page *page, int try)
+{
+   page = compound_head(page);
+   lock_acquire_exclusive(>map, 0, try, NULL, _RET_IP_);
+}
+
+static inline void lock_page_release(struct page *page)
+{
+   page = compound_head(page);
+   /*
+* Calling lock_commit_crosslock() is necessary
+* for cross-releasable lock when the lock is
+* releasing before calling lock_release().
+*/
+   lock_commit_crosslock(>map);
+   lock_release(>map, 0, _RET_IP_);
+}
+#else
+static inline void lock_page_init(struct page *page) {}
+static inline void lock_page_free(struct page *page) {}
+static inline void lock_page_acquire(struct page *page, int try) {}
+static inline void lock_page_release(struct page *page) {}
+#endif
+
 extern void __lock_page(struct page *page);
 extern int __lock_page_killable(struct page *page);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
unsigned int flags);
-extern void unlock_page(struct page *page);
+extern void do_raw_unlock_page(struct page *page);
 
-static inline int trylock_page(struct page *page)
+static inline void unlock_page(struct page *page)
+{
+   lock_page_release(page);
+   do_raw_unlock_page(page);
+}
+
+static inline int do_raw_trylock_page(struct page *page)
 {
page = compound_head(page);
return (likely(!test_and_set_bit_lock(PG_locked, >flags)));
 }
 
+static inline int trylock_page(struct page *page)
+{
+   if (do_raw_trylock_page(page)) {
+   lock_page_acquire(page, 1);
+   return 1;
+   }
+   return 0;
+}
+
 /*
  * lock_page may only be called if we have the page's inode pinned.
  */
 static inline void lock_page(struct page *page)
 {
might_sleep();
-   if (!trylock_page(page))
+
+   if (!do_raw_trylock_page(page))
__lock_page(page);
+   /*
+* The acquire function must be after actual lock operation
+* for crossrelease lock, because the lock instance is
+* searched by release operation in any context and more
+* than two instances acquired make it confused.
+*/
+   lock_page_acquire(page, 0);
 }
 
 /*
@@ -470,9 +528,22 @@ static inline void lock_page(struct page *page)
  */
 static inline int lock_page_killable(struct page *page)
 {
+   int ret;
+
might_sleep();
-   if (!trylock_page(page))
-   return __lock_page_killable(page);
+
+   if (!do_raw_trylock_page(page)) {
+   ret = __lock_page_killable(page);
+   if (ret)
+   return ret;
+   }
+   /*
+* The acquire function must be after actual lock operation
+* for crossrelease 

[RFC 03/12] lockdep: Make check_prev_add can use a stack_trace of other context

2016-06-19 Thread Byungchul Park
Currently, check_prev_add() can only save its current context's stack
trace. But it would be useful if a seperated stack_trace can be taken
and used in check_prev_add(). Crossrelease feature can use
check_prev_add() with another context's stack_trace.

Signed-off-by: Byungchul Park 
---
 kernel/locking/lockdep.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 4d51208..c596bef 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1822,7 +1822,8 @@ check_deadlock(struct task_struct *curr, struct held_lock 
*next,
  */
 static int
 check_prev_add(struct task_struct *curr, struct held_lock *prev,
-  struct held_lock *next, int distance, int *stack_saved)
+  struct held_lock *next, int distance, int *stack_saved,
+  struct stack_trace *own_trace)
 {
struct lock_list *entry;
int ret;
@@ -1883,7 +1884,7 @@ check_prev_add(struct task_struct *curr, struct held_lock 
*prev,
}
}
 
-   if (!*stack_saved) {
+   if (!own_trace && stack_saved && !*stack_saved) {
if (!save_trace())
return 0;
*stack_saved = 1;
@@ -1895,14 +1896,14 @@ check_prev_add(struct task_struct *curr, struct 
held_lock *prev,
 */
ret = add_lock_to_list(hlock_class(prev), hlock_class(next),
   _class(prev)->locks_after,
-  next->acquire_ip, distance, );
+  next->acquire_ip, distance, own_trace ?: );
 
if (!ret)
return 0;
 
ret = add_lock_to_list(hlock_class(next), hlock_class(prev),
   _class(next)->locks_before,
-  next->acquire_ip, distance, );
+  next->acquire_ip, distance, own_trace ?: );
if (!ret)
return 0;
 
@@ -1911,7 +1912,8 @@ check_prev_add(struct task_struct *curr, struct held_lock 
*prev,
 */
if (verbose(hlock_class(prev)) || verbose(hlock_class(next))) {
/* We drop graph lock, so another thread can overwrite trace. */
-   *stack_saved = 0;
+   if (stack_saved)
+   *stack_saved = 0;
graph_unlock();
printk("\n new dependency: ");
print_lock_name(hlock_class(prev));
@@ -1960,8 +1962,8 @@ check_prevs_add(struct task_struct *curr, struct 
held_lock *next)
 * added:
 */
if (hlock->read != 2 && hlock->check) {
-   if (!check_prev_add(curr, hlock, next,
-   distance, _saved))
+   if (!check_prev_add(curr, hlock, next, distance,
+   _saved, NULL))
return 0;
/*
 * Stop after the first non-trylock entry,
-- 
1.9.1



[RFC 09/12] cifs/file.c: Remove trailing white space

2016-06-19 Thread Byungchul Park
Trailing white space is not accepted in kernel coding style. Remove
them.

Signed-off-by: Byungchul Park 
---
 fs/cifs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index ff882ae..bcf9ead 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -3851,7 +3851,7 @@ void cifs_oplock_break(struct work_struct *work)
  * In the non-cached mode (mount with cache=none), we shunt off direct read 
and write requests
  * so this method should never be called.
  *
- * Direct IO is not yet supported in the cached mode. 
+ * Direct IO is not yet supported in the cached mode.
  */
 static ssize_t
 cifs_direct_io(struct kiocb *iocb, struct iov_iter *iter, loff_t pos)
-- 
1.9.1



[PATCH 5/5] lockdep: Apply bit_spin_lock lockdep to BH_Uptodate_Lock

2016-06-19 Thread Byungchul Park
In order to use lockdep-enabled bit_spin_lock, we have to call
bit_spin_init() when a instance including the bit used as lock
creates, and bit_spin_free() when the instance including the bit
destroys.

BH_Uptodate_Lock bit of buffer head's b_state is one of
bit_spin_lock users. And this patch adds bit_spin_init() and
bit_spin_free() properly to apply the lock correctness validator.

Signed-off-by: Byungchul Park 
---
 fs/buffer.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/buffer.c b/fs/buffer.c
index a75ca74..65c0b8a 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -3317,6 +3317,7 @@ struct buffer_head *alloc_buffer_head(gfp_t gfp_flags)
if (ret) {
INIT_LIST_HEAD(>b_assoc_buffers);
preempt_disable();
+   bit_spin_init(BH_Uptodate_Lock, >b_state);
__this_cpu_inc(bh_accounting.nr);
recalc_bh_state();
preempt_enable();
@@ -3328,6 +3329,7 @@ EXPORT_SYMBOL(alloc_buffer_head);
 void free_buffer_head(struct buffer_head *bh)
 {
BUG_ON(!list_empty(>b_assoc_buffers));
+   bit_spin_free(BH_Uptodate_Lock, >b_state);
kmem_cache_free(bh_cachep, bh);
preempt_disable();
__this_cpu_dec(bh_accounting.nr);
-- 
1.9.1



[RFC 07/12] pagemap.h: Remove trailing white space

2016-06-19 Thread Byungchul Park
Trailing white space is not accepted in kernel coding style. Remove
them.

Signed-off-by: Byungchul Park 
---
 include/linux/pagemap.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 92395a0..c0049d9 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -513,7 +513,7 @@ static inline void wake_up_page(struct page *page, int bit)
__wake_up_bit(page_waitqueue(page), >flags, bit);
 }
 
-/* 
+/*
  * Wait for a page to be unlocked.
  *
  * This must be called with the caller "holding" the page,
@@ -526,7 +526,7 @@ static inline void wait_on_page_locked(struct page *page)
wait_on_page_bit(compound_head(page), PG_locked);
 }
 
-/* 
+/*
  * Wait for a page to complete writeback
  */
 static inline void wait_on_page_writeback(struct page *page)
-- 
1.9.1



[RFC 09/12] cifs/file.c: Remove trailing white space

2016-06-19 Thread Byungchul Park
Trailing white space is not accepted in kernel coding style. Remove
them.

Signed-off-by: Byungchul Park 
---
 fs/cifs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index ff882ae..bcf9ead 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -3851,7 +3851,7 @@ void cifs_oplock_break(struct work_struct *work)
  * In the non-cached mode (mount with cache=none), we shunt off direct read 
and write requests
  * so this method should never be called.
  *
- * Direct IO is not yet supported in the cached mode. 
+ * Direct IO is not yet supported in the cached mode.
  */
 static ssize_t
 cifs_direct_io(struct kiocb *iocb, struct iov_iter *iter, loff_t pos)
-- 
1.9.1



[PATCH 5/5] lockdep: Apply bit_spin_lock lockdep to BH_Uptodate_Lock

2016-06-19 Thread Byungchul Park
In order to use lockdep-enabled bit_spin_lock, we have to call
bit_spin_init() when a instance including the bit used as lock
creates, and bit_spin_free() when the instance including the bit
destroys.

BH_Uptodate_Lock bit of buffer head's b_state is one of
bit_spin_lock users. And this patch adds bit_spin_init() and
bit_spin_free() properly to apply the lock correctness validator.

Signed-off-by: Byungchul Park 
---
 fs/buffer.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/buffer.c b/fs/buffer.c
index a75ca74..65c0b8a 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -3317,6 +3317,7 @@ struct buffer_head *alloc_buffer_head(gfp_t gfp_flags)
if (ret) {
INIT_LIST_HEAD(>b_assoc_buffers);
preempt_disable();
+   bit_spin_init(BH_Uptodate_Lock, >b_state);
__this_cpu_inc(bh_accounting.nr);
recalc_bh_state();
preempt_enable();
@@ -3328,6 +3329,7 @@ EXPORT_SYMBOL(alloc_buffer_head);
 void free_buffer_head(struct buffer_head *bh)
 {
BUG_ON(!list_empty(>b_assoc_buffers));
+   bit_spin_free(BH_Uptodate_Lock, >b_state);
kmem_cache_free(bh_cachep, bh);
preempt_disable();
__this_cpu_dec(bh_accounting.nr);
-- 
1.9.1



[RFC 07/12] pagemap.h: Remove trailing white space

2016-06-19 Thread Byungchul Park
Trailing white space is not accepted in kernel coding style. Remove
them.

Signed-off-by: Byungchul Park 
---
 include/linux/pagemap.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 92395a0..c0049d9 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -513,7 +513,7 @@ static inline void wake_up_page(struct page *page, int bit)
__wake_up_bit(page_waitqueue(page), >flags, bit);
 }
 
-/* 
+/*
  * Wait for a page to be unlocked.
  *
  * This must be called with the caller "holding" the page,
@@ -526,7 +526,7 @@ static inline void wait_on_page_locked(struct page *page)
wait_on_page_bit(compound_head(page), PG_locked);
 }
 
-/* 
+/*
  * Wait for a page to complete writeback
  */
 static inline void wait_on_page_writeback(struct page *page)
-- 
1.9.1



[RFC 00/12] lockdep: Implement crossrelease feature

2016-06-19 Thread Byungchul Park
Crossrelease feature calls a lock which is releasable by a
different context from the context having acquired the lock,
crosslock. For crosslock, all locks having been held in the
context unlocking the crosslock, until eventually the crosslock
will be unlocked, have dependency with the crosslock. That's a
key idea to implement crossrelease feature.

Crossrelease feature introduces 2 new data structures.

1. pend_lock (== plock)

This is for keeping locks waiting to commit those so
that an actual dependency chain is built, when commiting
a crosslock.

Every task_struct has an array of this pending lock to
keep those locks. These pending locks will be added
whenever lock_acquire() is called for normal(non-crosslock)
lock and will be flushed(committed) at proper time.

2. cross_lock (== xlock)

This keeps some additional data only for crosslock. There
is one cross_lock per one lockdep_map for crosslock.
lockdep_init_map_crosslock() should be used instead of
lockdep_init_map() to use the lock as a crosslock.

Acquiring and releasing sequence for crossrelease feature:

1. Acquire

All validation check is performed for all locks.

1) For non-crosslock (normal lock)

The hlock will be added not only to held_locks
of the current's task_struct, but also to
pend_lock array of the task_struct, so that
a dependency chain can be built with the lock
when doing commit.

2) For crosslock

The hlock will be added only to the cross_lock
of the lock's lockdep_map instead of held_locks,
so that a dependency chain can be built with
the lock when doing commit. And this lock is
added to the xlocks_head list.

2. Commit (only for crosslock)

This establishes a dependency chain between the lock
unlocking it now and all locks having held in the context
unlocking it since the lock was held, even though it tries
to avoid building a chain unnecessarily as far as possible.

3. Release

1) For non-crosslock (normal lock)

No change.

2) For crosslock

Just Remove the lock from xlocks_head list. Release
operation should be used with commit operation
together for crosslock, in order to build a
dependency chain properly.

Byungchul Park (12):
  lockdep: Refactor lookup_chain_cache()
  lockdep: Add a function building a chain between two hlocks
  lockdep: Make check_prev_add can use a stack_trace of other context
  lockdep: Make save_trace can copy from other stack_trace
  lockdep: Implement crossrelease feature
  lockdep: Apply crossrelease to completion
  pagemap.h: Remove trailing white space
  lockdep: Apply crossrelease to PG_locked lock
  cifs/file.c: Remove trailing white space
  mm/swap_state.c: Remove trailing white space
  lockdep: Call lock_acquire(release) when accessing PG_locked manually
  x86/dumpstack: Optimize save_stack_trace

 arch/x86/include/asm/stacktrace.h |   1 +
 arch/x86/kernel/dumpstack.c   |   2 +
 arch/x86/kernel/dumpstack_32.c|   2 +
 arch/x86/kernel/stacktrace.c  |   7 +
 fs/cifs/file.c|   6 +-
 include/linux/completion.h| 121 +-
 include/linux/irqflags.h  |  16 +-
 include/linux/lockdep.h   | 139 +++
 include/linux/mm_types.h  |   9 +
 include/linux/pagemap.h   | 104 -
 include/linux/sched.h |   5 +
 kernel/fork.c |   4 +
 kernel/locking/lockdep.c  | 846 +++---
 kernel/sched/completion.c |  55 +--
 lib/Kconfig.debug |  30 ++
 mm/filemap.c  |  10 +-
 mm/ksm.c  |   1 +
 mm/migrate.c  |   1 +
 mm/page_alloc.c   |   3 +
 mm/shmem.c|   2 +
 mm/swap_state.c   |  12 +-
 mm/vmscan.c   |   1 +
 22 files changed, 1255 insertions(+), 122 deletions(-)

-- 
1.9.1



[RFC 08/12] lockdep: Apply crossrelease to PG_locked lock

2016-06-19 Thread Byungchul Park
lock_page() and its family can cause deadlock. Nevertheless, it cannot
use the lock correctness validator becasue unlock_page() can be called
in different context from the context calling lock_page(), which
violates original lockdep's assumption.

However, thanks to CONFIG_LOCKDEP_CROSSRELEASE, we can apply the lockdep
detector to lock_page() using PG_locked. Applied it.

Signed-off-by: Byungchul Park 
---
 include/linux/mm_types.h |  9 +
 include/linux/pagemap.h  | 95 +---
 lib/Kconfig.debug|  9 +
 mm/filemap.c |  4 +-
 mm/page_alloc.c  |  3 ++
 5 files changed, 112 insertions(+), 8 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 624b78b..ab33ee3 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -15,6 +15,10 @@
 #include 
 #include 
 
+#ifdef CONFIG_LOCKDEP_PAGELOCK
+#include 
+#endif
+
 #ifndef AT_VECTOR_SIZE_ARCH
 #define AT_VECTOR_SIZE_ARCH 0
 #endif
@@ -215,6 +219,11 @@ struct page {
 #ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
int _last_cpupid;
 #endif
+
+#ifdef CONFIG_LOCKDEP_PAGELOCK
+   struct lockdep_map map;
+   struct cross_lock xlock;
+#endif
 }
 /*
  * The struct page can be forced to be double word aligned so that atomic ops
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index c0049d9..2fc4af1 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -14,6 +14,9 @@
 #include 
 #include  /* for in_interrupt() */
 #include 
+#ifdef CONFIG_LOCKDEP_PAGELOCK
+#include 
+#endif
 
 /*
  * Bits in mapping->flags.  The lower __GFP_BITS_SHIFT bits are the page
@@ -441,26 +444,81 @@ static inline pgoff_t linear_page_index(struct 
vm_area_struct *vma,
return pgoff >> (PAGE_CACHE_SHIFT - PAGE_SHIFT);
 }
 
+#ifdef CONFIG_LOCKDEP_PAGELOCK
+#define lock_page_init(p)  \
+do {   \
+   static struct lock_class_key __key; \
+   lockdep_init_map_crosslock(&(p)->map, &(p)->xlock,  \
+   "(PG_locked)" #p, &__key, 0);   \
+} while (0)
+
+static inline void lock_page_acquire(struct page *page, int try)
+{
+   page = compound_head(page);
+   lock_acquire_exclusive(>map, 0, try, NULL, _RET_IP_);
+}
+
+static inline void lock_page_release(struct page *page)
+{
+   page = compound_head(page);
+   /*
+* Calling lock_commit_crosslock() is necessary
+* for cross-releasable lock when the lock is
+* releasing before calling lock_release().
+*/
+   lock_commit_crosslock(>map);
+   lock_release(>map, 0, _RET_IP_);
+}
+#else
+static inline void lock_page_init(struct page *page) {}
+static inline void lock_page_free(struct page *page) {}
+static inline void lock_page_acquire(struct page *page, int try) {}
+static inline void lock_page_release(struct page *page) {}
+#endif
+
 extern void __lock_page(struct page *page);
 extern int __lock_page_killable(struct page *page);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
unsigned int flags);
-extern void unlock_page(struct page *page);
+extern void do_raw_unlock_page(struct page *page);
 
-static inline int trylock_page(struct page *page)
+static inline void unlock_page(struct page *page)
+{
+   lock_page_release(page);
+   do_raw_unlock_page(page);
+}
+
+static inline int do_raw_trylock_page(struct page *page)
 {
page = compound_head(page);
return (likely(!test_and_set_bit_lock(PG_locked, >flags)));
 }
 
+static inline int trylock_page(struct page *page)
+{
+   if (do_raw_trylock_page(page)) {
+   lock_page_acquire(page, 1);
+   return 1;
+   }
+   return 0;
+}
+
 /*
  * lock_page may only be called if we have the page's inode pinned.
  */
 static inline void lock_page(struct page *page)
 {
might_sleep();
-   if (!trylock_page(page))
+
+   if (!do_raw_trylock_page(page))
__lock_page(page);
+   /*
+* The acquire function must be after actual lock operation
+* for crossrelease lock, because the lock instance is
+* searched by release operation in any context and more
+* than two instances acquired make it confused.
+*/
+   lock_page_acquire(page, 0);
 }
 
 /*
@@ -470,9 +528,22 @@ static inline void lock_page(struct page *page)
  */
 static inline int lock_page_killable(struct page *page)
 {
+   int ret;
+
might_sleep();
-   if (!trylock_page(page))
-   return __lock_page_killable(page);
+
+   if (!do_raw_trylock_page(page)) {
+   ret = __lock_page_killable(page);
+   if (ret)
+   return ret;
+   }
+   /*
+* The acquire function must be after actual lock operation
+* for crossrelease lock, because the lock 

[RFC 03/12] lockdep: Make check_prev_add can use a stack_trace of other context

2016-06-19 Thread Byungchul Park
Currently, check_prev_add() can only save its current context's stack
trace. But it would be useful if a seperated stack_trace can be taken
and used in check_prev_add(). Crossrelease feature can use
check_prev_add() with another context's stack_trace.

Signed-off-by: Byungchul Park 
---
 kernel/locking/lockdep.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 4d51208..c596bef 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1822,7 +1822,8 @@ check_deadlock(struct task_struct *curr, struct held_lock 
*next,
  */
 static int
 check_prev_add(struct task_struct *curr, struct held_lock *prev,
-  struct held_lock *next, int distance, int *stack_saved)
+  struct held_lock *next, int distance, int *stack_saved,
+  struct stack_trace *own_trace)
 {
struct lock_list *entry;
int ret;
@@ -1883,7 +1884,7 @@ check_prev_add(struct task_struct *curr, struct held_lock 
*prev,
}
}
 
-   if (!*stack_saved) {
+   if (!own_trace && stack_saved && !*stack_saved) {
if (!save_trace())
return 0;
*stack_saved = 1;
@@ -1895,14 +1896,14 @@ check_prev_add(struct task_struct *curr, struct 
held_lock *prev,
 */
ret = add_lock_to_list(hlock_class(prev), hlock_class(next),
   _class(prev)->locks_after,
-  next->acquire_ip, distance, );
+  next->acquire_ip, distance, own_trace ?: );
 
if (!ret)
return 0;
 
ret = add_lock_to_list(hlock_class(next), hlock_class(prev),
   _class(next)->locks_before,
-  next->acquire_ip, distance, );
+  next->acquire_ip, distance, own_trace ?: );
if (!ret)
return 0;
 
@@ -1911,7 +1912,8 @@ check_prev_add(struct task_struct *curr, struct held_lock 
*prev,
 */
if (verbose(hlock_class(prev)) || verbose(hlock_class(next))) {
/* We drop graph lock, so another thread can overwrite trace. */
-   *stack_saved = 0;
+   if (stack_saved)
+   *stack_saved = 0;
graph_unlock();
printk("\n new dependency: ");
print_lock_name(hlock_class(prev));
@@ -1960,8 +1962,8 @@ check_prevs_add(struct task_struct *curr, struct 
held_lock *next)
 * added:
 */
if (hlock->read != 2 && hlock->check) {
-   if (!check_prev_add(curr, hlock, next,
-   distance, _saved))
+   if (!check_prev_add(curr, hlock, next, distance,
+   _saved, NULL))
return 0;
/*
 * Stop after the first non-trylock entry,
-- 
1.9.1



[RFC 12/12] x86/dumpstack: Optimize save_stack_trace

2016-06-19 Thread Byungchul Park
Currently, x86 implementation of save_stack_trace() is walking all stack
region word by word regardless of what the trace->max_entries is.
However, it's unnecessary to walk after already fulfilling caller's
requirement, say, if trace->nr_entries >= trace->max_entries is true.

For example, CONFIG_LOCKDEP_CROSSRELEASE implementation calls
save_stack_trace() with max_entries = 5 frequently. I measured its
overhead and printed its difference of sched_clock() with my QEMU x86
machine.

The latency was improved over 70% when trace->max_entries = 5.

Before this patch:

[2.326940] save_stack_trace() takes 83931 ns
[2.326389] save_stack_trace() takes 62576 ns
[2.327575] save_stack_trace() takes 58826 ns
[2.327000] save_stack_trace() takes 88980 ns
[2.327424] save_stack_trace() takes 59831 ns
[2.327575] save_stack_trace() takes 58482 ns
[2.327597] save_stack_trace() takes 87114 ns
[2.327931] save_stack_trace() takes 121140 ns
[2.327434] save_stack_trace() takes 64321 ns
[2.328632] save_stack_trace() takes 84997 ns
[2.328000] save_stack_trace() takes 115037 ns
[2.328460] save_stack_trace() takes 72292 ns
[2.328632] save_stack_trace() takes 61236 ns
[2.328567] save_stack_trace() takes 7 ns
[2.328867] save_stack_trace() takes 79525 ns
[2.328460] save_stack_trace() takes 64902 ns
[2.329585] save_stack_trace() takes 58760 ns
[2.329000] save_stack_trace() takes 91349 ns
[2.329414] save_stack_trace() takes 60069 ns
[2.329585] save_stack_trace() takes 61012 ns
[2.329573] save_stack_trace() takes 76820 ns
[2.329863] save_stack_trace() takes 62131 ns
[2.33] save_stack_trace() takes 99476 ns
[2.329846] save_stack_trace() takes 62419 ns
[2.33] save_stack_trace() takes 88918 ns
[2.330253] save_stack_trace() takes 73669 ns
[2.330520] save_stack_trace() takes 67876 ns
[2.330671] save_stack_trace() takes 75963 ns
[2.330983] save_stack_trace() takes 95079 ns
[2.330451] save_stack_trace() takes 62352 ns

After this patch:

[2.780735] save_stack_trace() takes 19902 ns
[2.780718] save_stack_trace() takes 20240 ns
[2.781692] save_stack_trace() takes 45215 ns
[2.781477] save_stack_trace() takes 20191 ns
[2.781694] save_stack_trace() takes 20044 ns
[2.782589] save_stack_trace() takes 20292 ns
[2.782706] save_stack_trace() takes 20024 ns
[2.782706] save_stack_trace() takes 19881 ns
[2.782881] save_stack_trace() takes 24577 ns
[2.782706] save_stack_trace() takes 19901 ns
[2.783621] save_stack_trace() takes 24381 ns
[2.783621] save_stack_trace() takes 20205 ns
[2.783760] save_stack_trace() takes 19956 ns
[2.783718] save_stack_trace() takes 20280 ns
[2.784179] save_stack_trace() takes 20099 ns
[2.784835] save_stack_trace() takes 20055 ns
[2.785922] save_stack_trace() takes 20157 ns
[2.785922] save_stack_trace() takes 20140 ns
[2.786178] save_stack_trace() takes 20040 ns
[2.786877] save_stack_trace() takes 20102 ns
[2.795000] save_stack_trace() takes 21147 ns
[2.795397] save_stack_trace() takes 20230 ns
[2.795397] save_stack_trace() takes 31274 ns
[2.795739] save_stack_trace() takes 19706 ns
[2.796484] save_stack_trace() takes 20266 ns
[2.796484] save_stack_trace() takes 20902 ns
[2.797000] save_stack_trace() takes 38110 ns
[2.797510] save_stack_trace() takes 20224 ns
[2.798181] save_stack_trace() takes 20172 ns
[2.798837] save_stack_trace() takes 20824 ns

Signed-off-by: Byungchul Park 
---
 arch/x86/include/asm/stacktrace.h | 1 +
 arch/x86/kernel/dumpstack.c   | 2 ++
 arch/x86/kernel/dumpstack_32.c| 2 ++
 arch/x86/kernel/stacktrace.c  | 7 +++
 4 files changed, 12 insertions(+)

diff --git a/arch/x86/include/asm/stacktrace.h 
b/arch/x86/include/asm/stacktrace.h
index 70bbe39..fc572e7 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -41,6 +41,7 @@ struct stacktrace_ops {
/* On negative return stop dumping */
int (*stack)(void *data, char *name);
walk_stack_twalk_stack;
+   int (*end_walk)(void *data);
 };
 
 void dump_trace(struct task_struct *tsk, struct pt_regs *regs,
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 9c30acf..355fe8f 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -115,6 +115,8 @@ print_context_stack(struct thread_info *tinfo,
print_ftrace_graph_addr(addr, data, ops, tinfo, graph);
}
stack++;
+   if (ops->end_walk && ops->end_walk(data))
+   break;
}
return bp;
 }
diff --git a/arch/x86/kernel/dumpstack_32.c b/arch/x86/kernel/dumpstack_32.c
index 464ffd6..cc51419 100644
--- a/arch/x86/kernel/dumpstack_32.c
+++ b/arch/x86/kernel/dumpstack_32.c
@@ -71,6 +71,8 @@ void dump_trace(struct task_struct *task, 

[RFC 02/12] lockdep: Add a function building a chain between two hlocks

2016-06-19 Thread Byungchul Park
add_chain_cache() can only be used by current context since it
depends on a task's held_locks which is not protected by lock.
However, it would be useful if a dependency chain can be built
in any context. This patch makes the chain building not depend
on its context.

Especially, crossrelease feature wants to do this. Crossrelease
feature introduces a additional dependency chain consisting of 2
lock classes using 2 hlock instances, to connect dependency
between different contexts.

Signed-off-by: Byungchul Park 
---
 kernel/locking/lockdep.c | 57 
 1 file changed, 57 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index efd001c..4d51208 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2010,6 +2010,63 @@ struct lock_class *lock_chain_get_class(struct 
lock_chain *chain, int i)
return lock_classes + chain_hlocks[chain->base + i];
 }
 
+/*
+ * This can make it possible to build a chain between just two
+ * specified hlocks rather than between already held locks of
+ * the current task and newly held lock, which can be done by
+ * add_chain_cache().
+ *
+ * add_chain_cache() must be done within the lock owner's context,
+ * however this can be called in any context if two racy-less hlock
+ * instances were already taken by caller. Thus this can be useful
+ * when building a chain between two hlocks regardless of context.
+ */
+static inline int add_chain_cache_2hlocks(struct held_lock *prev,
+ struct held_lock *next,
+ u64 chain_key)
+{
+   struct hlist_head *hash_head = chainhashentry(chain_key);
+   struct lock_chain *chain;
+
+   /*
+* Allocate a new chain entry from the static array, and add
+* it to the hash:
+*/
+
+   /*
+* We might need to take the graph lock, ensure we've got IRQs
+* disabled to make this an IRQ-safe lock.. for recursion reasons
+* lockdep won't complain about its own locking errors.
+*/
+   if (DEBUG_LOCKS_WARN_ON(!irqs_disabled()))
+   return 0;
+
+   if (unlikely(nr_lock_chains >= MAX_LOCKDEP_CHAINS)) {
+   if (!debug_locks_off_graph_unlock())
+   return 0;
+
+   print_lockdep_off("BUG: MAX_LOCKDEP_CHAINS too low!");
+   dump_stack();
+   return 0;
+   }
+
+   chain = lock_chains + nr_lock_chains++;
+   chain->chain_key = chain_key;
+   chain->irq_context = next->irq_context;
+   chain->depth = 2;
+   if (likely(nr_chain_hlocks + chain->depth <= MAX_LOCKDEP_CHAIN_HLOCKS)) 
{
+   chain->base = nr_chain_hlocks;
+   nr_chain_hlocks += chain->depth;
+   chain_hlocks[chain->base] = prev->class_idx - 1;
+   chain_hlocks[chain->base + 1] = next->class_idx -1;
+   }
+   hlist_add_head_rcu(>entry, hash_head);
+   debug_atomic_inc(chain_lookup_misses);
+   inc_chains();
+
+   return 1;
+}
+
 static inline int add_chain_cache(struct task_struct *curr,
  struct held_lock *hlock,
  u64 chain_key)
-- 
1.9.1



[RFC 12/12] x86/dumpstack: Optimize save_stack_trace

2016-06-19 Thread Byungchul Park
Currently, x86 implementation of save_stack_trace() is walking all stack
region word by word regardless of what the trace->max_entries is.
However, it's unnecessary to walk after already fulfilling caller's
requirement, say, if trace->nr_entries >= trace->max_entries is true.

For example, CONFIG_LOCKDEP_CROSSRELEASE implementation calls
save_stack_trace() with max_entries = 5 frequently. I measured its
overhead and printed its difference of sched_clock() with my QEMU x86
machine.

The latency was improved over 70% when trace->max_entries = 5.

Before this patch:

[2.326940] save_stack_trace() takes 83931 ns
[2.326389] save_stack_trace() takes 62576 ns
[2.327575] save_stack_trace() takes 58826 ns
[2.327000] save_stack_trace() takes 88980 ns
[2.327424] save_stack_trace() takes 59831 ns
[2.327575] save_stack_trace() takes 58482 ns
[2.327597] save_stack_trace() takes 87114 ns
[2.327931] save_stack_trace() takes 121140 ns
[2.327434] save_stack_trace() takes 64321 ns
[2.328632] save_stack_trace() takes 84997 ns
[2.328000] save_stack_trace() takes 115037 ns
[2.328460] save_stack_trace() takes 72292 ns
[2.328632] save_stack_trace() takes 61236 ns
[2.328567] save_stack_trace() takes 7 ns
[2.328867] save_stack_trace() takes 79525 ns
[2.328460] save_stack_trace() takes 64902 ns
[2.329585] save_stack_trace() takes 58760 ns
[2.329000] save_stack_trace() takes 91349 ns
[2.329414] save_stack_trace() takes 60069 ns
[2.329585] save_stack_trace() takes 61012 ns
[2.329573] save_stack_trace() takes 76820 ns
[2.329863] save_stack_trace() takes 62131 ns
[2.33] save_stack_trace() takes 99476 ns
[2.329846] save_stack_trace() takes 62419 ns
[2.33] save_stack_trace() takes 88918 ns
[2.330253] save_stack_trace() takes 73669 ns
[2.330520] save_stack_trace() takes 67876 ns
[2.330671] save_stack_trace() takes 75963 ns
[2.330983] save_stack_trace() takes 95079 ns
[2.330451] save_stack_trace() takes 62352 ns

After this patch:

[2.780735] save_stack_trace() takes 19902 ns
[2.780718] save_stack_trace() takes 20240 ns
[2.781692] save_stack_trace() takes 45215 ns
[2.781477] save_stack_trace() takes 20191 ns
[2.781694] save_stack_trace() takes 20044 ns
[2.782589] save_stack_trace() takes 20292 ns
[2.782706] save_stack_trace() takes 20024 ns
[2.782706] save_stack_trace() takes 19881 ns
[2.782881] save_stack_trace() takes 24577 ns
[2.782706] save_stack_trace() takes 19901 ns
[2.783621] save_stack_trace() takes 24381 ns
[2.783621] save_stack_trace() takes 20205 ns
[2.783760] save_stack_trace() takes 19956 ns
[2.783718] save_stack_trace() takes 20280 ns
[2.784179] save_stack_trace() takes 20099 ns
[2.784835] save_stack_trace() takes 20055 ns
[2.785922] save_stack_trace() takes 20157 ns
[2.785922] save_stack_trace() takes 20140 ns
[2.786178] save_stack_trace() takes 20040 ns
[2.786877] save_stack_trace() takes 20102 ns
[2.795000] save_stack_trace() takes 21147 ns
[2.795397] save_stack_trace() takes 20230 ns
[2.795397] save_stack_trace() takes 31274 ns
[2.795739] save_stack_trace() takes 19706 ns
[2.796484] save_stack_trace() takes 20266 ns
[2.796484] save_stack_trace() takes 20902 ns
[2.797000] save_stack_trace() takes 38110 ns
[2.797510] save_stack_trace() takes 20224 ns
[2.798181] save_stack_trace() takes 20172 ns
[2.798837] save_stack_trace() takes 20824 ns

Signed-off-by: Byungchul Park 
---
 arch/x86/include/asm/stacktrace.h | 1 +
 arch/x86/kernel/dumpstack.c   | 2 ++
 arch/x86/kernel/dumpstack_32.c| 2 ++
 arch/x86/kernel/stacktrace.c  | 7 +++
 4 files changed, 12 insertions(+)

diff --git a/arch/x86/include/asm/stacktrace.h 
b/arch/x86/include/asm/stacktrace.h
index 70bbe39..fc572e7 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -41,6 +41,7 @@ struct stacktrace_ops {
/* On negative return stop dumping */
int (*stack)(void *data, char *name);
walk_stack_twalk_stack;
+   int (*end_walk)(void *data);
 };
 
 void dump_trace(struct task_struct *tsk, struct pt_regs *regs,
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 9c30acf..355fe8f 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -115,6 +115,8 @@ print_context_stack(struct thread_info *tinfo,
print_ftrace_graph_addr(addr, data, ops, tinfo, graph);
}
stack++;
+   if (ops->end_walk && ops->end_walk(data))
+   break;
}
return bp;
 }
diff --git a/arch/x86/kernel/dumpstack_32.c b/arch/x86/kernel/dumpstack_32.c
index 464ffd6..cc51419 100644
--- a/arch/x86/kernel/dumpstack_32.c
+++ b/arch/x86/kernel/dumpstack_32.c
@@ -71,6 +71,8 @@ void dump_trace(struct task_struct *task, struct pt_regs 
*regs,
   

[RFC 02/12] lockdep: Add a function building a chain between two hlocks

2016-06-19 Thread Byungchul Park
add_chain_cache() can only be used by current context since it
depends on a task's held_locks which is not protected by lock.
However, it would be useful if a dependency chain can be built
in any context. This patch makes the chain building not depend
on its context.

Especially, crossrelease feature wants to do this. Crossrelease
feature introduces a additional dependency chain consisting of 2
lock classes using 2 hlock instances, to connect dependency
between different contexts.

Signed-off-by: Byungchul Park 
---
 kernel/locking/lockdep.c | 57 
 1 file changed, 57 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index efd001c..4d51208 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2010,6 +2010,63 @@ struct lock_class *lock_chain_get_class(struct 
lock_chain *chain, int i)
return lock_classes + chain_hlocks[chain->base + i];
 }
 
+/*
+ * This can make it possible to build a chain between just two
+ * specified hlocks rather than between already held locks of
+ * the current task and newly held lock, which can be done by
+ * add_chain_cache().
+ *
+ * add_chain_cache() must be done within the lock owner's context,
+ * however this can be called in any context if two racy-less hlock
+ * instances were already taken by caller. Thus this can be useful
+ * when building a chain between two hlocks regardless of context.
+ */
+static inline int add_chain_cache_2hlocks(struct held_lock *prev,
+ struct held_lock *next,
+ u64 chain_key)
+{
+   struct hlist_head *hash_head = chainhashentry(chain_key);
+   struct lock_chain *chain;
+
+   /*
+* Allocate a new chain entry from the static array, and add
+* it to the hash:
+*/
+
+   /*
+* We might need to take the graph lock, ensure we've got IRQs
+* disabled to make this an IRQ-safe lock.. for recursion reasons
+* lockdep won't complain about its own locking errors.
+*/
+   if (DEBUG_LOCKS_WARN_ON(!irqs_disabled()))
+   return 0;
+
+   if (unlikely(nr_lock_chains >= MAX_LOCKDEP_CHAINS)) {
+   if (!debug_locks_off_graph_unlock())
+   return 0;
+
+   print_lockdep_off("BUG: MAX_LOCKDEP_CHAINS too low!");
+   dump_stack();
+   return 0;
+   }
+
+   chain = lock_chains + nr_lock_chains++;
+   chain->chain_key = chain_key;
+   chain->irq_context = next->irq_context;
+   chain->depth = 2;
+   if (likely(nr_chain_hlocks + chain->depth <= MAX_LOCKDEP_CHAIN_HLOCKS)) 
{
+   chain->base = nr_chain_hlocks;
+   nr_chain_hlocks += chain->depth;
+   chain_hlocks[chain->base] = prev->class_idx - 1;
+   chain_hlocks[chain->base + 1] = next->class_idx -1;
+   }
+   hlist_add_head_rcu(>entry, hash_head);
+   debug_atomic_inc(chain_lookup_misses);
+   inc_chains();
+
+   return 1;
+}
+
 static inline int add_chain_cache(struct task_struct *curr,
  struct held_lock *hlock,
  u64 chain_key)
-- 
1.9.1



[PATCH 2/5] lockdep: Apply bitlock to bit_spin_lock

2016-06-19 Thread Byungchul Park
Currently, bit_spin_lock does not use lockdep_map at all. Of course,
the lock correctness validator is not supported for bit_spin_lock.
This patch makes bit_spin_lock possible to use the validator using
CONFIG_BITLOCK_ALLOC.

Signed-off-by: Byungchul Park 
---
 include/linux/bit_spinlock.h | 57 +++-
 1 file changed, 51 insertions(+), 6 deletions(-)

diff --git a/include/linux/bit_spinlock.h b/include/linux/bit_spinlock.h
index 3b5bafc..3f8b013 100644
--- a/include/linux/bit_spinlock.h
+++ b/include/linux/bit_spinlock.h
@@ -6,13 +6,43 @@
 #include 
 #include 
 
+#ifdef CONFIG_BITLOCK_ALLOC
+#include 
+#define bit_spin_init(b, a)\
+do {   \
+   static struct lock_class_key __key; \
+   bitlock_init(b, a, #b "@" #a, &__key);  \
+} while (0)
+
+static inline void bit_spin_free(int bitnum, unsigned long *addr)
+{
+   bitlock_free(bitnum, addr);
+}
+
+static inline void bit_spin_acquire(int bitnum, unsigned long *addr, int try)
+{
+   struct lockdep_map *map = bitlock_get_map(bitnum, addr, BIT_ACQUIRE);
+   if (map)
+   spin_acquire(map, 0, try, _RET_IP_);
+}
+
+static inline void bit_spin_release(int bitnum, unsigned long *addr)
+{
+   struct lockdep_map *map = bitlock_get_map(bitnum, addr, BIT_RELEASE);
+   if (map)
+   spin_release(map, 0, _RET_IP_);
+}
+#else
+static inline void bit_spin_init(int bitnum, unsigned long *addr) {}
+static inline void bit_spin_free(int bitnum, unsigned long *addr) {}
+static inline void bit_spin_acquire(int bitnum, unsigned long *addr, int try) 
{}
+static inline void bit_spin_release(int bitnum, unsigned long *addr) {}
+#endif
+
 /*
- *  bit-based spin_lock()
- *
- * Don't use this unless you really need to: spin_lock() and spin_unlock()
- * are significantly faster.
+ * bit-based spin_lock() without lock acquiring
  */
-static inline void bit_spin_lock(int bitnum, unsigned long *addr)
+static inline void do_raw_bit_spin_lock(int bitnum, unsigned long *addr)
 {
/*
 * Assuming the lock is uncontended, this never enters
@@ -21,7 +51,6 @@ static inline void bit_spin_lock(int bitnum, unsigned long 
*addr)
 * busywait with less bus contention for a good time to
 * attempt to acquire the lock bit.
 */
-   preempt_disable();
 #if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
while (unlikely(test_and_set_bit_lock(bitnum, addr))) {
preempt_enable();
@@ -35,6 +64,19 @@ static inline void bit_spin_lock(int bitnum, unsigned long 
*addr)
 }
 
 /*
+ *  bit-based spin_lock()
+ *
+ * Don't use this unless you really need to: spin_lock() and spin_unlock()
+ * are significantly faster.
+ */
+static inline void bit_spin_lock(int bitnum, unsigned long *addr)
+{
+   preempt_disable();
+   bit_spin_acquire(bitnum, addr, 0);
+   do_raw_bit_spin_lock(bitnum, addr);
+}
+
+/*
  * Return true if it was acquired
  */
 static inline int bit_spin_trylock(int bitnum, unsigned long *addr)
@@ -46,6 +88,7 @@ static inline int bit_spin_trylock(int bitnum, unsigned long 
*addr)
return 0;
}
 #endif
+   bit_spin_acquire(bitnum, addr, 1);
__acquire(bitlock);
return 1;
 }
@@ -55,6 +98,7 @@ static inline int bit_spin_trylock(int bitnum, unsigned long 
*addr)
  */
 static inline void bit_spin_unlock(int bitnum, unsigned long *addr)
 {
+   bit_spin_release(bitnum, addr);
 #ifdef CONFIG_DEBUG_SPINLOCK
BUG_ON(!test_bit(bitnum, addr));
 #endif
@@ -72,6 +116,7 @@ static inline void bit_spin_unlock(int bitnum, unsigned long 
*addr)
  */
 static inline void __bit_spin_unlock(int bitnum, unsigned long *addr)
 {
+   bit_spin_release(bitnum, addr);
 #ifdef CONFIG_DEBUG_SPINLOCK
BUG_ON(!test_bit(bitnum, addr));
 #endif
-- 
1.9.1



[PATCH 0/5] Implement bitlock map allocator

2016-06-19 Thread Byungchul Park
Currently, bit-based lock e.g. bit_spin_lock cannot use the lock
correctness validator using lockdep. However, it would be useful if
the validator supports for even bit-based lock as well.

Therefore, this patch provides interface for allocation and freeing
lockdep_map for bit-based lock so that the bit-based lock can also use
the lock correctness validator with the lockdep_map, allocated for each
bit address.

This patch can be applied to any bit_spin_lock user except slab
allocator where I am not sure if using kmalloc is safe. Anyway I chose
two example to apply bitlock map allocator, zram and buffer head.
And applied it and included it in this patch set.

Byungchul Park (5):
  lockdep: Implement bitlock map allocator
  lockdep: Apply bitlock to bit_spin_lock
  lockdep: Apply bit_spin_lock lockdep to zram
  fs/buffer.c: Remove trailing white space
  lockdep: Apply bit_spin_lock lockdep to BH_Uptodate_Lock

 drivers/block/zram/zram_drv.c |  10 +++
 fs/buffer.c   |  24 +++
 include/linux/bit_spinlock.h  |  57 ++--
 include/linux/bitlock.h   |  20 ++
 kernel/locking/Makefile   |   1 +
 kernel/locking/bitlock_map.c  | 147 ++
 lib/Kconfig.debug |  10 +++
 7 files changed, 252 insertions(+), 17 deletions(-)
 create mode 100644 include/linux/bitlock.h
 create mode 100644 kernel/locking/bitlock_map.c

-- 
1.9.1



[PATCH 1/5] lockdep: Implement bitlock map allocator

2016-06-19 Thread Byungchul Park
Currently, bit-based lock e.g. bit_spin_lock cannot use the lock
correctness validator using lockdep. However, it would be useful if
the validator supports for even bit-based lock as well.

Therefore, this patch provides interface for allocation and freeing
lockdep_map for bit-based lock so that the bit-based lock can also use
the lock correctness validator with the lockdep_map, allocated for each
bit address.

Signed-off-by: Byungchul Park 
---
 include/linux/bitlock.h  |  20 ++
 kernel/locking/Makefile  |   1 +
 kernel/locking/bitlock_map.c | 147 +++
 lib/Kconfig.debug|  10 +++
 4 files changed, 178 insertions(+)
 create mode 100644 include/linux/bitlock.h
 create mode 100644 kernel/locking/bitlock_map.c

diff --git a/include/linux/bitlock.h b/include/linux/bitlock.h
new file mode 100644
index 000..1c8a46f
--- /dev/null
+++ b/include/linux/bitlock.h
@@ -0,0 +1,20 @@
+#ifndef __LINUX_BITLOCK_H
+#define __LINUX_BITLOCK_H
+
+#include 
+
+struct bitlock_map {
+   struct hlist_node   hash_entry;
+   unsigned long   bitaddr; /* ID */
+   struct lockdep_map  map;
+   int ref; /* reference count */
+};
+
+#define BIT_ACQUIRE 0 /* Increase bmap reference count */
+#define BIT_RELEASE 1 /* Decrease bmap reference count */
+#define BIT_OTHER   2 /* No touch bmap reference count */
+
+extern struct lockdep_map *bitlock_get_map(int bitnum, unsigned long *addr, 
int type);
+extern void bitlock_init(int bitnum, unsigned long *addr, const char *name, 
struct lock_class_key *key);
+extern void bitlock_free(int bitnum, unsigned long *addr);
+#endif /* __LINUX_BITLOCK_H */
diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile
index 8e96f6c..8f4aa9e 100644
--- a/kernel/locking/Makefile
+++ b/kernel/locking/Makefile
@@ -26,3 +26,4 @@ obj-$(CONFIG_RWSEM_GENERIC_SPINLOCK) += rwsem-spinlock.o
 obj-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem-xadd.o
 obj-$(CONFIG_QUEUED_RWLOCKS) += qrwlock.o
 obj-$(CONFIG_LOCK_TORTURE_TEST) += locktorture.o
+obj-$(CONFIG_BITLOCK_ALLOC) += bitlock_map.o
diff --git a/kernel/locking/bitlock_map.c b/kernel/locking/bitlock_map.c
new file mode 100644
index 000..e2b576f
--- /dev/null
+++ b/kernel/locking/bitlock_map.c
@@ -0,0 +1,147 @@
+/*
+ * kernel/bitlock_map.c
+ *
+ * Lockdep allocator for bit-based lock
+ *
+ * Written by Byungchul Park:
+ *
+ * Thanks to Minchan Kim for coming up with the initial suggestion, that is
+ * to make even a kind of bitlock possible to use the runtime locking
+ * correctness validator.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define BITLOCK_HASH_BITS  15U
+#define BITLOCK_HASH_SIZE  (1U << BITLOCK_HASH_BITS)
+#define bitlock_hashentry(key) (bitlock_hash + hash_long(key, 
BITLOCK_HASH_BITS))
+
+static struct hlist_head bitlock_hash[BITLOCK_HASH_SIZE];
+
+static DEFINE_SPINLOCK(bitlock_spin);
+
+static inline unsigned long get_bitaddr(int bitnum, unsigned long *addr)
+{
+   return (unsigned long)((char *)addr + bitnum);
+}
+
+/* Caller must hold a lock to protect hlist traversal */
+static struct bitlock_map *look_up_bmap(int bitnum, unsigned long *addr)
+{
+   struct hlist_head *hash_head;
+   struct bitlock_map *bmap;
+   unsigned long bitaddr = get_bitaddr(bitnum, addr);
+
+   hash_head = bitlock_hashentry(bitaddr);
+   hlist_for_each_entry(bmap, hash_head, hash_entry)
+   if (bmap->bitaddr == bitaddr)
+   return bmap;
+
+   return NULL;
+}
+
+static struct bitlock_map *alloc_bmap(void)
+{
+   struct bitlock_map *ret;
+
+   ret = kmalloc(sizeof(struct bitlock_map), GFP_NOWAIT | __GFP_NOWARN);
+   if (!ret)
+   pr_warn("bitlock: Can't kmalloc a bitlock map.\n");
+
+   return ret;
+}
+
+static void free_bmap(struct bitlock_map *bmap)
+{
+   kfree(bmap);
+}
+
+struct lockdep_map *bitlock_get_map(int bitnum, unsigned long *addr, int type)
+{
+   struct bitlock_map *bmap;
+   struct lockdep_map *map = NULL;
+   unsigned long flags;
+
+   spin_lock_irqsave(_spin, flags);
+
+   bmap = look_up_bmap(bitnum, addr);
+   if (bmap) {
+   /*
+* bmap->ref is for checking reliablity.
+* One pair e.i. bitlock_acquire and
+* bitlock_release should keep bmap->ref
+* zero.
+*/
+   if (type == BIT_ACQUIRE)
+   bmap->ref++;
+   else if (type == BIT_RELEASE)
+   bmap->ref--;
+   map = >map;
+   }
+
+   spin_unlock_irqrestore(_spin, flags);
+
+   return map;
+}
+EXPORT_SYMBOL_GPL(bitlock_get_map);
+
+void bitlock_init(int bitnum, unsigned long *addr, const char *name,
+   struct lock_class_key *key)
+{
+   struct hlist_head *hash_head;
+   struct bitlock_map *bmap;
+   unsigned long flags;
+   

[PATCH 2/5] lockdep: Apply bitlock to bit_spin_lock

2016-06-19 Thread Byungchul Park
Currently, bit_spin_lock does not use lockdep_map at all. Of course,
the lock correctness validator is not supported for bit_spin_lock.
This patch makes bit_spin_lock possible to use the validator using
CONFIG_BITLOCK_ALLOC.

Signed-off-by: Byungchul Park 
---
 include/linux/bit_spinlock.h | 57 +++-
 1 file changed, 51 insertions(+), 6 deletions(-)

diff --git a/include/linux/bit_spinlock.h b/include/linux/bit_spinlock.h
index 3b5bafc..3f8b013 100644
--- a/include/linux/bit_spinlock.h
+++ b/include/linux/bit_spinlock.h
@@ -6,13 +6,43 @@
 #include 
 #include 
 
+#ifdef CONFIG_BITLOCK_ALLOC
+#include 
+#define bit_spin_init(b, a)\
+do {   \
+   static struct lock_class_key __key; \
+   bitlock_init(b, a, #b "@" #a, &__key);  \
+} while (0)
+
+static inline void bit_spin_free(int bitnum, unsigned long *addr)
+{
+   bitlock_free(bitnum, addr);
+}
+
+static inline void bit_spin_acquire(int bitnum, unsigned long *addr, int try)
+{
+   struct lockdep_map *map = bitlock_get_map(bitnum, addr, BIT_ACQUIRE);
+   if (map)
+   spin_acquire(map, 0, try, _RET_IP_);
+}
+
+static inline void bit_spin_release(int bitnum, unsigned long *addr)
+{
+   struct lockdep_map *map = bitlock_get_map(bitnum, addr, BIT_RELEASE);
+   if (map)
+   spin_release(map, 0, _RET_IP_);
+}
+#else
+static inline void bit_spin_init(int bitnum, unsigned long *addr) {}
+static inline void bit_spin_free(int bitnum, unsigned long *addr) {}
+static inline void bit_spin_acquire(int bitnum, unsigned long *addr, int try) 
{}
+static inline void bit_spin_release(int bitnum, unsigned long *addr) {}
+#endif
+
 /*
- *  bit-based spin_lock()
- *
- * Don't use this unless you really need to: spin_lock() and spin_unlock()
- * are significantly faster.
+ * bit-based spin_lock() without lock acquiring
  */
-static inline void bit_spin_lock(int bitnum, unsigned long *addr)
+static inline void do_raw_bit_spin_lock(int bitnum, unsigned long *addr)
 {
/*
 * Assuming the lock is uncontended, this never enters
@@ -21,7 +51,6 @@ static inline void bit_spin_lock(int bitnum, unsigned long 
*addr)
 * busywait with less bus contention for a good time to
 * attempt to acquire the lock bit.
 */
-   preempt_disable();
 #if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
while (unlikely(test_and_set_bit_lock(bitnum, addr))) {
preempt_enable();
@@ -35,6 +64,19 @@ static inline void bit_spin_lock(int bitnum, unsigned long 
*addr)
 }
 
 /*
+ *  bit-based spin_lock()
+ *
+ * Don't use this unless you really need to: spin_lock() and spin_unlock()
+ * are significantly faster.
+ */
+static inline void bit_spin_lock(int bitnum, unsigned long *addr)
+{
+   preempt_disable();
+   bit_spin_acquire(bitnum, addr, 0);
+   do_raw_bit_spin_lock(bitnum, addr);
+}
+
+/*
  * Return true if it was acquired
  */
 static inline int bit_spin_trylock(int bitnum, unsigned long *addr)
@@ -46,6 +88,7 @@ static inline int bit_spin_trylock(int bitnum, unsigned long 
*addr)
return 0;
}
 #endif
+   bit_spin_acquire(bitnum, addr, 1);
__acquire(bitlock);
return 1;
 }
@@ -55,6 +98,7 @@ static inline int bit_spin_trylock(int bitnum, unsigned long 
*addr)
  */
 static inline void bit_spin_unlock(int bitnum, unsigned long *addr)
 {
+   bit_spin_release(bitnum, addr);
 #ifdef CONFIG_DEBUG_SPINLOCK
BUG_ON(!test_bit(bitnum, addr));
 #endif
@@ -72,6 +116,7 @@ static inline void bit_spin_unlock(int bitnum, unsigned long 
*addr)
  */
 static inline void __bit_spin_unlock(int bitnum, unsigned long *addr)
 {
+   bit_spin_release(bitnum, addr);
 #ifdef CONFIG_DEBUG_SPINLOCK
BUG_ON(!test_bit(bitnum, addr));
 #endif
-- 
1.9.1



[PATCH 0/5] Implement bitlock map allocator

2016-06-19 Thread Byungchul Park
Currently, bit-based lock e.g. bit_spin_lock cannot use the lock
correctness validator using lockdep. However, it would be useful if
the validator supports for even bit-based lock as well.

Therefore, this patch provides interface for allocation and freeing
lockdep_map for bit-based lock so that the bit-based lock can also use
the lock correctness validator with the lockdep_map, allocated for each
bit address.

This patch can be applied to any bit_spin_lock user except slab
allocator where I am not sure if using kmalloc is safe. Anyway I chose
two example to apply bitlock map allocator, zram and buffer head.
And applied it and included it in this patch set.

Byungchul Park (5):
  lockdep: Implement bitlock map allocator
  lockdep: Apply bitlock to bit_spin_lock
  lockdep: Apply bit_spin_lock lockdep to zram
  fs/buffer.c: Remove trailing white space
  lockdep: Apply bit_spin_lock lockdep to BH_Uptodate_Lock

 drivers/block/zram/zram_drv.c |  10 +++
 fs/buffer.c   |  24 +++
 include/linux/bit_spinlock.h  |  57 ++--
 include/linux/bitlock.h   |  20 ++
 kernel/locking/Makefile   |   1 +
 kernel/locking/bitlock_map.c  | 147 ++
 lib/Kconfig.debug |  10 +++
 7 files changed, 252 insertions(+), 17 deletions(-)
 create mode 100644 include/linux/bitlock.h
 create mode 100644 kernel/locking/bitlock_map.c

-- 
1.9.1



[PATCH 1/5] lockdep: Implement bitlock map allocator

2016-06-19 Thread Byungchul Park
Currently, bit-based lock e.g. bit_spin_lock cannot use the lock
correctness validator using lockdep. However, it would be useful if
the validator supports for even bit-based lock as well.

Therefore, this patch provides interface for allocation and freeing
lockdep_map for bit-based lock so that the bit-based lock can also use
the lock correctness validator with the lockdep_map, allocated for each
bit address.

Signed-off-by: Byungchul Park 
---
 include/linux/bitlock.h  |  20 ++
 kernel/locking/Makefile  |   1 +
 kernel/locking/bitlock_map.c | 147 +++
 lib/Kconfig.debug|  10 +++
 4 files changed, 178 insertions(+)
 create mode 100644 include/linux/bitlock.h
 create mode 100644 kernel/locking/bitlock_map.c

diff --git a/include/linux/bitlock.h b/include/linux/bitlock.h
new file mode 100644
index 000..1c8a46f
--- /dev/null
+++ b/include/linux/bitlock.h
@@ -0,0 +1,20 @@
+#ifndef __LINUX_BITLOCK_H
+#define __LINUX_BITLOCK_H
+
+#include 
+
+struct bitlock_map {
+   struct hlist_node   hash_entry;
+   unsigned long   bitaddr; /* ID */
+   struct lockdep_map  map;
+   int ref; /* reference count */
+};
+
+#define BIT_ACQUIRE 0 /* Increase bmap reference count */
+#define BIT_RELEASE 1 /* Decrease bmap reference count */
+#define BIT_OTHER   2 /* No touch bmap reference count */
+
+extern struct lockdep_map *bitlock_get_map(int bitnum, unsigned long *addr, 
int type);
+extern void bitlock_init(int bitnum, unsigned long *addr, const char *name, 
struct lock_class_key *key);
+extern void bitlock_free(int bitnum, unsigned long *addr);
+#endif /* __LINUX_BITLOCK_H */
diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile
index 8e96f6c..8f4aa9e 100644
--- a/kernel/locking/Makefile
+++ b/kernel/locking/Makefile
@@ -26,3 +26,4 @@ obj-$(CONFIG_RWSEM_GENERIC_SPINLOCK) += rwsem-spinlock.o
 obj-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem-xadd.o
 obj-$(CONFIG_QUEUED_RWLOCKS) += qrwlock.o
 obj-$(CONFIG_LOCK_TORTURE_TEST) += locktorture.o
+obj-$(CONFIG_BITLOCK_ALLOC) += bitlock_map.o
diff --git a/kernel/locking/bitlock_map.c b/kernel/locking/bitlock_map.c
new file mode 100644
index 000..e2b576f
--- /dev/null
+++ b/kernel/locking/bitlock_map.c
@@ -0,0 +1,147 @@
+/*
+ * kernel/bitlock_map.c
+ *
+ * Lockdep allocator for bit-based lock
+ *
+ * Written by Byungchul Park:
+ *
+ * Thanks to Minchan Kim for coming up with the initial suggestion, that is
+ * to make even a kind of bitlock possible to use the runtime locking
+ * correctness validator.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define BITLOCK_HASH_BITS  15U
+#define BITLOCK_HASH_SIZE  (1U << BITLOCK_HASH_BITS)
+#define bitlock_hashentry(key) (bitlock_hash + hash_long(key, 
BITLOCK_HASH_BITS))
+
+static struct hlist_head bitlock_hash[BITLOCK_HASH_SIZE];
+
+static DEFINE_SPINLOCK(bitlock_spin);
+
+static inline unsigned long get_bitaddr(int bitnum, unsigned long *addr)
+{
+   return (unsigned long)((char *)addr + bitnum);
+}
+
+/* Caller must hold a lock to protect hlist traversal */
+static struct bitlock_map *look_up_bmap(int bitnum, unsigned long *addr)
+{
+   struct hlist_head *hash_head;
+   struct bitlock_map *bmap;
+   unsigned long bitaddr = get_bitaddr(bitnum, addr);
+
+   hash_head = bitlock_hashentry(bitaddr);
+   hlist_for_each_entry(bmap, hash_head, hash_entry)
+   if (bmap->bitaddr == bitaddr)
+   return bmap;
+
+   return NULL;
+}
+
+static struct bitlock_map *alloc_bmap(void)
+{
+   struct bitlock_map *ret;
+
+   ret = kmalloc(sizeof(struct bitlock_map), GFP_NOWAIT | __GFP_NOWARN);
+   if (!ret)
+   pr_warn("bitlock: Can't kmalloc a bitlock map.\n");
+
+   return ret;
+}
+
+static void free_bmap(struct bitlock_map *bmap)
+{
+   kfree(bmap);
+}
+
+struct lockdep_map *bitlock_get_map(int bitnum, unsigned long *addr, int type)
+{
+   struct bitlock_map *bmap;
+   struct lockdep_map *map = NULL;
+   unsigned long flags;
+
+   spin_lock_irqsave(_spin, flags);
+
+   bmap = look_up_bmap(bitnum, addr);
+   if (bmap) {
+   /*
+* bmap->ref is for checking reliablity.
+* One pair e.i. bitlock_acquire and
+* bitlock_release should keep bmap->ref
+* zero.
+*/
+   if (type == BIT_ACQUIRE)
+   bmap->ref++;
+   else if (type == BIT_RELEASE)
+   bmap->ref--;
+   map = >map;
+   }
+
+   spin_unlock_irqrestore(_spin, flags);
+
+   return map;
+}
+EXPORT_SYMBOL_GPL(bitlock_get_map);
+
+void bitlock_init(int bitnum, unsigned long *addr, const char *name,
+   struct lock_class_key *key)
+{
+   struct hlist_head *hash_head;
+   struct bitlock_map *bmap;
+   unsigned long flags;
+   unsigned long 

Re: [PATCH 5/7] random: replace non-blocking pool with a Chacha20-based CRNG

2016-06-19 Thread Theodore Ts'o
On Mon, Jun 20, 2016 at 09:25:28AM +0800, Herbert Xu wrote:
> > Yes, I understand the argument that the networking stack is now
> > requiring the crypto layer --- but not all IOT devices may necessarily
> > require the IP stack (they might be using some alternate wireless
> > communications stack) and I'd much rather not make things worse.
> 
> Sure, but 99% of the kernels out there will have a crypto API.
> So why not use it if it's there and use the standalone chacha
> code otherwise?

It's work that I'm not convinced is worth the gain?  Perhaps I
shouldn't have buried the lede, but repeating a paragraph from later
in the message:

   So even if the AVX optimized is 100% faster than the generic version,
   it would change the time needed to create a 256 byte session key from
   1.68 microseconds to 1.55 microseconds.  And this is ignoring the
   extra overhead needed to set up AVX, the fact that this will require
   the kernel to do extra work doing the XSAVE and XRESTORE because of
   the use of the AVX registers, etc.

So in the absolute best case, this improves the time needed to create
a 256 bit session key by 0.13 microseconds.  And that assumes that the
extra setup and teardown overhead of an AVX optimized ChaCha20
(including the XSAVE and XRESTORE of the AVX registers, etc.) don't
end up making the CRNG **slower**.

The thing to remember about these optimizations is that they are great
for bulk encryption, but that's not what the getrandom(2) and
get_random_bytes() are used for, in general.  We don't need to create
multiple megabytes of random numbers at a time.  We need to create
them 256 bits at a time, with anti-backtracking protections in
between.  Think of this as the random number equivalent of artisinal
beer making, as opposed to Budweiser beer, which ferments the beer
literally in pipelines.  :-)

Yes, Budweiser may be made more efficiently using continuous
fermentation --- but would you want to drink it?   And if you have to
constantly start and stop the continuous fermentation pipeline, the net
result can actually be less efficient compared to doing it right in
the first place

   - Ted

P.S.  I haven't measured this to see, mainly because I really don't
care about the difference between 1.68 vs 1.55 microseconds, but there
is a good chance in the crypto layer that it might be a good idea to
have the system be smart enough to automatically fall back to using
the **non** optimized version if you only need to encrypt a small
amount of data.


Re: [PATCH 5/7] random: replace non-blocking pool with a Chacha20-based CRNG

2016-06-19 Thread Theodore Ts'o
On Mon, Jun 20, 2016 at 09:25:28AM +0800, Herbert Xu wrote:
> > Yes, I understand the argument that the networking stack is now
> > requiring the crypto layer --- but not all IOT devices may necessarily
> > require the IP stack (they might be using some alternate wireless
> > communications stack) and I'd much rather not make things worse.
> 
> Sure, but 99% of the kernels out there will have a crypto API.
> So why not use it if it's there and use the standalone chacha
> code otherwise?

It's work that I'm not convinced is worth the gain?  Perhaps I
shouldn't have buried the lede, but repeating a paragraph from later
in the message:

   So even if the AVX optimized is 100% faster than the generic version,
   it would change the time needed to create a 256 byte session key from
   1.68 microseconds to 1.55 microseconds.  And this is ignoring the
   extra overhead needed to set up AVX, the fact that this will require
   the kernel to do extra work doing the XSAVE and XRESTORE because of
   the use of the AVX registers, etc.

So in the absolute best case, this improves the time needed to create
a 256 bit session key by 0.13 microseconds.  And that assumes that the
extra setup and teardown overhead of an AVX optimized ChaCha20
(including the XSAVE and XRESTORE of the AVX registers, etc.) don't
end up making the CRNG **slower**.

The thing to remember about these optimizations is that they are great
for bulk encryption, but that's not what the getrandom(2) and
get_random_bytes() are used for, in general.  We don't need to create
multiple megabytes of random numbers at a time.  We need to create
them 256 bits at a time, with anti-backtracking protections in
between.  Think of this as the random number equivalent of artisinal
beer making, as opposed to Budweiser beer, which ferments the beer
literally in pipelines.  :-)

Yes, Budweiser may be made more efficiently using continuous
fermentation --- but would you want to drink it?   And if you have to
constantly start and stop the continuous fermentation pipeline, the net
result can actually be less efficient compared to doing it right in
the first place

   - Ted

P.S.  I haven't measured this to see, mainly because I really don't
care about the difference between 1.68 vs 1.55 microseconds, but there
is a good chance in the crypto layer that it might be a good idea to
have the system be smart enough to automatically fall back to using
the **non** optimized version if you only need to encrypt a small
amount of data.


Linux 4.7-rc4

2016-06-19 Thread Linus Torvalds
It's been a fairly normal week, and rc4 is out. Go test.

The statistics look very normal: about two thirds drivers, with the
rest being half architecture updates and half "misc" (small
ffilesystem updates,. some documentation, and a smattering of patches
elsewhere).

The bulk of the driver updates are usb and gpu, but there's iio, leds,
platform drivers, dma etc).

The arch updates are mostly arm, with some small x86 fixlets too.

But it's all pretty small, nothing particularly worrisome.

Shortlog appended for people who want to get a feel for the kinds of
things that have been happening.

Linus

---

Akinobu Mita (1):
  iio: pressure: bmp280: fix error message for wrong chip id

Alan Stern (1):
  USB: EHCI: avoid undefined pointer arithmetic and placate UBSAN

Alden Tondettar (3):
  udf: Don't BUG on missing metadata partition descriptor
  udf: Use IS_ERR when loading metadata mirror file entry
  udf: Use correct partition reference number for metadata

Alex Deucher (3):
  drm/radeon: fix asic initialization for virtualized environments
  drm/amdgpu/gfx7: fix broken condition check
  Revert "drm/amdgpu: add pipeline sync while vmid switch in same ctx"

Alex Hung (1):
  ideapad_laptop: Add an event for mic mute hotkey

Alexander Usyskin (1):
  mei: don't use wake_up_interruptible for wr_ctrl

Alexander Yarygin (1):
  KVM: s390: Add stats for PEI events

Alexandre Belloni (1):
  Documentation: configfs-usb-gadget-uvc: fix kernel version

Alison Schofield (1):
  iio: humidity: hdc100x: correct humidity integration time mask

Ander Conselvan de Oliveira (1):
  drm/i915: Fix NULL pointer deference when out of PLLs in IVB

Andres Rodriguez (1):
  amdgpu: fix asic initialization for virtualized environments (v2)

Andrew Goodbody (2):
  usb: musb: Ensure rx reinit occurs for shared_fifo endpoints
  usb: musb: Stop bulk endpoint while queue is rotated

Andrey Grodzovsky (1):
  drm/dp/mst: Always clear proposed vcpi table for port.

Andy Gross (1):
  usb: host: ehci-msm: Conditionally call ehci suspend/resume

Arnd Bergmann (4):
  ARM: samsung: improve static dma_mask definition
  ARM: exynos: don't select keyboard driver
  phy: exynos-mipi-video: avoid uninitialized variable use
  usb: dwc2: fix regression on big-endian PowerPC/ARM systems

Axel Lin (1):
  regulator: tps51632: Fix setting ramp delay

Ben Skeggs (1):
  drm/nouveau/iccsense: fix memory leak

Benjamin Tissoires (1):
  HID: multitouch: Add MT_QUIRK_NOT_SEEN_MEANS_UP to Surface Pro 3

Bin Liu (5):
  usb: gadget: fix spinlock dead lock in gadgetfs
  usb: musb: host: clear rxcsr error bit if set
  usb: musb: host: don't start next rx urb if current one failed
  usb: musb: only restore devctl when session was set in backup
  usb: musb: host: correct cppi dma channel for isoch transfer

Boris Brezillon (1):
  pwm: atmel-hlcdc: Fix default PWM polarity

Brian Norris (1):
  pwm: Improve args checking in pwm_apply_state()

Chandan Rajendra (1):
  Btrfs: btrfs_check_super_valid: Allow 4096 as stripesize

Chen-Yu Tsai (2):
  ARM: dts: sun6i: primo81: Drop constraints on dc1sw regulator
  ARM: dts: sun6i: yones-toptech-bs1078-v2: Drop constraints on
dc1sw regulator

Chris Wilson (2):
  drm/i915: Silence "unexpected child device config size" for VBT on 845g
  drm/i915: Only ignore eDP ports that are connected

Christian König (1):
  drm/radeon: don't use fractional dividers on RS[78]80 if SS is enabled

Crestez Dan Leonard (4):
  max44000: Remove scale from proximity
  iio: st_sensors: Init trigger before irq request
  iio: st_sensors: Disable DRDY at init time
  iio: Fix error handling in iio_trigger_attach_poll_func

Dan Carpenter (4):
  iio: dac: ad5592r: Off by one bug in ad5592r_alloc_channels()
  iio: humidity: am2315: Remove a stray unlock
  usb: f_fs: off by one bug in _ffs_func_bind()
  KEYS: potential uninitialized variable

Daniel Baluta (2):
  iio: bmi160: Fix output data rate for accel
  iio: bmi160: Fix ODR setting

Daniel Thompson (1):
  arm64: kgdb: Match pstate size with gdbserver protocol

Dave Gerlach (3):
  ARM: OMAP2+: AM43XX: Enable fixes for Cortex-A9 errata
  ARM: OMAP2+: Select OMAP_INTERCONNECT for SOC_AM43XX
  ARM: dts: am437x-sk-evm: Reduce i2c0 bus speed for tps65218

David Hildenbrand (1):
  KVM: s390: ignore IBC if zero

David Sterba (2):
  btrfs: use new error message helper in qgroup_account_snapshot
  btrfs: remove build fixup for qgroup_account_snapshot

Dennis Wassenberg (1):
  thinkpad_acpi: Add support for HKEY version 0x200

Doug Oucharek (1):
  staging: lustre: lnet: Don't access NULL NI on failure path

Enric Balletbo i Serra (2):
  ARM: dts: igep00x0: Add SD card-detect.
  ARM: dts: igep0020: Add SD card write-protect pin.

Fabio Estevam (2):
  ARM: imx6ul: 

Linux 4.7-rc4

2016-06-19 Thread Linus Torvalds
It's been a fairly normal week, and rc4 is out. Go test.

The statistics look very normal: about two thirds drivers, with the
rest being half architecture updates and half "misc" (small
ffilesystem updates,. some documentation, and a smattering of patches
elsewhere).

The bulk of the driver updates are usb and gpu, but there's iio, leds,
platform drivers, dma etc).

The arch updates are mostly arm, with some small x86 fixlets too.

But it's all pretty small, nothing particularly worrisome.

Shortlog appended for people who want to get a feel for the kinds of
things that have been happening.

Linus

---

Akinobu Mita (1):
  iio: pressure: bmp280: fix error message for wrong chip id

Alan Stern (1):
  USB: EHCI: avoid undefined pointer arithmetic and placate UBSAN

Alden Tondettar (3):
  udf: Don't BUG on missing metadata partition descriptor
  udf: Use IS_ERR when loading metadata mirror file entry
  udf: Use correct partition reference number for metadata

Alex Deucher (3):
  drm/radeon: fix asic initialization for virtualized environments
  drm/amdgpu/gfx7: fix broken condition check
  Revert "drm/amdgpu: add pipeline sync while vmid switch in same ctx"

Alex Hung (1):
  ideapad_laptop: Add an event for mic mute hotkey

Alexander Usyskin (1):
  mei: don't use wake_up_interruptible for wr_ctrl

Alexander Yarygin (1):
  KVM: s390: Add stats for PEI events

Alexandre Belloni (1):
  Documentation: configfs-usb-gadget-uvc: fix kernel version

Alison Schofield (1):
  iio: humidity: hdc100x: correct humidity integration time mask

Ander Conselvan de Oliveira (1):
  drm/i915: Fix NULL pointer deference when out of PLLs in IVB

Andres Rodriguez (1):
  amdgpu: fix asic initialization for virtualized environments (v2)

Andrew Goodbody (2):
  usb: musb: Ensure rx reinit occurs for shared_fifo endpoints
  usb: musb: Stop bulk endpoint while queue is rotated

Andrey Grodzovsky (1):
  drm/dp/mst: Always clear proposed vcpi table for port.

Andy Gross (1):
  usb: host: ehci-msm: Conditionally call ehci suspend/resume

Arnd Bergmann (4):
  ARM: samsung: improve static dma_mask definition
  ARM: exynos: don't select keyboard driver
  phy: exynos-mipi-video: avoid uninitialized variable use
  usb: dwc2: fix regression on big-endian PowerPC/ARM systems

Axel Lin (1):
  regulator: tps51632: Fix setting ramp delay

Ben Skeggs (1):
  drm/nouveau/iccsense: fix memory leak

Benjamin Tissoires (1):
  HID: multitouch: Add MT_QUIRK_NOT_SEEN_MEANS_UP to Surface Pro 3

Bin Liu (5):
  usb: gadget: fix spinlock dead lock in gadgetfs
  usb: musb: host: clear rxcsr error bit if set
  usb: musb: host: don't start next rx urb if current one failed
  usb: musb: only restore devctl when session was set in backup
  usb: musb: host: correct cppi dma channel for isoch transfer

Boris Brezillon (1):
  pwm: atmel-hlcdc: Fix default PWM polarity

Brian Norris (1):
  pwm: Improve args checking in pwm_apply_state()

Chandan Rajendra (1):
  Btrfs: btrfs_check_super_valid: Allow 4096 as stripesize

Chen-Yu Tsai (2):
  ARM: dts: sun6i: primo81: Drop constraints on dc1sw regulator
  ARM: dts: sun6i: yones-toptech-bs1078-v2: Drop constraints on
dc1sw regulator

Chris Wilson (2):
  drm/i915: Silence "unexpected child device config size" for VBT on 845g
  drm/i915: Only ignore eDP ports that are connected

Christian König (1):
  drm/radeon: don't use fractional dividers on RS[78]80 if SS is enabled

Crestez Dan Leonard (4):
  max44000: Remove scale from proximity
  iio: st_sensors: Init trigger before irq request
  iio: st_sensors: Disable DRDY at init time
  iio: Fix error handling in iio_trigger_attach_poll_func

Dan Carpenter (4):
  iio: dac: ad5592r: Off by one bug in ad5592r_alloc_channels()
  iio: humidity: am2315: Remove a stray unlock
  usb: f_fs: off by one bug in _ffs_func_bind()
  KEYS: potential uninitialized variable

Daniel Baluta (2):
  iio: bmi160: Fix output data rate for accel
  iio: bmi160: Fix ODR setting

Daniel Thompson (1):
  arm64: kgdb: Match pstate size with gdbserver protocol

Dave Gerlach (3):
  ARM: OMAP2+: AM43XX: Enable fixes for Cortex-A9 errata
  ARM: OMAP2+: Select OMAP_INTERCONNECT for SOC_AM43XX
  ARM: dts: am437x-sk-evm: Reduce i2c0 bus speed for tps65218

David Hildenbrand (1):
  KVM: s390: ignore IBC if zero

David Sterba (2):
  btrfs: use new error message helper in qgroup_account_snapshot
  btrfs: remove build fixup for qgroup_account_snapshot

Dennis Wassenberg (1):
  thinkpad_acpi: Add support for HKEY version 0x200

Doug Oucharek (1):
  staging: lustre: lnet: Don't access NULL NI on failure path

Enric Balletbo i Serra (2):
  ARM: dts: igep00x0: Add SD card-detect.
  ARM: dts: igep0020: Add SD card write-protect pin.

Fabio Estevam (2):
  ARM: imx6ul: 

linux-next: manual merge of the staging tree with the drm tree

2016-06-19 Thread Stephen Rothwell
Hi Greg,

Today's linux-next merge of the staging tree got a conflict in:

  drivers/staging/android/sync.h

between commit:

  76bf0db55439 ("dma-buf/fence: make fence context 64 bit v2")

from the drm tree and commits:

  342952d3a5c4 ("staging/android: remove 'destroyed' member from struct 
sync_timeline")
  1fe82e2e1486 ("staging/android: rename sync.h to sync_debug.h")

from the staging tree.

I fixed it up (I removed the file and applied the following fix patch)
and can carry the fix as necessary. This is now fixed as far as linux-next
is concerned, but any non trivial conflicts should be mentioned to your
upstream maintainer when your tree is submitted for merging.  You may
also want to consider cooperating with the maintainer of the conflicting
tree to minimise any particularly complex conflicts.

From: Stephen Rothwell 
Date: Mon, 20 Jun 2016 14:28:29 +1000
Subject: [PATCH] staging/android: merge fix up for sync.h renaming

Signed-off-by: Stephen Rothwell 
---
 drivers/staging/android/sync_debug.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/android/sync_debug.h 
b/drivers/staging/android/sync_debug.h
index 425ebc5c32aa..fab66396d421 100644
--- a/drivers/staging/android/sync_debug.h
+++ b/drivers/staging/android/sync_debug.h
@@ -34,7 +34,8 @@ struct sync_timeline {
charname[32];
 
/* protected by child_list_lock */
-   int context, value;
+   u64 context;
+   int value;
 
struct list_headchild_list_head;
spinlock_t  child_list_lock;
-- 
2.8.1

-- 
Cheers,
Stephen Rothwell


linux-next: manual merge of the staging tree with the drm tree

2016-06-19 Thread Stephen Rothwell
Hi Greg,

Today's linux-next merge of the staging tree got a conflict in:

  drivers/staging/android/sync.h

between commit:

  76bf0db55439 ("dma-buf/fence: make fence context 64 bit v2")

from the drm tree and commits:

  342952d3a5c4 ("staging/android: remove 'destroyed' member from struct 
sync_timeline")
  1fe82e2e1486 ("staging/android: rename sync.h to sync_debug.h")

from the staging tree.

I fixed it up (I removed the file and applied the following fix patch)
and can carry the fix as necessary. This is now fixed as far as linux-next
is concerned, but any non trivial conflicts should be mentioned to your
upstream maintainer when your tree is submitted for merging.  You may
also want to consider cooperating with the maintainer of the conflicting
tree to minimise any particularly complex conflicts.

From: Stephen Rothwell 
Date: Mon, 20 Jun 2016 14:28:29 +1000
Subject: [PATCH] staging/android: merge fix up for sync.h renaming

Signed-off-by: Stephen Rothwell 
---
 drivers/staging/android/sync_debug.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/android/sync_debug.h 
b/drivers/staging/android/sync_debug.h
index 425ebc5c32aa..fab66396d421 100644
--- a/drivers/staging/android/sync_debug.h
+++ b/drivers/staging/android/sync_debug.h
@@ -34,7 +34,8 @@ struct sync_timeline {
charname[32];
 
/* protected by child_list_lock */
-   int context, value;
+   u64 context;
+   int value;
 
struct list_headchild_list_head;
spinlock_t  child_list_lock;
-- 
2.8.1

-- 
Cheers,
Stephen Rothwell


Re: [PATCH v7 7/8] perf tools: Check write_backward during evlist config

2016-06-19 Thread Wangnan (F)



On 2016/6/17 5:47, Arnaldo Carvalho de Melo wrote:

Em Wed, Jun 15, 2016 at 02:23:34AM +, Wang Nan escreveu:

Before this patch, when using overwritable ring buffer on an old
kernel, error message is misleading:

  # ~/perf record -m 1 -e raw_syscalls:*/overwrite/ -a
  Error:
  The raw_syscalls:sys_enter event is not supported.

This patch output clear error message to tell user his/her kernel
is too old:

  # ~/perf record -m 1 -e raw_syscalls:*/overwrite/ -a
  Reading from overwrite event is not supported by this kernel
  Error:
  The raw_syscalls:sys_enter event is not supported.

So I went to see if exposing that missing_features struct outside
evsel.c was strictly needed and found that we already have fallbacking
for this feature (attr.write_backwards) i.e. if we set it and
sys_perf_event_open() fails, we will check if we are asking the kernel
for some attr. field that it doesn't supports, set that missing_features
and try again.

But the way this was done for attr.write_backwards was buggy, as we need
to check features in the inverse order of their introduction to the
kernel, so that a newer tool checks first the newest perf_event_attr
fields, detecting that the older kernel doesn't have support for them.
The patch that introduced write_backwards support ([1]) in perf_evsel__open()
did this checking after all the other older attributes, wrongly.

[1]: b90dc17a5d14 ("perf evsel: Add overwrite attribute and check 
write_backward")

Also we shouldn't even try to call sys_perf_event_open if
perf_missing_features.write_backward is true and evsel->overwrite is
also true, the old code would check this only after successfully opening
the fd, do it before the open loop.

Please take a look at the following patch, see if it is sufficient for
handling older kernels, probably we need to emit a message to the user,
but that has to be done at the builtin- level, i.e. at the tool, i.e.
perf_evsel_open__strerror() should have what it takes to figure out this
extra error and provide/ a proper string, lemme add this to the patch...
done, please check:

write_backwards_fallback.patch:


[SNIP]

  
@@ -1496,7 +1493,10 @@ try_fallback:

 * Must probe features in the order they were added to the
 * perf_event_attr interface.
 */


I read this comment but misunderstand. I thought 'order' means newest last.

Will try your patch. Thank you.


-   if (!perf_missing_features.clockid_wrong && evsel->attr.use_clockid) {
+   if (!perf_missing_features.write_backward && 
evsel->attr.write_backward) {
+   perf_missing_features.write_backward = true;
+   goto fallback_missing_features;
+   } else if (!perf_missing_features.clockid_wrong && 
evsel->attr.use_clockid) {
perf_missing_features.clockid_wrong = true;
goto fallback_missing_features;
} else if (!perf_missing_features.clockid && evsel->attr.use_clockid) {
@@ -1521,10 +1521,6 @@ try_fallback:
  PERF_SAMPLE_BRANCH_NO_FLAGS))) {
perf_missing_features.lbr_flags = true;
goto fallback_missing_features;
-   } else if (!perf_missing_features.write_backward &&
-   evsel->attr.write_backward) {
-   perf_missing_features.write_backward = true;
-   goto fallback_missing_features;
}
  
  out_close:

@@ -2409,6 +2405,8 @@ int perf_evsel__open_strerror(struct perf_evsel *evsel, 
struct target *target,
"We found oprofile daemon running, please stop it and try again.");
break;
case EINVAL:
+   if (evsel->overwrite && perf_missing_features.write_backward)
+   return scnprintf(msg, size, "Reading from overwrite event is 
not supported by this kernel.");
if (perf_missing_features.clockid)
return scnprintf(msg, size, "clockid feature not 
supported.");
if (perf_missing_features.clockid_wrong)





Re: [PATCH v7 7/8] perf tools: Check write_backward during evlist config

2016-06-19 Thread Wangnan (F)



On 2016/6/17 5:47, Arnaldo Carvalho de Melo wrote:

Em Wed, Jun 15, 2016 at 02:23:34AM +, Wang Nan escreveu:

Before this patch, when using overwritable ring buffer on an old
kernel, error message is misleading:

  # ~/perf record -m 1 -e raw_syscalls:*/overwrite/ -a
  Error:
  The raw_syscalls:sys_enter event is not supported.

This patch output clear error message to tell user his/her kernel
is too old:

  # ~/perf record -m 1 -e raw_syscalls:*/overwrite/ -a
  Reading from overwrite event is not supported by this kernel
  Error:
  The raw_syscalls:sys_enter event is not supported.

So I went to see if exposing that missing_features struct outside
evsel.c was strictly needed and found that we already have fallbacking
for this feature (attr.write_backwards) i.e. if we set it and
sys_perf_event_open() fails, we will check if we are asking the kernel
for some attr. field that it doesn't supports, set that missing_features
and try again.

But the way this was done for attr.write_backwards was buggy, as we need
to check features in the inverse order of their introduction to the
kernel, so that a newer tool checks first the newest perf_event_attr
fields, detecting that the older kernel doesn't have support for them.
The patch that introduced write_backwards support ([1]) in perf_evsel__open()
did this checking after all the other older attributes, wrongly.

[1]: b90dc17a5d14 ("perf evsel: Add overwrite attribute and check 
write_backward")

Also we shouldn't even try to call sys_perf_event_open if
perf_missing_features.write_backward is true and evsel->overwrite is
also true, the old code would check this only after successfully opening
the fd, do it before the open loop.

Please take a look at the following patch, see if it is sufficient for
handling older kernels, probably we need to emit a message to the user,
but that has to be done at the builtin- level, i.e. at the tool, i.e.
perf_evsel_open__strerror() should have what it takes to figure out this
extra error and provide/ a proper string, lemme add this to the patch...
done, please check:

write_backwards_fallback.patch:


[SNIP]

  
@@ -1496,7 +1493,10 @@ try_fallback:

 * Must probe features in the order they were added to the
 * perf_event_attr interface.
 */


I read this comment but misunderstand. I thought 'order' means newest last.

Will try your patch. Thank you.


-   if (!perf_missing_features.clockid_wrong && evsel->attr.use_clockid) {
+   if (!perf_missing_features.write_backward && 
evsel->attr.write_backward) {
+   perf_missing_features.write_backward = true;
+   goto fallback_missing_features;
+   } else if (!perf_missing_features.clockid_wrong && 
evsel->attr.use_clockid) {
perf_missing_features.clockid_wrong = true;
goto fallback_missing_features;
} else if (!perf_missing_features.clockid && evsel->attr.use_clockid) {
@@ -1521,10 +1521,6 @@ try_fallback:
  PERF_SAMPLE_BRANCH_NO_FLAGS))) {
perf_missing_features.lbr_flags = true;
goto fallback_missing_features;
-   } else if (!perf_missing_features.write_backward &&
-   evsel->attr.write_backward) {
-   perf_missing_features.write_backward = true;
-   goto fallback_missing_features;
}
  
  out_close:

@@ -2409,6 +2405,8 @@ int perf_evsel__open_strerror(struct perf_evsel *evsel, 
struct target *target,
"We found oprofile daemon running, please stop it and try again.");
break;
case EINVAL:
+   if (evsel->overwrite && perf_missing_features.write_backward)
+   return scnprintf(msg, size, "Reading from overwrite event is 
not supported by this kernel.");
if (perf_missing_features.clockid)
return scnprintf(msg, size, "clockid feature not 
supported.");
if (perf_missing_features.clockid_wrong)





[PATCH v3] mmc: dw_mmc: remove UBSAN warning in dw_mci_setup_bus()

2016-06-19 Thread Seung-Woo Kim
This patch removes following UBSAN warnings in dw_mci_setup_bus().

  UBSAN: Undefined behaviour in drivers/mmc/host/dw_mmc.c:1102:14
  shift exponent 250 is too large for 32-bit type 'unsigned int'
  Call trace:
  [] dump_backtrace+0x0/0x380
  [] show_stack+0x14/0x20
  [] dump_stack+0xe0/0x120
  [] ubsan_epilogue+0x18/0x68
  [] __ubsan_handle_shift_out_of_bounds+0x18c/0x1bc
  [] dw_mci_setup_bus+0x3a0/0x438
  [...]

  UBSAN: Undefined behaviour in drivers/mmc/host/dw_mmc.c:1132:27
  shift exponent 250 is too large for 32-bit type 'unsigned int'
  Call trace:
  [] dump_backtrace+0x0/0x380
  [] show_stack+0x14/0x20
  [] dump_stack+0xe0/0x120
  [] ubsan_epilogue+0x18/0x68
  [] __ubsan_handle_shift_out_of_bounds+0x18c/0x1bc
  [] dw_mci_setup_bus+0x384/0x438
  [...]

The warnings are caused because of bit shift which is used to
filter spamming message for CONFIG_MMC_CLKGATE, but the config is
already removed. So this patch just removes the shift.

Signed-off-by: Seung-Woo Kim 
---
 drivers/mmc/host/dw_mmc.c |   14 +-
 drivers/mmc/host/dw_mmc.h |4 
 2 files changed, 5 insertions(+), 13 deletions(-)

diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
index 2cc6123..bada11e 100644
--- a/drivers/mmc/host/dw_mmc.c
+++ b/drivers/mmc/host/dw_mmc.c
@@ -1099,12 +1099,11 @@ static void dw_mci_setup_bus(struct dw_mci_slot *slot, 
bool force_clkinit)
 
div = (host->bus_hz != clock) ? DIV_ROUND_UP(div, 2) : 0;
 
-   if ((clock << div) != slot->__clk_old || force_clkinit)
-   dev_info(>mmc->class_dev,
-"Bus speed (slot %d) = %dHz (slot req %dHz, 
actual %dHZ div = %d)\n",
-slot->id, host->bus_hz, clock,
-div ? ((host->bus_hz / div) >> 1) :
-host->bus_hz, div);
+   dev_info(>mmc->class_dev,
+"Bus speed (slot %d) = %dHz (slot req %dHz, actual 
%dHZ div = %d)\n",
+slot->id, host->bus_hz, clock,
+div ? ((host->bus_hz / div) >> 1) :
+host->bus_hz, div);
 
/* disable clock */
mci_writel(host, CLKENA, 0);
@@ -1127,9 +1126,6 @@ static void dw_mci_setup_bus(struct dw_mci_slot *slot, 
bool force_clkinit)
 
/* inform CIU */
mci_send_cmd(slot, sdmmc_cmd_bits, 0);
-
-   /* keep the clock with reflecting clock dividor */
-   slot->__clk_old = clock << div;
}
 
host->current_speed = clock;
diff --git a/drivers/mmc/host/dw_mmc.h b/drivers/mmc/host/dw_mmc.h
index 1e8d838..5961037 100644
--- a/drivers/mmc/host/dw_mmc.h
+++ b/drivers/mmc/host/dw_mmc.h
@@ -245,9 +245,6 @@ extern int dw_mci_resume(struct dw_mci *host);
  * @queue_node: List node for placing this node in the @queue list of
  *  dw_mci.
  * @clock: Clock rate configured by set_ios(). Protected by host->lock.
- * @__clk_old: The last updated clock with reflecting clock divider.
- * Keeping track of this helps us to avoid spamming the console
- * with CONFIG_MMC_CLKGATE.
  * @flags: Random state bits associated with the slot.
  * @id: Number of this slot.
  * @sdio_id: Number of this slot in the SDIO interrupt registers.
@@ -262,7 +259,6 @@ struct dw_mci_slot {
struct list_headqueue_node;
 
unsigned intclock;
-   unsigned int__clk_old;
 
unsigned long   flags;
 #define DW_MMC_CARD_PRESENT0
-- 
1.7.9.5



[PATCH v3] mmc: dw_mmc: remove UBSAN warning in dw_mci_setup_bus()

2016-06-19 Thread Seung-Woo Kim
This patch removes following UBSAN warnings in dw_mci_setup_bus().

  UBSAN: Undefined behaviour in drivers/mmc/host/dw_mmc.c:1102:14
  shift exponent 250 is too large for 32-bit type 'unsigned int'
  Call trace:
  [] dump_backtrace+0x0/0x380
  [] show_stack+0x14/0x20
  [] dump_stack+0xe0/0x120
  [] ubsan_epilogue+0x18/0x68
  [] __ubsan_handle_shift_out_of_bounds+0x18c/0x1bc
  [] dw_mci_setup_bus+0x3a0/0x438
  [...]

  UBSAN: Undefined behaviour in drivers/mmc/host/dw_mmc.c:1132:27
  shift exponent 250 is too large for 32-bit type 'unsigned int'
  Call trace:
  [] dump_backtrace+0x0/0x380
  [] show_stack+0x14/0x20
  [] dump_stack+0xe0/0x120
  [] ubsan_epilogue+0x18/0x68
  [] __ubsan_handle_shift_out_of_bounds+0x18c/0x1bc
  [] dw_mci_setup_bus+0x384/0x438
  [...]

The warnings are caused because of bit shift which is used to
filter spamming message for CONFIG_MMC_CLKGATE, but the config is
already removed. So this patch just removes the shift.

Signed-off-by: Seung-Woo Kim 
---
 drivers/mmc/host/dw_mmc.c |   14 +-
 drivers/mmc/host/dw_mmc.h |4 
 2 files changed, 5 insertions(+), 13 deletions(-)

diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
index 2cc6123..bada11e 100644
--- a/drivers/mmc/host/dw_mmc.c
+++ b/drivers/mmc/host/dw_mmc.c
@@ -1099,12 +1099,11 @@ static void dw_mci_setup_bus(struct dw_mci_slot *slot, 
bool force_clkinit)
 
div = (host->bus_hz != clock) ? DIV_ROUND_UP(div, 2) : 0;
 
-   if ((clock << div) != slot->__clk_old || force_clkinit)
-   dev_info(>mmc->class_dev,
-"Bus speed (slot %d) = %dHz (slot req %dHz, 
actual %dHZ div = %d)\n",
-slot->id, host->bus_hz, clock,
-div ? ((host->bus_hz / div) >> 1) :
-host->bus_hz, div);
+   dev_info(>mmc->class_dev,
+"Bus speed (slot %d) = %dHz (slot req %dHz, actual 
%dHZ div = %d)\n",
+slot->id, host->bus_hz, clock,
+div ? ((host->bus_hz / div) >> 1) :
+host->bus_hz, div);
 
/* disable clock */
mci_writel(host, CLKENA, 0);
@@ -1127,9 +1126,6 @@ static void dw_mci_setup_bus(struct dw_mci_slot *slot, 
bool force_clkinit)
 
/* inform CIU */
mci_send_cmd(slot, sdmmc_cmd_bits, 0);
-
-   /* keep the clock with reflecting clock dividor */
-   slot->__clk_old = clock << div;
}
 
host->current_speed = clock;
diff --git a/drivers/mmc/host/dw_mmc.h b/drivers/mmc/host/dw_mmc.h
index 1e8d838..5961037 100644
--- a/drivers/mmc/host/dw_mmc.h
+++ b/drivers/mmc/host/dw_mmc.h
@@ -245,9 +245,6 @@ extern int dw_mci_resume(struct dw_mci *host);
  * @queue_node: List node for placing this node in the @queue list of
  *  dw_mci.
  * @clock: Clock rate configured by set_ios(). Protected by host->lock.
- * @__clk_old: The last updated clock with reflecting clock divider.
- * Keeping track of this helps us to avoid spamming the console
- * with CONFIG_MMC_CLKGATE.
  * @flags: Random state bits associated with the slot.
  * @id: Number of this slot.
  * @sdio_id: Number of this slot in the SDIO interrupt registers.
@@ -262,7 +259,6 @@ struct dw_mci_slot {
struct list_headqueue_node;
 
unsigned intclock;
-   unsigned int__clk_old;
 
unsigned long   flags;
 #define DW_MMC_CARD_PRESENT0
-- 
1.7.9.5



Re: [PATCH v2] mfd: intel_soc_pmic_bxtwc: Add Intel BXT WhiskeyCove PMIC ADC thermal channel-zone mapping

2016-06-19 Thread Bin Gao
On Fri, Jun 17, 2016 at 09:01:59AM +0100, Lee Jones wrote:
> > +static struct trip_config_map str0_trip_config[] = {
> > +   {
> > +   .irq_reg = BXTWC_THRM0IRQ,
> > +   .irq_mask = 0x01,
> > +   .irq_en = BXTWC_MTHRM0IRQ,
> > +   .irq_en_mask = 0x01,
> > +   .evt_stat = BXTWC_STHRM0IRQ,
> > +   .evt_mask = 0x01,
> > +   .trip_num = 0
> > +   },
> > +   {
> > +   .irq_reg = BXTWC_THRM0IRQ,
> > +   .irq_mask = 0x10,
> > +   .irq_en = BXTWC_MTHRM0IRQ,
> > +   .irq_en_mask = 0x10,
> > +   .evt_stat = BXTWC_STHRM0IRQ,
> > +   .evt_mask = 0x10,
> > +   .trip_num = 1
> > +   }
> > +};
> > +
> > +static struct trip_config_map str1_trip_config[] = {
> > +   {
> > +   .irq_reg = BXTWC_THRM0IRQ,
> > +   .irq_mask = 0x02,
> > +   .irq_en = BXTWC_MTHRM0IRQ,
> > +   .irq_en_mask = 0x02,
> > +   .evt_stat = BXTWC_STHRM0IRQ,
> > +   .evt_mask = 0x02,
> > +   .trip_num = 0
> > +   },
> > +   {
> > +   .irq_reg = BXTWC_THRM0IRQ,
> > +   .irq_mask = 0x20,
> > +   .irq_en = BXTWC_MTHRM0IRQ,
> > +   .irq_en_mask = 0x20,
> > +   .evt_stat = BXTWC_STHRM0IRQ,
> > +   .evt_mask = 0x20,
> > +   .trip_num = 1
> > +   },
> > +};
> > +
> > +static struct trip_config_map str2_trip_config[] = {
> > +   {
> > +   .irq_reg = BXTWC_THRM0IRQ,
> > +   .irq_mask = 0x04,
> > +   .irq_en = BXTWC_MTHRM0IRQ,
> > +   .irq_en_mask = 0x04,
> > +   .evt_stat = BXTWC_STHRM0IRQ,
> > +   .evt_mask = 0x04,
> > +   .trip_num = 0
> > +   },
> > +   {
> > +   .irq_reg = BXTWC_THRM0IRQ,
> > +   .irq_mask = 0x40,
> > +   .irq_en = BXTWC_MTHRM0IRQ,
> > +   .irq_en_mask = 0x40,
> > +   .evt_stat = BXTWC_STHRM0IRQ,
> > +   .evt_mask = 0x40,
> > +   .trip_num = 1
> > +   },
> > +};
> > +
> > +static struct trip_config_map str3_trip_config[] = {
> > +   {
> > +   .irq_reg = BXTWC_THRM2IRQ,
> > +   .irq_mask = 0x10,
> > +   .irq_en = BXTWC_MTHRM2IRQ,
> > +   .irq_en_mask = 0x10,
> > +   .evt_stat = BXTWC_STHRM2IRQ,
> > +   .evt_mask = 0x10,
> > +   .trip_num = 0
> > +   },
> > +};
> 
> This looks like a register map to me.
> 
> Can you use the regmap framework instead?

These are platform data used by another driver(thermal driver) which
uses regmap framework to access some of the fields of the structure(
irq_reg, irq_en and evt_stat).


Re: [PATCH v2] mfd: intel_soc_pmic_bxtwc: Add Intel BXT WhiskeyCove PMIC ADC thermal channel-zone mapping

2016-06-19 Thread Bin Gao
On Fri, Jun 17, 2016 at 09:01:59AM +0100, Lee Jones wrote:
> > +static struct trip_config_map str0_trip_config[] = {
> > +   {
> > +   .irq_reg = BXTWC_THRM0IRQ,
> > +   .irq_mask = 0x01,
> > +   .irq_en = BXTWC_MTHRM0IRQ,
> > +   .irq_en_mask = 0x01,
> > +   .evt_stat = BXTWC_STHRM0IRQ,
> > +   .evt_mask = 0x01,
> > +   .trip_num = 0
> > +   },
> > +   {
> > +   .irq_reg = BXTWC_THRM0IRQ,
> > +   .irq_mask = 0x10,
> > +   .irq_en = BXTWC_MTHRM0IRQ,
> > +   .irq_en_mask = 0x10,
> > +   .evt_stat = BXTWC_STHRM0IRQ,
> > +   .evt_mask = 0x10,
> > +   .trip_num = 1
> > +   }
> > +};
> > +
> > +static struct trip_config_map str1_trip_config[] = {
> > +   {
> > +   .irq_reg = BXTWC_THRM0IRQ,
> > +   .irq_mask = 0x02,
> > +   .irq_en = BXTWC_MTHRM0IRQ,
> > +   .irq_en_mask = 0x02,
> > +   .evt_stat = BXTWC_STHRM0IRQ,
> > +   .evt_mask = 0x02,
> > +   .trip_num = 0
> > +   },
> > +   {
> > +   .irq_reg = BXTWC_THRM0IRQ,
> > +   .irq_mask = 0x20,
> > +   .irq_en = BXTWC_MTHRM0IRQ,
> > +   .irq_en_mask = 0x20,
> > +   .evt_stat = BXTWC_STHRM0IRQ,
> > +   .evt_mask = 0x20,
> > +   .trip_num = 1
> > +   },
> > +};
> > +
> > +static struct trip_config_map str2_trip_config[] = {
> > +   {
> > +   .irq_reg = BXTWC_THRM0IRQ,
> > +   .irq_mask = 0x04,
> > +   .irq_en = BXTWC_MTHRM0IRQ,
> > +   .irq_en_mask = 0x04,
> > +   .evt_stat = BXTWC_STHRM0IRQ,
> > +   .evt_mask = 0x04,
> > +   .trip_num = 0
> > +   },
> > +   {
> > +   .irq_reg = BXTWC_THRM0IRQ,
> > +   .irq_mask = 0x40,
> > +   .irq_en = BXTWC_MTHRM0IRQ,
> > +   .irq_en_mask = 0x40,
> > +   .evt_stat = BXTWC_STHRM0IRQ,
> > +   .evt_mask = 0x40,
> > +   .trip_num = 1
> > +   },
> > +};
> > +
> > +static struct trip_config_map str3_trip_config[] = {
> > +   {
> > +   .irq_reg = BXTWC_THRM2IRQ,
> > +   .irq_mask = 0x10,
> > +   .irq_en = BXTWC_MTHRM2IRQ,
> > +   .irq_en_mask = 0x10,
> > +   .evt_stat = BXTWC_STHRM2IRQ,
> > +   .evt_mask = 0x10,
> > +   .trip_num = 0
> > +   },
> > +};
> 
> This looks like a register map to me.
> 
> Can you use the regmap framework instead?

These are platform data used by another driver(thermal driver) which
uses regmap framework to access some of the fields of the structure(
irq_reg, irq_en and evt_stat).


[PATCH] ARM: AM43XX: hwmod: Fix RSTST register offset for pruss

2016-06-19 Thread Keerthy
pruss hwmod RSTST register wrongly points to PWRSTCTRL register in case of
am43xx. Fix the RSTST register offset value.

This can lead to setting of wrong power state values for PER domain.

Fixes: 1c7e224d ("ARM: OMAP2+: hwmod: AM335x: runtime register update")
Signed-off-by: Keerthy 
---
 arch/arm/mach-omap2/omap_hwmod_33xx_43xx_ipblock_data.c | 1 +
 arch/arm/mach-omap2/prcm43xx.h  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_ipblock_data.c 
b/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_ipblock_data.c
index 6a73b6c..55c5878 100644
--- a/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_ipblock_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_ipblock_data.c
@@ -1392,6 +1392,7 @@ static void omap_hwmod_am43xx_rst(void)
 {
RSTCTRL(am33xx_pruss_hwmod, AM43XX_RM_PER_RSTCTRL_OFFSET);
RSTCTRL(am33xx_gfx_hwmod, AM43XX_RM_GFX_RSTCTRL_OFFSET);
+   RSTST(am33xx_pruss_hwmod, AM43XX_RM_PER_RSTST_OFFSET);
RSTST(am33xx_gfx_hwmod, AM43XX_RM_GFX_RSTST_OFFSET);
 }
 
diff --git a/arch/arm/mach-omap2/prcm43xx.h b/arch/arm/mach-omap2/prcm43xx.h
index 7c34c44e..babb5db 100644
--- a/arch/arm/mach-omap2/prcm43xx.h
+++ b/arch/arm/mach-omap2/prcm43xx.h
@@ -39,6 +39,7 @@
 
 /* RM RSTST offsets */
 #define AM43XX_RM_GFX_RSTST_OFFSET 0x0014
+#define AM43XX_RM_PER_RSTST_OFFSET 0x0014
 #define AM43XX_RM_WKUP_RSTST_OFFSET0x0014
 
 /* CM instances */
-- 
1.9.1



[PATCH] ARM: AM43XX: hwmod: Fix RSTST register offset for pruss

2016-06-19 Thread Keerthy
pruss hwmod RSTST register wrongly points to PWRSTCTRL register in case of
am43xx. Fix the RSTST register offset value.

This can lead to setting of wrong power state values for PER domain.

Fixes: 1c7e224d ("ARM: OMAP2+: hwmod: AM335x: runtime register update")
Signed-off-by: Keerthy 
---
 arch/arm/mach-omap2/omap_hwmod_33xx_43xx_ipblock_data.c | 1 +
 arch/arm/mach-omap2/prcm43xx.h  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_ipblock_data.c 
b/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_ipblock_data.c
index 6a73b6c..55c5878 100644
--- a/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_ipblock_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_ipblock_data.c
@@ -1392,6 +1392,7 @@ static void omap_hwmod_am43xx_rst(void)
 {
RSTCTRL(am33xx_pruss_hwmod, AM43XX_RM_PER_RSTCTRL_OFFSET);
RSTCTRL(am33xx_gfx_hwmod, AM43XX_RM_GFX_RSTCTRL_OFFSET);
+   RSTST(am33xx_pruss_hwmod, AM43XX_RM_PER_RSTST_OFFSET);
RSTST(am33xx_gfx_hwmod, AM43XX_RM_GFX_RSTST_OFFSET);
 }
 
diff --git a/arch/arm/mach-omap2/prcm43xx.h b/arch/arm/mach-omap2/prcm43xx.h
index 7c34c44e..babb5db 100644
--- a/arch/arm/mach-omap2/prcm43xx.h
+++ b/arch/arm/mach-omap2/prcm43xx.h
@@ -39,6 +39,7 @@
 
 /* RM RSTST offsets */
 #define AM43XX_RM_GFX_RSTST_OFFSET 0x0014
+#define AM43XX_RM_PER_RSTST_OFFSET 0x0014
 #define AM43XX_RM_WKUP_RSTST_OFFSET0x0014
 
 /* CM instances */
-- 
1.9.1



[PATCH v3 2/4] arm64: dts: hi6220: Add media subsystem reset dts

2016-06-19 Thread Xinliang Liu
Add media subsystem reset dts support.

Signed-off-by: Chen Feng 
Signed-off-by: Xinliang Liu 
---
 arch/arm64/boot/dts/hisilicon/hi6220.dtsi  | 2 ++
 include/dt-bindings/reset/hisi,hi6220-resets.h | 8 
 2 files changed, 10 insertions(+)

diff --git a/arch/arm64/boot/dts/hisilicon/hi6220.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
index 189d21541f9c..c19b82799a34 100644
--- a/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
@@ -5,6 +5,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -252,6 +253,7 @@
compatible = "hisilicon,hi6220-mediactrl", "syscon";
reg = <0x0 0xf441 0x0 0x1000>;
#clock-cells = <1>;
+   #reset-cells = <1>;
};
 
pm_ctrl: pm_ctrl@f7032000 {
diff --git a/include/dt-bindings/reset/hisi,hi6220-resets.h 
b/include/dt-bindings/reset/hisi,hi6220-resets.h
index ca08a7e5248e..322ec5335b65 100644
--- a/include/dt-bindings/reset/hisi,hi6220-resets.h
+++ b/include/dt-bindings/reset/hisi,hi6220-resets.h
@@ -64,4 +64,12 @@
 #define PERIPH_RSDIST9_CARM_SOCDBG  0x507
 #define PERIPH_RSDIST9_CARM_ETM 0x508
 
+#define MEDIA_G3D   0
+#define MEDIA_CODEC_VPU 2
+#define MEDIA_CODEC_JPEG3
+#define MEDIA_ISP   4
+#define MEDIA_ADE   5
+#define MEDIA_MMU   6
+#define MEDIA_XG2RAM1   7
+
 #endif /*_DT_BINDINGS_RESET_CONTROLLER_HI6220*/
-- 
2.8.3



[PATCH v3 1/4] reset: hisilicon: Add media reset controller binding

2016-06-19 Thread Xinliang Liu
Add compatible for media reset controller.

Actually, there are two reset controllers in hi6220 SoC:
The peripheral reset controller bits are part of sysctrl registers.
The media reset controller bits are part of mediactrl registers.
So for the compatible part, it should contain "syscon" for both peripheral
and media reset controller.

Signed-off-by: Xinliang Liu 
---
 Documentation/devicetree/bindings/reset/hisilicon,hi6220-reset.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/reset/hisilicon,hi6220-reset.txt 
b/Documentation/devicetree/bindings/reset/hisilicon,hi6220-reset.txt
index e0b185a944ba..c25da39df707 100644
--- a/Documentation/devicetree/bindings/reset/hisilicon,hi6220-reset.txt
+++ b/Documentation/devicetree/bindings/reset/hisilicon,hi6220-reset.txt
@@ -8,7 +8,9 @@ The reset controller registers are part of the system-ctl block 
on
 hi6220 SoC.
 
 Required properties:
-- compatible: may be "hisilicon,hi6220-sysctrl"
+- compatible: should be one of the following:
+  - "hisilicon,hi6220-sysctrl", "syscon" : For peripheral reset controller.
+  - "hisilicon,hi6220-mediactrl", "syscon" : For media reset controller.
 - reg: should be register base and length as documented in the
   datasheet
 - #reset-cells: 1, see below
-- 
2.8.3



[PATCH v3 2/4] arm64: dts: hi6220: Add media subsystem reset dts

2016-06-19 Thread Xinliang Liu
Add media subsystem reset dts support.

Signed-off-by: Chen Feng 
Signed-off-by: Xinliang Liu 
---
 arch/arm64/boot/dts/hisilicon/hi6220.dtsi  | 2 ++
 include/dt-bindings/reset/hisi,hi6220-resets.h | 8 
 2 files changed, 10 insertions(+)

diff --git a/arch/arm64/boot/dts/hisilicon/hi6220.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
index 189d21541f9c..c19b82799a34 100644
--- a/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
@@ -5,6 +5,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -252,6 +253,7 @@
compatible = "hisilicon,hi6220-mediactrl", "syscon";
reg = <0x0 0xf441 0x0 0x1000>;
#clock-cells = <1>;
+   #reset-cells = <1>;
};
 
pm_ctrl: pm_ctrl@f7032000 {
diff --git a/include/dt-bindings/reset/hisi,hi6220-resets.h 
b/include/dt-bindings/reset/hisi,hi6220-resets.h
index ca08a7e5248e..322ec5335b65 100644
--- a/include/dt-bindings/reset/hisi,hi6220-resets.h
+++ b/include/dt-bindings/reset/hisi,hi6220-resets.h
@@ -64,4 +64,12 @@
 #define PERIPH_RSDIST9_CARM_SOCDBG  0x507
 #define PERIPH_RSDIST9_CARM_ETM 0x508
 
+#define MEDIA_G3D   0
+#define MEDIA_CODEC_VPU 2
+#define MEDIA_CODEC_JPEG3
+#define MEDIA_ISP   4
+#define MEDIA_ADE   5
+#define MEDIA_MMU   6
+#define MEDIA_XG2RAM1   7
+
 #endif /*_DT_BINDINGS_RESET_CONTROLLER_HI6220*/
-- 
2.8.3



[PATCH v3 1/4] reset: hisilicon: Add media reset controller binding

2016-06-19 Thread Xinliang Liu
Add compatible for media reset controller.

Actually, there are two reset controllers in hi6220 SoC:
The peripheral reset controller bits are part of sysctrl registers.
The media reset controller bits are part of mediactrl registers.
So for the compatible part, it should contain "syscon" for both peripheral
and media reset controller.

Signed-off-by: Xinliang Liu 
---
 Documentation/devicetree/bindings/reset/hisilicon,hi6220-reset.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/reset/hisilicon,hi6220-reset.txt 
b/Documentation/devicetree/bindings/reset/hisilicon,hi6220-reset.txt
index e0b185a944ba..c25da39df707 100644
--- a/Documentation/devicetree/bindings/reset/hisilicon,hi6220-reset.txt
+++ b/Documentation/devicetree/bindings/reset/hisilicon,hi6220-reset.txt
@@ -8,7 +8,9 @@ The reset controller registers are part of the system-ctl block 
on
 hi6220 SoC.
 
 Required properties:
-- compatible: may be "hisilicon,hi6220-sysctrl"
+- compatible: should be one of the following:
+  - "hisilicon,hi6220-sysctrl", "syscon" : For peripheral reset controller.
+  - "hisilicon,hi6220-mediactrl", "syscon" : For media reset controller.
 - reg: should be register base and length as documented in the
   datasheet
 - #reset-cells: 1, see below
-- 
2.8.3



Re: [PATCH v2] gpio: add Intel WhiskeyCove GPIO driver

2016-06-19 Thread Bin Gao
> 
> Looks good. I have couple of minor comments, see below.
Thanks for review again.

> 
> > + * Copyright (C) 2015 Intel Corporation. All rights reserved.
> 
> It is 2016 now isn't it? :-)
Will fix this in v3.

> > +#define DRV_NAME "bxt_wcove_gpio"
> 
> Drop this.
We have _TWO_ places using DRV_NAME(near the end of the file):
static struct platform_driver wcove_gpio_driver = {
   .driver = {
   .name = DRV_NAME,
   },

and

MODULE_ALIAS("platform:" DRV_NAME);

You are suggesting to replace DRV_NAME with bxt_wcove_gpio(but why?)
or something else?

> > + * struct wcove_gpio - Whiskey Cove GPIO controller
> > + * @buslock: for bus lock/sync and unlock.
> > + * @chip: the abstract gpio_chip structure.
> > + * @regmap: the regmap from the parent device.
> 
> Missing kernel-doc for regmap_irq_chip.
Will fix this in v3.

> > +static void wcove_update_irq_mask(struct wcove_gpio *wg,
> > +   int gpio)
> 
> Does this with into 80 chars?
Yes, it fits into 80 chars. Will fix in v3.



[PATCH v3 3/4] reset: hisilicon: Change to syscon register access

2016-06-19 Thread Xinliang Liu
From: Chen Feng 

There are two reset controllers in hi6220 SoC:
The peripheral reset controller bits are part of sysctrl registers.
The media reset controller bits are part of mediactrl registers.

So change register access to syscon way.
And rename current reset controller to peripheral one.

Signed-off-by: Chen Feng 
Signed-off-by: Xia Qing 
Signed-off-by: Xinliang Liu 
---
 drivers/reset/hisilicon/hi6220_reset.c | 85 ++
 1 file changed, 45 insertions(+), 40 deletions(-)

diff --git a/drivers/reset/hisilicon/hi6220_reset.c 
b/drivers/reset/hisilicon/hi6220_reset.c
index 8f55fd4a2630..686fea9e2c54 100644
--- a/drivers/reset/hisilicon/hi6220_reset.c
+++ b/drivers/reset/hisilicon/hi6220_reset.c
@@ -1,7 +1,8 @@
 /*
  * Hisilicon Hi6220 reset controller driver
  *
- * Copyright (c) 2015 Hisilicon Limited.
+ * Copyright (c) 2016 Linaro Limited.
+ * Copyright (c) 2015-2016 Hisilicon Limited.
  *
  * Author: Feng Chen 
  *
@@ -15,81 +16,85 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include 
 
-#define ASSERT_OFFSET0x300
-#define DEASSERT_OFFSET  0x304
-#define MAX_INDEX0x509
+#define PERIPH_ASSERT_OFFSET  0x300
+#define PERIPH_DEASSERT_OFFSET0x304
+#define PERIPH_MAX_INDEX  0x509
 
 #define to_reset_data(x) container_of(x, struct hi6220_reset_data, rc_dev)
 
 struct hi6220_reset_data {
-   void __iomem*assert_base;
-   void __iomem*deassert_base;
-   struct reset_controller_dev rc_dev;
+   struct reset_controller_dev rc_dev;
+   struct regmap *regmap;
 };
 
-static int hi6220_reset_assert(struct reset_controller_dev *rc_dev,
-  unsigned long idx)
+static int hi6220_peripheral_assert(struct reset_controller_dev *rc_dev,
+   unsigned long idx)
 {
struct hi6220_reset_data *data = to_reset_data(rc_dev);
+   struct regmap *regmap = data->regmap;
+   u32 bank = idx >> 8;
+   u32 offset = idx & 0xff;
+   u32 reg = PERIPH_ASSERT_OFFSET + bank * 0x10;
 
-   int bank = idx >> 8;
-   int offset = idx & 0xff;
-
-   writel(BIT(offset), data->assert_base + (bank * 0x10));
-
-   return 0;
+   return regmap_write(regmap, reg, BIT(offset));
 }
 
-static int hi6220_reset_deassert(struct reset_controller_dev *rc_dev,
-unsigned long idx)
+static int hi6220_peripheral_deassert(struct reset_controller_dev *rc_dev,
+ unsigned long idx)
 {
struct hi6220_reset_data *data = to_reset_data(rc_dev);
+   struct regmap *regmap = data->regmap;
+   u32 bank = idx >> 8;
+   u32 offset = idx & 0xff;
+   u32 reg = PERIPH_DEASSERT_OFFSET + bank * 0x10;
 
-   int bank = idx >> 8;
-   int offset = idx & 0xff;
-
-   writel(BIT(offset), data->deassert_base + (bank * 0x10));
-
-   return 0;
+   return regmap_write(regmap, reg, BIT(offset));
 }
 
-static const struct reset_control_ops hi6220_reset_ops = {
-   .assert = hi6220_reset_assert,
-   .deassert = hi6220_reset_deassert,
+static const struct reset_control_ops hi6220_peripheral_reset_ops = {
+   .assert = hi6220_peripheral_assert,
+   .deassert = hi6220_peripheral_deassert,
 };
 
 static int hi6220_reset_probe(struct platform_device *pdev)
 {
+   struct device_node *np = pdev->dev.of_node;
+   struct device *dev = >dev;
struct hi6220_reset_data *data;
-   struct resource *res;
-   void __iomem *src_base;
+   struct regmap *regmap;
 
-   data = devm_kzalloc(>dev, sizeof(*data), GFP_KERNEL);
+   data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
if (!data)
return -ENOMEM;
 
-   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-   src_base = devm_ioremap_resource(>dev, res);
-   if (IS_ERR(src_base))
-   return PTR_ERR(src_base);
+   regmap = syscon_node_to_regmap(np);
+   if (IS_ERR(regmap)) {
+   dev_err(dev, "failed to get reset controller regmap\n");
+   return PTR_ERR(regmap);
+   }
 
-   data->assert_base = src_base + ASSERT_OFFSET;
-   data->deassert_base = src_base + DEASSERT_OFFSET;
-   data->rc_dev.nr_resets = MAX_INDEX;
-   data->rc_dev.ops = _reset_ops;
-   data->rc_dev.of_node = pdev->dev.of_node;
+   data->regmap = regmap;
+   data->rc_dev.of_node = np;
+   data->rc_dev.ops = _peripheral_reset_ops;
+   data->rc_dev.nr_resets = PERIPH_MAX_INDEX;
 
return reset_controller_register(>rc_dev);
 }
 
 static const struct of_device_id hi6220_reset_match[] = {
-   { .compatible = "hisilicon,hi6220-sysctrl" },
-   { },
+   {
+   .compatible = 

[PATCH v3 0/4] Add hi6220 media subsystem reset controller driver

2016-06-19 Thread Xinliang Liu
This patch set adds support for HiSilicon hi6220 SoC media subsystem
reset controller.

Change history:
v3:
- Split regmap register access change and mediactrl support.

v2:
- Update binding document for media reset controller.
- Separate peripheral and media reset controller ops.

Chen Feng (1):
  reset: hisilicon: Change to syscon register access

Xinliang Liu (3):
  reset: hisilicon: Add media reset controller binding
  arm64: dts: hi6220: Add media subsystem reset dts
  reset: hisilicon: Add hi6220 media subsystem reset support

 .../bindings/reset/hisilicon,hi6220-reset.txt  |   4 +-
 arch/arm64/boot/dts/hisilicon/hi6220.dtsi  |   2 +
 drivers/reset/hisilicon/hi6220_reset.c | 122 +++--
 include/dt-bindings/reset/hisi,hi6220-resets.h |   8 ++
 4 files changed, 99 insertions(+), 37 deletions(-)

-- 
2.8.3



[PATCH v3 4/4] reset: hisilicon: Add hi6220 media subsystem reset support

2016-06-19 Thread Xinliang Liu
Add hi6220 media subsystem reset controller.

Signed-off-by: Chen Feng 
Signed-off-by: Xia Qing 
Signed-off-by: Xinliang Liu 
---
 drivers/reset/hisilicon/hi6220_reset.c | 49 --
 1 file changed, 47 insertions(+), 2 deletions(-)

diff --git a/drivers/reset/hisilicon/hi6220_reset.c 
b/drivers/reset/hisilicon/hi6220_reset.c
index 686fea9e2c54..35ce53edabf9 100644
--- a/drivers/reset/hisilicon/hi6220_reset.c
+++ b/drivers/reset/hisilicon/hi6220_reset.c
@@ -27,8 +27,17 @@
 #define PERIPH_DEASSERT_OFFSET0x304
 #define PERIPH_MAX_INDEX  0x509
 
+#define SC_MEDIA_RSTEN0x052C
+#define SC_MEDIA_RSTDIS   0x0530
+#define MEDIA_MAX_INDEX   8
+
 #define to_reset_data(x) container_of(x, struct hi6220_reset_data, rc_dev)
 
+enum hi6220_reset_ctrl_type {
+   PERIPHERAL,
+   MEDIA,
+};
+
 struct hi6220_reset_data {
struct reset_controller_dev rc_dev;
struct regmap *regmap;
@@ -63,10 +72,34 @@ static const struct reset_control_ops 
hi6220_peripheral_reset_ops = {
.deassert = hi6220_peripheral_deassert,
 };
 
+static int hi6220_media_assert(struct reset_controller_dev *rc_dev,
+  unsigned long idx)
+{
+   struct hi6220_reset_data *data = to_reset_data(rc_dev);
+   struct regmap *regmap = data->regmap;
+
+   return regmap_write(regmap, SC_MEDIA_RSTEN, BIT(idx));
+}
+
+static int hi6220_media_deassert(struct reset_controller_dev *rc_dev,
+unsigned long idx)
+{
+   struct hi6220_reset_data *data = to_reset_data(rc_dev);
+   struct regmap *regmap = data->regmap;
+
+   return regmap_write(regmap, SC_MEDIA_RSTDIS, BIT(idx));
+}
+
+static const struct reset_control_ops hi6220_media_reset_ops = {
+   .assert = hi6220_media_assert,
+   .deassert = hi6220_media_deassert,
+};
+
 static int hi6220_reset_probe(struct platform_device *pdev)
 {
struct device_node *np = pdev->dev.of_node;
struct device *dev = >dev;
+   enum hi6220_reset_ctrl_type type;
struct hi6220_reset_data *data;
struct regmap *regmap;
 
@@ -74,6 +107,8 @@ static int hi6220_reset_probe(struct platform_device *pdev)
if (!data)
return -ENOMEM;
 
+   type = (enum hi6220_reset_ctrl_type)of_device_get_match_data(dev);
+
regmap = syscon_node_to_regmap(np);
if (IS_ERR(regmap)) {
dev_err(dev, "failed to get reset controller regmap\n");
@@ -82,8 +117,13 @@ static int hi6220_reset_probe(struct platform_device *pdev)
 
data->regmap = regmap;
data->rc_dev.of_node = np;
-   data->rc_dev.ops = _peripheral_reset_ops;
-   data->rc_dev.nr_resets = PERIPH_MAX_INDEX;
+   if (type == MEDIA) {
+   data->rc_dev.ops = _media_reset_ops;
+   data->rc_dev.nr_resets = MEDIA_MAX_INDEX;
+   } else {
+   data->rc_dev.ops = _peripheral_reset_ops;
+   data->rc_dev.nr_resets = PERIPH_MAX_INDEX;
+   }
 
return reset_controller_register(>rc_dev);
 }
@@ -91,6 +131,11 @@ static int hi6220_reset_probe(struct platform_device *pdev)
 static const struct of_device_id hi6220_reset_match[] = {
{
.compatible = "hisilicon,hi6220-sysctrl",
+   .data = (void *)PERIPHERAL,
+   },
+   {
+   .compatible = "hisilicon,hi6220-mediactrl",
+   .data = (void *)MEDIA,
},
{ /* sentinel */ },
 };
-- 
2.8.3



  1   2   3   4   5   6   >