[PATCH 00/11] drm/vc4: DSI panel support + Raspberry Pi touchscreen

2016-12-14 Thread Eric Anholt
After 9 months of development, I finally got the DSI panel to light up
from poweron, and this is the series for enabling it.  I've included
the whole thing Cced to all the subsystems, as there are some
interesting choices we have to make (how the analog PHY clocks are
represented in common clocks¸ and how the touchscreen panel is
represented in DT).

There's one commit left out of the series, which is the fix to allow
bcm2835_dma.c's poll-for-completion from IRQ handlers for the DSI1
register write workaround.  However, with the panel's DSI transaction
workaround, we're not triggering our IRQ handler anyway.

Note that patch #11 is not intended to be pushed, it's just a demo.

Eric Anholt (11):
  clk: bcm2835: Don't rate change PLLs on behalf of DSI PLL dividers.
  clk: bcm2835: Register the DSI0/DSI1 pixel clocks.
  clk: bcm2835: Add leaf clock measurement support, disabled by default
  drm/vc4: Set up SCALER_DISPCTRL at boot.
  drm/vc4: Add support for feeding DSI encoders from the pixel valve.
  dt-bindings: Document the VC4 DSI module nodes.
  drm/vc4: Add DSI driver
  dt-bindings: Document the Raspberry Pi Touchscreen nodes.
  drm/panel: Add support for the Raspberry Pi 7" Touchscreen.
  ARM: bcm2835: dt: Add the DSI module nodes and clocks.
  ARM: bcm2835: Enable the Raspberry Pi touchscreen panel.

 .../bindings/clock/brcm,bcm2835-cprman.txt |   15 +-
 .../devicetree/bindings/display/brcm,bcm-vc4.txt   |   35 +
 .../display/panel/raspberrypi,touchscreen.txt  |   45 +
 arch/arm/boot/dts/bcm2835-rpi.dtsi |8 +
 arch/arm/boot/dts/bcm283x.dtsi |   77 +-
 drivers/clk/bcm/clk-bcm2835.c  |  302 +++-
 drivers/gpu/drm/panel/Kconfig  |8 +
 drivers/gpu/drm/panel/Makefile |1 +
 .../gpu/drm/panel/panel-raspberrypi-touchscreen.c  |  509 ++
 drivers/gpu/drm/vc4/Kconfig|2 +
 drivers/gpu/drm/vc4/Makefile   |1 +
 drivers/gpu/drm/vc4/vc4_crtc.c |   33 +-
 drivers/gpu/drm/vc4/vc4_debugfs.c  |1 +
 drivers/gpu/drm/vc4/vc4_drv.c  |1 +
 drivers/gpu/drm/vc4/vc4_drv.h  |5 +
 drivers/gpu/drm/vc4/vc4_dsi.c  | 1725 
 drivers/gpu/drm/vc4/vc4_hvs.c  |   14 +
 drivers/gpu/drm/vc4/vc4_regs.h |5 +
 include/dt-bindings/clock/bcm2835.h|2 +
 19 files changed, 2722 insertions(+), 67 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/display/panel/raspberrypi,touchscreen.txt
 create mode 100644 drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c
 create mode 100644 drivers/gpu/drm/vc4/vc4_dsi.c

-- 
2.11.0



[PATCH 04/11] drm/vc4: Set up SCALER_DISPCTRL at boot.

2016-12-14 Thread Eric Anholt
We want the HVS on, obviously, and we also want DSP3 (PV1's source) to
be muxed from HVS channel 2 like we expect in vc4_crtc.c.  The
firmware wasn't setting the DSP3 mux up when both the LCD and HDMI
were disabled.

Signed-off-by: Eric Anholt 
---
 drivers/gpu/drm/vc4/vc4_hvs.c  | 14 ++
 drivers/gpu/drm/vc4/vc4_regs.h |  3 +++
 2 files changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
index 6fbab1c82cb1..fc68b1b4da52 100644
--- a/drivers/gpu/drm/vc4/vc4_hvs.c
+++ b/drivers/gpu/drm/vc4/vc4_hvs.c
@@ -170,6 +170,7 @@ static int vc4_hvs_bind(struct device *dev, struct device 
*master, void *data)
struct vc4_dev *vc4 = drm->dev_private;
struct vc4_hvs *hvs = NULL;
int ret;
+   u32 dispctrl;
 
hvs = devm_kzalloc(>dev, sizeof(*hvs), GFP_KERNEL);
if (!hvs)
@@ -211,6 +212,19 @@ static int vc4_hvs_bind(struct device *dev, struct device 
*master, void *data)
return ret;
 
vc4->hvs = hvs;
+
+   dispctrl = HVS_READ(SCALER_DISPCTRL);
+
+   dispctrl |= SCALER_DISPCTRL_ENABLE;
+
+   /* Set DSP3 (PV1) to use HVS channel 2, which would otherwise
+* be unused.
+*/
+   dispctrl &= ~SCALER_DISPCTRL_DSP3_MUX_MASK;
+   dispctrl |= VC4_SET_FIELD(2, SCALER_DISPCTRL_DSP3_MUX);
+
+   HVS_WRITE(SCALER_DISPCTRL, dispctrl);
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h
index 39f6886b2410..b3b297fba709 100644
--- a/drivers/gpu/drm/vc4/vc4_regs.h
+++ b/drivers/gpu/drm/vc4/vc4_regs.h
@@ -244,6 +244,9 @@
 # define SCALER_DISPCTRL_ENABLEBIT(31)
 # define SCALER_DISPCTRL_DSP2EISLURBIT(15)
 # define SCALER_DISPCTRL_DSP1EISLURBIT(14)
+# define SCALER_DISPCTRL_DSP3_MUX_MASK VC4_MASK(19, 18)
+# define SCALER_DISPCTRL_DSP3_MUX_SHIFT18
+
 /* Enables Display 0 short line and underrun contribution to
  * SCALER_DISPSTAT_IRQDISP0.  Note that short frame contributions are
  * always enabled.
-- 
2.11.0



[PATCH 00/11] drm/vc4: DSI panel support + Raspberry Pi touchscreen

2016-12-14 Thread Eric Anholt
After 9 months of development, I finally got the DSI panel to light up
from poweron, and this is the series for enabling it.  I've included
the whole thing Cced to all the subsystems, as there are some
interesting choices we have to make (how the analog PHY clocks are
represented in common clocks¸ and how the touchscreen panel is
represented in DT).

There's one commit left out of the series, which is the fix to allow
bcm2835_dma.c's poll-for-completion from IRQ handlers for the DSI1
register write workaround.  However, with the panel's DSI transaction
workaround, we're not triggering our IRQ handler anyway.

Note that patch #11 is not intended to be pushed, it's just a demo.

Eric Anholt (11):
  clk: bcm2835: Don't rate change PLLs on behalf of DSI PLL dividers.
  clk: bcm2835: Register the DSI0/DSI1 pixel clocks.
  clk: bcm2835: Add leaf clock measurement support, disabled by default
  drm/vc4: Set up SCALER_DISPCTRL at boot.
  drm/vc4: Add support for feeding DSI encoders from the pixel valve.
  dt-bindings: Document the VC4 DSI module nodes.
  drm/vc4: Add DSI driver
  dt-bindings: Document the Raspberry Pi Touchscreen nodes.
  drm/panel: Add support for the Raspberry Pi 7" Touchscreen.
  ARM: bcm2835: dt: Add the DSI module nodes and clocks.
  ARM: bcm2835: Enable the Raspberry Pi touchscreen panel.

 .../bindings/clock/brcm,bcm2835-cprman.txt |   15 +-
 .../devicetree/bindings/display/brcm,bcm-vc4.txt   |   35 +
 .../display/panel/raspberrypi,touchscreen.txt  |   45 +
 arch/arm/boot/dts/bcm2835-rpi.dtsi |8 +
 arch/arm/boot/dts/bcm283x.dtsi |   77 +-
 drivers/clk/bcm/clk-bcm2835.c  |  302 +++-
 drivers/gpu/drm/panel/Kconfig  |8 +
 drivers/gpu/drm/panel/Makefile |1 +
 .../gpu/drm/panel/panel-raspberrypi-touchscreen.c  |  509 ++
 drivers/gpu/drm/vc4/Kconfig|2 +
 drivers/gpu/drm/vc4/Makefile   |1 +
 drivers/gpu/drm/vc4/vc4_crtc.c |   33 +-
 drivers/gpu/drm/vc4/vc4_debugfs.c  |1 +
 drivers/gpu/drm/vc4/vc4_drv.c  |1 +
 drivers/gpu/drm/vc4/vc4_drv.h  |5 +
 drivers/gpu/drm/vc4/vc4_dsi.c  | 1725 
 drivers/gpu/drm/vc4/vc4_hvs.c  |   14 +
 drivers/gpu/drm/vc4/vc4_regs.h |5 +
 include/dt-bindings/clock/bcm2835.h|2 +
 19 files changed, 2722 insertions(+), 67 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/display/panel/raspberrypi,touchscreen.txt
 create mode 100644 drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c
 create mode 100644 drivers/gpu/drm/vc4/vc4_dsi.c

-- 
2.11.0



[PATCH 04/11] drm/vc4: Set up SCALER_DISPCTRL at boot.

2016-12-14 Thread Eric Anholt
We want the HVS on, obviously, and we also want DSP3 (PV1's source) to
be muxed from HVS channel 2 like we expect in vc4_crtc.c.  The
firmware wasn't setting the DSP3 mux up when both the LCD and HDMI
were disabled.

Signed-off-by: Eric Anholt 
---
 drivers/gpu/drm/vc4/vc4_hvs.c  | 14 ++
 drivers/gpu/drm/vc4/vc4_regs.h |  3 +++
 2 files changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
index 6fbab1c82cb1..fc68b1b4da52 100644
--- a/drivers/gpu/drm/vc4/vc4_hvs.c
+++ b/drivers/gpu/drm/vc4/vc4_hvs.c
@@ -170,6 +170,7 @@ static int vc4_hvs_bind(struct device *dev, struct device 
*master, void *data)
struct vc4_dev *vc4 = drm->dev_private;
struct vc4_hvs *hvs = NULL;
int ret;
+   u32 dispctrl;
 
hvs = devm_kzalloc(>dev, sizeof(*hvs), GFP_KERNEL);
if (!hvs)
@@ -211,6 +212,19 @@ static int vc4_hvs_bind(struct device *dev, struct device 
*master, void *data)
return ret;
 
vc4->hvs = hvs;
+
+   dispctrl = HVS_READ(SCALER_DISPCTRL);
+
+   dispctrl |= SCALER_DISPCTRL_ENABLE;
+
+   /* Set DSP3 (PV1) to use HVS channel 2, which would otherwise
+* be unused.
+*/
+   dispctrl &= ~SCALER_DISPCTRL_DSP3_MUX_MASK;
+   dispctrl |= VC4_SET_FIELD(2, SCALER_DISPCTRL_DSP3_MUX);
+
+   HVS_WRITE(SCALER_DISPCTRL, dispctrl);
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h
index 39f6886b2410..b3b297fba709 100644
--- a/drivers/gpu/drm/vc4/vc4_regs.h
+++ b/drivers/gpu/drm/vc4/vc4_regs.h
@@ -244,6 +244,9 @@
 # define SCALER_DISPCTRL_ENABLEBIT(31)
 # define SCALER_DISPCTRL_DSP2EISLURBIT(15)
 # define SCALER_DISPCTRL_DSP1EISLURBIT(14)
+# define SCALER_DISPCTRL_DSP3_MUX_MASK VC4_MASK(19, 18)
+# define SCALER_DISPCTRL_DSP3_MUX_SHIFT18
+
 /* Enables Display 0 short line and underrun contribution to
  * SCALER_DISPSTAT_IRQDISP0.  Note that short frame contributions are
  * always enabled.
-- 
2.11.0



Re: [PATCH 1/1] infiniband: hw: cxgb4: set errno on failure

2016-12-14 Thread Doug Ledford
On 12/3/2016 8:04 AM, Pan Bian wrote:
> From: Pan Bian 
> 
> In function c4iw_rdev_open(), the value of return variable err should be
> negative on errors. However, when the call to __get_free_page() returns
> a NULL pointer, its value is not set to "-ENOMEM" and keeps 0. 0 means
> no error. And thus, the behavior of its caller may be misled. This patch
> fixes the bug.
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188821
> 
> Signed-off-by: Pan Bian 
> ---
>  drivers/infiniband/hw/cxgb4/device.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/hw/cxgb4/device.c 
> b/drivers/infiniband/hw/cxgb4/device.c
> index 93e3d27..b99dc9e 100644
> --- a/drivers/infiniband/hw/cxgb4/device.c
> +++ b/drivers/infiniband/hw/cxgb4/device.c
> @@ -828,8 +828,10 @@ static int c4iw_rdev_open(struct c4iw_rdev *rdev)
>   }
>   rdev->status_page = (struct t4_dev_status_page *)
>   __get_free_page(GFP_KERNEL);
> - if (!rdev->status_page)
> + if (!rdev->status_page) {
> + err = -ENOMEM;
>   goto destroy_ocqp_pool;
> + }
>   rdev->status_page->qp_start = rdev->lldi.vr->qp.start;
>   rdev->status_page->qp_size = rdev->lldi.vr->qp.size;
>   rdev->status_page->cq_start = rdev->lldi.vr->cq.start;
> 

This fix was previously submitted by Wei Yongjun.

-- 
Doug Ledford 
GPG Key ID: 0E572FDD



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 1/1] infiniband: hw: cxgb4: set errno on failure

2016-12-14 Thread Doug Ledford
On 12/3/2016 8:04 AM, Pan Bian wrote:
> From: Pan Bian 
> 
> In function c4iw_rdev_open(), the value of return variable err should be
> negative on errors. However, when the call to __get_free_page() returns
> a NULL pointer, its value is not set to "-ENOMEM" and keeps 0. 0 means
> no error. And thus, the behavior of its caller may be misled. This patch
> fixes the bug.
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188821
> 
> Signed-off-by: Pan Bian 
> ---
>  drivers/infiniband/hw/cxgb4/device.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/hw/cxgb4/device.c 
> b/drivers/infiniband/hw/cxgb4/device.c
> index 93e3d27..b99dc9e 100644
> --- a/drivers/infiniband/hw/cxgb4/device.c
> +++ b/drivers/infiniband/hw/cxgb4/device.c
> @@ -828,8 +828,10 @@ static int c4iw_rdev_open(struct c4iw_rdev *rdev)
>   }
>   rdev->status_page = (struct t4_dev_status_page *)
>   __get_free_page(GFP_KERNEL);
> - if (!rdev->status_page)
> + if (!rdev->status_page) {
> + err = -ENOMEM;
>   goto destroy_ocqp_pool;
> + }
>   rdev->status_page->qp_start = rdev->lldi.vr->qp.start;
>   rdev->status_page->qp_size = rdev->lldi.vr->qp.size;
>   rdev->status_page->cq_start = rdev->lldi.vr->cq.start;
> 

This fix was previously submitted by Wei Yongjun.

-- 
Doug Ledford 
GPG Key ID: 0E572FDD



signature.asc
Description: OpenPGP digital signature


[PATCH] vfio-mdev: Fix mtty sample driver building

2016-12-14 Thread Alex Williamson
This sample driver was originally under Documentation/ and was moved
to samples, but build support was never adjusted for the new location.

Signed-off-by: Alex Williamson 
---
 samples/Kconfig|7 +++
 samples/Makefile   |3 ++-
 samples/vfio-mdev/Makefile |   14 +-
 3 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/samples/Kconfig b/samples/Kconfig
index a6d2a43..b124f62 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -105,4 +105,11 @@ config SAMPLE_BLACKFIN_GPTIMERS
help
  Build samples of blackfin gptimers sample module.
 
+config SAMPLE_VFIO_MDEV_MTTY
+   tristate "Build VFIO mtty example mediated device sample code -- 
loadable modules only"
+   depends on VFIO_MDEV_DEVICE && m
+   help
+ Build a virtual tty sample driver for use as a VFIO
+ mediated device
+
 endif # SAMPLES
diff --git a/samples/Makefile b/samples/Makefile
index e17d66d..86a137e 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -2,4 +2,5 @@
 
 obj-$(CONFIG_SAMPLES)  += kobject/ kprobes/ trace_events/ livepatch/ \
   hw_breakpoint/ kfifo/ kdb/ hidraw/ rpmsg/ seccomp/ \
-  configfs/ connector/ v4l/ trace_printk/ blackfin/
+  configfs/ connector/ v4l/ trace_printk/ blackfin/ \
+  vfio-mdev/
diff --git a/samples/vfio-mdev/Makefile b/samples/vfio-mdev/Makefile
index a932edb..cbbd868 100644
--- a/samples/vfio-mdev/Makefile
+++ b/samples/vfio-mdev/Makefile
@@ -1,13 +1 @@
-#
-# Makefile for mtty.c file
-#
-KERNEL_DIR:=/lib/modules/$(shell uname -r)/build
-
-obj-m:=mtty.o
-
-modules clean modules_install:
-   $(MAKE) -C $(KERNEL_DIR) SUBDIRS=$(PWD) $@
-
-default: modules
-
-module: modules
+obj-$(CONFIG_SAMPLE_VFIO_MDEV_MTTY) += mtty.o



[PATCH] vfio-mdev: Fix mtty sample driver building

2016-12-14 Thread Alex Williamson
This sample driver was originally under Documentation/ and was moved
to samples, but build support was never adjusted for the new location.

Signed-off-by: Alex Williamson 
---
 samples/Kconfig|7 +++
 samples/Makefile   |3 ++-
 samples/vfio-mdev/Makefile |   14 +-
 3 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/samples/Kconfig b/samples/Kconfig
index a6d2a43..b124f62 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -105,4 +105,11 @@ config SAMPLE_BLACKFIN_GPTIMERS
help
  Build samples of blackfin gptimers sample module.
 
+config SAMPLE_VFIO_MDEV_MTTY
+   tristate "Build VFIO mtty example mediated device sample code -- 
loadable modules only"
+   depends on VFIO_MDEV_DEVICE && m
+   help
+ Build a virtual tty sample driver for use as a VFIO
+ mediated device
+
 endif # SAMPLES
diff --git a/samples/Makefile b/samples/Makefile
index e17d66d..86a137e 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -2,4 +2,5 @@
 
 obj-$(CONFIG_SAMPLES)  += kobject/ kprobes/ trace_events/ livepatch/ \
   hw_breakpoint/ kfifo/ kdb/ hidraw/ rpmsg/ seccomp/ \
-  configfs/ connector/ v4l/ trace_printk/ blackfin/
+  configfs/ connector/ v4l/ trace_printk/ blackfin/ \
+  vfio-mdev/
diff --git a/samples/vfio-mdev/Makefile b/samples/vfio-mdev/Makefile
index a932edb..cbbd868 100644
--- a/samples/vfio-mdev/Makefile
+++ b/samples/vfio-mdev/Makefile
@@ -1,13 +1 @@
-#
-# Makefile for mtty.c file
-#
-KERNEL_DIR:=/lib/modules/$(shell uname -r)/build
-
-obj-m:=mtty.o
-
-modules clean modules_install:
-   $(MAKE) -C $(KERNEL_DIR) SUBDIRS=$(PWD) $@
-
-default: modules
-
-module: modules
+obj-$(CONFIG_SAMPLE_VFIO_MDEV_MTTY) += mtty.o



Re: [PATCH v2 3/4] secure_seq: use siphash24 instead of md5_transform

2016-12-14 Thread Jason A. Donenfeld
Hi Hannes,

On Wed, Dec 14, 2016 at 8:22 PM, Hannes Frederic Sowa
 wrote:
> I don't think this helps. Did you test it? I don't see reason why
> padding could be left out between `d' and `end' because of the flexible
> array member?

Because the type u8 doesn't require any alignment requirements, it can
nestle right up there cozy with the u16:

zx2c4@thinkpad ~ $ cat a.c
#include 
#include 
#include 
int main()
{
   struct {
   uint64_t a;
   uint32_t b;
   uint32_t c;
   uint16_t d;
   char x[];
   } a;
   printf("%zu\n", sizeof(a));
   printf("%zu\n", offsetof(typeof(a), x));
   return 0;
}
zx2c4@thinkpad ~ $ gcc a.c
zx2c4@thinkpad ~ $ ./a.out
24
18

Jason


Re: [PATCH v2 3/4] secure_seq: use siphash24 instead of md5_transform

2016-12-14 Thread Jason A. Donenfeld
Hi Hannes,

On Wed, Dec 14, 2016 at 8:22 PM, Hannes Frederic Sowa
 wrote:
> I don't think this helps. Did you test it? I don't see reason why
> padding could be left out between `d' and `end' because of the flexible
> array member?

Because the type u8 doesn't require any alignment requirements, it can
nestle right up there cozy with the u16:

zx2c4@thinkpad ~ $ cat a.c
#include 
#include 
#include 
int main()
{
   struct {
   uint64_t a;
   uint32_t b;
   uint32_t c;
   uint16_t d;
   char x[];
   } a;
   printf("%zu\n", sizeof(a));
   printf("%zu\n", offsetof(typeof(a), x));
   return 0;
}
zx2c4@thinkpad ~ $ gcc a.c
zx2c4@thinkpad ~ $ ./a.out
24
18

Jason


Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-14 Thread Jason A. Donenfeld
Hi Tom,

On Wed, Dec 14, 2016 at 8:18 PM, Tom Herbert  wrote:
> "super fast" is relative. My quick test shows that this faster than
> Toeplitz (good, but not exactly hard to achieve), but is about 4x
> slower than jhash.

Fast relative to other cryptographically secure PRFs.

>> SipHash isn't just some new trendy hash function. It's been around for a
>> while, and there really isn't anything that comes remotely close to
>> being useful in the way SipHash is. With that said, why do we need this?
> I don't think we need advertising nor a lesson on hashing. It would be
> much more useful if you just point us to the paper on siphash (which I
> assume I http://cr.yp.to/siphash/siphash-20120918.pdf ?).

Ugh. Sorry. It definitely wasn't my intention to give an uninvited
lesson or an annoying advert. For the former, I didn't want to make
any expectations about fields of knowledge, because I honest have no
idea. For the latter, I wrote that sentence to indicate that siphash
isn't just some newfangled hipster function, but something useful and
well established. I didn't mean it as a form of advertising. My
apologies if I've offended your sensibilities.

That cr.yp.to link is fine, or https://131002.net/siphash/siphash.pdf I believe.

> Key rotation is important anyway, without any key rotation even if the
> key is compromised in siphash by some external means we would have an
> insecure hash until the system reboots.

I'm a bit surprised to read this. I've never designed a system to be
secure even in the event of remote arbitrary kernel memory disclosure,
and I wasn't aware this was generally considered an architectural
requirement or Linux.

In any case, if you want this, I suppose you can have it with siphash too.

> Maybe so, but we need to do due diligence before considering adopting
> siphash as the primary hashing in the network stack. Consider that we
> may very well perform a hash over L4 tuples on _every_ packet. We've
> done a good job at limiting this to be at most one hash per packet,
> but nevertheless the performance of the hash function must be take
> into account.

I agree with you. It seems like each case is going to needed to be
measured on a case by case basis. In this series I make the first use
of siphash in the secure sequence generation and get_random_int/long,
where siphash replaces md5, so there's a pretty clear performance in.
But for the jhash replacements indeed things are going to need to be
individually evaluated.

> 1) My quick test shows siphash is about four times more expensive than
> jhash. On my test system, computing a hash over IPv4 tuple (two 32 bit
> addresses and 2 16 bit source ports) is 6.9 nsecs in Jenkins hash, 33
> nsecs with siphash. Given that we have eliminated most of the packet
> header hashes this might be tolerable, but still should be looking at
> ways to optimize.
> 2) I like moving to use u64 (quad words) in the hash, this is an
> improvement over Jenkins which is based on 32 bit words. If we put
> this in the kernel we probably want to have several variants of
> siphash for specific sizes (e.g. siphash1, siphash2, siphash2,
> siphashn for hash over one, two, three, or n sixty four bit words).

I think your suggestion for (2) will contribute to further
optimizations for (1). In v2, I had another patch in there adding
siphash_1word, siphash_2words, etc, like jhash, but I implemented it
by taking u32 variables and then just concatenating these into a
buffer and passing them to the main siphash function. I removed it
from v3 because I thought that these kind of missed the whole point.
In particular:

a) siphash24_1word, siphash24_2words, siphash24_3words, etc should
take u64, not u32, since that's what siphash operates on natively
b) Rather than concatenating them in a buffer, I should write
specializations of the siphash24 function _especially_ for these size
inputs to avoid the copy and to reduce the book keeping.

I'll add these functions to v4 implemented like that.

Thanks for the useful feedback and benchmarks!

Jason


Re: [PATCH 1/1] infiniband: hw: mlx4: fix improper return value

2016-12-14 Thread Doug Ledford
On 12/4/2016 1:45 AM, Pan Bian wrote:
> From: Pan Bian 
> 
> If uhw->inlen is non-zero, the value of variable err is 0 if the copy
> succeeds. Then, if kzalloc() or kmalloc() returns a NULL pointer, it
> will return 0 to the callers. As a result, the callers cannot detect the
> errors. This patch fixes the bug, assign "-ENOMEM" to err before the
> NULL pointer checks, and remove the initialization of err at the
> beginning.
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189031
> Signed-off-by: Pan Bian 

Thanks, applied.

-- 
Doug Ledford 
GPG Key ID: 0E572FDD



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-14 Thread Jason A. Donenfeld
Hi Tom,

On Wed, Dec 14, 2016 at 8:18 PM, Tom Herbert  wrote:
> "super fast" is relative. My quick test shows that this faster than
> Toeplitz (good, but not exactly hard to achieve), but is about 4x
> slower than jhash.

Fast relative to other cryptographically secure PRFs.

>> SipHash isn't just some new trendy hash function. It's been around for a
>> while, and there really isn't anything that comes remotely close to
>> being useful in the way SipHash is. With that said, why do we need this?
> I don't think we need advertising nor a lesson on hashing. It would be
> much more useful if you just point us to the paper on siphash (which I
> assume I http://cr.yp.to/siphash/siphash-20120918.pdf ?).

Ugh. Sorry. It definitely wasn't my intention to give an uninvited
lesson or an annoying advert. For the former, I didn't want to make
any expectations about fields of knowledge, because I honest have no
idea. For the latter, I wrote that sentence to indicate that siphash
isn't just some newfangled hipster function, but something useful and
well established. I didn't mean it as a form of advertising. My
apologies if I've offended your sensibilities.

That cr.yp.to link is fine, or https://131002.net/siphash/siphash.pdf I believe.

> Key rotation is important anyway, without any key rotation even if the
> key is compromised in siphash by some external means we would have an
> insecure hash until the system reboots.

I'm a bit surprised to read this. I've never designed a system to be
secure even in the event of remote arbitrary kernel memory disclosure,
and I wasn't aware this was generally considered an architectural
requirement or Linux.

In any case, if you want this, I suppose you can have it with siphash too.

> Maybe so, but we need to do due diligence before considering adopting
> siphash as the primary hashing in the network stack. Consider that we
> may very well perform a hash over L4 tuples on _every_ packet. We've
> done a good job at limiting this to be at most one hash per packet,
> but nevertheless the performance of the hash function must be take
> into account.

I agree with you. It seems like each case is going to needed to be
measured on a case by case basis. In this series I make the first use
of siphash in the secure sequence generation and get_random_int/long,
where siphash replaces md5, so there's a pretty clear performance in.
But for the jhash replacements indeed things are going to need to be
individually evaluated.

> 1) My quick test shows siphash is about four times more expensive than
> jhash. On my test system, computing a hash over IPv4 tuple (two 32 bit
> addresses and 2 16 bit source ports) is 6.9 nsecs in Jenkins hash, 33
> nsecs with siphash. Given that we have eliminated most of the packet
> header hashes this might be tolerable, but still should be looking at
> ways to optimize.
> 2) I like moving to use u64 (quad words) in the hash, this is an
> improvement over Jenkins which is based on 32 bit words. If we put
> this in the kernel we probably want to have several variants of
> siphash for specific sizes (e.g. siphash1, siphash2, siphash2,
> siphashn for hash over one, two, three, or n sixty four bit words).

I think your suggestion for (2) will contribute to further
optimizations for (1). In v2, I had another patch in there adding
siphash_1word, siphash_2words, etc, like jhash, but I implemented it
by taking u32 variables and then just concatenating these into a
buffer and passing them to the main siphash function. I removed it
from v3 because I thought that these kind of missed the whole point.
In particular:

a) siphash24_1word, siphash24_2words, siphash24_3words, etc should
take u64, not u32, since that's what siphash operates on natively
b) Rather than concatenating them in a buffer, I should write
specializations of the siphash24 function _especially_ for these size
inputs to avoid the copy and to reduce the book keeping.

I'll add these functions to v4 implemented like that.

Thanks for the useful feedback and benchmarks!

Jason


Re: [PATCH 1/1] infiniband: hw: mlx4: fix improper return value

2016-12-14 Thread Doug Ledford
On 12/4/2016 1:45 AM, Pan Bian wrote:
> From: Pan Bian 
> 
> If uhw->inlen is non-zero, the value of variable err is 0 if the copy
> succeeds. Then, if kzalloc() or kmalloc() returns a NULL pointer, it
> will return 0 to the callers. As a result, the callers cannot detect the
> errors. This patch fixes the bug, assign "-ENOMEM" to err before the
> NULL pointer checks, and remove the initialization of err at the
> beginning.
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189031
> Signed-off-by: Pan Bian 

Thanks, applied.

-- 
Doug Ledford 
GPG Key ID: 0E572FDD



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 1/1] infiniband: hw: ocrdma: fix bad initialization

2016-12-14 Thread Doug Ledford
On 12/4/2016 1:20 AM, Leon Romanovsky wrote:
> On Sat, Dec 03, 2016 at 09:10:21PM +0800, Pan Bian wrote:
>> From: Pan Bian 
>>
>> In function ocrdma_mbx_create_ah_tbl(), returns the value of status on
>> errors. However, because status is initialized with 0, 0 will be
>> returned even if on error paths. This patch initialize status with
>> "-ENOMEM".
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188831
>>
>> Signed-off-by: Pan Bian 
> 
> Thanks,
> Reviewed-by: Leon Romanovsky 
> 

Thanks, applied.

-- 
Doug Ledford 
GPG Key ID: 0E572FDD



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 1/1] infiniband: hw: ocrdma: fix bad initialization

2016-12-14 Thread Doug Ledford
On 12/4/2016 1:20 AM, Leon Romanovsky wrote:
> On Sat, Dec 03, 2016 at 09:10:21PM +0800, Pan Bian wrote:
>> From: Pan Bian 
>>
>> In function ocrdma_mbx_create_ah_tbl(), returns the value of status on
>> errors. However, because status is initialized with 0, 0 will be
>> returned even if on error paths. This patch initialize status with
>> "-ENOMEM".
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188831
>>
>> Signed-off-by: Pan Bian 
> 
> Thanks,
> Reviewed-by: Leon Romanovsky 
> 

Thanks, applied.

-- 
Doug Ledford 
GPG Key ID: 0E572FDD



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v2] crypto: sun4i-ss: support the Security System PRNG

2016-12-14 Thread PrasannaKumar Muralidharan
>> I have found two solutions:
>
> No we already have algif_rng so let's not confuse things even
> further by making hwrng take PRNGs.

Even if both the solutions could not be adopted I think there must be
a way for applications to use similar API to get true rng or prng.
Given the case that no user complained about prng data when using
/dev/hwrng is it safe to assume that the random data generated is
acceptable for users? If so, the drivers can be left without any
modification.

Should there be a mandate that driver will be accepted only when it
passes 'rngtest'. This will make sure that prng drivers won't get
added in future.

Regards,
PrasannaKumar


Re: [PATCH v2] crypto: sun4i-ss: support the Security System PRNG

2016-12-14 Thread PrasannaKumar Muralidharan
>> I have found two solutions:
>
> No we already have algif_rng so let's not confuse things even
> further by making hwrng take PRNGs.

Even if both the solutions could not be adopted I think there must be
a way for applications to use similar API to get true rng or prng.
Given the case that no user complained about prng data when using
/dev/hwrng is it safe to assume that the random data generated is
acceptable for users? If so, the drivers can be left without any
modification.

Should there be a mandate that driver will be accepted only when it
passes 'rngtest'. This will make sure that prng drivers won't get
added in future.

Regards,
PrasannaKumar


Re: [PATCH 1/1] infiniband: nes: return value of skb_linearize should be handled

2016-12-14 Thread Doug Ledford
On 12/7/2016 2:30 AM, Zhouyi Zhou wrote:
> Return value of skb_linearize should be handled in function 
> nes_netdev_start_xmit.
> 
> Compiled in x86_64
> Signed-off-by: Zhouyi Zhou 
> Reviewed-by: Yuval Shaia  
> Reviewed-by: Eric Dumazet 

Applied, thanks.

-- 
Doug Ledford 
GPG Key ID: 0E572FDD



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 1/1] infiniband: nes: return value of skb_linearize should be handled

2016-12-14 Thread Doug Ledford
On 12/7/2016 2:30 AM, Zhouyi Zhou wrote:
> Return value of skb_linearize should be handled in function 
> nes_netdev_start_xmit.
> 
> Compiled in x86_64
> Signed-off-by: Zhouyi Zhou 
> Reviewed-by: Yuval Shaia  
> Reviewed-by: Eric Dumazet 

Applied, thanks.

-- 
Doug Ledford 
GPG Key ID: 0E572FDD



signature.asc
Description: OpenPGP digital signature


[PATCH] usb: musb: debugfs: allow forcing host mode together with speed in testmode

2016-12-14 Thread Pali Rohár
Based on the musb ug, force_host bit is allowed to be set along with
force_hs or force_fs bit.

It could help to implement forced host mode via testmode on Nokia N900.

Signed-off-by: Pali Rohár 
---
 drivers/usb/musb/musb_debugfs.c |   44 +--
 1 file changed, 28 insertions(+), 16 deletions(-)

diff --git a/drivers/usb/musb/musb_debugfs.c b/drivers/usb/musb/musb_debugfs.c
index 9b22d94..62c13a2 100644
--- a/drivers/usb/musb/musb_debugfs.c
+++ b/drivers/usb/musb/musb_debugfs.c
@@ -147,28 +147,34 @@ static int musb_test_mode_show(struct seq_file *s, void 
*unused)
 
test = musb_readb(musb->mregs, MUSB_TESTMODE);
 
-   if (test & MUSB_TEST_FORCE_HOST)
+   if (test & (MUSB_TEST_FORCE_HOST | MUSB_TEST_FORCE_FS))
+   seq_printf(s, "force host full-speed\n");
+
+   else if (test & (MUSB_TEST_FORCE_HOST | MUSB_TEST_FORCE_HS))
+   seq_printf(s, "force host high-speed\n");
+
+   else if (test & MUSB_TEST_FORCE_HOST)
seq_printf(s, "force host\n");
 
-   if (test & MUSB_TEST_FIFO_ACCESS)
+   else if (test & MUSB_TEST_FIFO_ACCESS)
seq_printf(s, "fifo access\n");
 
-   if (test & MUSB_TEST_FORCE_FS)
+   else if (test & MUSB_TEST_FORCE_FS)
seq_printf(s, "force full-speed\n");
 
-   if (test & MUSB_TEST_FORCE_HS)
+   else if (test & MUSB_TEST_FORCE_HS)
seq_printf(s, "force high-speed\n");
 
-   if (test & MUSB_TEST_PACKET)
+   else if (test & MUSB_TEST_PACKET)
seq_printf(s, "test packet\n");
 
-   if (test & MUSB_TEST_K)
+   else if (test & MUSB_TEST_K)
seq_printf(s, "test K\n");
 
-   if (test & MUSB_TEST_J)
+   else if (test & MUSB_TEST_J)
seq_printf(s, "test J\n");
 
-   if (test & MUSB_TEST_SE0_NAK)
+   else if (test & MUSB_TEST_SE0_NAK)
seq_printf(s, "test SE0 NAK\n");
 
return 0;
@@ -206,30 +212,36 @@ static ssize_t musb_test_mode_write(struct file *file,
if (copy_from_user(buf, ubuf, min_t(size_t, sizeof(buf) - 1, count)))
return -EFAULT;
 
-   if (strstarts(buf, "force host"))
+   if (strstarts(buf, "force host full-speed"))
+   test = MUSB_TEST_FORCE_HOST | MUSB_TEST_FORCE_FS;
+
+   else if (strstarts(buf, "force host high-speed"))
+   test = MUSB_TEST_FORCE_HOST | MUSB_TEST_FORCE_HS;
+
+   else if (strstarts(buf, "force host"))
test = MUSB_TEST_FORCE_HOST;
 
-   if (strstarts(buf, "fifo access"))
+   else if (strstarts(buf, "fifo access"))
test = MUSB_TEST_FIFO_ACCESS;
 
-   if (strstarts(buf, "force full-speed"))
+   else if (strstarts(buf, "force full-speed"))
test = MUSB_TEST_FORCE_FS;
 
-   if (strstarts(buf, "force high-speed"))
+   else if (strstarts(buf, "force high-speed"))
test = MUSB_TEST_FORCE_HS;
 
-   if (strstarts(buf, "test packet")) {
+   else if (strstarts(buf, "test packet")) {
test = MUSB_TEST_PACKET;
musb_load_testpacket(musb);
}
 
-   if (strstarts(buf, "test K"))
+   else if (strstarts(buf, "test K"))
test = MUSB_TEST_K;
 
-   if (strstarts(buf, "test J"))
+   else if (strstarts(buf, "test J"))
test = MUSB_TEST_J;
 
-   if (strstarts(buf, "test SE0 NAK"))
+   else if (strstarts(buf, "test SE0 NAK"))
test = MUSB_TEST_SE0_NAK;
 
musb_writeb(musb->mregs, MUSB_TESTMODE, test);
-- 
1.7.9.5



Re: [tip:x86/urgent] x86/boot/64: Use 'push' instead of 'call' in start_cpu()

2016-12-14 Thread hpa
On December 14, 2016 12:36:58 AM PST, tip-bot for Josh Poimboeuf 
 wrote:
>Commit-ID:  ec2d86a9b646d93f1948569f368e2c6f5449e6c7
>Gitweb:
>http://git.kernel.org/tip/ec2d86a9b646d93f1948569f368e2c6f5449e6c7
>Author: Josh Poimboeuf 
>AuthorDate: Tue, 13 Dec 2016 21:25:35 -0600
>Committer:  Ingo Molnar 
>CommitDate: Wed, 14 Dec 2016 08:48:05 +0100
>
>x86/boot/64: Use 'push' instead of 'call' in start_cpu()
>
>start_cpu() pushes a text address on the stack so that stack traces
>from
>idle tasks will show start_cpu() at the end.  But it uses a call
>instruction to do that, which is rather obtuse.  Use a straightforward
>push instead.
>
>Suggested-by: Borislav Petkov 
>Signed-off-by: Josh Poimboeuf 
>Cc: Andy Lutomirski 
>Cc: Brian Gerst 
>Cc: Denys Vlasenko 
>Cc: H. Peter Anvin 
>Cc: Linus Torvalds 
>Cc: Peter Zijlstra 
>Cc: Thomas Gleixner 
>Link:
>http://lkml.kernel.org/r/4d8a1952759721d42d1e62ba9e4a7e3ac5df8574.1481685203.git.jpoim...@redhat.com
>Signed-off-by: Ingo Molnar 
>---
> arch/x86/kernel/head_64.S | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
>index 90de288..1facaf4 100644
>--- a/arch/x86/kernel/head_64.S
>+++ b/arch/x86/kernel/head_64.S
>@@ -298,7 +298,7 @@ ENTRY(start_cpu)
>*  REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
>*  address given in m16:64.
>*/
>-  call1f  # put return address on stack for unwinder
>+  pushq   $1f # put return address on stack for unwinder
> 1:xorq%rbp, %rbp  # clear frame pointer
>   movqinitial_code(%rip), %rax
>   pushq   $__KERNEL_CS# set correct cs

This adds another relocation to the kernel.  I hope this is safe at this point 
in the code?
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


[PATCH] usb: musb: debugfs: allow forcing host mode together with speed in testmode

2016-12-14 Thread Pali Rohár
Based on the musb ug, force_host bit is allowed to be set along with
force_hs or force_fs bit.

It could help to implement forced host mode via testmode on Nokia N900.

Signed-off-by: Pali Rohár 
---
 drivers/usb/musb/musb_debugfs.c |   44 +--
 1 file changed, 28 insertions(+), 16 deletions(-)

diff --git a/drivers/usb/musb/musb_debugfs.c b/drivers/usb/musb/musb_debugfs.c
index 9b22d94..62c13a2 100644
--- a/drivers/usb/musb/musb_debugfs.c
+++ b/drivers/usb/musb/musb_debugfs.c
@@ -147,28 +147,34 @@ static int musb_test_mode_show(struct seq_file *s, void 
*unused)
 
test = musb_readb(musb->mregs, MUSB_TESTMODE);
 
-   if (test & MUSB_TEST_FORCE_HOST)
+   if (test & (MUSB_TEST_FORCE_HOST | MUSB_TEST_FORCE_FS))
+   seq_printf(s, "force host full-speed\n");
+
+   else if (test & (MUSB_TEST_FORCE_HOST | MUSB_TEST_FORCE_HS))
+   seq_printf(s, "force host high-speed\n");
+
+   else if (test & MUSB_TEST_FORCE_HOST)
seq_printf(s, "force host\n");
 
-   if (test & MUSB_TEST_FIFO_ACCESS)
+   else if (test & MUSB_TEST_FIFO_ACCESS)
seq_printf(s, "fifo access\n");
 
-   if (test & MUSB_TEST_FORCE_FS)
+   else if (test & MUSB_TEST_FORCE_FS)
seq_printf(s, "force full-speed\n");
 
-   if (test & MUSB_TEST_FORCE_HS)
+   else if (test & MUSB_TEST_FORCE_HS)
seq_printf(s, "force high-speed\n");
 
-   if (test & MUSB_TEST_PACKET)
+   else if (test & MUSB_TEST_PACKET)
seq_printf(s, "test packet\n");
 
-   if (test & MUSB_TEST_K)
+   else if (test & MUSB_TEST_K)
seq_printf(s, "test K\n");
 
-   if (test & MUSB_TEST_J)
+   else if (test & MUSB_TEST_J)
seq_printf(s, "test J\n");
 
-   if (test & MUSB_TEST_SE0_NAK)
+   else if (test & MUSB_TEST_SE0_NAK)
seq_printf(s, "test SE0 NAK\n");
 
return 0;
@@ -206,30 +212,36 @@ static ssize_t musb_test_mode_write(struct file *file,
if (copy_from_user(buf, ubuf, min_t(size_t, sizeof(buf) - 1, count)))
return -EFAULT;
 
-   if (strstarts(buf, "force host"))
+   if (strstarts(buf, "force host full-speed"))
+   test = MUSB_TEST_FORCE_HOST | MUSB_TEST_FORCE_FS;
+
+   else if (strstarts(buf, "force host high-speed"))
+   test = MUSB_TEST_FORCE_HOST | MUSB_TEST_FORCE_HS;
+
+   else if (strstarts(buf, "force host"))
test = MUSB_TEST_FORCE_HOST;
 
-   if (strstarts(buf, "fifo access"))
+   else if (strstarts(buf, "fifo access"))
test = MUSB_TEST_FIFO_ACCESS;
 
-   if (strstarts(buf, "force full-speed"))
+   else if (strstarts(buf, "force full-speed"))
test = MUSB_TEST_FORCE_FS;
 
-   if (strstarts(buf, "force high-speed"))
+   else if (strstarts(buf, "force high-speed"))
test = MUSB_TEST_FORCE_HS;
 
-   if (strstarts(buf, "test packet")) {
+   else if (strstarts(buf, "test packet")) {
test = MUSB_TEST_PACKET;
musb_load_testpacket(musb);
}
 
-   if (strstarts(buf, "test K"))
+   else if (strstarts(buf, "test K"))
test = MUSB_TEST_K;
 
-   if (strstarts(buf, "test J"))
+   else if (strstarts(buf, "test J"))
test = MUSB_TEST_J;
 
-   if (strstarts(buf, "test SE0 NAK"))
+   else if (strstarts(buf, "test SE0 NAK"))
test = MUSB_TEST_SE0_NAK;
 
musb_writeb(musb->mregs, MUSB_TESTMODE, test);
-- 
1.7.9.5



Re: [tip:x86/urgent] x86/boot/64: Use 'push' instead of 'call' in start_cpu()

2016-12-14 Thread hpa
On December 14, 2016 12:36:58 AM PST, tip-bot for Josh Poimboeuf 
 wrote:
>Commit-ID:  ec2d86a9b646d93f1948569f368e2c6f5449e6c7
>Gitweb:
>http://git.kernel.org/tip/ec2d86a9b646d93f1948569f368e2c6f5449e6c7
>Author: Josh Poimboeuf 
>AuthorDate: Tue, 13 Dec 2016 21:25:35 -0600
>Committer:  Ingo Molnar 
>CommitDate: Wed, 14 Dec 2016 08:48:05 +0100
>
>x86/boot/64: Use 'push' instead of 'call' in start_cpu()
>
>start_cpu() pushes a text address on the stack so that stack traces
>from
>idle tasks will show start_cpu() at the end.  But it uses a call
>instruction to do that, which is rather obtuse.  Use a straightforward
>push instead.
>
>Suggested-by: Borislav Petkov 
>Signed-off-by: Josh Poimboeuf 
>Cc: Andy Lutomirski 
>Cc: Brian Gerst 
>Cc: Denys Vlasenko 
>Cc: H. Peter Anvin 
>Cc: Linus Torvalds 
>Cc: Peter Zijlstra 
>Cc: Thomas Gleixner 
>Link:
>http://lkml.kernel.org/r/4d8a1952759721d42d1e62ba9e4a7e3ac5df8574.1481685203.git.jpoim...@redhat.com
>Signed-off-by: Ingo Molnar 
>---
> arch/x86/kernel/head_64.S | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
>index 90de288..1facaf4 100644
>--- a/arch/x86/kernel/head_64.S
>+++ b/arch/x86/kernel/head_64.S
>@@ -298,7 +298,7 @@ ENTRY(start_cpu)
>*  REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
>*  address given in m16:64.
>*/
>-  call1f  # put return address on stack for unwinder
>+  pushq   $1f # put return address on stack for unwinder
> 1:xorq%rbp, %rbp  # clear frame pointer
>   movqinitial_code(%rip), %rax
>   pushq   $__KERNEL_CS# set correct cs

This adds another relocation to the kernel.  I hope this is safe at this point 
in the code?
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [kernel-hardening] [RFC 0/4] make call_usermodehelper a bit more "safe"

2016-12-14 Thread Mark Rutland

Hi,

On Wed, Dec 14, 2016 at 10:50:00AM -0800, Greg KH wrote:
> The issue is that if you end up getting write access to kernel memory,
> if you change the string '/sbin/hotplug' to point to
> '/home/hacked/my_binary', then the next uevent that the system makes
> will call this binary instead of the "trusted" one.
> 
> It does this by moving the location of the binary to be in read-only
> memory.  This works for a number of call_usermodehelper strings, as they
> are specified at build or configuration time.  But, some subsystems have
> the option to let userspace change the value at runtime, so those values
> can't live in read-only memory. 

> So, anyone have any better ideas?  Is this approach worth it?  Or should
> we just go down the "whitelist" path?

As a general note, I believe the write-rarely / mostly-ro [1] stuff is
meant to cater for this case, but I haven't heard anything on that front
recently (and there doesn't appear to be anything on the KSPP TODO
page).

If that does cater for this case, and if we're able to implement that
generically, that might be nicer than locking down the set of binaries
at build time.

Chen, are you still looking at implementing write-rarely support?

Thanks,
Mark.

[1] http://www.openwall.com/lists/kernel-hardening/2016/11/16/3


Re: [kernel-hardening] [RFC 0/4] make call_usermodehelper a bit more "safe"

2016-12-14 Thread Mark Rutland

Hi,

On Wed, Dec 14, 2016 at 10:50:00AM -0800, Greg KH wrote:
> The issue is that if you end up getting write access to kernel memory,
> if you change the string '/sbin/hotplug' to point to
> '/home/hacked/my_binary', then the next uevent that the system makes
> will call this binary instead of the "trusted" one.
> 
> It does this by moving the location of the binary to be in read-only
> memory.  This works for a number of call_usermodehelper strings, as they
> are specified at build or configuration time.  But, some subsystems have
> the option to let userspace change the value at runtime, so those values
> can't live in read-only memory. 

> So, anyone have any better ideas?  Is this approach worth it?  Or should
> we just go down the "whitelist" path?

As a general note, I believe the write-rarely / mostly-ro [1] stuff is
meant to cater for this case, but I haven't heard anything on that front
recently (and there doesn't appear to be anything on the KSPP TODO
page).

If that does cater for this case, and if we're able to implement that
generically, that might be nicer than locking down the set of binaries
at build time.

Chen, are you still looking at implementing write-rarely support?

Thanks,
Mark.

[1] http://www.openwall.com/lists/kernel-hardening/2016/11/16/3


Re: [PATCH v2 01/46] mtdpart: Propagate _get/put_device()

2016-12-14 Thread Karl Beldan
On Wed, Sep 28, 2016 at 8:16 PM, Brian Norris
 wrote:
> On Wed, Sep 21, 2016 at 12:15:31PM +0200, Boris Brezillon wrote:
>> On Wed, 21 Sep 2016 11:43:56 +0200
>> Daniel Walter  wrote:
>>
>> > From: Richard Weinberger 
>> >
>> > If the master device has callbacks for _get/put_device()
>> > and this MTD has slaves a get_mtd_device() call on paritions
>> > will never issue the registered callbacks.
>> > Fix this by propagating _get/put_device() down.
>>
>> Brian, can we have this one queued for 4.9? I can't take it in my tree
>> if you want, but it's probably better if it's in the mtd tree.
>
> Applied this patch to l2-mtd.git
>

I think this should also go into -stable.


Re: [PATCH v2 01/46] mtdpart: Propagate _get/put_device()

2016-12-14 Thread Karl Beldan
On Wed, Sep 28, 2016 at 8:16 PM, Brian Norris
 wrote:
> On Wed, Sep 21, 2016 at 12:15:31PM +0200, Boris Brezillon wrote:
>> On Wed, 21 Sep 2016 11:43:56 +0200
>> Daniel Walter  wrote:
>>
>> > From: Richard Weinberger 
>> >
>> > If the master device has callbacks for _get/put_device()
>> > and this MTD has slaves a get_mtd_device() call on paritions
>> > will never issue the registered callbacks.
>> > Fix this by propagating _get/put_device() down.
>>
>> Brian, can we have this one queued for 4.9? I can't take it in my tree
>> if you want, but it's probably better if it's in the mtd tree.
>
> Applied this patch to l2-mtd.git
>

I think this should also go into -stable.


Re: [PATCH] IB/core: fix unmap_sg argument

2016-12-14 Thread Doug Ledford
On 12/2/2016 8:45 AM, Sebastian Ott wrote:

> __ib_umem_release calls dma_unmap_sg with a different number of
> sg_entries than ib_umem_get uses for dma_map_sg. This might cause
> trouble for implementations that merge sglist entries and results
> in the following dma debug complaint:
> 
> DMA-API: device driver frees DMA sg list with different entry
>  count [map count=2] [unmap count=1]
> 
> Fix it by using the correct value.
> 
> Signed-off-by: Sebastian Ott 
> ---
>  drivers/infiniband/core/umem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> index 84b4eff..1e62a5f 100644
> --- a/drivers/infiniband/core/umem.c
> +++ b/drivers/infiniband/core/umem.c
> @@ -51,7 +51,7 @@ static void __ib_umem_release(struct ib_device *dev, struct 
> ib_umem *umem, int d
>  
>   if (umem->nmap > 0)
>   ib_dma_unmap_sg(dev, umem->sg_head.sgl,
> - umem->nmap,
> + umem->npages,
>   DMA_BIDIRECTIONAL);
>  
>   for_each_sg(umem->sg_head.sgl, sg, umem->npages, i) {
> 

Thanks, applied.

-- 
Doug Ledford 
GPG Key ID: 0E572FDD



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] IB/core: fix unmap_sg argument

2016-12-14 Thread Doug Ledford
On 12/2/2016 8:45 AM, Sebastian Ott wrote:

> __ib_umem_release calls dma_unmap_sg with a different number of
> sg_entries than ib_umem_get uses for dma_map_sg. This might cause
> trouble for implementations that merge sglist entries and results
> in the following dma debug complaint:
> 
> DMA-API: device driver frees DMA sg list with different entry
>  count [map count=2] [unmap count=1]
> 
> Fix it by using the correct value.
> 
> Signed-off-by: Sebastian Ott 
> ---
>  drivers/infiniband/core/umem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> index 84b4eff..1e62a5f 100644
> --- a/drivers/infiniband/core/umem.c
> +++ b/drivers/infiniband/core/umem.c
> @@ -51,7 +51,7 @@ static void __ib_umem_release(struct ib_device *dev, struct 
> ib_umem *umem, int d
>  
>   if (umem->nmap > 0)
>   ib_dma_unmap_sg(dev, umem->sg_head.sgl,
> - umem->nmap,
> + umem->npages,
>   DMA_BIDIRECTIONAL);
>  
>   for_each_sg(umem->sg_head.sgl, sg, umem->npages, i) {
> 

Thanks, applied.

-- 
Doug Ledford 
GPG Key ID: 0E572FDD



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v2 3/4] secure_seq: use siphash24 instead of md5_transform

2016-12-14 Thread Hannes Frederic Sowa
On 14.12.2016 19:06, Jason A. Donenfeld wrote:
> Hi David,
> 
> On Wed, Dec 14, 2016 at 6:56 PM, David Miller  wrote:
>> Just marking the structure __packed, whether necessary or not, makes
>> the compiler assume that the members are not aligned and causes
>> byte-by-byte accesses to be performed for words.
>> Never, _ever_, use __packed unless absolutely necessary, it pessimizes
>> the code on cpus that require proper alignment of types.
> 
> Oh, jimminy cricket, I did not realize that it made assignments
> byte-by-byte *always*. So what options am I left with? What
> immediately comes to mind are:
> 
> 1)
> 
> struct {
> u64 a;
> u32 b;
> u32 c;
> u16 d;
> u8 end[];

I don't think this helps. Did you test it? I don't see reason why
padding could be left out between `d' and `end' because of the flexible
array member?

Bye,
Hannes




Re: [PATCH v2 3/4] secure_seq: use siphash24 instead of md5_transform

2016-12-14 Thread Hannes Frederic Sowa
On 14.12.2016 19:06, Jason A. Donenfeld wrote:
> Hi David,
> 
> On Wed, Dec 14, 2016 at 6:56 PM, David Miller  wrote:
>> Just marking the structure __packed, whether necessary or not, makes
>> the compiler assume that the members are not aligned and causes
>> byte-by-byte accesses to be performed for words.
>> Never, _ever_, use __packed unless absolutely necessary, it pessimizes
>> the code on cpus that require proper alignment of types.
> 
> Oh, jimminy cricket, I did not realize that it made assignments
> byte-by-byte *always*. So what options am I left with? What
> immediately comes to mind are:
> 
> 1)
> 
> struct {
> u64 a;
> u32 b;
> u32 c;
> u16 d;
> u8 end[];

I don't think this helps. Did you test it? I don't see reason why
padding could be left out between `d' and `end' because of the flexible
array member?

Bye,
Hannes




RE: [PATCH] perf tools: ignore zombie process for user profile

2016-12-14 Thread Liang, Kan


.
> Em Wed, Dec 14, 2016 at 06:26:02PM +, Liang, Kan escreveu:
> > > On Wed, Dec 14, 2016 at 12:48:05PM -0500, kan.li...@intel.com wrote:
> > > > From: Kan Liang  Zombie process is dead
> > > > process. It is meaningless to profile it.
> > > > It's better to ignore it for user profile.
> 
> > > I recently posted different patch for same issue:
> > >   http://marc.info/?l=linux-kernel=148153895827359=2
> 
> > The change as below make me confuse.
> > +   /* The system wide setup does not work with threads. */
> > +   if (!evsel->system_wide)
> > +   return false;
> > It looks the meaning of the comments is inconsistent with the code.
> 
> > Your original patch doesn't work well with the issue.
> > But if I change the above code as below, the issue is fixed.
> > if (evsel->system_wide)
> > return false;
> 
> yeah, Namhyung noticed that, Jiri sent a fixed patch, I have it already in my
> perf/core branch.
> 

That's great. Then Jiri's patch would work on my case.
Please ignore this patch.

Thanks,
Kan



RE: [PATCH] perf tools: ignore zombie process for user profile

2016-12-14 Thread Liang, Kan


.
> Em Wed, Dec 14, 2016 at 06:26:02PM +, Liang, Kan escreveu:
> > > On Wed, Dec 14, 2016 at 12:48:05PM -0500, kan.li...@intel.com wrote:
> > > > From: Kan Liang  Zombie process is dead
> > > > process. It is meaningless to profile it.
> > > > It's better to ignore it for user profile.
> 
> > > I recently posted different patch for same issue:
> > >   http://marc.info/?l=linux-kernel=148153895827359=2
> 
> > The change as below make me confuse.
> > +   /* The system wide setup does not work with threads. */
> > +   if (!evsel->system_wide)
> > +   return false;
> > It looks the meaning of the comments is inconsistent with the code.
> 
> > Your original patch doesn't work well with the issue.
> > But if I change the above code as below, the issue is fixed.
> > if (evsel->system_wide)
> > return false;
> 
> yeah, Namhyung noticed that, Jiri sent a fixed patch, I have it already in my
> perf/core branch.
> 

That's great. Then Jiri's patch would work on my case.
Please ignore this patch.

Thanks,
Kan



Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-14 Thread Tom Herbert
On Wed, Dec 14, 2016 at 10:46 AM, Jason A. Donenfeld  wrote:
> SipHash is a 64-bit keyed hash function that is actually a
> cryptographically secure PRF, like HMAC. Except SipHash is super fast,
> and is meant to be used as a hashtable keyed lookup function.
>
"super fast" is relative. My quick test shows that this faster than
Toeplitz (good, but not exactly hard to achieve), but is about 4x
slower than jhash.

> SipHash isn't just some new trendy hash function. It's been around for a
> while, and there really isn't anything that comes remotely close to
> being useful in the way SipHash is. With that said, why do we need this?
>
I don't think we need advertising nor a lesson on hashing. It would be
much more useful if you just point us to the paper on siphash (which I
assume I http://cr.yp.to/siphash/siphash-20120918.pdf ?).

> There are a variety of attacks known as "hashtable poisoning" in which an
> attacker forms some data such that the hash of that data will be the
> same, and then preceeds to fill up all entries of a hashbucket. This is
> a realistic and well-known denial-of-service vector.
>
> Linux developers already seem to be aware that this is an issue, and
> various places that use hash tables in, say, a network context, use a
> non-cryptographically secure function (usually jhash) and then try to
> twiddle with the key on a time basis (or in many cases just do nothing
> and hope that nobody notices). While this is an admirable attempt at
> solving the problem, it doesn't actually fix it. SipHash fixes it.
>
Key rotation is important anyway, without any key rotation even if the
key is compromised in siphash by some external means we would have an
insecure hash until the system reboots.

> (It fixes it in such a sound way that you could even build a stream
> cipher out of SipHash that would resist the modern cryptanalysis.)
>
> There are a modicum of places in the kernel that are vulnerable to
> hashtable poisoning attacks, either via userspace vectors or network
> vectors, and there's not a reliable mechanism inside the kernel at the
> moment to fix it. The first step toward fixing these issues is actually
> getting a secure primitive into the kernel for developers to use. Then
> we can, bit by bit, port things over to it as deemed appropriate.
>
> Secondly, a few places are using MD5 for creating secure sequence
> numbers, port numbers, or fast random numbers. Siphash is a faster, more
> fittting, and more secure replacement for MD5 in those situations.
>
> Dozens of languages are already using this internally for their hash
> tables. Some of the BSDs already use this in their kernels. SipHash is
> a widely known high-speed solution to a widely known problem, and it's
> time we catch-up.
>
Maybe so, but we need to do due diligence before considering adopting
siphash as the primary hashing in the network stack. Consider that we
may very well perform a hash over L4 tuples on _every_ packet. We've
done a good job at limiting this to be at most one hash per packet,
but nevertheless the performance of the hash function must be take
into account.

A few points:

1) My quick test shows siphash is about four times more expensive than
jhash. On my test system, computing a hash over IPv4 tuple (two 32 bit
addresses and 2 16 bit source ports) is 6.9 nsecs in Jenkins hash, 33
nsecs with siphash. Given that we have eliminated most of the packet
header hashes this might be tolerable, but still should be looking at
ways to optimize.
2) I like moving to use u64 (quad words) in the hash, this is an
improvement over Jenkins which is based on 32 bit words. If we put
this in the kernel we probably want to have several variants of
siphash for specific sizes (e.g. siphash1, siphash2, siphash2,
siphashn for hash over one, two, three, or n sixty four bit words).
3) I also tested siphash distribution and Avalanche Effect (these
really should have been covered in the paper :-( ). Both of these are
good with siphash, in fact almost have identical characteristics to
Jenkins hash.

Tom

> Signed-off-by: Jason A. Donenfeld 
> Cc: Jean-Philippe Aumasson 
> Cc: Daniel J. Bernstein 
> Cc: Linus Torvalds 
> Cc: Eric Biggers 
> Cc: David Laight 
> ---
> Changes from v2->v3:
>
>   - There is now a fast aligned version of the function and a not-as-fast
> unaligned version. The requirements for each have been documented in
> a docbook-style comment. As well, the header now contains a constant
> for the expected alignment.
>
>   - The test suite has been updated to check both the unaligned and aligned
> version of the function.
>
>  include/linux/siphash.h |  30 ++
>  lib/Kconfig.debug   |   6 +-
>  lib/Makefile|   5 +-
>  lib/siphash.c   | 153 
> 
>  lib/test_siphash.c  |  85 

Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-14 Thread Tom Herbert
On Wed, Dec 14, 2016 at 10:46 AM, Jason A. Donenfeld  wrote:
> SipHash is a 64-bit keyed hash function that is actually a
> cryptographically secure PRF, like HMAC. Except SipHash is super fast,
> and is meant to be used as a hashtable keyed lookup function.
>
"super fast" is relative. My quick test shows that this faster than
Toeplitz (good, but not exactly hard to achieve), but is about 4x
slower than jhash.

> SipHash isn't just some new trendy hash function. It's been around for a
> while, and there really isn't anything that comes remotely close to
> being useful in the way SipHash is. With that said, why do we need this?
>
I don't think we need advertising nor a lesson on hashing. It would be
much more useful if you just point us to the paper on siphash (which I
assume I http://cr.yp.to/siphash/siphash-20120918.pdf ?).

> There are a variety of attacks known as "hashtable poisoning" in which an
> attacker forms some data such that the hash of that data will be the
> same, and then preceeds to fill up all entries of a hashbucket. This is
> a realistic and well-known denial-of-service vector.
>
> Linux developers already seem to be aware that this is an issue, and
> various places that use hash tables in, say, a network context, use a
> non-cryptographically secure function (usually jhash) and then try to
> twiddle with the key on a time basis (or in many cases just do nothing
> and hope that nobody notices). While this is an admirable attempt at
> solving the problem, it doesn't actually fix it. SipHash fixes it.
>
Key rotation is important anyway, without any key rotation even if the
key is compromised in siphash by some external means we would have an
insecure hash until the system reboots.

> (It fixes it in such a sound way that you could even build a stream
> cipher out of SipHash that would resist the modern cryptanalysis.)
>
> There are a modicum of places in the kernel that are vulnerable to
> hashtable poisoning attacks, either via userspace vectors or network
> vectors, and there's not a reliable mechanism inside the kernel at the
> moment to fix it. The first step toward fixing these issues is actually
> getting a secure primitive into the kernel for developers to use. Then
> we can, bit by bit, port things over to it as deemed appropriate.
>
> Secondly, a few places are using MD5 for creating secure sequence
> numbers, port numbers, or fast random numbers. Siphash is a faster, more
> fittting, and more secure replacement for MD5 in those situations.
>
> Dozens of languages are already using this internally for their hash
> tables. Some of the BSDs already use this in their kernels. SipHash is
> a widely known high-speed solution to a widely known problem, and it's
> time we catch-up.
>
Maybe so, but we need to do due diligence before considering adopting
siphash as the primary hashing in the network stack. Consider that we
may very well perform a hash over L4 tuples on _every_ packet. We've
done a good job at limiting this to be at most one hash per packet,
but nevertheless the performance of the hash function must be take
into account.

A few points:

1) My quick test shows siphash is about four times more expensive than
jhash. On my test system, computing a hash over IPv4 tuple (two 32 bit
addresses and 2 16 bit source ports) is 6.9 nsecs in Jenkins hash, 33
nsecs with siphash. Given that we have eliminated most of the packet
header hashes this might be tolerable, but still should be looking at
ways to optimize.
2) I like moving to use u64 (quad words) in the hash, this is an
improvement over Jenkins which is based on 32 bit words. If we put
this in the kernel we probably want to have several variants of
siphash for specific sizes (e.g. siphash1, siphash2, siphash2,
siphashn for hash over one, two, three, or n sixty four bit words).
3) I also tested siphash distribution and Avalanche Effect (these
really should have been covered in the paper :-( ). Both of these are
good with siphash, in fact almost have identical characteristics to
Jenkins hash.

Tom

> Signed-off-by: Jason A. Donenfeld 
> Cc: Jean-Philippe Aumasson 
> Cc: Daniel J. Bernstein 
> Cc: Linus Torvalds 
> Cc: Eric Biggers 
> Cc: David Laight 
> ---
> Changes from v2->v3:
>
>   - There is now a fast aligned version of the function and a not-as-fast
> unaligned version. The requirements for each have been documented in
> a docbook-style comment. As well, the header now contains a constant
> for the expected alignment.
>
>   - The test suite has been updated to check both the unaligned and aligned
> version of the function.
>
>  include/linux/siphash.h |  30 ++
>  lib/Kconfig.debug   |   6 +-
>  lib/Makefile|   5 +-
>  lib/siphash.c   | 153 
> 
>  lib/test_siphash.c  |  85 +++
>  5 files changed, 274 insertions(+), 5 deletions(-)
>  create mode 100644 include/linux/siphash.h
>  create mode 100644 

Re: [f2fs-dev] [GIT PULL] f2fs update for 4.10

2016-12-14 Thread Jaegeuk Kim
On 12/14, Jaegeuk Kim wrote:
> On 12/14, Linus Torvalds wrote:
> > On Mon, Dec 12, 2016 at 2:15 PM, Jaegeuk Kim  wrote:
> > >
> > > Could you please consider this pull request?
> > 
> > Pulled. Mind double-checking my resolution wrt commit 70fd76140a6c
> > ("block,fs: use REQ_* flags directly")?
> 
> Thank you, and the resolution looks good to me as well.

BTW, I just downloaded mainline, and tried to build a debian package but
failed due to missing Documentation/Changes.

I've found out that it was renamed by commit 186128f753
 ("docs-rst: add documents to development-process")

And when taking a look at its description, it was supposed to do symlink for
document files, but did rename all of them.

Afterwards, I could see some files were re-added by commit 08a9a8d44c1
  ("doc: re-add CodingStyle and SubmittingPatches")

In order to build a package for now, I just added a symlink like below.

---
 Documentation/Changes | 1 +
 1 file changed, 1 insertion(+)
 create mode 12 Documentation/Changes

diff --git a/Documentation/Changes b/Documentation/Changes
new file mode 12
index ..7564ae1682ba
--- /dev/null
+++ b/Documentation/Changes
@@ -0,0 +1 @@
+process/changes.rst
\ No newline at end of file
-- 
2.11.0

Thanks,

> 
> Thanks,
> 
> > 
> >  Linus
> 
> --
> Check out the vibrant tech community on one of the world's most 
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Linux-f2fs-devel mailing list
> linux-f2fs-de...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [GIT PULL] f2fs update for 4.10

2016-12-14 Thread Jaegeuk Kim
On 12/14, Jaegeuk Kim wrote:
> On 12/14, Linus Torvalds wrote:
> > On Mon, Dec 12, 2016 at 2:15 PM, Jaegeuk Kim  wrote:
> > >
> > > Could you please consider this pull request?
> > 
> > Pulled. Mind double-checking my resolution wrt commit 70fd76140a6c
> > ("block,fs: use REQ_* flags directly")?
> 
> Thank you, and the resolution looks good to me as well.

BTW, I just downloaded mainline, and tried to build a debian package but
failed due to missing Documentation/Changes.

I've found out that it was renamed by commit 186128f753
 ("docs-rst: add documents to development-process")

And when taking a look at its description, it was supposed to do symlink for
document files, but did rename all of them.

Afterwards, I could see some files were re-added by commit 08a9a8d44c1
  ("doc: re-add CodingStyle and SubmittingPatches")

In order to build a package for now, I just added a symlink like below.

---
 Documentation/Changes | 1 +
 1 file changed, 1 insertion(+)
 create mode 12 Documentation/Changes

diff --git a/Documentation/Changes b/Documentation/Changes
new file mode 12
index ..7564ae1682ba
--- /dev/null
+++ b/Documentation/Changes
@@ -0,0 +1 @@
+process/changes.rst
\ No newline at end of file
-- 
2.11.0

Thanks,

> 
> Thanks,
> 
> > 
> >  Linus
> 
> --
> Check out the vibrant tech community on one of the world's most 
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Linux-f2fs-devel mailing list
> linux-f2fs-de...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


RE: [PATCH v3 2/2] FPGA: Add TS-7300 FPGA manager

2016-12-14 Thread Hartley Sweeten
On Wednesday, December 14, 2016 11:55 AM, Florian Fainelli wrote:
> My understanding is that, yes, this triggers the final write. You are
> right that ts73xx_fpga_write() can be called multiple times. It sounds
> like what my write_complete function does right now is just return that
> we successfully completed the bistream write, but this snippet that you
> are quoting should actually be moved into write_complete.

Florian,

I'm in the process of getting a TS-7300 board so I can help test this. Hopefully
I will have it by next week.

Hartley



RE: [PATCH v3 2/2] FPGA: Add TS-7300 FPGA manager

2016-12-14 Thread Hartley Sweeten
On Wednesday, December 14, 2016 11:55 AM, Florian Fainelli wrote:
> My understanding is that, yes, this triggers the final write. You are
> right that ts73xx_fpga_write() can be called multiple times. It sounds
> like what my write_complete function does right now is just return that
> we successfully completed the bistream write, but this snippet that you
> are quoting should actually be moved into write_complete.

Florian,

I'm in the process of getting a TS-7300 board so I can help test this. Hopefully
I will have it by next week.

Hartley



Re: [PATCH] perf tools: ignore zombie process for user profile

2016-12-14 Thread a...@kernel.org
Em Wed, Dec 14, 2016 at 06:26:02PM +, Liang, Kan escreveu:
> > On Wed, Dec 14, 2016 at 12:48:05PM -0500, kan.li...@intel.com wrote:
> > > From: Kan Liang 
> > > Zombie process is dead process. It is meaningless to profile it.
> > > It's better to ignore it for user profile.

> > I recently posted different patch for same issue:
> >   http://marc.info/?l=linux-kernel=148153895827359=2
 
> The change as below make me confuse.
> + /* The system wide setup does not work with threads. */
> + if (!evsel->system_wide)
> + return false;
> It looks the meaning of the comments is inconsistent with the code.
 
> Your original patch doesn't work well with the issue.
> But if I change the above code as below, the issue is fixed.
>   if (evsel->system_wide)
>   return false;

yeah, Namhyung noticed that, Jiri sent a fixed patch, I have it already
in my perf/core branch.

- Arnaldo


Re: [PATCH] perf tools: ignore zombie process for user profile

2016-12-14 Thread a...@kernel.org
Em Wed, Dec 14, 2016 at 06:26:02PM +, Liang, Kan escreveu:
> > On Wed, Dec 14, 2016 at 12:48:05PM -0500, kan.li...@intel.com wrote:
> > > From: Kan Liang 
> > > Zombie process is dead process. It is meaningless to profile it.
> > > It's better to ignore it for user profile.

> > I recently posted different patch for same issue:
> >   http://marc.info/?l=linux-kernel=148153895827359=2
 
> The change as below make me confuse.
> + /* The system wide setup does not work with threads. */
> + if (!evsel->system_wide)
> + return false;
> It looks the meaning of the comments is inconsistent with the code.
 
> Your original patch doesn't work well with the issue.
> But if I change the above code as below, the issue is fixed.
>   if (evsel->system_wide)
>   return false;

yeah, Namhyung noticed that, Jiri sent a fixed patch, I have it already
in my perf/core branch.

- Arnaldo


Re: [kernel-hardening] Re: [PATCH 4/3] random: use siphash24 instead of md5 for get_random_int/long

2016-12-14 Thread Jason A. Donenfeld
Hi again,

On Wed, Dec 14, 2016 at 5:37 PM, Theodore Ts'o  wrote:
> [3.606139] random benchmark!!
> [3.606276] get_random_int # cycles: 326578
> [3.606317] get_random_int_new # cycles: 95438
> [3.607423] get_random_bytes # cycles: 2653388

Looks to me like my siphash implementation is much faster for
get_random_long, and more or less tied for get_random_int:

[1.729370] random benchmark!!
[1.729710] get_random_long # cycles: 349771
[1.730128] get_random_long_chacha # cycles: 359660
[1.730457] get_random_long_siphash # cycles: 94255
[1.731307] get_random_bytes # cycles: 1354894
[1.731707] get_random_int # cycles: 305640
[1.732095] get_random_int_chacha # cycles: 80726
[1.732425] get_random_int_siphash # cycles: 94265
[1.733278] get_random_bytes # cycles: 1315873

Given the increasing usage of get_random_long for ASLR and related, I
think this makes the siphash approach worth pursuing. The chacha
approach is also not significantly different from the md5 approach in
terms of speed for get_rand_long. Additionally, since siphash is a
PRF, I think this opens up a big window for optimizing it even
further.

Benchmark here:
https://git.zx2c4.com/linux-dev/commit/?h=rng-bench

Jason


Re: [kernel-hardening] Re: [PATCH 4/3] random: use siphash24 instead of md5 for get_random_int/long

2016-12-14 Thread Jason A. Donenfeld
Hi again,

On Wed, Dec 14, 2016 at 5:37 PM, Theodore Ts'o  wrote:
> [3.606139] random benchmark!!
> [3.606276] get_random_int # cycles: 326578
> [3.606317] get_random_int_new # cycles: 95438
> [3.607423] get_random_bytes # cycles: 2653388

Looks to me like my siphash implementation is much faster for
get_random_long, and more or less tied for get_random_int:

[1.729370] random benchmark!!
[1.729710] get_random_long # cycles: 349771
[1.730128] get_random_long_chacha # cycles: 359660
[1.730457] get_random_long_siphash # cycles: 94255
[1.731307] get_random_bytes # cycles: 1354894
[1.731707] get_random_int # cycles: 305640
[1.732095] get_random_int_chacha # cycles: 80726
[1.732425] get_random_int_siphash # cycles: 94265
[1.733278] get_random_bytes # cycles: 1315873

Given the increasing usage of get_random_long for ASLR and related, I
think this makes the siphash approach worth pursuing. The chacha
approach is also not significantly different from the md5 approach in
terms of speed for get_rand_long. Additionally, since siphash is a
PRF, I think this opens up a big window for optimizing it even
further.

Benchmark here:
https://git.zx2c4.com/linux-dev/commit/?h=rng-bench

Jason


Re: [kernel-hardening] [PATCH 3/4] Make static usermode helper binaries constant

2016-12-14 Thread Greg KH
On Wed, Dec 14, 2016 at 10:50:52AM -0800, Greg KH wrote:
> 
> There are a number of usermode helper binaries that are "hard coded" in
> the kernel today, so mark them as "const" to make it harder for someone
> to change where the variables point to.
> 
> Signed-off-by: Greg Kroah-Hartman 
> ---
>  drivers/macintosh/windfarm_core.c  | 2 +-
>  drivers/net/hamradio/baycom_epp.c  | 2 +-
>  drivers/pnp/pnpbios/core.c | 5 +++--
>  drivers/staging/greybus/svc_watchdog.c | 4 ++--
>  drivers/staging/rtl8192e/rtl8192e/rtl_dm.c | 6 +++---
>  fs/nfsd/nfs4layouts.c  | 6 --
>  security/keys/request_key.c| 7 ---
>  7 files changed, 18 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/macintosh/windfarm_core.c 
> b/drivers/macintosh/windfarm_core.c
> index 465d770ab0bb..1b317cbb73cf 100644
> --- a/drivers/macintosh/windfarm_core.c
> +++ b/drivers/macintosh/windfarm_core.c
> @@ -74,7 +74,7 @@ static inline void wf_notify(int event, void *param)
>  
>  static int wf_critical_overtemp(void)
>  {
> - static char * critical_overtemp_path = "/sbin/critical_overtemp";
> + static const char * critical_overtemp_path = "/sbin/critical_overtemp";
>   char *argv[] = { critical_overtemp_path, NULL };
>   static char *envp[] = { "HOME=/",
>   "TERM=linux",

Doh, minor compiler warnings for this one, and others in this patch,
I'll fix them up...



Re: [kernel-hardening] [PATCH 3/4] Make static usermode helper binaries constant

2016-12-14 Thread Greg KH
On Wed, Dec 14, 2016 at 10:50:52AM -0800, Greg KH wrote:
> 
> There are a number of usermode helper binaries that are "hard coded" in
> the kernel today, so mark them as "const" to make it harder for someone
> to change where the variables point to.
> 
> Signed-off-by: Greg Kroah-Hartman 
> ---
>  drivers/macintosh/windfarm_core.c  | 2 +-
>  drivers/net/hamradio/baycom_epp.c  | 2 +-
>  drivers/pnp/pnpbios/core.c | 5 +++--
>  drivers/staging/greybus/svc_watchdog.c | 4 ++--
>  drivers/staging/rtl8192e/rtl8192e/rtl_dm.c | 6 +++---
>  fs/nfsd/nfs4layouts.c  | 6 --
>  security/keys/request_key.c| 7 ---
>  7 files changed, 18 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/macintosh/windfarm_core.c 
> b/drivers/macintosh/windfarm_core.c
> index 465d770ab0bb..1b317cbb73cf 100644
> --- a/drivers/macintosh/windfarm_core.c
> +++ b/drivers/macintosh/windfarm_core.c
> @@ -74,7 +74,7 @@ static inline void wf_notify(int event, void *param)
>  
>  static int wf_critical_overtemp(void)
>  {
> - static char * critical_overtemp_path = "/sbin/critical_overtemp";
> + static const char * critical_overtemp_path = "/sbin/critical_overtemp";
>   char *argv[] = { critical_overtemp_path, NULL };
>   static char *envp[] = { "HOME=/",
>   "TERM=linux",

Doh, minor compiler warnings for this one, and others in this patch,
I'll fix them up...



Applied "ASoC: dwc: Fix PIO mode initialization" to the asoc tree

2016-12-14 Thread Mark Brown
The patch

   ASoC: dwc: Fix PIO mode initialization

has been applied to the asoc tree at

   git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From 6fce983f9b3ef51d47e647b2cff15049ef803781 Mon Sep 17 00:00:00 2001
From: Jose Abreu 
Date: Tue, 13 Dec 2016 11:03:49 +
Subject: [PATCH] ASoC: dwc: Fix PIO mode initialization

We can no longer rely on the return value of
devm_snd_dmaengine_pcm_register(...) to check if the DMA
handle is declared in the DT.

Previously this check activated PIO mode but currently
dma_request_chan returns either a valid channel or -EPROBE_DEFER.

In order to activate PIO mode check instead if the interrupt
line is declared. This reflects better what is documented in
the DT bindings (see Documentation/devicetree/bindings/sound/
designware-i2s.txt).

Also, initialize use_pio variable which was never being set
causing PIO mode to never work.

Signed-off-by: Jose Abreu 
Signed-off-by: Mark Brown 
---
 sound/soc/dwc/designware_i2s.c | 25 +++--
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/sound/soc/dwc/designware_i2s.c b/sound/soc/dwc/designware_i2s.c
index 2998954a1c74..bdf8398cbc81 100644
--- a/sound/soc/dwc/designware_i2s.c
+++ b/sound/soc/dwc/designware_i2s.c
@@ -681,22 +681,19 @@ static int dw_i2s_probe(struct platform_device *pdev)
}
 
if (!pdata) {
-   ret = devm_snd_dmaengine_pcm_register(>dev, NULL, 0);
-   if (ret == -EPROBE_DEFER) {
-   dev_err(>dev,
-   "failed to register PCM, deferring probe\n");
-   return ret;
-   } else if (ret) {
-   dev_err(>dev,
-   "Could not register DMA PCM: %d\n"
-   "falling back to PIO mode\n", ret);
+   if (irq >= 0) {
ret = dw_pcm_register(pdev);
-   if (ret) {
-   dev_err(>dev,
-   "Could not register PIO PCM: %d\n",
+   dev->use_pio = true;
+   } else {
+   ret = devm_snd_dmaengine_pcm_register(>dev, NULL,
+   0);
+   dev->use_pio = false;
+   }
+
+   if (ret) {
+   dev_err(>dev, "could not register pcm: %d\n",
ret);
-   goto err_clk_disable;
-   }
+   goto err_clk_disable;
}
}
 
-- 
2.11.0



Applied "ASoC: dwc: Fix PIO mode initialization" to the asoc tree

2016-12-14 Thread Mark Brown
The patch

   ASoC: dwc: Fix PIO mode initialization

has been applied to the asoc tree at

   git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From 6fce983f9b3ef51d47e647b2cff15049ef803781 Mon Sep 17 00:00:00 2001
From: Jose Abreu 
Date: Tue, 13 Dec 2016 11:03:49 +
Subject: [PATCH] ASoC: dwc: Fix PIO mode initialization

We can no longer rely on the return value of
devm_snd_dmaengine_pcm_register(...) to check if the DMA
handle is declared in the DT.

Previously this check activated PIO mode but currently
dma_request_chan returns either a valid channel or -EPROBE_DEFER.

In order to activate PIO mode check instead if the interrupt
line is declared. This reflects better what is documented in
the DT bindings (see Documentation/devicetree/bindings/sound/
designware-i2s.txt).

Also, initialize use_pio variable which was never being set
causing PIO mode to never work.

Signed-off-by: Jose Abreu 
Signed-off-by: Mark Brown 
---
 sound/soc/dwc/designware_i2s.c | 25 +++--
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/sound/soc/dwc/designware_i2s.c b/sound/soc/dwc/designware_i2s.c
index 2998954a1c74..bdf8398cbc81 100644
--- a/sound/soc/dwc/designware_i2s.c
+++ b/sound/soc/dwc/designware_i2s.c
@@ -681,22 +681,19 @@ static int dw_i2s_probe(struct platform_device *pdev)
}
 
if (!pdata) {
-   ret = devm_snd_dmaengine_pcm_register(>dev, NULL, 0);
-   if (ret == -EPROBE_DEFER) {
-   dev_err(>dev,
-   "failed to register PCM, deferring probe\n");
-   return ret;
-   } else if (ret) {
-   dev_err(>dev,
-   "Could not register DMA PCM: %d\n"
-   "falling back to PIO mode\n", ret);
+   if (irq >= 0) {
ret = dw_pcm_register(pdev);
-   if (ret) {
-   dev_err(>dev,
-   "Could not register PIO PCM: %d\n",
+   dev->use_pio = true;
+   } else {
+   ret = devm_snd_dmaengine_pcm_register(>dev, NULL,
+   0);
+   dev->use_pio = false;
+   }
+
+   if (ret) {
+   dev_err(>dev, "could not register pcm: %d\n",
ret);
-   goto err_clk_disable;
-   }
+   goto err_clk_disable;
}
}
 
-- 
2.11.0



Re: [PATCH v3 2/2] FPGA: Add TS-7300 FPGA manager

2016-12-14 Thread Florian Fainelli
On 12/14/2016 10:58 AM, Hartley Sweeten wrote:
> On Wednesday, December 14, 2016 11:55 AM, Florian Fainelli wrote:
>> My understanding is that, yes, this triggers the final write. You are
>> right that ts73xx_fpga_write() can be called multiple times. It sounds
>> like what my write_complete function does right now is just return that
>> we successfully completed the bistream write, but this snippet that you
>> are quoting should actually be moved into write_complete.
> 
> Florian,
> 
> I'm in the process of getting a TS-7300 board so I can help test this. 
> Hopefully
> I will have it by next week.

Great! I got a few things on my list that have not been submitted yet:

- tmp124 support through drivers/hwmon/lm70.c
- specific memcpy_{from,to}io accessors for ethoc from the FPGA
- serial port support for the UARTs from the FPGA

And some other things that are giving me issues at the moment, like
SPI_3WIRE support for spi-ep93xx so I can configure the tmp124 to send
alarms/have temperature thresholds.

My branch is here:

https://github.com/ffainelli/linux/tree/ts72xx

Cheers
-- 
Florian


Re: [PATCH v3 2/2] FPGA: Add TS-7300 FPGA manager

2016-12-14 Thread Florian Fainelli
On 12/14/2016 10:58 AM, Hartley Sweeten wrote:
> On Wednesday, December 14, 2016 11:55 AM, Florian Fainelli wrote:
>> My understanding is that, yes, this triggers the final write. You are
>> right that ts73xx_fpga_write() can be called multiple times. It sounds
>> like what my write_complete function does right now is just return that
>> we successfully completed the bistream write, but this snippet that you
>> are quoting should actually be moved into write_complete.
> 
> Florian,
> 
> I'm in the process of getting a TS-7300 board so I can help test this. 
> Hopefully
> I will have it by next week.

Great! I got a few things on my list that have not been submitted yet:

- tmp124 support through drivers/hwmon/lm70.c
- specific memcpy_{from,to}io accessors for ethoc from the FPGA
- serial port support for the UARTs from the FPGA

And some other things that are giving me issues at the moment, like
SPI_3WIRE support for spi-ep93xx so I can configure the tmp124 to send
alarms/have temperature thresholds.

My branch is here:

https://github.com/ffainelli/linux/tree/ts72xx

Cheers
-- 
Florian


[v3] net: ethernet: cavium: octeon: octeon_mgmt: Handle return NULL error from devm_ioremap

2016-12-14 Thread Arvind Yadav
Here, If devm_ioremap will fail. It will return NULL.
Kernel can run into a NULL-pointer dereference.
This error check will avoid NULL pointer dereference.

Signed-off-by: Arvind Yadav 
---
 drivers/net/ethernet/cavium/octeon/octeon_mgmt.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c 
b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
index 4ab404f..33c2fec 100644
--- a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
+++ b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
@@ -1479,6 +1479,12 @@ static int octeon_mgmt_probe(struct platform_device 
*pdev)
p->agl = (u64)devm_ioremap(>dev, p->agl_phys, p->agl_size);
p->agl_prt_ctl = (u64)devm_ioremap(>dev, p->agl_prt_ctl_phys,
   p->agl_prt_ctl_size);
+   if (!p->mix || !p->agl || !p->agl_prt_ctl) {
+   dev_err(>dev, "failed to map I/O memory\n");
+   result = -ENOMEM;
+   goto err;
+   }
+
spin_lock_init(>lock);
 
skb_queue_head_init(>tx_list);
-- 
2.7.4



Re: [v2] net:ethernet:cavium:octeon:octeon_mgmt: Handle return NULL error from devm_ioremap

2016-12-14 Thread arvind Yadav

Hi,

As per your suggestion, I have change the subject.

Thanks

On Thursday 15 December 2016 12:24 AM, Florian Fainelli wrote:

On 12/14/2016 10:39 AM, arvind Yadav wrote:

Hi David,

I have gave my comment.

Thanks
Arvind

On Wednesday 14 December 2016 11:44 PM, David Daney wrote:

On 12/14/2016 10:06 AM, arvind Yadav wrote:

Yes, I have seen this error. We have a device with very less memory.
Basically it's OMAP2 board. We have to port Android L on this.
It's has 3.10 kernel version. In this device, we were getting Page
allocation failure.

This makes absolutely no sense to me.  OCTEON is a mips64 SoC with a
ton of memory where ioremap can never fail, and it doesn't run
Android, and you are talking about OMAP2.

   -I just gave as example where i have seen ioremap issue.
Please don't relate. I know, Now it will not fail.  ioremap will through
NULL on failure. We should catch this error. Even other driver of MIPS
soc is having same check. It's just check which will not impact any
functionality or performance of this driver. It will avoid NULL pointer
error. We know, if  function is returning any error. we should catch.

Your patch subject should also be changed to insert spaces between
semicolon, so this would be:

net: ethernet: cavium: octeon: octeon_mgmt:




[v3] net: ethernet: cavium: octeon: octeon_mgmt: Handle return NULL error from devm_ioremap

2016-12-14 Thread Arvind Yadav
Here, If devm_ioremap will fail. It will return NULL.
Kernel can run into a NULL-pointer dereference.
This error check will avoid NULL pointer dereference.

Signed-off-by: Arvind Yadav 
---
 drivers/net/ethernet/cavium/octeon/octeon_mgmt.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c 
b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
index 4ab404f..33c2fec 100644
--- a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
+++ b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
@@ -1479,6 +1479,12 @@ static int octeon_mgmt_probe(struct platform_device 
*pdev)
p->agl = (u64)devm_ioremap(>dev, p->agl_phys, p->agl_size);
p->agl_prt_ctl = (u64)devm_ioremap(>dev, p->agl_prt_ctl_phys,
   p->agl_prt_ctl_size);
+   if (!p->mix || !p->agl || !p->agl_prt_ctl) {
+   dev_err(>dev, "failed to map I/O memory\n");
+   result = -ENOMEM;
+   goto err;
+   }
+
spin_lock_init(>lock);
 
skb_queue_head_init(>tx_list);
-- 
2.7.4



Re: [v2] net:ethernet:cavium:octeon:octeon_mgmt: Handle return NULL error from devm_ioremap

2016-12-14 Thread arvind Yadav

Hi,

As per your suggestion, I have change the subject.

Thanks

On Thursday 15 December 2016 12:24 AM, Florian Fainelli wrote:

On 12/14/2016 10:39 AM, arvind Yadav wrote:

Hi David,

I have gave my comment.

Thanks
Arvind

On Wednesday 14 December 2016 11:44 PM, David Daney wrote:

On 12/14/2016 10:06 AM, arvind Yadav wrote:

Yes, I have seen this error. We have a device with very less memory.
Basically it's OMAP2 board. We have to port Android L on this.
It's has 3.10 kernel version. In this device, we were getting Page
allocation failure.

This makes absolutely no sense to me.  OCTEON is a mips64 SoC with a
ton of memory where ioremap can never fail, and it doesn't run
Android, and you are talking about OMAP2.

   -I just gave as example where i have seen ioremap issue.
Please don't relate. I know, Now it will not fail.  ioremap will through
NULL on failure. We should catch this error. Even other driver of MIPS
soc is having same check. It's just check which will not impact any
functionality or performance of this driver. It will avoid NULL pointer
error. We know, if  function is returning any error. we should catch.

Your patch subject should also be changed to insert spaces between
semicolon, so this would be:

net: ethernet: cavium: octeon: octeon_mgmt:




Re: [PATCH 1/3] mm, trace: extract COMPACTION_STATUS and ZONE_TYPE to a common header

2016-12-14 Thread Michal Hocko
On Thu 15-12-16 01:32:06, kbuild test robot wrote:
> Hi Michal,
> 
> [auto build test ERROR on tip/perf/core]
> [also build test ERROR on v4.9 next-20161214]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
> 
> url:
> https://github.com/0day-ci/linux/commits/Michal-Hocko/mm-oom-add-oom-detection-tracepoints/20161214-231225
> config: x86_64-randconfig-s2-12142134 (attached as .config)
> compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=x86_64 
> 
> All error/warnings (new ones prefixed by >>):
> 
>In file included from include/trace/trace_events.h:361,
> from include/trace/define_trace.h:95,
> from include/trace/events/compaction.h:356,
> from mm/compaction.c:43:
>include/trace/events/compaction.h: In function 
> 'trace_raw_output_mm_compaction_end':
> >> include/trace/events/compaction.h:134: error: expected expression before 
> >> ',' token
>include/trace/events/compaction.h: In function 
> 'trace_raw_output_mm_compaction_suitable_template':
>include/trace/events/compaction.h:195: error: expected expression before 
> ',' token
> >> include/trace/events/compaction.h:195: warning: missing braces around 
> >> initializer
>include/trace/events/compaction.h:195: warning: (near initialization for 
> 'symbols[0]')
> >> include/trace/events/compaction.h:195: error: initializer element is not 
> >> constant
>include/trace/events/compaction.h:195: error: (near initialization for 
> 'symbols[0].mask')

Interesting. I am pretty sure that my config battery has
CONFIG_COMPACTION=n. Not sure which part of your config made a change.
Anyway, I've added to my collection. And with the below diff it passes
all my configs.
---
>From 921bf07b8684ded5f076904cc6baa875b52c3a1e Mon Sep 17 00:00:00 2001
From: Michal Hocko <mho...@suse.com>
Date: Wed, 14 Dec 2016 18:56:44 +0100
Subject: [PATCH] fold me "mm, trace: extract COMPACTION_STATUS and ZONE_TYPE
 to a common header"

0-day has reported:
   In file included from include/trace/trace_events.h:361,
from include/trace/define_trace.h:95,
from include/trace/events/compaction.h:356,
from mm/compaction.c:43:
   include/trace/events/compaction.h: In function 
'trace_raw_output_mm_compaction_end':
>> include/trace/events/compaction.h:134: error: expected expression before ',' 
>> token
   include/trace/events/compaction.h: In function 
'trace_raw_output_mm_compaction_suitable_template':
   include/trace/events/compaction.h:195: error: expected expression before ',' 
token
>> include/trace/events/compaction.h:195: warning: missing braces around 
>> initializer
   include/trace/events/compaction.h:195: warning: (near initialization for 
'symbols[0]')
>> include/trace/events/compaction.h:195: error: initializer element is not 
>> constant
   include/trace/events/compaction.h:195: error: (near initialization for 
'symbols[0].mask')

CONFIG_COMPACTION=n so COMPACTION_STATUS is not defined properly.

Signed-off-by: Michal Hocko <mho...@suse.com>
---
 include/trace/events/compaction.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/compaction.h 
b/include/trace/events/compaction.h
index 2334faa56323..0a18ab6483ff 100644
--- a/include/trace/events/compaction.h
+++ b/include/trace/events/compaction.h
@@ -131,6 +131,7 @@ TRACE_EVENT(mm_compaction_begin,
__entry->sync ? "sync" : "async")
 );
 
+#ifdef CONFIG_COMPACTION
 TRACE_EVENT(mm_compaction_end,
TP_PROTO(unsigned long zone_start, unsigned long migrate_pfn,
unsigned long free_pfn, unsigned long zone_end, bool sync,
@@ -164,6 +165,7 @@ TRACE_EVENT(mm_compaction_end,
__entry->sync ? "sync" : "async",
__print_symbolic(__entry->status, COMPACTION_STATUS))
 );
+#endif
 
 TRACE_EVENT(mm_compaction_try_to_compact_pages,
 
@@ -192,6 +194,7 @@ TRACE_EVENT(mm_compaction_try_to_compact_pages,
__entry->prio)
 );
 
+#ifdef CONFIG_COMPACTION
 DECLARE_EVENT_CLASS(mm_compaction_suitable_template,
 
TP_PROTO(struct zone *zone,
@@ -239,7 +242,6 @@ DEFINE_EVENT(mm_compaction_suitable_template, 
mm_compaction_suitable,
TP_ARGS(zone, order, ret)
 );
 
-#ifdef CONFIG_COMPACTION
 DECLARE_EVENT_CLASS(mm_compaction_defer_template,
 
TP_PROTO(struct zone *zone, int order),
-- 
2.10.2

-- 
Michal Hocko
SUSE Labs


Re: [PATCH 1/3] mm, trace: extract COMPACTION_STATUS and ZONE_TYPE to a common header

2016-12-14 Thread Michal Hocko
On Thu 15-12-16 01:32:06, kbuild test robot wrote:
> Hi Michal,
> 
> [auto build test ERROR on tip/perf/core]
> [also build test ERROR on v4.9 next-20161214]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
> 
> url:
> https://github.com/0day-ci/linux/commits/Michal-Hocko/mm-oom-add-oom-detection-tracepoints/20161214-231225
> config: x86_64-randconfig-s2-12142134 (attached as .config)
> compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=x86_64 
> 
> All error/warnings (new ones prefixed by >>):
> 
>In file included from include/trace/trace_events.h:361,
> from include/trace/define_trace.h:95,
> from include/trace/events/compaction.h:356,
> from mm/compaction.c:43:
>include/trace/events/compaction.h: In function 
> 'trace_raw_output_mm_compaction_end':
> >> include/trace/events/compaction.h:134: error: expected expression before 
> >> ',' token
>include/trace/events/compaction.h: In function 
> 'trace_raw_output_mm_compaction_suitable_template':
>include/trace/events/compaction.h:195: error: expected expression before 
> ',' token
> >> include/trace/events/compaction.h:195: warning: missing braces around 
> >> initializer
>include/trace/events/compaction.h:195: warning: (near initialization for 
> 'symbols[0]')
> >> include/trace/events/compaction.h:195: error: initializer element is not 
> >> constant
>include/trace/events/compaction.h:195: error: (near initialization for 
> 'symbols[0].mask')

Interesting. I am pretty sure that my config battery has
CONFIG_COMPACTION=n. Not sure which part of your config made a change.
Anyway, I've added to my collection. And with the below diff it passes
all my configs.
---
>From 921bf07b8684ded5f076904cc6baa875b52c3a1e Mon Sep 17 00:00:00 2001
From: Michal Hocko 
Date: Wed, 14 Dec 2016 18:56:44 +0100
Subject: [PATCH] fold me "mm, trace: extract COMPACTION_STATUS and ZONE_TYPE
 to a common header"

0-day has reported:
   In file included from include/trace/trace_events.h:361,
from include/trace/define_trace.h:95,
from include/trace/events/compaction.h:356,
from mm/compaction.c:43:
   include/trace/events/compaction.h: In function 
'trace_raw_output_mm_compaction_end':
>> include/trace/events/compaction.h:134: error: expected expression before ',' 
>> token
   include/trace/events/compaction.h: In function 
'trace_raw_output_mm_compaction_suitable_template':
   include/trace/events/compaction.h:195: error: expected expression before ',' 
token
>> include/trace/events/compaction.h:195: warning: missing braces around 
>> initializer
   include/trace/events/compaction.h:195: warning: (near initialization for 
'symbols[0]')
>> include/trace/events/compaction.h:195: error: initializer element is not 
>> constant
   include/trace/events/compaction.h:195: error: (near initialization for 
'symbols[0].mask')

CONFIG_COMPACTION=n so COMPACTION_STATUS is not defined properly.

Signed-off-by: Michal Hocko 
---
 include/trace/events/compaction.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/compaction.h 
b/include/trace/events/compaction.h
index 2334faa56323..0a18ab6483ff 100644
--- a/include/trace/events/compaction.h
+++ b/include/trace/events/compaction.h
@@ -131,6 +131,7 @@ TRACE_EVENT(mm_compaction_begin,
__entry->sync ? "sync" : "async")
 );
 
+#ifdef CONFIG_COMPACTION
 TRACE_EVENT(mm_compaction_end,
TP_PROTO(unsigned long zone_start, unsigned long migrate_pfn,
unsigned long free_pfn, unsigned long zone_end, bool sync,
@@ -164,6 +165,7 @@ TRACE_EVENT(mm_compaction_end,
__entry->sync ? "sync" : "async",
__print_symbolic(__entry->status, COMPACTION_STATUS))
 );
+#endif
 
 TRACE_EVENT(mm_compaction_try_to_compact_pages,
 
@@ -192,6 +194,7 @@ TRACE_EVENT(mm_compaction_try_to_compact_pages,
__entry->prio)
 );
 
+#ifdef CONFIG_COMPACTION
 DECLARE_EVENT_CLASS(mm_compaction_suitable_template,
 
TP_PROTO(struct zone *zone,
@@ -239,7 +242,6 @@ DEFINE_EVENT(mm_compaction_suitable_template, 
mm_compaction_suitable,
TP_ARGS(zone, order, ret)
 );
 
-#ifdef CONFIG_COMPACTION
 DECLARE_EVENT_CLASS(mm_compaction_defer_template,
 
TP_PROTO(struct zone *zone, int order),
-- 
2.10.2

-- 
Michal Hocko
SUSE Labs


Re: Question about regulator API

2016-12-14 Thread Mark Brown
On Wed, Dec 14, 2016 at 03:52:54PM +0100, Harald Geyer wrote:

> Thus the following constraints should be met:
> * When user space asks the driver to read a device, it needs to "claim"
>   the supply and ensure that it has been up for at least 2 seconds
>   before proceeding to read the HW. The supply would be "locked" enabled.
>   (I think this is standard regulator API.)
> * When HW failure is detected, the driver needs to tell the supply to
>   turn off for at least 2 seconds, wait for all other devices to release
>   the supply so it can actually be turned off, then wait for the off period,
>   then start over again.
>   (This is easy only with getting the supply exclusively.)

You need to use a notification to figure out when the supply is actually
off.

> * It should be possible to read multiple devices quickly when everything
>   is okay and working. (Having 6 2-second delays accumulate would be quite
>   annoying.)
>   (This won't work with exclusive supply usage.)

Similarly using a notification to discover when the supply is on would
help here.

> * Optionally: If all devices are idle the supply would be enabled if short
>   response time is desired, but disabled if power saving is desired.
> 
> Can this somehow be solved with the existing API?
> If not, do you think it would be reasonable/possible to extend the API
> to cover situations like the one described above?

This doesn't feel like a regulator API problem exactly, a lot of what
you're talking about here seems like you really need the devices to
coopereate with each other and know what they're doing in order to work
well together.  From a regulator API point of view my first thought is
to use notifications to hook into the actual power on/off transitions
and then expressing the bit where the devices all work together at a
higher level.  It may be that a lot of that higher level coordination
just falls out of normal usage patterns do doesn't need explicitly
implementing, assuming the devices are only powered up during reads
anyway.  Using regulator_disable_deferred() in the driver may help
grease the wheels in terms of avoiding needless power bounces and
delays - just wait a little while before powering down in case you need
to power up again very soon after.

You'd end up with the devices all ignoring each other but keeping track
of when the supply was last enabled and disabled and individually
keeping timers to make sure that the needed delays are taken care of.
Userspace would then turn up and read all the devices, they'd then do
the enables and disables as though they were working alone but
coordinate through the notifications.  If device A powered things up
then device B would know the power was already on via the notifications
and when it came on so could take account of that when doing its own
delays.


signature.asc
Description: PGP signature


Re: Question about regulator API

2016-12-14 Thread Mark Brown
On Wed, Dec 14, 2016 at 03:52:54PM +0100, Harald Geyer wrote:

> Thus the following constraints should be met:
> * When user space asks the driver to read a device, it needs to "claim"
>   the supply and ensure that it has been up for at least 2 seconds
>   before proceeding to read the HW. The supply would be "locked" enabled.
>   (I think this is standard regulator API.)
> * When HW failure is detected, the driver needs to tell the supply to
>   turn off for at least 2 seconds, wait for all other devices to release
>   the supply so it can actually be turned off, then wait for the off period,
>   then start over again.
>   (This is easy only with getting the supply exclusively.)

You need to use a notification to figure out when the supply is actually
off.

> * It should be possible to read multiple devices quickly when everything
>   is okay and working. (Having 6 2-second delays accumulate would be quite
>   annoying.)
>   (This won't work with exclusive supply usage.)

Similarly using a notification to discover when the supply is on would
help here.

> * Optionally: If all devices are idle the supply would be enabled if short
>   response time is desired, but disabled if power saving is desired.
> 
> Can this somehow be solved with the existing API?
> If not, do you think it would be reasonable/possible to extend the API
> to cover situations like the one described above?

This doesn't feel like a regulator API problem exactly, a lot of what
you're talking about here seems like you really need the devices to
coopereate with each other and know what they're doing in order to work
well together.  From a regulator API point of view my first thought is
to use notifications to hook into the actual power on/off transitions
and then expressing the bit where the devices all work together at a
higher level.  It may be that a lot of that higher level coordination
just falls out of normal usage patterns do doesn't need explicitly
implementing, assuming the devices are only powered up during reads
anyway.  Using regulator_disable_deferred() in the driver may help
grease the wheels in terms of avoiding needless power bounces and
delays - just wait a little while before powering down in case you need
to power up again very soon after.

You'd end up with the devices all ignoring each other but keeping track
of when the supply was last enabled and disabled and individually
keeping timers to make sure that the needed delays are taken care of.
Userspace would then turn up and read all the devices, they'd then do
the enables and disables as though they were working alone but
coordinate through the notifications.  If device A powered things up
then device B would know the power was already on via the notifications
and when it came on so could take account of that when doing its own
delays.


signature.asc
Description: PGP signature


Re: [PATCH v2 3/3] kvm: svm: Use the hardware provided GPA instead of page walk

2016-12-14 Thread Brijesh Singh


On 12/14/2016 11:23 AM, Paolo Bonzini wrote:



On 14/12/2016 18:07, Brijesh Singh wrote:




Since now we are going to perform multiple conditional checks before
concluding that its safe to use HW provided GPA. How about if we add two
functions "emulator_is_rep_string_op" and "emulator_is_two_mem_op" into
emulator.c and  use these functions inside the x86.c to determine if its
safe to use HW provided gpa?


Why not export only emulator_can_use_gpa from emulate.c?  (So in the end
leaving emulator_is_string_op in emulate.c was the right thing to do, it
was just the test that was wrong :)).



Actually, I was not sure if putting emulator_can_use_gpa() in emulate.c 
was right thing - mainly because emulator.c does not deal with GPA. I 
will go with your advice and put it in emulator.c, it makes easy :)




The patch below is still missing the check for cross-page MMIO.  Your
reference to the BKDG only covers MMCONFIG (sometimes referred to as
ECAM), not MMIO in general.  Doing AND or OR into video memory for
example is perfectly legal, and I'm fairly sure that some obscure legacy
software does PUSH/POP into vram as well!




I used your below code snippet to detect cross-page MMIO access. After 
applying these changes cross-page MMIO read/write unit test is passing 
just fine. I will include it in patch.


> Actually there is a nice trick you can do to support cross-page
> MMIO access detection:

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 37cd31645d45..754d251dc611 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4549,6 +4549,7 @@ static int emulator_read_write_onepage(unsigned 
long addr, void *val,

 */
if (vcpu->arch.gpa_available &&
!emulator_can_use_hw_gpa(ctxt) &&
+   (addr & ~PAGE_MASK) == (exception->address & ~PAGE_MASK) &&
vcpu_is_mmio_gpa(vcpu, addr, exception->address, write)) {
gpa = exception->address;
goto mmio;



Re: [PATCH v2 3/3] kvm: svm: Use the hardware provided GPA instead of page walk

2016-12-14 Thread Brijesh Singh


On 12/14/2016 11:23 AM, Paolo Bonzini wrote:



On 14/12/2016 18:07, Brijesh Singh wrote:




Since now we are going to perform multiple conditional checks before
concluding that its safe to use HW provided GPA. How about if we add two
functions "emulator_is_rep_string_op" and "emulator_is_two_mem_op" into
emulator.c and  use these functions inside the x86.c to determine if its
safe to use HW provided gpa?


Why not export only emulator_can_use_gpa from emulate.c?  (So in the end
leaving emulator_is_string_op in emulate.c was the right thing to do, it
was just the test that was wrong :)).



Actually, I was not sure if putting emulator_can_use_gpa() in emulate.c 
was right thing - mainly because emulator.c does not deal with GPA. I 
will go with your advice and put it in emulator.c, it makes easy :)




The patch below is still missing the check for cross-page MMIO.  Your
reference to the BKDG only covers MMCONFIG (sometimes referred to as
ECAM), not MMIO in general.  Doing AND or OR into video memory for
example is perfectly legal, and I'm fairly sure that some obscure legacy
software does PUSH/POP into vram as well!




I used your below code snippet to detect cross-page MMIO access. After 
applying these changes cross-page MMIO read/write unit test is passing 
just fine. I will include it in patch.


> Actually there is a nice trick you can do to support cross-page
> MMIO access detection:

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 37cd31645d45..754d251dc611 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4549,6 +4549,7 @@ static int emulator_read_write_onepage(unsigned 
long addr, void *val,

 */
if (vcpu->arch.gpa_available &&
!emulator_can_use_hw_gpa(ctxt) &&
+   (addr & ~PAGE_MASK) == (exception->address & ~PAGE_MASK) &&
vcpu_is_mmio_gpa(vcpu, addr, exception->address, write)) {
gpa = exception->address;
goto mmio;



Re: [v2] net:ethernet:cavium:octeon:octeon_mgmt: Handle return NULL error from devm_ioremap

2016-12-14 Thread Florian Fainelli
On 12/14/2016 10:39 AM, arvind Yadav wrote:
> Hi David,
> 
> I have gave my comment.
> 
> Thanks
> Arvind
> 
> On Wednesday 14 December 2016 11:44 PM, David Daney wrote:
>> On 12/14/2016 10:06 AM, arvind Yadav wrote:
>>> Yes, I have seen this error. We have a device with very less memory.
>>> Basically it's OMAP2 board. We have to port Android L on this.
>>> It's has 3.10 kernel version. In this device, we were getting Page
>>> allocation failure.
>>
>> This makes absolutely no sense to me.  OCTEON is a mips64 SoC with a
>> ton of memory where ioremap can never fail, and it doesn't run
>> Android, and you are talking about OMAP2.
>   -I just gave as example where i have seen ioremap issue.
> Please don't relate. I know, Now it will not fail.  ioremap will through
> NULL on failure. We should catch this error. Even other driver of MIPS
> soc is having same check. It's just check which will not impact any
> functionality or performance of this driver. It will avoid NULL pointer
> error. We know, if  function is returning any error. we should catch.

Your patch subject should also be changed to insert spaces between
semicolon, so this would be:

net: ethernet: cavium: octeon: octeon_mgmt:
-- 
Florian


Re: [v2] net:ethernet:cavium:octeon:octeon_mgmt: Handle return NULL error from devm_ioremap

2016-12-14 Thread Florian Fainelli
On 12/14/2016 10:39 AM, arvind Yadav wrote:
> Hi David,
> 
> I have gave my comment.
> 
> Thanks
> Arvind
> 
> On Wednesday 14 December 2016 11:44 PM, David Daney wrote:
>> On 12/14/2016 10:06 AM, arvind Yadav wrote:
>>> Yes, I have seen this error. We have a device with very less memory.
>>> Basically it's OMAP2 board. We have to port Android L on this.
>>> It's has 3.10 kernel version. In this device, we were getting Page
>>> allocation failure.
>>
>> This makes absolutely no sense to me.  OCTEON is a mips64 SoC with a
>> ton of memory where ioremap can never fail, and it doesn't run
>> Android, and you are talking about OMAP2.
>   -I just gave as example where i have seen ioremap issue.
> Please don't relate. I know, Now it will not fail.  ioremap will through
> NULL on failure. We should catch this error. Even other driver of MIPS
> soc is having same check. It's just check which will not impact any
> functionality or performance of this driver. It will avoid NULL pointer
> error. We know, if  function is returning any error. we should catch.

Your patch subject should also be changed to insert spaces between
semicolon, so this would be:

net: ethernet: cavium: octeon: octeon_mgmt:
-- 
Florian


Re: [GIT PULL] ext4 updates for 4.10

2016-12-14 Thread Johannes Weiner
On Wed, Dec 14, 2016 at 09:20:43AM -0800, Linus Torvalds wrote:
> [ Johannes added to participants due to radix tree changes ]
> 
> On Tue, Dec 13, 2016 at 12:00 PM, Theodore Ts'o  wrote:
> >
> > This merge request includes the dax-4.0-iomap-pmd branch which is
> > needed for both ext4 and xfs dax changes to use iomap for DAX.  It
> > also includes the fscrypt branch which is needed for ubifs encryption
> > work as well as ext4 encryption and fscrypt cleanups.
> 
> Can you double-check my merge resolution with the radix tree changes
> wrt the DAX changes, please? It looked straightforward, but it's an
> area where I'd really like people to actually look again and test..

The radix tree bits look correct to me.


Re: [GIT PULL] ext4 updates for 4.10

2016-12-14 Thread Johannes Weiner
On Wed, Dec 14, 2016 at 09:20:43AM -0800, Linus Torvalds wrote:
> [ Johannes added to participants due to radix tree changes ]
> 
> On Tue, Dec 13, 2016 at 12:00 PM, Theodore Ts'o  wrote:
> >
> > This merge request includes the dax-4.0-iomap-pmd branch which is
> > needed for both ext4 and xfs dax changes to use iomap for DAX.  It
> > also includes the fscrypt branch which is needed for ubifs encryption
> > work as well as ext4 encryption and fscrypt cleanups.
> 
> Can you double-check my merge resolution with the radix tree changes
> wrt the DAX changes, please? It looked straightforward, but it's an
> area where I'd really like people to actually look again and test..

The radix tree bits look correct to me.


[PATCH 2/4] drbd: rename "usermode_helper" to "drbd_usermode_helper"

2016-12-14 Thread Greg KH

Nothing like having a very generic global variable in a tiny driver
subsystem to make a mess of the global namespace...

Anyway, clean it up in anticipation of making drbd_usermode_helper
read-only in a future patch.

Note, there are many other "generic" named global variables in the drbd
subsystem, someone should fix those up one day before they hit a linking
error.

Signed-off-by: Greg Kroah-Hartman 
---
 drivers/block/drbd/drbd_int.h  |  2 +-
 drivers/block/drbd/drbd_main.c |  4 ++--
 drivers/block/drbd/drbd_nl.c   | 20 ++--
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index 4cb8f21ff4ef..a139a34f1f1e 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -75,7 +75,7 @@ extern int fault_rate;
 extern int fault_devs;
 #endif
 
-extern char usermode_helper[];
+extern char drbd_usermode_helper[];
 
 
 /* This is used to stop/restart our threads.
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 83482721bc01..8f51eccc8de7 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -108,9 +108,9 @@ int proc_details;   /* Detail level in proc drbd*/
 
 /* Module parameter for setting the user mode helper program
  * to run. Default is /sbin/drbdadm */
-char usermode_helper[80] = "/sbin/drbdadm";
+char drbd_usermode_helper[80] = "/sbin/drbdadm";
 
-module_param_string(usermode_helper, usermode_helper, sizeof(usermode_helper), 
0644);
+module_param_string(usermode_helper, drbd_usermode_helper, 
sizeof(drbd_usermode_helper), 0644);
 
 /* in 2.6.x, our device mapping and config info contains our virtual gendisks
  * as member "struct gendisk *vdisk;"
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index f35db29cac76..9edc6fb95f19 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -344,7 +344,7 @@ int drbd_khelper(struct drbd_device *device, char *cmd)
 (char[60]) { }, /* address */
NULL };
char mb[14];
-   char *argv[] = {usermode_helper, cmd, mb, NULL };
+   char *argv[] = {drbd_usermode_helper, cmd, mb, NULL };
struct drbd_connection *connection = 
first_peer_device(device)->connection;
struct sib_info sib;
int ret;
@@ -359,19 +359,19 @@ int drbd_khelper(struct drbd_device *device, char *cmd)
 * write out any unsynced meta data changes now */
drbd_md_sync(device);
 
-   drbd_info(device, "helper command: %s %s %s\n", usermode_helper, cmd, 
mb);
+   drbd_info(device, "helper command: %s %s %s\n", drbd_usermode_helper, 
cmd, mb);
sib.sib_reason = SIB_HELPER_PRE;
sib.helper_name = cmd;
drbd_bcast_event(device, );
notify_helper(NOTIFY_CALL, device, connection, cmd, 0);
-   ret = call_usermodehelper(usermode_helper, argv, envp, UMH_WAIT_PROC);
+   ret = call_usermodehelper(drbd_usermode_helper, argv, envp, 
UMH_WAIT_PROC);
if (ret)
drbd_warn(device, "helper command: %s %s %s exit code %u 
(0x%x)\n",
-   usermode_helper, cmd, mb,
+   drbd_usermode_helper, cmd, mb,
(ret >> 8) & 0xff, ret);
else
drbd_info(device, "helper command: %s %s %s exit code %u 
(0x%x)\n",
-   usermode_helper, cmd, mb,
+   drbd_usermode_helper, cmd, mb,
(ret >> 8) & 0xff, ret);
sib.sib_reason = SIB_HELPER_POST;
sib.helper_exit_code = ret;
@@ -396,24 +396,24 @@ enum drbd_peer_state conn_khelper(struct drbd_connection 
*connection, char *cmd)
 (char[60]) { }, /* address */
NULL };
char *resource_name = connection->resource->name;
-   char *argv[] = {usermode_helper, cmd, resource_name, NULL };
+   char *argv[] = {drbd_usermode_helper, cmd, resource_name, NULL };
int ret;
 
setup_khelper_env(connection, envp);
conn_md_sync(connection);
 
-   drbd_info(connection, "helper command: %s %s %s\n", usermode_helper, 
cmd, resource_name);
+   drbd_info(connection, "helper command: %s %s %s\n", 
drbd_usermode_helper, cmd, resource_name);
/* TODO: conn_bcast_event() ?? */
notify_helper(NOTIFY_CALL, NULL, connection, cmd, 0);
 
-   ret = call_usermodehelper(usermode_helper, argv, envp, UMH_WAIT_PROC);
+   ret = call_usermodehelper(drbd_usermode_helper, argv, envp, 
UMH_WAIT_PROC);
if (ret)
drbd_warn(connection, "helper command: %s %s %s exit code %u 
(0x%x)\n",
- usermode_helper, cmd, resource_name,
+ drbd_usermode_helper, cmd, resource_name,
  (ret >> 8) & 0xff, ret);
else

[PATCH 2/4] drbd: rename "usermode_helper" to "drbd_usermode_helper"

2016-12-14 Thread Greg KH

Nothing like having a very generic global variable in a tiny driver
subsystem to make a mess of the global namespace...

Anyway, clean it up in anticipation of making drbd_usermode_helper
read-only in a future patch.

Note, there are many other "generic" named global variables in the drbd
subsystem, someone should fix those up one day before they hit a linking
error.

Signed-off-by: Greg Kroah-Hartman 
---
 drivers/block/drbd/drbd_int.h  |  2 +-
 drivers/block/drbd/drbd_main.c |  4 ++--
 drivers/block/drbd/drbd_nl.c   | 20 ++--
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index 4cb8f21ff4ef..a139a34f1f1e 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -75,7 +75,7 @@ extern int fault_rate;
 extern int fault_devs;
 #endif
 
-extern char usermode_helper[];
+extern char drbd_usermode_helper[];
 
 
 /* This is used to stop/restart our threads.
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 83482721bc01..8f51eccc8de7 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -108,9 +108,9 @@ int proc_details;   /* Detail level in proc drbd*/
 
 /* Module parameter for setting the user mode helper program
  * to run. Default is /sbin/drbdadm */
-char usermode_helper[80] = "/sbin/drbdadm";
+char drbd_usermode_helper[80] = "/sbin/drbdadm";
 
-module_param_string(usermode_helper, usermode_helper, sizeof(usermode_helper), 
0644);
+module_param_string(usermode_helper, drbd_usermode_helper, 
sizeof(drbd_usermode_helper), 0644);
 
 /* in 2.6.x, our device mapping and config info contains our virtual gendisks
  * as member "struct gendisk *vdisk;"
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index f35db29cac76..9edc6fb95f19 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -344,7 +344,7 @@ int drbd_khelper(struct drbd_device *device, char *cmd)
 (char[60]) { }, /* address */
NULL };
char mb[14];
-   char *argv[] = {usermode_helper, cmd, mb, NULL };
+   char *argv[] = {drbd_usermode_helper, cmd, mb, NULL };
struct drbd_connection *connection = 
first_peer_device(device)->connection;
struct sib_info sib;
int ret;
@@ -359,19 +359,19 @@ int drbd_khelper(struct drbd_device *device, char *cmd)
 * write out any unsynced meta data changes now */
drbd_md_sync(device);
 
-   drbd_info(device, "helper command: %s %s %s\n", usermode_helper, cmd, 
mb);
+   drbd_info(device, "helper command: %s %s %s\n", drbd_usermode_helper, 
cmd, mb);
sib.sib_reason = SIB_HELPER_PRE;
sib.helper_name = cmd;
drbd_bcast_event(device, );
notify_helper(NOTIFY_CALL, device, connection, cmd, 0);
-   ret = call_usermodehelper(usermode_helper, argv, envp, UMH_WAIT_PROC);
+   ret = call_usermodehelper(drbd_usermode_helper, argv, envp, 
UMH_WAIT_PROC);
if (ret)
drbd_warn(device, "helper command: %s %s %s exit code %u 
(0x%x)\n",
-   usermode_helper, cmd, mb,
+   drbd_usermode_helper, cmd, mb,
(ret >> 8) & 0xff, ret);
else
drbd_info(device, "helper command: %s %s %s exit code %u 
(0x%x)\n",
-   usermode_helper, cmd, mb,
+   drbd_usermode_helper, cmd, mb,
(ret >> 8) & 0xff, ret);
sib.sib_reason = SIB_HELPER_POST;
sib.helper_exit_code = ret;
@@ -396,24 +396,24 @@ enum drbd_peer_state conn_khelper(struct drbd_connection 
*connection, char *cmd)
 (char[60]) { }, /* address */
NULL };
char *resource_name = connection->resource->name;
-   char *argv[] = {usermode_helper, cmd, resource_name, NULL };
+   char *argv[] = {drbd_usermode_helper, cmd, resource_name, NULL };
int ret;
 
setup_khelper_env(connection, envp);
conn_md_sync(connection);
 
-   drbd_info(connection, "helper command: %s %s %s\n", usermode_helper, 
cmd, resource_name);
+   drbd_info(connection, "helper command: %s %s %s\n", 
drbd_usermode_helper, cmd, resource_name);
/* TODO: conn_bcast_event() ?? */
notify_helper(NOTIFY_CALL, NULL, connection, cmd, 0);
 
-   ret = call_usermodehelper(usermode_helper, argv, envp, UMH_WAIT_PROC);
+   ret = call_usermodehelper(drbd_usermode_helper, argv, envp, 
UMH_WAIT_PROC);
if (ret)
drbd_warn(connection, "helper command: %s %s %s exit code %u 
(0x%x)\n",
- usermode_helper, cmd, resource_name,
+ drbd_usermode_helper, cmd, resource_name,
  (ret >> 8) & 0xff, ret);
else
drbd_info(connection, 

[RFC 4/4] Introduce CONFIG_READONLY_USERMODEHELPER

2016-12-14 Thread Greg KH
If you can write to kernel memory, an "easy" way to get the kernel to
run any application is to change the pointer of one of the usermode
helper program names.  To try to mitigate this, create a new config
option, CONFIG_READONLY_USERMODEHELPER.

This option only allows "predefined" binaries to be called.  A number of
drivers and subsystems allow for the name of the binary to be changed,
and this config option disables that capability, so be aware of that.

Note:  Still a proof-of-concept at this point in time, doesn't cover all
of the call_usermodehelper() calls just yet, including the "fun" of
coredumps, it's still a work in progress.

Not-Signed-off-by: Greg Kroah-Hartman 
---
 arch/x86/kernel/cpu/mcheck/mce.c | 12 
 drivers/block/drbd/drbd_int.h|  6 +-
 drivers/block/drbd/drbd_main.c   |  5 +
 drivers/video/fbdev/uvesafb.c| 19 ++-
 fs/nfs/cache_lib.c   | 12 ++--
 include/linux/reboot.h   |  2 ++
 kernel/ksysfs.c  |  6 +-
 kernel/reboot.c  |  3 +++
 kernel/sysctl.c  |  4 
 lib/kobject_uevent.c |  3 +++
 security/Kconfig | 17 +
 11 files changed, 76 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 00ef43233e03..92a2ef8ffe3e 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -2337,15 +2337,16 @@ static ssize_t set_bank(struct device *s, struct 
device_attribute *attr,
 }
 
 static ssize_t
-show_trigger(struct device *s, struct device_attribute *attr, char *buf)
+trigger_show(struct device *s, struct device_attribute *attr, char *buf)
 {
strcpy(buf, mce_helper);
strcat(buf, "\n");
return strlen(mce_helper) + 1;
 }
 
-static ssize_t set_trigger(struct device *s, struct device_attribute *attr,
-   const char *buf, size_t siz)
+#ifndef CONFIG_READONLY_USERMODEHELPER
+static ssize_t trigger_store(struct device *s, struct device_attribute *attr,
+const char *buf, size_t siz)
 {
char *p;
 
@@ -2358,6 +2359,10 @@ static ssize_t set_trigger(struct device *s, struct 
device_attribute *attr,
 
return strlen(mce_helper) + !!p;
 }
+static DEVICE_ATTR_RW(trigger);
+#else
+static DEVICE_ATTR_RO(trigger);
+#endif
 
 static ssize_t set_ignore_ce(struct device *s,
 struct device_attribute *attr,
@@ -2415,7 +2420,6 @@ static ssize_t store_int_with_restart(struct device *s,
return ret;
 }
 
-static DEVICE_ATTR(trigger, 0644, show_trigger, set_trigger);
 static DEVICE_INT_ATTR(tolerant, 0644, mca_cfg.tolerant);
 static DEVICE_INT_ATTR(monarch_timeout, 0644, mca_cfg.monarch_timeout);
 static DEVICE_BOOL_ATTR(dont_log_ce, 0644, mca_cfg.dont_log_ce);
diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index a139a34f1f1e..e21ab2bcc482 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -75,7 +75,11 @@ extern int fault_rate;
 extern int fault_devs;
 #endif
 
-extern char drbd_usermode_helper[];
+extern
+#ifdef CONFIG_READONLY_USERMODEHELPER
+   const
+#endif
+ char drbd_usermode_helper[];
 
 
 /* This is used to stop/restart our threads.
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 8f51eccc8de7..41c988e9cdf2 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -108,9 +108,14 @@ int proc_details;   /* Detail level in proc drbd*/
 
 /* Module parameter for setting the user mode helper program
  * to run. Default is /sbin/drbdadm */
+#ifdef CONFIG_READONLY_USERMODEHELPER
+const
+#endif
 char drbd_usermode_helper[80] = "/sbin/drbdadm";
 
+#ifndef CONFIG_READONLY_USERMODEHELPER
 module_param_string(usermode_helper, drbd_usermode_helper, 
sizeof(drbd_usermode_helper), 0644);
+#endif
 
 /* in 2.6.x, our device mapping and config info contains our virtual gendisks
  * as member "struct gendisk *vdisk;"
diff --git a/drivers/video/fbdev/uvesafb.c b/drivers/video/fbdev/uvesafb.c
index 98af9e02959b..0328d70a4afb 100644
--- a/drivers/video/fbdev/uvesafb.c
+++ b/drivers/video/fbdev/uvesafb.c
@@ -30,7 +30,11 @@ static struct cb_id uvesafb_cn_id = {
.idx = CN_IDX_V86D,
.val = CN_VAL_V86D_UVESAFB
 };
+#ifdef CONFIG_READONLY_USERMODEHELPER
+static const char v86d_path[PATH_MAX] = "/sbin/v86d";
+#else
 static char v86d_path[PATH_MAX] = "/sbin/v86d";
+#endif
 static char v86d_started;  /* has v86d been started by uvesafb? */
 
 static const struct fb_fix_screeninfo uvesafb_fix = {
@@ -114,7 +118,7 @@ static int uvesafb_helper_start(void)
};
 
char *argv[] = {
-   v86d_path,
+   (char *)v86d_path,
NULL,
};
 
@@ -1883,19 +1887,22 @@ static int uvesafb_setup(char *options)
 }
 #endif /* !MODULE */
 

[PATCH 3/4] Make static usermode helper binaries constant

2016-12-14 Thread Greg KH

There are a number of usermode helper binaries that are "hard coded" in
the kernel today, so mark them as "const" to make it harder for someone
to change where the variables point to.

Signed-off-by: Greg Kroah-Hartman 
---
 drivers/macintosh/windfarm_core.c  | 2 +-
 drivers/net/hamradio/baycom_epp.c  | 2 +-
 drivers/pnp/pnpbios/core.c | 5 +++--
 drivers/staging/greybus/svc_watchdog.c | 4 ++--
 drivers/staging/rtl8192e/rtl8192e/rtl_dm.c | 6 +++---
 fs/nfsd/nfs4layouts.c  | 6 --
 security/keys/request_key.c| 7 ---
 7 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/drivers/macintosh/windfarm_core.c 
b/drivers/macintosh/windfarm_core.c
index 465d770ab0bb..1b317cbb73cf 100644
--- a/drivers/macintosh/windfarm_core.c
+++ b/drivers/macintosh/windfarm_core.c
@@ -74,7 +74,7 @@ static inline void wf_notify(int event, void *param)
 
 static int wf_critical_overtemp(void)
 {
-   static char * critical_overtemp_path = "/sbin/critical_overtemp";
+   static const char * critical_overtemp_path = "/sbin/critical_overtemp";
char *argv[] = { critical_overtemp_path, NULL };
static char *envp[] = { "HOME=/",
"TERM=linux",
diff --git a/drivers/net/hamradio/baycom_epp.c 
b/drivers/net/hamradio/baycom_epp.c
index 78dbc44540f6..321cffa8dbe4 100644
--- a/drivers/net/hamradio/baycom_epp.c
+++ b/drivers/net/hamradio/baycom_epp.c
@@ -299,7 +299,7 @@ static inline void baycom_int_freq(struct baycom_state *bc)
  *eppconfig_path should be setable  via /proc/sys.
  */
 
-static char eppconfig_path[256] = "/usr/sbin/eppfpga";
+static const char eppconfig_path[256] = "/usr/sbin/eppfpga";
 
 static char *envp[] = { "HOME=/", "TERM=linux", "PATH=/usr/bin:/bin", NULL };
 
diff --git a/drivers/pnp/pnpbios/core.c b/drivers/pnp/pnpbios/core.c
index c38a5b9733c8..614aae6fcc0f 100644
--- a/drivers/pnp/pnpbios/core.c
+++ b/drivers/pnp/pnpbios/core.c
@@ -98,6 +98,7 @@ static struct completion unload_sem;
  */
 static int pnp_dock_event(int dock, struct pnp_docking_station_info *info)
 {
+   static const char *sbin_pnpbios = "/sbin/pnpbios";
char *argv[3], **envp, *buf, *scratch;
int i = 0, value;
 
@@ -112,7 +113,7 @@ static int pnp_dock_event(int dock, struct 
pnp_docking_station_info *info)
 * integrated into the driver core and use the usual infrastructure
 * like sysfs and uevents
 */
-   argv[0] = "/sbin/pnpbios";
+   argv[0] = sbin_pnpbios;
argv[1] = "dock";
argv[2] = NULL;
 
@@ -139,7 +140,7 @@ static int pnp_dock_event(int dock, struct 
pnp_docking_station_info *info)
   info->location_id, info->serial, info->capabilities);
envp[i] = NULL;
 
-   value = call_usermodehelper(argv [0], argv, envp, UMH_WAIT_EXEC);
+   value = call_usermodehelper(sbin_pnpbios, argv, envp, UMH_WAIT_EXEC);
kfree(buf);
kfree(envp);
return 0;
diff --git a/drivers/staging/greybus/svc_watchdog.c 
b/drivers/staging/greybus/svc_watchdog.c
index 3729460fb954..db32ec0f0e80 100644
--- a/drivers/staging/greybus/svc_watchdog.c
+++ b/drivers/staging/greybus/svc_watchdog.c
@@ -44,14 +44,14 @@ static int svc_watchdog_pm_notifier(struct notifier_block 
*notifier,
 
 static void greybus_reset(struct work_struct *work)
 {
-   static char start_path[256] = "/system/bin/start";
+   static const char start_path[256] = "/system/bin/start";
static char *envp[] = {
"HOME=/",
"PATH=/sbin:/vendor/bin:/system/sbin:/system/bin:/system/xbin",
NULL,
};
static char *argv[] = {
-   start_path,
+   (char *)start_path,
"unipro_reset",
NULL,
};
diff --git a/drivers/staging/rtl8192e/rtl8192e/rtl_dm.c 
b/drivers/staging/rtl8192e/rtl8192e/rtl_dm.c
index 9bc284812c30..5f0c2cdf32d1 100644
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_dm.c
+++ b/drivers/staging/rtl8192e/rtl8192e/rtl_dm.c
@@ -268,7 +268,7 @@ void rtl92e_dm_watchdog(struct net_device *dev)
 static void _rtl92e_dm_check_ac_dc_power(struct net_device *dev)
 {
struct r8192_priv *priv = rtllib_priv(dev);
-   static char *ac_dc_script = "/etc/acpi/wireless-rtl-ac-dc-power.sh";
+   static const char *ac_dc_script = 
"/etc/acpi/wireless-rtl-ac-dc-power.sh";
char *argv[] = {ac_dc_script, DRV_NAME, NULL};
static char *envp[] = {"HOME=/",
"TERM=linux",
@@ -1823,7 +1823,7 @@ static void _rtl92e_dm_check_rf_ctrl_gpio(void *data)
enum rt_rf_power_state eRfPowerStateToSet;
bool bActuallySet = false;
char *argv[3];
-   static char *RadioPowerPath = "/etc/acpi/events/RadioPower.sh";
+   static const char *RadioPowerPath = "/etc/acpi/events/RadioPower.sh";
static char *envp[] = {"HOME=/", "TERM=linux", 

[RFC 4/4] Introduce CONFIG_READONLY_USERMODEHELPER

2016-12-14 Thread Greg KH
If you can write to kernel memory, an "easy" way to get the kernel to
run any application is to change the pointer of one of the usermode
helper program names.  To try to mitigate this, create a new config
option, CONFIG_READONLY_USERMODEHELPER.

This option only allows "predefined" binaries to be called.  A number of
drivers and subsystems allow for the name of the binary to be changed,
and this config option disables that capability, so be aware of that.

Note:  Still a proof-of-concept at this point in time, doesn't cover all
of the call_usermodehelper() calls just yet, including the "fun" of
coredumps, it's still a work in progress.

Not-Signed-off-by: Greg Kroah-Hartman 
---
 arch/x86/kernel/cpu/mcheck/mce.c | 12 
 drivers/block/drbd/drbd_int.h|  6 +-
 drivers/block/drbd/drbd_main.c   |  5 +
 drivers/video/fbdev/uvesafb.c| 19 ++-
 fs/nfs/cache_lib.c   | 12 ++--
 include/linux/reboot.h   |  2 ++
 kernel/ksysfs.c  |  6 +-
 kernel/reboot.c  |  3 +++
 kernel/sysctl.c  |  4 
 lib/kobject_uevent.c |  3 +++
 security/Kconfig | 17 +
 11 files changed, 76 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 00ef43233e03..92a2ef8ffe3e 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -2337,15 +2337,16 @@ static ssize_t set_bank(struct device *s, struct 
device_attribute *attr,
 }
 
 static ssize_t
-show_trigger(struct device *s, struct device_attribute *attr, char *buf)
+trigger_show(struct device *s, struct device_attribute *attr, char *buf)
 {
strcpy(buf, mce_helper);
strcat(buf, "\n");
return strlen(mce_helper) + 1;
 }
 
-static ssize_t set_trigger(struct device *s, struct device_attribute *attr,
-   const char *buf, size_t siz)
+#ifndef CONFIG_READONLY_USERMODEHELPER
+static ssize_t trigger_store(struct device *s, struct device_attribute *attr,
+const char *buf, size_t siz)
 {
char *p;
 
@@ -2358,6 +2359,10 @@ static ssize_t set_trigger(struct device *s, struct 
device_attribute *attr,
 
return strlen(mce_helper) + !!p;
 }
+static DEVICE_ATTR_RW(trigger);
+#else
+static DEVICE_ATTR_RO(trigger);
+#endif
 
 static ssize_t set_ignore_ce(struct device *s,
 struct device_attribute *attr,
@@ -2415,7 +2420,6 @@ static ssize_t store_int_with_restart(struct device *s,
return ret;
 }
 
-static DEVICE_ATTR(trigger, 0644, show_trigger, set_trigger);
 static DEVICE_INT_ATTR(tolerant, 0644, mca_cfg.tolerant);
 static DEVICE_INT_ATTR(monarch_timeout, 0644, mca_cfg.monarch_timeout);
 static DEVICE_BOOL_ATTR(dont_log_ce, 0644, mca_cfg.dont_log_ce);
diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index a139a34f1f1e..e21ab2bcc482 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -75,7 +75,11 @@ extern int fault_rate;
 extern int fault_devs;
 #endif
 
-extern char drbd_usermode_helper[];
+extern
+#ifdef CONFIG_READONLY_USERMODEHELPER
+   const
+#endif
+ char drbd_usermode_helper[];
 
 
 /* This is used to stop/restart our threads.
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 8f51eccc8de7..41c988e9cdf2 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -108,9 +108,14 @@ int proc_details;   /* Detail level in proc drbd*/
 
 /* Module parameter for setting the user mode helper program
  * to run. Default is /sbin/drbdadm */
+#ifdef CONFIG_READONLY_USERMODEHELPER
+const
+#endif
 char drbd_usermode_helper[80] = "/sbin/drbdadm";
 
+#ifndef CONFIG_READONLY_USERMODEHELPER
 module_param_string(usermode_helper, drbd_usermode_helper, 
sizeof(drbd_usermode_helper), 0644);
+#endif
 
 /* in 2.6.x, our device mapping and config info contains our virtual gendisks
  * as member "struct gendisk *vdisk;"
diff --git a/drivers/video/fbdev/uvesafb.c b/drivers/video/fbdev/uvesafb.c
index 98af9e02959b..0328d70a4afb 100644
--- a/drivers/video/fbdev/uvesafb.c
+++ b/drivers/video/fbdev/uvesafb.c
@@ -30,7 +30,11 @@ static struct cb_id uvesafb_cn_id = {
.idx = CN_IDX_V86D,
.val = CN_VAL_V86D_UVESAFB
 };
+#ifdef CONFIG_READONLY_USERMODEHELPER
+static const char v86d_path[PATH_MAX] = "/sbin/v86d";
+#else
 static char v86d_path[PATH_MAX] = "/sbin/v86d";
+#endif
 static char v86d_started;  /* has v86d been started by uvesafb? */
 
 static const struct fb_fix_screeninfo uvesafb_fix = {
@@ -114,7 +118,7 @@ static int uvesafb_helper_start(void)
};
 
char *argv[] = {
-   v86d_path,
+   (char *)v86d_path,
NULL,
};
 
@@ -1883,19 +1887,22 @@ static int uvesafb_setup(char *options)
 }
 #endif /* !MODULE */
 
-static ssize_t 

[PATCH 3/4] Make static usermode helper binaries constant

2016-12-14 Thread Greg KH

There are a number of usermode helper binaries that are "hard coded" in
the kernel today, so mark them as "const" to make it harder for someone
to change where the variables point to.

Signed-off-by: Greg Kroah-Hartman 
---
 drivers/macintosh/windfarm_core.c  | 2 +-
 drivers/net/hamradio/baycom_epp.c  | 2 +-
 drivers/pnp/pnpbios/core.c | 5 +++--
 drivers/staging/greybus/svc_watchdog.c | 4 ++--
 drivers/staging/rtl8192e/rtl8192e/rtl_dm.c | 6 +++---
 fs/nfsd/nfs4layouts.c  | 6 --
 security/keys/request_key.c| 7 ---
 7 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/drivers/macintosh/windfarm_core.c 
b/drivers/macintosh/windfarm_core.c
index 465d770ab0bb..1b317cbb73cf 100644
--- a/drivers/macintosh/windfarm_core.c
+++ b/drivers/macintosh/windfarm_core.c
@@ -74,7 +74,7 @@ static inline void wf_notify(int event, void *param)
 
 static int wf_critical_overtemp(void)
 {
-   static char * critical_overtemp_path = "/sbin/critical_overtemp";
+   static const char * critical_overtemp_path = "/sbin/critical_overtemp";
char *argv[] = { critical_overtemp_path, NULL };
static char *envp[] = { "HOME=/",
"TERM=linux",
diff --git a/drivers/net/hamradio/baycom_epp.c 
b/drivers/net/hamradio/baycom_epp.c
index 78dbc44540f6..321cffa8dbe4 100644
--- a/drivers/net/hamradio/baycom_epp.c
+++ b/drivers/net/hamradio/baycom_epp.c
@@ -299,7 +299,7 @@ static inline void baycom_int_freq(struct baycom_state *bc)
  *eppconfig_path should be setable  via /proc/sys.
  */
 
-static char eppconfig_path[256] = "/usr/sbin/eppfpga";
+static const char eppconfig_path[256] = "/usr/sbin/eppfpga";
 
 static char *envp[] = { "HOME=/", "TERM=linux", "PATH=/usr/bin:/bin", NULL };
 
diff --git a/drivers/pnp/pnpbios/core.c b/drivers/pnp/pnpbios/core.c
index c38a5b9733c8..614aae6fcc0f 100644
--- a/drivers/pnp/pnpbios/core.c
+++ b/drivers/pnp/pnpbios/core.c
@@ -98,6 +98,7 @@ static struct completion unload_sem;
  */
 static int pnp_dock_event(int dock, struct pnp_docking_station_info *info)
 {
+   static const char *sbin_pnpbios = "/sbin/pnpbios";
char *argv[3], **envp, *buf, *scratch;
int i = 0, value;
 
@@ -112,7 +113,7 @@ static int pnp_dock_event(int dock, struct 
pnp_docking_station_info *info)
 * integrated into the driver core and use the usual infrastructure
 * like sysfs and uevents
 */
-   argv[0] = "/sbin/pnpbios";
+   argv[0] = sbin_pnpbios;
argv[1] = "dock";
argv[2] = NULL;
 
@@ -139,7 +140,7 @@ static int pnp_dock_event(int dock, struct 
pnp_docking_station_info *info)
   info->location_id, info->serial, info->capabilities);
envp[i] = NULL;
 
-   value = call_usermodehelper(argv [0], argv, envp, UMH_WAIT_EXEC);
+   value = call_usermodehelper(sbin_pnpbios, argv, envp, UMH_WAIT_EXEC);
kfree(buf);
kfree(envp);
return 0;
diff --git a/drivers/staging/greybus/svc_watchdog.c 
b/drivers/staging/greybus/svc_watchdog.c
index 3729460fb954..db32ec0f0e80 100644
--- a/drivers/staging/greybus/svc_watchdog.c
+++ b/drivers/staging/greybus/svc_watchdog.c
@@ -44,14 +44,14 @@ static int svc_watchdog_pm_notifier(struct notifier_block 
*notifier,
 
 static void greybus_reset(struct work_struct *work)
 {
-   static char start_path[256] = "/system/bin/start";
+   static const char start_path[256] = "/system/bin/start";
static char *envp[] = {
"HOME=/",
"PATH=/sbin:/vendor/bin:/system/sbin:/system/bin:/system/xbin",
NULL,
};
static char *argv[] = {
-   start_path,
+   (char *)start_path,
"unipro_reset",
NULL,
};
diff --git a/drivers/staging/rtl8192e/rtl8192e/rtl_dm.c 
b/drivers/staging/rtl8192e/rtl8192e/rtl_dm.c
index 9bc284812c30..5f0c2cdf32d1 100644
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_dm.c
+++ b/drivers/staging/rtl8192e/rtl8192e/rtl_dm.c
@@ -268,7 +268,7 @@ void rtl92e_dm_watchdog(struct net_device *dev)
 static void _rtl92e_dm_check_ac_dc_power(struct net_device *dev)
 {
struct r8192_priv *priv = rtllib_priv(dev);
-   static char *ac_dc_script = "/etc/acpi/wireless-rtl-ac-dc-power.sh";
+   static const char *ac_dc_script = 
"/etc/acpi/wireless-rtl-ac-dc-power.sh";
char *argv[] = {ac_dc_script, DRV_NAME, NULL};
static char *envp[] = {"HOME=/",
"TERM=linux",
@@ -1823,7 +1823,7 @@ static void _rtl92e_dm_check_rf_ctrl_gpio(void *data)
enum rt_rf_power_state eRfPowerStateToSet;
bool bActuallySet = false;
char *argv[3];
-   static char *RadioPowerPath = "/etc/acpi/events/RadioPower.sh";
+   static const char *RadioPowerPath = "/etc/acpi/events/RadioPower.sh";
static char *envp[] = {"HOME=/", "TERM=linux", "PATH=/usr/bin:/bin",
 

[RFC 0/4] make call_usermodehelper a bit more "safe"

2016-12-14 Thread Greg KH
Hi all,

Here's a proof-of-concept patch series that tries to work to address the
issue of call_usermodehelper being abused to have the kernel call any
userspace binary with full root permissions.

The issue is that if you end up getting write access to kernel memory,
if you change the string '/sbin/hotplug' to point to
'/home/hacked/my_binary', then the next uevent that the system makes
will call this binary instead of the "trusted" one.

It does this by moving the location of the binary to be in read-only
memory.  This works for a number of call_usermodehelper strings, as they
are specified at build or configuration time.  But, some subsystems have
the option to let userspace change the value at runtime, so those values
can't live in read-only memory.  To resolve this I've created a new
configuration option, CONFIG_READONLY_USERMODEHELPER to make those
options not able to be changed.

Yes, this changes existing functionality, but I'm willing to bet that
almost no one ever changes these binary locations, or if they do, they
can set them to the "correct" location at built time.

This all happens in the last patch of the series.  Note, I haven't
caught all places in the kernel that has these options, the messiest
being coredumps, which I haven't addressed yet, and is going to be a
pain.

This last patch is hacky, and I'm not really happy about it, so I'm
posting it here as an RFC to see what others think.

As a contrast, grsec does try to mitigate this same problem, but it does
so by looking at the location of the binary that is about to be run, and
only allowing a small whitelist of directories that are "allowed" to be
used.  That's a much simpler solution, but also feels hacky to me in a
way given that it's a whitelist and encompasses whole system directories
(i.e. /sbin/).  My patchset requires that each caller of
call_usermodehelper be audited, which is a pain, and will be needed to
be watched out for for new users, which also isn't any good.

So, anyone have any better ideas?  Is this approach worth it?  Or should
we just go down the "whitelist" path?

Note, the first 3 patches in this series will be submitted for inclusion
either way, as they are good cleanups, and change no functionality at
all, and resolve this issue automatically for some subsystems with no
downside.

thanks,

greg k-h


[RFC 0/4] make call_usermodehelper a bit more "safe"

2016-12-14 Thread Greg KH
Hi all,

Here's a proof-of-concept patch series that tries to work to address the
issue of call_usermodehelper being abused to have the kernel call any
userspace binary with full root permissions.

The issue is that if you end up getting write access to kernel memory,
if you change the string '/sbin/hotplug' to point to
'/home/hacked/my_binary', then the next uevent that the system makes
will call this binary instead of the "trusted" one.

It does this by moving the location of the binary to be in read-only
memory.  This works for a number of call_usermodehelper strings, as they
are specified at build or configuration time.  But, some subsystems have
the option to let userspace change the value at runtime, so those values
can't live in read-only memory.  To resolve this I've created a new
configuration option, CONFIG_READONLY_USERMODEHELPER to make those
options not able to be changed.

Yes, this changes existing functionality, but I'm willing to bet that
almost no one ever changes these binary locations, or if they do, they
can set them to the "correct" location at built time.

This all happens in the last patch of the series.  Note, I haven't
caught all places in the kernel that has these options, the messiest
being coredumps, which I haven't addressed yet, and is going to be a
pain.

This last patch is hacky, and I'm not really happy about it, so I'm
posting it here as an RFC to see what others think.

As a contrast, grsec does try to mitigate this same problem, but it does
so by looking at the location of the binary that is about to be run, and
only allowing a small whitelist of directories that are "allowed" to be
used.  That's a much simpler solution, but also feels hacky to me in a
way given that it's a whitelist and encompasses whole system directories
(i.e. /sbin/).  My patchset requires that each caller of
call_usermodehelper be audited, which is a pain, and will be needed to
be watched out for for new users, which also isn't any good.

So, anyone have any better ideas?  Is this approach worth it?  Or should
we just go down the "whitelist" path?

Note, the first 3 patches in this series will be submitted for inclusion
either way, as they are good cleanups, and change no functionality at
all, and resolve this issue automatically for some subsystems with no
downside.

thanks,

greg k-h


[PATCH 1/4] kmod: make usermodehelper path a const string

2016-12-14 Thread Greg KH
This is in preparation for making it so that usermode helper programs
can't be changed, if desired, by userspace.  We will tackle the mess of
cleaning up the write-ability of argv and env later, that's going to
take more work, for much less gain...

Signed-off-by: Greg Kroah-Hartman 
---
 include/linux/kmod.h | 7 ---
 kernel/kmod.c| 4 ++--
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index fcfd2bf14d3f..c4e441e00db5 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -56,7 +56,7 @@ struct file;
 struct subprocess_info {
struct work_struct work;
struct completion *complete;
-   char *path;
+   const char *path;
char **argv;
char **envp;
int wait;
@@ -67,10 +67,11 @@ struct subprocess_info {
 };
 
 extern int
-call_usermodehelper(char *path, char **argv, char **envp, int wait);
+call_usermodehelper(const char *path, char **argv, char **envp, int wait);
 
 extern struct subprocess_info *
-call_usermodehelper_setup(char *path, char **argv, char **envp, gfp_t gfp_mask,
+call_usermodehelper_setup(const char *path, char **argv, char **envp,
+ gfp_t gfp_mask,
  int (*init)(struct subprocess_info *info, struct cred 
*new),
  void (*cleanup)(struct subprocess_info *), void 
*data);
 
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 0277d1216f80..0c216b76afca 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -516,7 +516,7 @@ static void helper_unlock(void)
  * Function must be runnable in either a process context or the
  * context in which call_usermodehelper_exec is called.
  */
-struct subprocess_info *call_usermodehelper_setup(char *path, char **argv,
+struct subprocess_info *call_usermodehelper_setup(const char *path, char 
**argv,
char **envp, gfp_t gfp_mask,
int (*init)(struct subprocess_info *info, struct cred *new),
void (*cleanup)(struct subprocess_info *info),
@@ -613,7 +613,7 @@ EXPORT_SYMBOL(call_usermodehelper_exec);
  * This function is the equivalent to use call_usermodehelper_setup() and
  * call_usermodehelper_exec().
  */
-int call_usermodehelper(char *path, char **argv, char **envp, int wait)
+int call_usermodehelper(const char *path, char **argv, char **envp, int wait)
 {
struct subprocess_info *info;
gfp_t gfp_mask = (wait == UMH_NO_WAIT) ? GFP_ATOMIC : GFP_KERNEL;
-- 
2.10.2



[PATCH 1/4] kmod: make usermodehelper path a const string

2016-12-14 Thread Greg KH
This is in preparation for making it so that usermode helper programs
can't be changed, if desired, by userspace.  We will tackle the mess of
cleaning up the write-ability of argv and env later, that's going to
take more work, for much less gain...

Signed-off-by: Greg Kroah-Hartman 
---
 include/linux/kmod.h | 7 ---
 kernel/kmod.c| 4 ++--
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index fcfd2bf14d3f..c4e441e00db5 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -56,7 +56,7 @@ struct file;
 struct subprocess_info {
struct work_struct work;
struct completion *complete;
-   char *path;
+   const char *path;
char **argv;
char **envp;
int wait;
@@ -67,10 +67,11 @@ struct subprocess_info {
 };
 
 extern int
-call_usermodehelper(char *path, char **argv, char **envp, int wait);
+call_usermodehelper(const char *path, char **argv, char **envp, int wait);
 
 extern struct subprocess_info *
-call_usermodehelper_setup(char *path, char **argv, char **envp, gfp_t gfp_mask,
+call_usermodehelper_setup(const char *path, char **argv, char **envp,
+ gfp_t gfp_mask,
  int (*init)(struct subprocess_info *info, struct cred 
*new),
  void (*cleanup)(struct subprocess_info *), void 
*data);
 
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 0277d1216f80..0c216b76afca 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -516,7 +516,7 @@ static void helper_unlock(void)
  * Function must be runnable in either a process context or the
  * context in which call_usermodehelper_exec is called.
  */
-struct subprocess_info *call_usermodehelper_setup(char *path, char **argv,
+struct subprocess_info *call_usermodehelper_setup(const char *path, char 
**argv,
char **envp, gfp_t gfp_mask,
int (*init)(struct subprocess_info *info, struct cred *new),
void (*cleanup)(struct subprocess_info *info),
@@ -613,7 +613,7 @@ EXPORT_SYMBOL(call_usermodehelper_exec);
  * This function is the equivalent to use call_usermodehelper_setup() and
  * call_usermodehelper_exec().
  */
-int call_usermodehelper(char *path, char **argv, char **envp, int wait)
+int call_usermodehelper(const char *path, char **argv, char **envp, int wait)
 {
struct subprocess_info *info;
gfp_t gfp_mask = (wait == UMH_NO_WAIT) ? GFP_ATOMIC : GFP_KERNEL;
-- 
2.10.2



[PATCH v3 2/3] secure_seq: use siphash24 instead of md5_transform

2016-12-14 Thread Jason A. Donenfeld
This gives a clear speed and security improvement. Siphash is both
faster and is more solid crypto than the aging MD5.

Rather than manually filling MD5 buffers, we simply create
a layout by a simple anonymous struct, for which gcc generates
rather efficient code.

Signed-off-by: Jason A. Donenfeld 
Cc: Andi Kleen 
Cc: David Miller 
Cc: David Laight 
---
Changes from v2->v3:

  - Structs are no longer packed, to mitigate slow byte-by-byte assignment.
  - A typo has been fixed in the port number assignment.

 net/core/secure_seq.c | 166 ++
 1 file changed, 85 insertions(+), 81 deletions(-)

diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
index 88a8e429fc3e..00eb141c981b 100644
--- a/net/core/secure_seq.c
+++ b/net/core/secure_seq.c
@@ -1,3 +1,5 @@
+/* Copyright (C) 2016 Jason A. Donenfeld . All Rights 
Reserved. */
+
 #include 
 #include 
 #include 
@@ -8,14 +10,14 @@
 #include 
 #include 
 #include 
-
+#include 
 #include 
 
 #if IS_ENABLED(CONFIG_IPV6) || IS_ENABLED(CONFIG_INET)
+#include 
 #include 
-#define NET_SECRET_SIZE (MD5_MESSAGE_BYTES / 4)
 
-static u32 net_secret[NET_SECRET_SIZE] cacheline_aligned;
+static u8 net_secret[SIPHASH24_KEY_LEN] __aligned(SIPHASH24_ALIGNMENT);
 
 static __always_inline void net_secret_init(void)
 {
@@ -44,44 +46,41 @@ static u32 seq_scale(u32 seq)
 u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
 __be16 sport, __be16 dport, u32 *tsoff)
 {
-   u32 secret[MD5_MESSAGE_BYTES / 4];
-   u32 hash[MD5_DIGEST_WORDS];
-   u32 i;
-
+   const struct {
+   struct in6_addr saddr;
+   struct in6_addr daddr;
+   __be16 sport;
+   __be16 dport;
+   char end[];
+   } __aligned(SIPHASH24_ALIGNMENT) combined = {
+   .saddr = *(struct in6_addr *)saddr,
+   .daddr = *(struct in6_addr *)daddr,
+   .sport = sport,
+   .dport = dport
+   };
+   u64 hash;
net_secret_init();
-   memcpy(hash, saddr, 16);
-   for (i = 0; i < 4; i++)
-   secret[i] = net_secret[i] + (__force u32)daddr[i];
-   secret[4] = net_secret[4] +
-   (((__force u16)sport << 16) + (__force u16)dport);
-   for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-   secret[i] = net_secret[i];
-
-   md5_transform(hash, secret);
-
-   *tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0;
-   return seq_scale(hash[0]);
+   hash = siphash24((const u8 *), offsetof(typeof(combined), 
end), net_secret);
+   *tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0;
+   return seq_scale(hash);
 }
 EXPORT_SYMBOL(secure_tcpv6_sequence_number);
 
 u32 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr,
   __be16 dport)
 {
-   u32 secret[MD5_MESSAGE_BYTES / 4];
-   u32 hash[MD5_DIGEST_WORDS];
-   u32 i;
-
+   const struct {
+   struct in6_addr saddr;
+   struct in6_addr daddr;
+   __be16 dport;
+   char end[];
+   } __aligned(SIPHASH24_ALIGNMENT) combined = {
+   .saddr = *(struct in6_addr *)saddr,
+   .daddr = *(struct in6_addr *)daddr,
+   .dport = dport
+   };
net_secret_init();
-   memcpy(hash, saddr, 16);
-   for (i = 0; i < 4; i++)
-   secret[i] = net_secret[i] + (__force u32) daddr[i];
-   secret[4] = net_secret[4] + (__force u32)dport;
-   for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-   secret[i] = net_secret[i];
-
-   md5_transform(hash, secret);
-
-   return hash[0];
+   return siphash24((const u8 *), offsetof(typeof(combined), 
end), net_secret);
 }
 EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 #endif
@@ -91,33 +90,39 @@ EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
   __be16 sport, __be16 dport, u32 *tsoff)
 {
-   u32 hash[MD5_DIGEST_WORDS];
-
+   const struct {
+   __be32 saddr;
+   __be32 daddr;
+   __be16 sport;
+   __be16 dport;
+   char end[];
+   } __aligned(SIPHASH24_ALIGNMENT) combined = {
+   .saddr = saddr,
+   .daddr = daddr,
+   .sport = sport,
+   .dport = dport
+   };
+   u64 hash;
net_secret_init();
-   hash[0] = (__force u32)saddr;
-   hash[1] = (__force u32)daddr;
-   hash[2] = ((__force u16)sport << 16) + (__force u16)dport;
-   hash[3] = net_secret[15];
-
-   md5_transform(hash, net_secret);
-
-   *tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0;
-   return seq_scale(hash[0]);
+   hash = siphash24((const u8 *), 

[PATCH v3 2/3] secure_seq: use siphash24 instead of md5_transform

2016-12-14 Thread Jason A. Donenfeld
This gives a clear speed and security improvement. Siphash is both
faster and is more solid crypto than the aging MD5.

Rather than manually filling MD5 buffers, we simply create
a layout by a simple anonymous struct, for which gcc generates
rather efficient code.

Signed-off-by: Jason A. Donenfeld 
Cc: Andi Kleen 
Cc: David Miller 
Cc: David Laight 
---
Changes from v2->v3:

  - Structs are no longer packed, to mitigate slow byte-by-byte assignment.
  - A typo has been fixed in the port number assignment.

 net/core/secure_seq.c | 166 ++
 1 file changed, 85 insertions(+), 81 deletions(-)

diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
index 88a8e429fc3e..00eb141c981b 100644
--- a/net/core/secure_seq.c
+++ b/net/core/secure_seq.c
@@ -1,3 +1,5 @@
+/* Copyright (C) 2016 Jason A. Donenfeld . All Rights 
Reserved. */
+
 #include 
 #include 
 #include 
@@ -8,14 +10,14 @@
 #include 
 #include 
 #include 
-
+#include 
 #include 
 
 #if IS_ENABLED(CONFIG_IPV6) || IS_ENABLED(CONFIG_INET)
+#include 
 #include 
-#define NET_SECRET_SIZE (MD5_MESSAGE_BYTES / 4)
 
-static u32 net_secret[NET_SECRET_SIZE] cacheline_aligned;
+static u8 net_secret[SIPHASH24_KEY_LEN] __aligned(SIPHASH24_ALIGNMENT);
 
 static __always_inline void net_secret_init(void)
 {
@@ -44,44 +46,41 @@ static u32 seq_scale(u32 seq)
 u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
 __be16 sport, __be16 dport, u32 *tsoff)
 {
-   u32 secret[MD5_MESSAGE_BYTES / 4];
-   u32 hash[MD5_DIGEST_WORDS];
-   u32 i;
-
+   const struct {
+   struct in6_addr saddr;
+   struct in6_addr daddr;
+   __be16 sport;
+   __be16 dport;
+   char end[];
+   } __aligned(SIPHASH24_ALIGNMENT) combined = {
+   .saddr = *(struct in6_addr *)saddr,
+   .daddr = *(struct in6_addr *)daddr,
+   .sport = sport,
+   .dport = dport
+   };
+   u64 hash;
net_secret_init();
-   memcpy(hash, saddr, 16);
-   for (i = 0; i < 4; i++)
-   secret[i] = net_secret[i] + (__force u32)daddr[i];
-   secret[4] = net_secret[4] +
-   (((__force u16)sport << 16) + (__force u16)dport);
-   for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-   secret[i] = net_secret[i];
-
-   md5_transform(hash, secret);
-
-   *tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0;
-   return seq_scale(hash[0]);
+   hash = siphash24((const u8 *), offsetof(typeof(combined), 
end), net_secret);
+   *tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0;
+   return seq_scale(hash);
 }
 EXPORT_SYMBOL(secure_tcpv6_sequence_number);
 
 u32 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr,
   __be16 dport)
 {
-   u32 secret[MD5_MESSAGE_BYTES / 4];
-   u32 hash[MD5_DIGEST_WORDS];
-   u32 i;
-
+   const struct {
+   struct in6_addr saddr;
+   struct in6_addr daddr;
+   __be16 dport;
+   char end[];
+   } __aligned(SIPHASH24_ALIGNMENT) combined = {
+   .saddr = *(struct in6_addr *)saddr,
+   .daddr = *(struct in6_addr *)daddr,
+   .dport = dport
+   };
net_secret_init();
-   memcpy(hash, saddr, 16);
-   for (i = 0; i < 4; i++)
-   secret[i] = net_secret[i] + (__force u32) daddr[i];
-   secret[4] = net_secret[4] + (__force u32)dport;
-   for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-   secret[i] = net_secret[i];
-
-   md5_transform(hash, secret);
-
-   return hash[0];
+   return siphash24((const u8 *), offsetof(typeof(combined), 
end), net_secret);
 }
 EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 #endif
@@ -91,33 +90,39 @@ EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
   __be16 sport, __be16 dport, u32 *tsoff)
 {
-   u32 hash[MD5_DIGEST_WORDS];
-
+   const struct {
+   __be32 saddr;
+   __be32 daddr;
+   __be16 sport;
+   __be16 dport;
+   char end[];
+   } __aligned(SIPHASH24_ALIGNMENT) combined = {
+   .saddr = saddr,
+   .daddr = daddr,
+   .sport = sport,
+   .dport = dport
+   };
+   u64 hash;
net_secret_init();
-   hash[0] = (__force u32)saddr;
-   hash[1] = (__force u32)daddr;
-   hash[2] = ((__force u16)sport << 16) + (__force u16)dport;
-   hash[3] = net_secret[15];
-
-   md5_transform(hash, net_secret);
-
-   *tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0;
-   return seq_scale(hash[0]);
+   hash = siphash24((const u8 *), offsetof(typeof(combined), 
end), net_secret);
+   *tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0;
+   return 

Re: [PATCH v2 3/3] kvm: svm: Use the hardware provided GPA instead of page walk

2016-12-14 Thread Paolo Bonzini


On 14/12/2016 19:39, Brijesh Singh wrote:
> 
> On 12/14/2016 11:23 AM, Paolo Bonzini wrote:
>>
>>
>> On 14/12/2016 18:07, Brijesh Singh wrote:

>>>
>>> Since now we are going to perform multiple conditional checks before
>>> concluding that its safe to use HW provided GPA. How about if we add two
>>> functions "emulator_is_rep_string_op" and "emulator_is_two_mem_op" into
>>> emulator.c and  use these functions inside the x86.c to determine if its
>>> safe to use HW provided gpa?
>>
>> Why not export only emulator_can_use_gpa from emulate.c?  (So in the end
>> leaving emulator_is_string_op in emulate.c was the right thing to do, it
>> was just the test that was wrong :)).
>>
> 
> Actually, I was not sure if putting emulator_can_use_gpa() in emulate.c
> was right thing - mainly because emulator.c does not deal with GPA. I
> will go with your advice and put it in emulator.c, it makes easy :)
> 
> 
>> The patch below is still missing the check for cross-page MMIO.  Your
>> reference to the BKDG only covers MMCONFIG (sometimes referred to as
>> ECAM), not MMIO in general.  Doing AND or OR into video memory for
>> example is perfectly legal, and I'm fairly sure that some obscure legacy
>> software does PUSH/POP into vram as well!
>>
>>
> 
> I used your below code snippet to detect cross-page MMIO access. After
> applying these changes cross-page MMIO read/write unit test is passing
> just fine. I will include it in patch.

Great, thanks.  I hope we can include it in 4.10.

Paolo


Re: [PATCH v3 1/2] ARM: ep93xx: Register ts73xx-fpga manager driver for TS-7300

2016-12-14 Thread Florian Fainelli
On 12/13/2016 10:14 PM, Moritz Fischer wrote:
> Hi Florian,
> 
> On Tue, Dec 13, 2016 at 6:35 PM, Florian Fainelli  
> wrote:
>> Register the TS-7300 FPGA manager device drivers which allows us to load
>> bitstreams into the on-board Altera Cyclone II FPGA.
>>
>> Signed-off-by: Florian Fainelli 
>> ---
>>  arch/arm/mach-ep93xx/ts72xx.c | 26 ++
>>  1 file changed, 26 insertions(+)
>>
>> diff --git a/arch/arm/mach-ep93xx/ts72xx.c b/arch/arm/mach-ep93xx/ts72xx.c
>> index 3b39ea353d30..acf72ea670ef 100644
>> --- a/arch/arm/mach-ep93xx/ts72xx.c
>> +++ b/arch/arm/mach-ep93xx/ts72xx.c
>> @@ -230,6 +230,28 @@ static struct ep93xx_eth_data __initdata 
>> ts72xx_eth_data = {
>> .phy_id = 1,
>>  };
>>
>> +#if IS_ENABLED(CONFIG_FPGA_MGR_TS73XX)
>> +
>> +/* Relative to EP93XX_CS1_PHYS_BASE */
>> +#define TS73XX_FPGA_LOADER_BASE0x03c0
>> +
>> +static struct resource ts73xx_fpga_resources[] = {
>> +   {
>> +   .start  = EP93XX_CS1_PHYS_BASE + TS73XX_FPGA_LOADER_BASE,
>> +   .end= EP93XX_CS1_PHYS_BASE + TS73XX_FPGA_LOADER_BASE + 1,
>> +   .flags  = IORESOURCE_MEM,
>> +   },
>> +};
>> +
>> +static struct platform_device ts73xx_fpga_device = {
>> +   .name   = "ts73xx-fpga-mgr",
>> +   .id = -1,
>> +   .resource = ts73xx_fpga_resources,
>> +   .num_resources = ARRAY_SIZE(ts73xx_fpga_resources),
>> +};
>> +
>> +#endif
>> +
>>  static void __init ts72xx_init_machine(void)
>>  {
>> ep93xx_init_devices();
>> @@ -238,6 +260,10 @@ static void __init ts72xx_init_machine(void)
>> platform_device_register(_wdt_device);
>>
>> ep93xx_register_eth(_eth_data, 1);
>> +#if IS_ENABLED(CONFIG_FPGA_MGR_TS73XX)
>> +   if (board_is_ts7300())
>> +   platform_device_register(_fpga_device);
>> +#endif
>>  }
>>
>>  MACHINE_START(TS72XX, "Technologic Systems TS-72xx SBC")
>> --
>> 2.9.3
>>
> 
> I think this is backwards, shouldn't this be your [PATCH 2/2]?
> Otherwise you're using
> the driver before you added it.

I can definitively re-order the patches, although I don't think this
really makes a difference, a driver without device does nothing, and
vice versa.
-- 
Florian


Re: [PATCH v2 3/3] kvm: svm: Use the hardware provided GPA instead of page walk

2016-12-14 Thread Paolo Bonzini


On 14/12/2016 19:39, Brijesh Singh wrote:
> 
> On 12/14/2016 11:23 AM, Paolo Bonzini wrote:
>>
>>
>> On 14/12/2016 18:07, Brijesh Singh wrote:

>>>
>>> Since now we are going to perform multiple conditional checks before
>>> concluding that its safe to use HW provided GPA. How about if we add two
>>> functions "emulator_is_rep_string_op" and "emulator_is_two_mem_op" into
>>> emulator.c and  use these functions inside the x86.c to determine if its
>>> safe to use HW provided gpa?
>>
>> Why not export only emulator_can_use_gpa from emulate.c?  (So in the end
>> leaving emulator_is_string_op in emulate.c was the right thing to do, it
>> was just the test that was wrong :)).
>>
> 
> Actually, I was not sure if putting emulator_can_use_gpa() in emulate.c
> was right thing - mainly because emulator.c does not deal with GPA. I
> will go with your advice and put it in emulator.c, it makes easy :)
> 
> 
>> The patch below is still missing the check for cross-page MMIO.  Your
>> reference to the BKDG only covers MMCONFIG (sometimes referred to as
>> ECAM), not MMIO in general.  Doing AND or OR into video memory for
>> example is perfectly legal, and I'm fairly sure that some obscure legacy
>> software does PUSH/POP into vram as well!
>>
>>
> 
> I used your below code snippet to detect cross-page MMIO access. After
> applying these changes cross-page MMIO read/write unit test is passing
> just fine. I will include it in patch.

Great, thanks.  I hope we can include it in 4.10.

Paolo


Re: [PATCH v3 1/2] ARM: ep93xx: Register ts73xx-fpga manager driver for TS-7300

2016-12-14 Thread Florian Fainelli
On 12/13/2016 10:14 PM, Moritz Fischer wrote:
> Hi Florian,
> 
> On Tue, Dec 13, 2016 at 6:35 PM, Florian Fainelli  
> wrote:
>> Register the TS-7300 FPGA manager device drivers which allows us to load
>> bitstreams into the on-board Altera Cyclone II FPGA.
>>
>> Signed-off-by: Florian Fainelli 
>> ---
>>  arch/arm/mach-ep93xx/ts72xx.c | 26 ++
>>  1 file changed, 26 insertions(+)
>>
>> diff --git a/arch/arm/mach-ep93xx/ts72xx.c b/arch/arm/mach-ep93xx/ts72xx.c
>> index 3b39ea353d30..acf72ea670ef 100644
>> --- a/arch/arm/mach-ep93xx/ts72xx.c
>> +++ b/arch/arm/mach-ep93xx/ts72xx.c
>> @@ -230,6 +230,28 @@ static struct ep93xx_eth_data __initdata 
>> ts72xx_eth_data = {
>> .phy_id = 1,
>>  };
>>
>> +#if IS_ENABLED(CONFIG_FPGA_MGR_TS73XX)
>> +
>> +/* Relative to EP93XX_CS1_PHYS_BASE */
>> +#define TS73XX_FPGA_LOADER_BASE0x03c0
>> +
>> +static struct resource ts73xx_fpga_resources[] = {
>> +   {
>> +   .start  = EP93XX_CS1_PHYS_BASE + TS73XX_FPGA_LOADER_BASE,
>> +   .end= EP93XX_CS1_PHYS_BASE + TS73XX_FPGA_LOADER_BASE + 1,
>> +   .flags  = IORESOURCE_MEM,
>> +   },
>> +};
>> +
>> +static struct platform_device ts73xx_fpga_device = {
>> +   .name   = "ts73xx-fpga-mgr",
>> +   .id = -1,
>> +   .resource = ts73xx_fpga_resources,
>> +   .num_resources = ARRAY_SIZE(ts73xx_fpga_resources),
>> +};
>> +
>> +#endif
>> +
>>  static void __init ts72xx_init_machine(void)
>>  {
>> ep93xx_init_devices();
>> @@ -238,6 +260,10 @@ static void __init ts72xx_init_machine(void)
>> platform_device_register(_wdt_device);
>>
>> ep93xx_register_eth(_eth_data, 1);
>> +#if IS_ENABLED(CONFIG_FPGA_MGR_TS73XX)
>> +   if (board_is_ts7300())
>> +   platform_device_register(_fpga_device);
>> +#endif
>>  }
>>
>>  MACHINE_START(TS72XX, "Technologic Systems TS-72xx SBC")
>> --
>> 2.9.3
>>
> 
> I think this is backwards, shouldn't this be your [PATCH 2/2]?
> Otherwise you're using
> the driver before you added it.

I can definitively re-order the patches, although I don't think this
really makes a difference, a driver without device does nothing, and
vice versa.
-- 
Florian


[PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-14 Thread Jason A. Donenfeld
SipHash is a 64-bit keyed hash function that is actually a
cryptographically secure PRF, like HMAC. Except SipHash is super fast,
and is meant to be used as a hashtable keyed lookup function.

SipHash isn't just some new trendy hash function. It's been around for a
while, and there really isn't anything that comes remotely close to
being useful in the way SipHash is. With that said, why do we need this?

There are a variety of attacks known as "hashtable poisoning" in which an
attacker forms some data such that the hash of that data will be the
same, and then preceeds to fill up all entries of a hashbucket. This is
a realistic and well-known denial-of-service vector.

Linux developers already seem to be aware that this is an issue, and
various places that use hash tables in, say, a network context, use a
non-cryptographically secure function (usually jhash) and then try to
twiddle with the key on a time basis (or in many cases just do nothing
and hope that nobody notices). While this is an admirable attempt at
solving the problem, it doesn't actually fix it. SipHash fixes it.

(It fixes it in such a sound way that you could even build a stream
cipher out of SipHash that would resist the modern cryptanalysis.)

There are a modicum of places in the kernel that are vulnerable to
hashtable poisoning attacks, either via userspace vectors or network
vectors, and there's not a reliable mechanism inside the kernel at the
moment to fix it. The first step toward fixing these issues is actually
getting a secure primitive into the kernel for developers to use. Then
we can, bit by bit, port things over to it as deemed appropriate.

Secondly, a few places are using MD5 for creating secure sequence
numbers, port numbers, or fast random numbers. Siphash is a faster, more
fittting, and more secure replacement for MD5 in those situations.

Dozens of languages are already using this internally for their hash
tables. Some of the BSDs already use this in their kernels. SipHash is
a widely known high-speed solution to a widely known problem, and it's
time we catch-up.

Signed-off-by: Jason A. Donenfeld 
Cc: Jean-Philippe Aumasson 
Cc: Daniel J. Bernstein 
Cc: Linus Torvalds 
Cc: Eric Biggers 
Cc: David Laight 
---
Changes from v2->v3:

  - There is now a fast aligned version of the function and a not-as-fast
unaligned version. The requirements for each have been documented in
a docbook-style comment. As well, the header now contains a constant
for the expected alignment.

  - The test suite has been updated to check both the unaligned and aligned
version of the function.

 include/linux/siphash.h |  30 ++
 lib/Kconfig.debug   |   6 +-
 lib/Makefile|   5 +-
 lib/siphash.c   | 153 
 lib/test_siphash.c  |  85 +++
 5 files changed, 274 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/siphash.h
 create mode 100644 lib/siphash.c
 create mode 100644 lib/test_siphash.c

diff --git a/include/linux/siphash.h b/include/linux/siphash.h
new file mode 100644
index ..82dc1a911a2e
--- /dev/null
+++ b/include/linux/siphash.h
@@ -0,0 +1,30 @@
+/* Copyright (C) 2016 Jason A. Donenfeld 
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ */
+
+#ifndef _LINUX_SIPHASH_H
+#define _LINUX_SIPHASH_H
+
+#include 
+
+enum siphash_lengths {
+   SIPHASH24_KEY_LEN = 16,
+   SIPHASH24_ALIGNMENT = 8
+};
+
+u64 siphash24(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]);
+
+#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+static inline u64 siphash24_unaligned(const u8 *data, size_t len, const u8 
key[SIPHASH24_KEY_LEN])
+{
+   return siphash24(data, len, key);
+}
+#else
+u64 siphash24_unaligned(const u8 *data, size_t len, const u8 
key[SIPHASH24_KEY_LEN]);
+#endif
+
+#endif /* _LINUX_SIPHASH_H */
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index e6327d102184..32bbf689fc46 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1843,9 +1843,9 @@ config TEST_HASH
tristate "Perform selftest on hash functions"
default n
help
- Enable this option to test the kernel's integer ()
- and string () hash functions on boot
- (or module load).
+ Enable this option to test the kernel's integer (),
+ string (), and siphash ()
+ hash functions on boot (or module load).
 
  This is intended to help people writing architecture-specific
  optimized versions.  If unsure, say N.
diff --git a/lib/Makefile b/lib/Makefile
index 50144a3aeebd..71d398b04a74 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -22,7 +22,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
 sha1.o 

[PATCH v3 3/3] random: use siphash24 instead of md5 for get_random_int/long

2016-12-14 Thread Jason A. Donenfeld
This duplicates the current algorithm for get_random_int/long, but uses
siphash24 instead. This comes with several benefits. It's certainly
faster and more cryptographically secure than MD5. This patch also
hashes the pid, entropy, and timestamp as fixed width fields, in order
to increase diffusion.

The previous md5 algorithm used a per-cpu md5 state, which caused
successive calls to the function to chain upon each other. While it's
not entirely clear that this kind of chaining is absolutely necessary
when using a secure PRF like siphash24, it can't hurt, and the timing of
the call chain does add a degree of natural entropy. So, in keeping with
this design, instead of the massive per-cpu 64-byte md5 state, there is
instead a per-cpu previously returned value for chaining.

Signed-off-by: Jason A. Donenfeld 
Cc: Jean-Philippe Aumasson 
Cc: Ted Tso 
---
Changes from v2->v3:

  - Structs are no longer packed, to mitigate slow byte-by-byte assignment.

 drivers/char/random.c | 52 ---
 1 file changed, 33 insertions(+), 19 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index d6876d506220..b1c2e3b26430 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -262,6 +262,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -2042,7 +2043,7 @@ struct ctl_table random_table[] = {
 };
 #endif /* CONFIG_SYSCTL */
 
-static u32 random_int_secret[MD5_MESSAGE_BYTES / 4] cacheline_aligned;
+static u8 random_int_secret[SIPHASH24_KEY_LEN] __aligned(SIPHASH24_ALIGNMENT);
 
 int random_int_secret_init(void)
 {
@@ -2050,8 +2051,7 @@ int random_int_secret_init(void)
return 0;
 }
 
-static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash)
-   __aligned(sizeof(unsigned long));
+static DEFINE_PER_CPU(u64, get_random_int_chaining);
 
 /*
  * Get a random word for internal kernel use only. Similar to urandom but
@@ -2061,19 +2061,26 @@ static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], 
get_random_int_hash)
  */
 unsigned int get_random_int(void)
 {
-   __u32 *hash;
unsigned int ret;
+   struct {
+   u64 chaining;
+   unsigned long ts;
+   unsigned long entropy;
+   pid_t pid;
+   char end[];
+   } __aligned(SIPHASH24_ALIGNMENT) combined;
+   u64 *chaining;
 
if (arch_get_random_int())
return ret;
 
-   hash = get_cpu_var(get_random_int_hash);
-
-   hash[0] += current->pid + jiffies + random_get_entropy();
-   md5_transform(hash, random_int_secret);
-   ret = hash[0];
-   put_cpu_var(get_random_int_hash);
-
+   chaining = get_cpu_ptr(_random_int_chaining);
+   combined.chaining = *chaining;
+   combined.ts = jiffies;
+   combined.entropy = random_get_entropy();
+   combined.pid = current->pid;
+   ret = *chaining = siphash24((u8 *), offsetof(typeof(combined), 
end), random_int_secret);
+   put_cpu_ptr(chaining);
return ret;
 }
 EXPORT_SYMBOL(get_random_int);
@@ -2083,19 +2090,26 @@ EXPORT_SYMBOL(get_random_int);
  */
 unsigned long get_random_long(void)
 {
-   __u32 *hash;
unsigned long ret;
+   struct {
+   u64 chaining;
+   unsigned long ts;
+   unsigned long entropy;
+   pid_t pid;
+   char end[];
+   } __aligned(SIPHASH24_ALIGNMENT) combined;
+   u64 *chaining;
 
if (arch_get_random_long())
return ret;
 
-   hash = get_cpu_var(get_random_int_hash);
-
-   hash[0] += current->pid + jiffies + random_get_entropy();
-   md5_transform(hash, random_int_secret);
-   ret = *(unsigned long *)hash;
-   put_cpu_var(get_random_int_hash);
-
+   chaining = get_cpu_ptr(_random_int_chaining);
+   combined.chaining = *chaining;
+   combined.ts = jiffies;
+   combined.entropy = random_get_entropy();
+   combined.pid = current->pid;
+   ret = *chaining = siphash24((u8 *), offsetof(typeof(combined), 
end), random_int_secret);
+   put_cpu_ptr(chaining);
return ret;
 }
 EXPORT_SYMBOL(get_random_long);
-- 
2.11.0



[PATCH v3 3/3] random: use siphash24 instead of md5 for get_random_int/long

2016-12-14 Thread Jason A. Donenfeld
This duplicates the current algorithm for get_random_int/long, but uses
siphash24 instead. This comes with several benefits. It's certainly
faster and more cryptographically secure than MD5. This patch also
hashes the pid, entropy, and timestamp as fixed width fields, in order
to increase diffusion.

The previous md5 algorithm used a per-cpu md5 state, which caused
successive calls to the function to chain upon each other. While it's
not entirely clear that this kind of chaining is absolutely necessary
when using a secure PRF like siphash24, it can't hurt, and the timing of
the call chain does add a degree of natural entropy. So, in keeping with
this design, instead of the massive per-cpu 64-byte md5 state, there is
instead a per-cpu previously returned value for chaining.

Signed-off-by: Jason A. Donenfeld 
Cc: Jean-Philippe Aumasson 
Cc: Ted Tso 
---
Changes from v2->v3:

  - Structs are no longer packed, to mitigate slow byte-by-byte assignment.

 drivers/char/random.c | 52 ---
 1 file changed, 33 insertions(+), 19 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index d6876d506220..b1c2e3b26430 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -262,6 +262,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -2042,7 +2043,7 @@ struct ctl_table random_table[] = {
 };
 #endif /* CONFIG_SYSCTL */
 
-static u32 random_int_secret[MD5_MESSAGE_BYTES / 4] cacheline_aligned;
+static u8 random_int_secret[SIPHASH24_KEY_LEN] __aligned(SIPHASH24_ALIGNMENT);
 
 int random_int_secret_init(void)
 {
@@ -2050,8 +2051,7 @@ int random_int_secret_init(void)
return 0;
 }
 
-static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash)
-   __aligned(sizeof(unsigned long));
+static DEFINE_PER_CPU(u64, get_random_int_chaining);
 
 /*
  * Get a random word for internal kernel use only. Similar to urandom but
@@ -2061,19 +2061,26 @@ static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], 
get_random_int_hash)
  */
 unsigned int get_random_int(void)
 {
-   __u32 *hash;
unsigned int ret;
+   struct {
+   u64 chaining;
+   unsigned long ts;
+   unsigned long entropy;
+   pid_t pid;
+   char end[];
+   } __aligned(SIPHASH24_ALIGNMENT) combined;
+   u64 *chaining;
 
if (arch_get_random_int())
return ret;
 
-   hash = get_cpu_var(get_random_int_hash);
-
-   hash[0] += current->pid + jiffies + random_get_entropy();
-   md5_transform(hash, random_int_secret);
-   ret = hash[0];
-   put_cpu_var(get_random_int_hash);
-
+   chaining = get_cpu_ptr(_random_int_chaining);
+   combined.chaining = *chaining;
+   combined.ts = jiffies;
+   combined.entropy = random_get_entropy();
+   combined.pid = current->pid;
+   ret = *chaining = siphash24((u8 *), offsetof(typeof(combined), 
end), random_int_secret);
+   put_cpu_ptr(chaining);
return ret;
 }
 EXPORT_SYMBOL(get_random_int);
@@ -2083,19 +2090,26 @@ EXPORT_SYMBOL(get_random_int);
  */
 unsigned long get_random_long(void)
 {
-   __u32 *hash;
unsigned long ret;
+   struct {
+   u64 chaining;
+   unsigned long ts;
+   unsigned long entropy;
+   pid_t pid;
+   char end[];
+   } __aligned(SIPHASH24_ALIGNMENT) combined;
+   u64 *chaining;
 
if (arch_get_random_long())
return ret;
 
-   hash = get_cpu_var(get_random_int_hash);
-
-   hash[0] += current->pid + jiffies + random_get_entropy();
-   md5_transform(hash, random_int_secret);
-   ret = *(unsigned long *)hash;
-   put_cpu_var(get_random_int_hash);
-
+   chaining = get_cpu_ptr(_random_int_chaining);
+   combined.chaining = *chaining;
+   combined.ts = jiffies;
+   combined.entropy = random_get_entropy();
+   combined.pid = current->pid;
+   ret = *chaining = siphash24((u8 *), offsetof(typeof(combined), 
end), random_int_secret);
+   put_cpu_ptr(chaining);
return ret;
 }
 EXPORT_SYMBOL(get_random_long);
-- 
2.11.0



[PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-14 Thread Jason A. Donenfeld
SipHash is a 64-bit keyed hash function that is actually a
cryptographically secure PRF, like HMAC. Except SipHash is super fast,
and is meant to be used as a hashtable keyed lookup function.

SipHash isn't just some new trendy hash function. It's been around for a
while, and there really isn't anything that comes remotely close to
being useful in the way SipHash is. With that said, why do we need this?

There are a variety of attacks known as "hashtable poisoning" in which an
attacker forms some data such that the hash of that data will be the
same, and then preceeds to fill up all entries of a hashbucket. This is
a realistic and well-known denial-of-service vector.

Linux developers already seem to be aware that this is an issue, and
various places that use hash tables in, say, a network context, use a
non-cryptographically secure function (usually jhash) and then try to
twiddle with the key on a time basis (or in many cases just do nothing
and hope that nobody notices). While this is an admirable attempt at
solving the problem, it doesn't actually fix it. SipHash fixes it.

(It fixes it in such a sound way that you could even build a stream
cipher out of SipHash that would resist the modern cryptanalysis.)

There are a modicum of places in the kernel that are vulnerable to
hashtable poisoning attacks, either via userspace vectors or network
vectors, and there's not a reliable mechanism inside the kernel at the
moment to fix it. The first step toward fixing these issues is actually
getting a secure primitive into the kernel for developers to use. Then
we can, bit by bit, port things over to it as deemed appropriate.

Secondly, a few places are using MD5 for creating secure sequence
numbers, port numbers, or fast random numbers. Siphash is a faster, more
fittting, and more secure replacement for MD5 in those situations.

Dozens of languages are already using this internally for their hash
tables. Some of the BSDs already use this in their kernels. SipHash is
a widely known high-speed solution to a widely known problem, and it's
time we catch-up.

Signed-off-by: Jason A. Donenfeld 
Cc: Jean-Philippe Aumasson 
Cc: Daniel J. Bernstein 
Cc: Linus Torvalds 
Cc: Eric Biggers 
Cc: David Laight 
---
Changes from v2->v3:

  - There is now a fast aligned version of the function and a not-as-fast
unaligned version. The requirements for each have been documented in
a docbook-style comment. As well, the header now contains a constant
for the expected alignment.

  - The test suite has been updated to check both the unaligned and aligned
version of the function.

 include/linux/siphash.h |  30 ++
 lib/Kconfig.debug   |   6 +-
 lib/Makefile|   5 +-
 lib/siphash.c   | 153 
 lib/test_siphash.c  |  85 +++
 5 files changed, 274 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/siphash.h
 create mode 100644 lib/siphash.c
 create mode 100644 lib/test_siphash.c

diff --git a/include/linux/siphash.h b/include/linux/siphash.h
new file mode 100644
index ..82dc1a911a2e
--- /dev/null
+++ b/include/linux/siphash.h
@@ -0,0 +1,30 @@
+/* Copyright (C) 2016 Jason A. Donenfeld 
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ */
+
+#ifndef _LINUX_SIPHASH_H
+#define _LINUX_SIPHASH_H
+
+#include 
+
+enum siphash_lengths {
+   SIPHASH24_KEY_LEN = 16,
+   SIPHASH24_ALIGNMENT = 8
+};
+
+u64 siphash24(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]);
+
+#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+static inline u64 siphash24_unaligned(const u8 *data, size_t len, const u8 
key[SIPHASH24_KEY_LEN])
+{
+   return siphash24(data, len, key);
+}
+#else
+u64 siphash24_unaligned(const u8 *data, size_t len, const u8 
key[SIPHASH24_KEY_LEN]);
+#endif
+
+#endif /* _LINUX_SIPHASH_H */
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index e6327d102184..32bbf689fc46 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1843,9 +1843,9 @@ config TEST_HASH
tristate "Perform selftest on hash functions"
default n
help
- Enable this option to test the kernel's integer ()
- and string () hash functions on boot
- (or module load).
+ Enable this option to test the kernel's integer (),
+ string (), and siphash ()
+ hash functions on boot (or module load).
 
  This is intended to help people writing architecture-specific
  optimized versions.  If unsure, say N.
diff --git a/lib/Makefile b/lib/Makefile
index 50144a3aeebd..71d398b04a74 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -22,7 +22,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
 sha1.o chacha20.o md5.o irq_regs.o argv_split.o \
 flex_proportions.o ratelimit.o show_mem.o \
 is_single_threaded.o plist.o decompress.o 

Re: [GIT PULL] f2fs update for 4.10

2016-12-14 Thread Jaegeuk Kim
On 12/14, Linus Torvalds wrote:
> On Mon, Dec 12, 2016 at 2:15 PM, Jaegeuk Kim  wrote:
> >
> > Could you please consider this pull request?
> 
> Pulled. Mind double-checking my resolution wrt commit 70fd76140a6c
> ("block,fs: use REQ_* flags directly")?

Thank you, and the resolution looks good to me as well.

Thanks,

> 
>  Linus


Re: [GIT PULL] f2fs update for 4.10

2016-12-14 Thread Jaegeuk Kim
On 12/14, Linus Torvalds wrote:
> On Mon, Dec 12, 2016 at 2:15 PM, Jaegeuk Kim  wrote:
> >
> > Could you please consider this pull request?
> 
> Pulled. Mind double-checking my resolution wrt commit 70fd76140a6c
> ("block,fs: use REQ_* flags directly")?

Thank you, and the resolution looks good to me as well.

Thanks,

> 
>  Linus


Re: [v2] net:ethernet:cavium:octeon:octeon_mgmt: Handle return NULL error from devm_ioremap

2016-12-14 Thread arvind Yadav

Hi David,

I have gave my comment.

Thanks
Arvind

On Wednesday 14 December 2016 11:44 PM, David Daney wrote:

On 12/14/2016 10:06 AM, arvind Yadav wrote:

Yes, I have seen this error. We have a device with very less memory.
Basically it's OMAP2 board. We have to port Android L on this.
It's has 3.10 kernel version. In this device, we were getting Page
allocation failure.


This makes absolutely no sense to me.  OCTEON is a mips64 SoC with a 
ton of memory where ioremap can never fail, and it doesn't run 
Android, and you are talking about OMAP2.
  -I just gave as example where i have seen ioremap issue. 
Please don't relate. I know, Now it will not fail.  ioremap will through 
NULL on failure. We should catch this error. Even other driver of MIPS 
soc is having same check. It's just check which will not impact any 
functionality or performance of this driver. It will avoid NULL pointer 
error. We know, if  function is returning any error. we should catch.


Q1: Have you observed a failure on the device for which you are 
modifying the driver?

 -No, I did not observe this error.


Q2: Have you tested the patch on hardware that uses the driver you are 
modifying by running network traffic through the Ethernet interface 
this driver controls?

-Right Now we can not tested these kind of failure,


If you cannot answer yes to both of those questions, then you should 
probably note in the changelog that the patch is untested.





David.



Vmalloc size was not enough to run all application. So we have decide to
increase vmalloc reserve space. once we increases Vmalloc space.
We start getting ioremap falilure. Kernel is getting NULL-pointer
dereference error.

Here, It's just check to avoid any kernel crash because of ioremap 
failure.

We can keep this check to avoid this kind of scenario.

Thanks
-Arvind


On Wednesday 14 December 2016 11:02 PM, David Daney wrote:

On 12/14/2016 08:25 AM, Arvind Yadav wrote:

Here, If devm_ioremap will fail. It will return NULL.
Kernel can run into a NULL-pointer dereference.
This error check will avoid NULL pointer dereference.


i
Have you ever seen this failure in the wild?

How was the patch tested?

Thanks,
David Daney



Signed-off-by: Arvind Yadav 
---
 drivers/net/ethernet/cavium/octeon/octeon_mgmt.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
index 4ab404f..33c2fec 100644
--- a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
+++ b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
@@ -1479,6 +1479,12 @@ static int octeon_mgmt_probe(struct
platform_device *pdev)
 p->agl = (u64)devm_ioremap(>dev, p->agl_phys, p->agl_size);
 p->agl_prt_ctl = (u64)devm_ioremap(>dev, 
p->agl_prt_ctl_phys,

p->agl_prt_ctl_size);
+if (!p->mix || !p->agl || !p->agl_prt_ctl) {
+dev_err(>dev, "failed to map I/O memory\n");
+result = -ENOMEM;
+goto err;
+}
+
 spin_lock_init(>lock);

 skb_queue_head_init(>tx_list);









Re: [v2] net:ethernet:cavium:octeon:octeon_mgmt: Handle return NULL error from devm_ioremap

2016-12-14 Thread arvind Yadav

Hi David,

I have gave my comment.

Thanks
Arvind

On Wednesday 14 December 2016 11:44 PM, David Daney wrote:

On 12/14/2016 10:06 AM, arvind Yadav wrote:

Yes, I have seen this error. We have a device with very less memory.
Basically it's OMAP2 board. We have to port Android L on this.
It's has 3.10 kernel version. In this device, we were getting Page
allocation failure.


This makes absolutely no sense to me.  OCTEON is a mips64 SoC with a 
ton of memory where ioremap can never fail, and it doesn't run 
Android, and you are talking about OMAP2.
  -I just gave as example where i have seen ioremap issue. 
Please don't relate. I know, Now it will not fail.  ioremap will through 
NULL on failure. We should catch this error. Even other driver of MIPS 
soc is having same check. It's just check which will not impact any 
functionality or performance of this driver. It will avoid NULL pointer 
error. We know, if  function is returning any error. we should catch.


Q1: Have you observed a failure on the device for which you are 
modifying the driver?

 -No, I did not observe this error.


Q2: Have you tested the patch on hardware that uses the driver you are 
modifying by running network traffic through the Ethernet interface 
this driver controls?

-Right Now we can not tested these kind of failure,


If you cannot answer yes to both of those questions, then you should 
probably note in the changelog that the patch is untested.





David.



Vmalloc size was not enough to run all application. So we have decide to
increase vmalloc reserve space. once we increases Vmalloc space.
We start getting ioremap falilure. Kernel is getting NULL-pointer
dereference error.

Here, It's just check to avoid any kernel crash because of ioremap 
failure.

We can keep this check to avoid this kind of scenario.

Thanks
-Arvind


On Wednesday 14 December 2016 11:02 PM, David Daney wrote:

On 12/14/2016 08:25 AM, Arvind Yadav wrote:

Here, If devm_ioremap will fail. It will return NULL.
Kernel can run into a NULL-pointer dereference.
This error check will avoid NULL pointer dereference.


i
Have you ever seen this failure in the wild?

How was the patch tested?

Thanks,
David Daney



Signed-off-by: Arvind Yadav 
---
 drivers/net/ethernet/cavium/octeon/octeon_mgmt.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
index 4ab404f..33c2fec 100644
--- a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
+++ b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
@@ -1479,6 +1479,12 @@ static int octeon_mgmt_probe(struct
platform_device *pdev)
 p->agl = (u64)devm_ioremap(>dev, p->agl_phys, p->agl_size);
 p->agl_prt_ctl = (u64)devm_ioremap(>dev, 
p->agl_prt_ctl_phys,

p->agl_prt_ctl_size);
+if (!p->mix || !p->agl || !p->agl_prt_ctl) {
+dev_err(>dev, "failed to map I/O memory\n");
+result = -ENOMEM;
+goto err;
+}
+
 spin_lock_init(>lock);

 skb_queue_head_init(>tx_list);









Re: [RFC] perf/x86/intel: Account interrupts for PEBS errors

2016-12-14 Thread Peter Zijlstra
On Wed, Dec 14, 2016 at 05:50:36PM +0100, Jiri Olsa wrote:
> 
> I also fail to reproduce on other than snb_x (model 45) server

reproduces on my ivb-ep as well model 62.

> thoughts?

cute find :-)

> +++ b/arch/x86/events/intel/ds.c
> @@ -1389,9 +1389,13 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs 
> *iregs)
>   continue;
>  
>   /* log dropped samples number */
> - if (error[bit])
> + if (error[bit]) {
>   perf_log_lost_samples(event, error[bit]);
>  
> + if (perf_event_account_interrupt(event, 1))

Seems a bit daft to expose the .throttle argument, since that would be
the only point of calling this.




> +static int __perf_event_overflow(struct perf_event *event,
> +int throttle, struct perf_sample_data *data,
> +struct pt_regs *regs)
> +{
> + int events = atomic_read(>event_limit);
> + struct hw_perf_event *hwc = >hw;
> + int ret = 0;
> +
> + /*
> +  * Non-sampling counters might still use the PMI to fold short
> +  * hardware counters, ignore those.
> +  */
> + if (unlikely(!is_sampling_event(event)))
> + return 0;
> +
> + ret = perf_event_account_interrupt(event, throttle);
> +
>   if (event->attr.freq) {
>   u64 now = perf_clock();
>   s64 delta = now - hwc->freq_time_stamp;

Arguably, everything in __perf_event_overflow() except for calling of
->overflow_handler() should be done I think.




Re: [RFC] perf/x86/intel: Account interrupts for PEBS errors

2016-12-14 Thread Peter Zijlstra
On Wed, Dec 14, 2016 at 05:50:36PM +0100, Jiri Olsa wrote:
> 
> I also fail to reproduce on other than snb_x (model 45) server

reproduces on my ivb-ep as well model 62.

> thoughts?

cute find :-)

> +++ b/arch/x86/events/intel/ds.c
> @@ -1389,9 +1389,13 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs 
> *iregs)
>   continue;
>  
>   /* log dropped samples number */
> - if (error[bit])
> + if (error[bit]) {
>   perf_log_lost_samples(event, error[bit]);
>  
> + if (perf_event_account_interrupt(event, 1))

Seems a bit daft to expose the .throttle argument, since that would be
the only point of calling this.




> +static int __perf_event_overflow(struct perf_event *event,
> +int throttle, struct perf_sample_data *data,
> +struct pt_regs *regs)
> +{
> + int events = atomic_read(>event_limit);
> + struct hw_perf_event *hwc = >hw;
> + int ret = 0;
> +
> + /*
> +  * Non-sampling counters might still use the PMI to fold short
> +  * hardware counters, ignore those.
> +  */
> + if (unlikely(!is_sampling_event(event)))
> + return 0;
> +
> + ret = perf_event_account_interrupt(event, throttle);
> +
>   if (event->attr.freq) {
>   u64 now = perf_clock();
>   s64 delta = now - hwc->freq_time_stamp;

Arguably, everything in __perf_event_overflow() except for calling of
->overflow_handler() should be done I think.




RE: [PATCH] perf tools: ignore zombie process for user profile

2016-12-14 Thread Liang, Kan


> 
> On Wed, Dec 14, 2016 at 12:48:05PM -0500, kan.li...@intel.com wrote:
> > From: Kan Liang 
> >
> > If user has zombie process, the perf record -u will error out.
> > Here is an example.
> >  $ ./testd &
> >  [1] 23796
> >  $ sudo perf record -e cycles -u kan
> >  Error:
> >  The sys_perf_event_open() syscall returned with 3 (No such process)
> > for  event (cycles).
> >  /bin/dmesg may provide additional information.
> >  No CONFIG_PERF_EVENTS=y kernel support configured?
> >
> > The source code of testd is as below.
> >  int main() {
> >
> > if (fork())
> > {
> > while (1);
> > }
> > return 0;
> >  }
> >
> > Zombie process is dead process. It is meaningless to profile it.
> > It's better to ignore it for user profile.
> 
> I recently posted different patch for same issue:
>   http://marc.info/?l=linux-kernel=148153895827359=2

The change as below make me confuse.
+   /* The system wide setup does not work with threads. */
+   if (!evsel->system_wide)
+   return false;
It looks the meaning of the comments is inconsistent with the code.


Your original patch doesn't work well with the issue.
But if I change the above code as below, the issue is fixed.
if (evsel->system_wide)
return false;

Thanks,
Kan



RE: [PATCH] perf tools: ignore zombie process for user profile

2016-12-14 Thread Liang, Kan


> 
> On Wed, Dec 14, 2016 at 12:48:05PM -0500, kan.li...@intel.com wrote:
> > From: Kan Liang 
> >
> > If user has zombie process, the perf record -u will error out.
> > Here is an example.
> >  $ ./testd &
> >  [1] 23796
> >  $ sudo perf record -e cycles -u kan
> >  Error:
> >  The sys_perf_event_open() syscall returned with 3 (No such process)
> > for  event (cycles).
> >  /bin/dmesg may provide additional information.
> >  No CONFIG_PERF_EVENTS=y kernel support configured?
> >
> > The source code of testd is as below.
> >  int main() {
> >
> > if (fork())
> > {
> > while (1);
> > }
> > return 0;
> >  }
> >
> > Zombie process is dead process. It is meaningless to profile it.
> > It's better to ignore it for user profile.
> 
> I recently posted different patch for same issue:
>   http://marc.info/?l=linux-kernel=148153895827359=2

The change as below make me confuse.
+   /* The system wide setup does not work with threads. */
+   if (!evsel->system_wide)
+   return false;
It looks the meaning of the comments is inconsistent with the code.


Your original patch doesn't work well with the issue.
But if I change the above code as below, the issue is fixed.
if (evsel->system_wide)
return false;

Thanks,
Kan



[GIT PULL] Audit patches for v4.10

2016-12-14 Thread Paul Moore
Hi Linus,

After the small number of patches for v4.9, we've got a much bigger pile for 
v4.10.

The bulk of these patches involve a rework of the audit backlog queue to 
enable us to move the netlink multicasting out of the task/thread that 
generates the audit record and into the kernel thread that emits the record 
(just like we do for the audit unicast to auditd).  While we were playing 
with the backlog queue(s) we fixed a number of other little problems with 
the code, and from all the testing so far things look to be in much better 
shape now.  Doing this also allowed us to re-enable disabling IRQs for some 
netns operations ("netns: avoid disabling irq for netns id").  The remaining 
patches fix some small problems that are well documented in the commit 
descriptions, as well as adding session ID filtering support.

You will likely hit two merge conflicts, one in net/core/net_namespace.c and 
one in include/uapi/linux/audit.h, both are easily resolved so I won't 
bother you with that here.  If you have questions, you know how to find me.

Thanks,
-Paul

---
The following changes since commit c8d2bc9bc39ebea8437fd974fdbc21847bb897a3:

  Linux 4.8 (2016-10-02 16:24:33 -0700)

are available in the git repository at:

  git://git.infradead.org/users/pcmoore/audit stable-4.10

for you to fetch changes up to 533c7b69c764ad5febb3e716899f43a75564fcab:

  audit: use proper refcount locking on audit_sock
 (2016-12-14 13:06:04 -0500)


Alexey Dobriyan (1):
  audit: less stack usage for /proc/*/loginuid

Paul Moore (9):
  audit: fixup audit_init()
  audit: queue netlink multicast sends just like we do for unicast sends
  audit: rename the queues and kauditd related functions
  audit: rework the audit queue handling
  audit: rework audit_log_start()
  audit: wake up kauditd_thread after auditd registers
  audit: handle a clean auditd shutdown with grace
  audit: don't ever sleep on a command record/message
  netns: avoid disabling irq for netns id

Richard Guy Briggs (5):
  audit: tame initialization warning len_abuf in audit_log_execve_info
  audit: skip sessionid sentinel value when auto-incrementing
  audit: add support for session ID user filter
  audit: move kaudit thread start from auditd registration to
 kaudit init (#2)
  audit: use proper refcount locking on audit_sock

Steve Grubb (1):
  audit: fix formatting of AUDIT_CONFIG_CHANGE events

 fs/proc/base.c |   2 +-
 include/uapi/linux/audit.h |   5 +-
 kernel/audit.c | 532 ---
 kernel/audit_fsnotify.c|   5 +-
 kernel/audit_tree.c|   3 +-
 kernel/audit_watch.c   |   5 +-
 kernel/auditfilter.c   |   5 +-
 kernel/auditsc.c   |  12 +-
 net/core/net_namespace.c   |  35 ++-
 9 files changed, 361 insertions(+), 243 deletions(-)

-- 
paul moore
security @ redhat



[GIT PULL] Audit patches for v4.10

2016-12-14 Thread Paul Moore
Hi Linus,

After the small number of patches for v4.9, we've got a much bigger pile for 
v4.10.

The bulk of these patches involve a rework of the audit backlog queue to 
enable us to move the netlink multicasting out of the task/thread that 
generates the audit record and into the kernel thread that emits the record 
(just like we do for the audit unicast to auditd).  While we were playing 
with the backlog queue(s) we fixed a number of other little problems with 
the code, and from all the testing so far things look to be in much better 
shape now.  Doing this also allowed us to re-enable disabling IRQs for some 
netns operations ("netns: avoid disabling irq for netns id").  The remaining 
patches fix some small problems that are well documented in the commit 
descriptions, as well as adding session ID filtering support.

You will likely hit two merge conflicts, one in net/core/net_namespace.c and 
one in include/uapi/linux/audit.h, both are easily resolved so I won't 
bother you with that here.  If you have questions, you know how to find me.

Thanks,
-Paul

---
The following changes since commit c8d2bc9bc39ebea8437fd974fdbc21847bb897a3:

  Linux 4.8 (2016-10-02 16:24:33 -0700)

are available in the git repository at:

  git://git.infradead.org/users/pcmoore/audit stable-4.10

for you to fetch changes up to 533c7b69c764ad5febb3e716899f43a75564fcab:

  audit: use proper refcount locking on audit_sock
 (2016-12-14 13:06:04 -0500)


Alexey Dobriyan (1):
  audit: less stack usage for /proc/*/loginuid

Paul Moore (9):
  audit: fixup audit_init()
  audit: queue netlink multicast sends just like we do for unicast sends
  audit: rename the queues and kauditd related functions
  audit: rework the audit queue handling
  audit: rework audit_log_start()
  audit: wake up kauditd_thread after auditd registers
  audit: handle a clean auditd shutdown with grace
  audit: don't ever sleep on a command record/message
  netns: avoid disabling irq for netns id

Richard Guy Briggs (5):
  audit: tame initialization warning len_abuf in audit_log_execve_info
  audit: skip sessionid sentinel value when auto-incrementing
  audit: add support for session ID user filter
  audit: move kaudit thread start from auditd registration to
 kaudit init (#2)
  audit: use proper refcount locking on audit_sock

Steve Grubb (1):
  audit: fix formatting of AUDIT_CONFIG_CHANGE events

 fs/proc/base.c |   2 +-
 include/uapi/linux/audit.h |   5 +-
 kernel/audit.c | 532 ---
 kernel/audit_fsnotify.c|   5 +-
 kernel/audit_tree.c|   3 +-
 kernel/audit_watch.c   |   5 +-
 kernel/auditfilter.c   |   5 +-
 kernel/auditsc.c   |  12 +-
 net/core/net_namespace.c   |  35 ++-
 9 files changed, 361 insertions(+), 243 deletions(-)

-- 
paul moore
security @ redhat



Re: [PATCH v2] infiniband: remove WARN that is not kernel bug

2016-12-14 Thread Leon Romanovsky
On Wed, Dec 14, 2016 at 01:16:45PM -0500, Doug Ledford wrote:
> On 11/21/2016 12:38 PM, Leon Romanovsky wrote:
> > On Mon, Nov 21, 2016 at 09:52:53AM -0700, Jason Gunthorpe wrote:
> >> On Mon, Nov 21, 2016 at 02:14:08PM +0200, Leon Romanovsky wrote:
> 
>  In ib_ucm_write function there is a wrong prefix:
> 
>  + pr_err_once("ucm_write: process %d (%s) tried to do something hinky\n",
> >>>
> >>> I did it intentionally to have the same errors for all flows.
> >>
> >> Lets actually use a good message too please?
> >>
> >>  pr_err_once("ucm_write: process %d (%s) changed security contexts after 
> >> opening FD, this is not allowed.\n",
> >>
> >> Jason
>
> I applied Leon's reworked version of this patch, thanks.

Thanks Doug,
I already forgot about it :)

>
> --
> Doug Ledford 
> GPG Key ID: 0E572FDD
>





signature.asc
Description: PGP signature


Re: [PATCH v2] infiniband: remove WARN that is not kernel bug

2016-12-14 Thread Leon Romanovsky
On Wed, Dec 14, 2016 at 01:16:45PM -0500, Doug Ledford wrote:
> On 11/21/2016 12:38 PM, Leon Romanovsky wrote:
> > On Mon, Nov 21, 2016 at 09:52:53AM -0700, Jason Gunthorpe wrote:
> >> On Mon, Nov 21, 2016 at 02:14:08PM +0200, Leon Romanovsky wrote:
> 
>  In ib_ucm_write function there is a wrong prefix:
> 
>  + pr_err_once("ucm_write: process %d (%s) tried to do something hinky\n",
> >>>
> >>> I did it intentionally to have the same errors for all flows.
> >>
> >> Lets actually use a good message too please?
> >>
> >>  pr_err_once("ucm_write: process %d (%s) changed security contexts after 
> >> opening FD, this is not allowed.\n",
> >>
> >> Jason
>
> I applied Leon's reworked version of this patch, thanks.

Thanks Doug,
I already forgot about it :)

>
> --
> Doug Ledford 
> GPG Key ID: 0E572FDD
>





signature.asc
Description: PGP signature


Re: [PATCH] Input: evdev: fix queueing of SYN_DROPPED event for EVIOCG[type] IOCTL case

2016-12-14 Thread Aniroop Mathur
Hello Mr. Torokhov,

Would you kindly update about this patch?
Thanks!

Best Regards,
Aniroop Mathur


On Fri, Nov 25, 2016 at 1:41 AM, Aniroop Mathur  wrote:
> Currently, when EVIOCG[type] ioctl call is issued and bits_to_user fails,
> then SYN_DROPPED event is inserted in the event queue always.
>
> However, it is not compulsory that some events are flushed out on every
> EVIOCG[type] ioctl call like in case of empty event queue and in case when
> EVIOCG[type] ioctl is issued for say A type of events but event queue does
> not have any A type of events but some other type of events.
>
> Therefore, insert SYN_DROPPED event only when some events have been flushed
> out from event queue plus bits_to_user fails.
>
> Signed-off-by: Aniroop Mathur 
> ---
>  drivers/input/evdev.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
> index e9ae3d5..f8b295e 100644
> --- a/drivers/input/evdev.c
> +++ b/drivers/input/evdev.c
> @@ -108,9 +108,11 @@ static bool __evdev_is_filtered(struct evdev_client 
> *client,
>  }
>
>  /* flush queued events of type @type, caller must hold client->buffer_lock */
> -static void __evdev_flush_queue(struct evdev_client *client, unsigned int 
> type)
> +static unsigned int __evdev_flush_queue(struct evdev_client *client,
> +   unsigned int type)
>  {
> unsigned int i, head, num;
> +   unsigned int drop_count = 0;
> unsigned int mask = client->bufsize - 1;
> bool is_report;
> struct input_event *ev;
> @@ -129,9 +131,11 @@ static void __evdev_flush_queue(struct evdev_client 
> *client, unsigned int type)
>
> if (ev->type == type) {
> /* drop matched entry */
> +   drop_count++;
> continue;
> } else if (is_report && !num) {
> /* drop empty SYN_REPORT groups */
> +   drop_count++;
> continue;
> } else if (head != i) {
> /* move entry to fill the gap */
> @@ -151,6 +155,7 @@ static void __evdev_flush_queue(struct evdev_client 
> *client, unsigned int type)
> }
>
> client->head = head;
> +   return drop_count;
>  }
>
>  static void __evdev_queue_syn_dropped(struct evdev_client *client)
> @@ -920,6 +925,7 @@ static int evdev_handle_get_val(struct evdev_client 
> *client,
> int ret;
> unsigned long *mem;
> size_t len;
> +   unsigned int drop_count = 0;
>
> len = BITS_TO_LONGS(maxbit) * sizeof(unsigned long);
> mem = kmalloc(len, GFP_KERNEL);
> @@ -933,12 +939,12 @@ static int evdev_handle_get_val(struct evdev_client 
> *client,
>
> spin_unlock(>event_lock);
>
> -   __evdev_flush_queue(client, type);
> +   drop_count = __evdev_flush_queue(client, type);
>
> spin_unlock_irq(>buffer_lock);
>
> ret = bits_to_user(mem, maxbit, maxlen, p, compat);
> -   if (ret < 0)
> +   if (ret < 0 && drop_count > 0)
> evdev_queue_syn_dropped(client);
>
> kfree(mem);
> --
> 2.6.2
>


Re: [PATCH] Input: evdev: fix queueing of SYN_DROPPED event for EVIOCG[type] IOCTL case

2016-12-14 Thread Aniroop Mathur
Hello Mr. Torokhov,

Would you kindly update about this patch?
Thanks!

Best Regards,
Aniroop Mathur


On Fri, Nov 25, 2016 at 1:41 AM, Aniroop Mathur  wrote:
> Currently, when EVIOCG[type] ioctl call is issued and bits_to_user fails,
> then SYN_DROPPED event is inserted in the event queue always.
>
> However, it is not compulsory that some events are flushed out on every
> EVIOCG[type] ioctl call like in case of empty event queue and in case when
> EVIOCG[type] ioctl is issued for say A type of events but event queue does
> not have any A type of events but some other type of events.
>
> Therefore, insert SYN_DROPPED event only when some events have been flushed
> out from event queue plus bits_to_user fails.
>
> Signed-off-by: Aniroop Mathur 
> ---
>  drivers/input/evdev.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
> index e9ae3d5..f8b295e 100644
> --- a/drivers/input/evdev.c
> +++ b/drivers/input/evdev.c
> @@ -108,9 +108,11 @@ static bool __evdev_is_filtered(struct evdev_client 
> *client,
>  }
>
>  /* flush queued events of type @type, caller must hold client->buffer_lock */
> -static void __evdev_flush_queue(struct evdev_client *client, unsigned int 
> type)
> +static unsigned int __evdev_flush_queue(struct evdev_client *client,
> +   unsigned int type)
>  {
> unsigned int i, head, num;
> +   unsigned int drop_count = 0;
> unsigned int mask = client->bufsize - 1;
> bool is_report;
> struct input_event *ev;
> @@ -129,9 +131,11 @@ static void __evdev_flush_queue(struct evdev_client 
> *client, unsigned int type)
>
> if (ev->type == type) {
> /* drop matched entry */
> +   drop_count++;
> continue;
> } else if (is_report && !num) {
> /* drop empty SYN_REPORT groups */
> +   drop_count++;
> continue;
> } else if (head != i) {
> /* move entry to fill the gap */
> @@ -151,6 +155,7 @@ static void __evdev_flush_queue(struct evdev_client 
> *client, unsigned int type)
> }
>
> client->head = head;
> +   return drop_count;
>  }
>
>  static void __evdev_queue_syn_dropped(struct evdev_client *client)
> @@ -920,6 +925,7 @@ static int evdev_handle_get_val(struct evdev_client 
> *client,
> int ret;
> unsigned long *mem;
> size_t len;
> +   unsigned int drop_count = 0;
>
> len = BITS_TO_LONGS(maxbit) * sizeof(unsigned long);
> mem = kmalloc(len, GFP_KERNEL);
> @@ -933,12 +939,12 @@ static int evdev_handle_get_val(struct evdev_client 
> *client,
>
> spin_unlock(>event_lock);
>
> -   __evdev_flush_queue(client, type);
> +   drop_count = __evdev_flush_queue(client, type);
>
> spin_unlock_irq(>buffer_lock);
>
> ret = bits_to_user(mem, maxbit, maxlen, p, compat);
> -   if (ret < 0)
> +   if (ret < 0 && drop_count > 0)
> evdev_queue_syn_dropped(client);
>
> kfree(mem);
> --
> 2.6.2
>


Re: [PATCH] ARM: dts: Add missing CPU frequencies for Exynos5422/5800

2016-12-14 Thread Krzysztof Kozlowski
On Tue, Dec 13, 2016 at 04:18:05PM -0300, Javier Martinez Canillas wrote:
> Hello Bartlomiej,
> 
> On 12/13/2016 01:52 PM, Bartlomiej Zolnierkiewicz wrote:
> > Add missing 2000MHz & 1900MHz OPPs (for A15 cores) and 1400MHz OPP
> > (for A7 cores).  Also update common Odroid-XU3 Lite/XU3/XU4 thermal
> > cooling maps to account for new OPPs.
> > 
> > Since new OPPs are not available on all Exynos5422/5800 boards modify
> > dts files for Odroid-XU3 Lite (limited to 1.8 GHz / 1.3 GHz) & Peach
> > Pi (limited to 2.0 GHz / 1.3 GHz) accordingly.
> > 
> > Tested on Odroid-XU3 and XU3 Lite.
> > 
> > Cc: Doug Anderson 
> > Cc: Javier Martinez Canillas 
> > Cc: Andreas Faerber 
> > Cc: Thomas Abraham 
> > Cc: Ben Gamari 
> > Signed-off-by: Bartlomiej Zolnierkiewicz 
> > ---
> >  arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi |   14 +++---
> >  arch/arm/boot/dts/exynos5422-odroidxu3-lite.dts|   17 +
> >  arch/arm/boot/dts/exynos5800-peach-pi.dts  |4 
> >  arch/arm/boot/dts/exynos5800.dtsi  |   15 +++
> >  4 files changed, 43 insertions(+), 7 deletions(-)
> > 
> > Index: b/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi
> > ===
> > --- a/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi2016-12-13 
> > 15:59:33.779763261 +0100
> > +++ b/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi2016-12-13 
> > 15:59:33.775763261 +0100
> > @@ -118,7 +118,7 @@
> > /*
> >  * When reaching cpu_alert3, reduce CPU
> >  * by 2 steps. On Exynos5422/5800 that would
> > -* be: 1600 MHz and 1100 MHz.
> > +* (usually) be: 1800 MHz and 1200 MHz.
> >  */
> > map3 {
> > trip = <_alert3>;
> > @@ -131,16 +131,16 @@
> >  
> > /*
> >  * When reaching cpu_alert4, reduce CPU
> > -* further, down to 600 MHz (11 steps for big,
> > -* 7 steps for LITTLE).
> > +* further, down to 600 MHz (13 steps for big,
> > +* 8 steps for LITTLE).
> >  */
> > -   map5 {
> > +   cooling_map5: map5 {
> > trip = <_alert4>;
> > -   cooling-device = < 3 7>;
> > +   cooling-device = < 3 8>;
> > };
> > -   map6 {
> > +   cooling_map6: map6 {
> > trip = <_alert4>;
> > -   cooling-device = < 3 11>;
> > +   cooling-device = < 3 13>;
> > };
> > };
> > };
> > Index: b/arch/arm/boot/dts/exynos5422-odroidxu3-lite.dts
> > ===
> > --- a/arch/arm/boot/dts/exynos5422-odroidxu3-lite.dts   2016-12-13 
> > 15:59:33.779763261 +0100
> > +++ b/arch/arm/boot/dts/exynos5422-odroidxu3-lite.dts   2016-12-13 
> > 15:59:33.775763261 +0100
> > @@ -21,6 +21,23 @@
> > compatible = "hardkernel,odroid-xu3-lite", "samsung,exynos5800", 
> > "samsung,exynos5";
> >  };
> >  
> > +_a15_opp_table {
> > +   /delete-node/opp@20;
> > +   /delete-node/opp@19;
> > +};
> > +
> > +_a7_opp_table {
> > +   /delete-node/opp@14;
> > +};
> > +
> 
> I think that a comment in the DTS why these operating points aren't available
> in this board will make more clear why the nodes are being deleted.
> 
> > +_map5 {
> > +   cooling-device = < 3 7>;
> > +};
> > +
> > +_map6 {
> > +   cooling-device = < 3 11>;
> > +};
> > +
> >   {
> > /*
> >  * PWM 0 -- fan
> > Index: b/arch/arm/boot/dts/exynos5800-peach-pi.dts
> > ===
> > --- a/arch/arm/boot/dts/exynos5800-peach-pi.dts 2016-12-13 
> > 15:59:33.779763261 +0100
> > +++ b/arch/arm/boot/dts/exynos5800-peach-pi.dts 2016-12-13 
> > 15:59:33.779763261 +0100
> > @@ -146,6 +146,10 @@
> > vdd-supply = <_reg>;
> >  };
> >  
> > +_a7_opp_table {
> > +   /delete-property/opp@14;
> > +};
> > +
> >   {
> > cpu-supply = <_reg>;
> >  };
> > Index: b/arch/arm/boot/dts/exynos5800.dtsi
> > ===
> > --- a/arch/arm/boot/dts/exynos5800.dtsi 2016-12-13 15:59:33.779763261 
> > +0100
> > +++ b/arch/arm/boot/dts/exynos5800.dtsi 2016-12-13 15:59:33.779763261 
> 

Re: [PATCH] ARM: dts: Add missing CPU frequencies for Exynos5422/5800

2016-12-14 Thread Krzysztof Kozlowski
On Tue, Dec 13, 2016 at 04:18:05PM -0300, Javier Martinez Canillas wrote:
> Hello Bartlomiej,
> 
> On 12/13/2016 01:52 PM, Bartlomiej Zolnierkiewicz wrote:
> > Add missing 2000MHz & 1900MHz OPPs (for A15 cores) and 1400MHz OPP
> > (for A7 cores).  Also update common Odroid-XU3 Lite/XU3/XU4 thermal
> > cooling maps to account for new OPPs.
> > 
> > Since new OPPs are not available on all Exynos5422/5800 boards modify
> > dts files for Odroid-XU3 Lite (limited to 1.8 GHz / 1.3 GHz) & Peach
> > Pi (limited to 2.0 GHz / 1.3 GHz) accordingly.
> > 
> > Tested on Odroid-XU3 and XU3 Lite.
> > 
> > Cc: Doug Anderson 
> > Cc: Javier Martinez Canillas 
> > Cc: Andreas Faerber 
> > Cc: Thomas Abraham 
> > Cc: Ben Gamari 
> > Signed-off-by: Bartlomiej Zolnierkiewicz 
> > ---
> >  arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi |   14 +++---
> >  arch/arm/boot/dts/exynos5422-odroidxu3-lite.dts|   17 +
> >  arch/arm/boot/dts/exynos5800-peach-pi.dts  |4 
> >  arch/arm/boot/dts/exynos5800.dtsi  |   15 +++
> >  4 files changed, 43 insertions(+), 7 deletions(-)
> > 
> > Index: b/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi
> > ===
> > --- a/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi2016-12-13 
> > 15:59:33.779763261 +0100
> > +++ b/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi2016-12-13 
> > 15:59:33.775763261 +0100
> > @@ -118,7 +118,7 @@
> > /*
> >  * When reaching cpu_alert3, reduce CPU
> >  * by 2 steps. On Exynos5422/5800 that would
> > -* be: 1600 MHz and 1100 MHz.
> > +* (usually) be: 1800 MHz and 1200 MHz.
> >  */
> > map3 {
> > trip = <_alert3>;
> > @@ -131,16 +131,16 @@
> >  
> > /*
> >  * When reaching cpu_alert4, reduce CPU
> > -* further, down to 600 MHz (11 steps for big,
> > -* 7 steps for LITTLE).
> > +* further, down to 600 MHz (13 steps for big,
> > +* 8 steps for LITTLE).
> >  */
> > -   map5 {
> > +   cooling_map5: map5 {
> > trip = <_alert4>;
> > -   cooling-device = < 3 7>;
> > +   cooling-device = < 3 8>;
> > };
> > -   map6 {
> > +   cooling_map6: map6 {
> > trip = <_alert4>;
> > -   cooling-device = < 3 11>;
> > +   cooling-device = < 3 13>;
> > };
> > };
> > };
> > Index: b/arch/arm/boot/dts/exynos5422-odroidxu3-lite.dts
> > ===
> > --- a/arch/arm/boot/dts/exynos5422-odroidxu3-lite.dts   2016-12-13 
> > 15:59:33.779763261 +0100
> > +++ b/arch/arm/boot/dts/exynos5422-odroidxu3-lite.dts   2016-12-13 
> > 15:59:33.775763261 +0100
> > @@ -21,6 +21,23 @@
> > compatible = "hardkernel,odroid-xu3-lite", "samsung,exynos5800", 
> > "samsung,exynos5";
> >  };
> >  
> > +_a15_opp_table {
> > +   /delete-node/opp@20;
> > +   /delete-node/opp@19;
> > +};
> > +
> > +_a7_opp_table {
> > +   /delete-node/opp@14;
> > +};
> > +
> 
> I think that a comment in the DTS why these operating points aren't available
> in this board will make more clear why the nodes are being deleted.
> 
> > +_map5 {
> > +   cooling-device = < 3 7>;
> > +};
> > +
> > +_map6 {
> > +   cooling-device = < 3 11>;
> > +};
> > +
> >   {
> > /*
> >  * PWM 0 -- fan
> > Index: b/arch/arm/boot/dts/exynos5800-peach-pi.dts
> > ===
> > --- a/arch/arm/boot/dts/exynos5800-peach-pi.dts 2016-12-13 
> > 15:59:33.779763261 +0100
> > +++ b/arch/arm/boot/dts/exynos5800-peach-pi.dts 2016-12-13 
> > 15:59:33.779763261 +0100
> > @@ -146,6 +146,10 @@
> > vdd-supply = <_reg>;
> >  };
> >  
> > +_a7_opp_table {
> > +   /delete-property/opp@14;
> > +};
> > +
> >   {
> > cpu-supply = <_reg>;
> >  };
> > Index: b/arch/arm/boot/dts/exynos5800.dtsi
> > ===
> > --- a/arch/arm/boot/dts/exynos5800.dtsi 2016-12-13 15:59:33.779763261 
> > +0100
> > +++ b/arch/arm/boot/dts/exynos5800.dtsi 2016-12-13 15:59:33.779763261 
> > +0100
> > @@ -24,6 +24,16 @@
> >  };
> >  
> >  _a15_opp_table {
> > +   opp@20 {
> > +   opp-hz = /bits/ 64 

<    4   5   6   7   8   9   10   11   12   13   >