Re: [PATCH RFC/RFT] sched/fair: Improve the behavior of sync flag

2017-08-28 Thread Mike Galbraith
On Sun, 2017-08-27 at 22:27 -0700, Joel Fernandes wrote:
> Hi Mike,
> 
> On Sun, Aug 27, 2017 at 11:07 AM, Mike Galbraith  wrote:
> > On Sat, 2017-08-26 at 23:39 -0700, Joel Fernandes wrote:
> >>
> >> Also about real world benchmarks, in Android we have usecases that
> >> show that the graphics performance and we have risk of frame drops if
> >> we don't use the sync flag so this is a real world need.
> >
> > That likely has everything to do with cpufreq not realizing that your
> > CPUs really are quite busy when scheduling cross core at fairly high
> > frequency, and not clocking up properly.
> >
> 
> I'm glad you brought this point up. Since Android O, the userspace
> processes are much more split across procedure calls due to a feature
> called treble (which does this for security, modularity etc). Due to
> this, a lot of things that were happening within a process boundary
> happen now across process boundaries over the binder bus. Early on
> folks noticed that this caused performance issues without sync flag
> being used as a more strong hint. This can happen when there are 2
> threads are in different frequency domains on different CPUs and are
> communicating over binder, due to this the combined load of both
> threads is divided between the individual CPUs and causes them to run
> at lower frequency. Where as if they are running together on the same
> CPUs, then they would run at a higher frequency and perform better as
> their combined load would run at a higher frequency. So a stronger
> sync actually helps this case if we're careful about using it when
> possible.

Sure, but isn't that really a cpufreq issue?  We schedule cross core
quite aggressively for obvious reasons.  Now on mostly idle handheld
devices, you may get better battery life by stacking tasks a bit more,
in which case a sync-me-harder flag may be what you really want/need,
but with modern CPUs, I'm kinda skeptical of that, would have to see
cold hard numbers to become a believer.  Iff deeper cstate etc for
longer does make a big difference, I can imagine wakeup time migrate
leftward if capacity exists as an "on battery" tactic. (though that
thought also invokes some unpleasant bounce fest images)

-Mike


Re: [PATCH 12/12] dma-mapping: turn dma_cache_sync into a dma_map_ops method

2017-08-28 Thread Geert Uytterhoeven
Hi Christoph,

On Sun, Aug 27, 2017 at 6:10 PM, Christoph Hellwig  wrote:
> After we removed all the dead wood it turns out only two architectures
> actually implement dma_cache_sync as a no-op: mips and parisc.  Add

s/no-op/real op/

> a cache_sync method to struct dma_map_ops and implement it for the
> mips defualt DMA ops, and the parisc pa11 ops.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


[PATCH net-next v3 1/3] net/ncsi: Fix several packet definitions

2017-08-28 Thread Samuel Mendoza-Jonas
Signed-off-by: Samuel Mendoza-Jonas 
---
v2: Rebased on latest net-next

 net/ncsi/ncsi-cmd.c | 10 +-
 net/ncsi/ncsi-pkt.h |  2 +-
 net/ncsi/ncsi-rsp.c |  3 ++-
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/net/ncsi/ncsi-cmd.c b/net/ncsi/ncsi-cmd.c
index 5e03ed190e18..7567ca63aae2 100644
--- a/net/ncsi/ncsi-cmd.c
+++ b/net/ncsi/ncsi-cmd.c
@@ -139,9 +139,9 @@ static int ncsi_cmd_handler_svf(struct sk_buff *skb,
struct ncsi_cmd_svf_pkt *cmd;
 
cmd = skb_put_zero(skb, sizeof(*cmd));
-   cmd->vlan = htons(nca->words[0]);
-   cmd->index = nca->bytes[2];
-   cmd->enable = nca->bytes[3];
+   cmd->vlan = htons(nca->words[1]);
+   cmd->index = nca->bytes[6];
+   cmd->enable = nca->bytes[7];
ncsi_cmd_build_header(>cmd.common, nca);
 
return 0;
@@ -153,7 +153,7 @@ static int ncsi_cmd_handler_ev(struct sk_buff *skb,
struct ncsi_cmd_ev_pkt *cmd;
 
cmd = skb_put_zero(skb, sizeof(*cmd));
-   cmd->mode = nca->bytes[0];
+   cmd->mode = nca->bytes[3];
ncsi_cmd_build_header(>cmd.common, nca);
 
return 0;
@@ -228,7 +228,7 @@ static struct ncsi_cmd_handler {
{ NCSI_PKT_CMD_AE, 8, ncsi_cmd_handler_ae  },
{ NCSI_PKT_CMD_SL, 8, ncsi_cmd_handler_sl  },
{ NCSI_PKT_CMD_GLS,0, ncsi_cmd_handler_default },
-   { NCSI_PKT_CMD_SVF,4, ncsi_cmd_handler_svf },
+   { NCSI_PKT_CMD_SVF,8, ncsi_cmd_handler_svf },
{ NCSI_PKT_CMD_EV, 4, ncsi_cmd_handler_ev  },
{ NCSI_PKT_CMD_DV, 0, ncsi_cmd_handler_default },
{ NCSI_PKT_CMD_SMA,8, ncsi_cmd_handler_sma },
diff --git a/net/ncsi/ncsi-pkt.h b/net/ncsi/ncsi-pkt.h
index 3ea49ed0a935..91b4b66438df 100644
--- a/net/ncsi/ncsi-pkt.h
+++ b/net/ncsi/ncsi-pkt.h
@@ -104,7 +104,7 @@ struct ncsi_cmd_svf_pkt {
unsigned char   index; /* VLAN table index  */
unsigned char   enable;/* Enable or disable */
__be32  checksum;  /* Checksum  */
-   unsigned char   pad[14];
+   unsigned char   pad[18];
 };
 
 /* Enable VLAN */
diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
index 087db775b3dc..c1a191d790e2 100644
--- a/net/ncsi/ncsi-rsp.c
+++ b/net/ncsi/ncsi-rsp.c
@@ -354,7 +354,8 @@ static int ncsi_rsp_handler_svf(struct ncsi_request *nr)
 
/* Add or remove the VLAN filter */
if (!(cmd->enable & 0x1)) {
-   ret = ncsi_remove_filter(nc, NCSI_FILTER_VLAN, cmd->index);
+   /* HW indexes from 1 */
+   ret = ncsi_remove_filter(nc, NCSI_FILTER_VLAN, cmd->index - 1);
} else {
vlan = ntohs(cmd->vlan);
ret = ncsi_add_filter(nc, NCSI_FILTER_VLAN, );
-- 
2.14.0



[PATCH v2 1/2] ARM: dts: sun7i: Fix A20-OLinuXino-MICRO dts for LAN8710

2017-08-28 Thread Stefan Mavrodiev
>From revision J the board uses new phy chip LAN8710. Compared
with RTL8201, RA17 pin is TXERR. It has pullup which causes phy
not to work. To fix this PA17 is muxed with GMAC function. This
makes the pin output-low.

This patch is compatible with earlier board revisions, since this
pin wasn't connected to phy.

Signed-off-by: Stefan Mavrodiev 
---
 arch/arm/boot/dts/sun7i-a20-olinuxino-micro.dts | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/sun7i-a20-olinuxino-micro.dts 
b/arch/arm/boot/dts/sun7i-a20-olinuxino-micro.dts
index 0b7403e..cb1b081 100644
--- a/arch/arm/boot/dts/sun7i-a20-olinuxino-micro.dts
+++ b/arch/arm/boot/dts/sun7i-a20-olinuxino-micro.dts
@@ -102,7 +102,7 @@
 
  {
pinctrl-names = "default";
-   pinctrl-0 = <_pins_mii_a>;
+   pinctrl-0 = <_pins_mii_a>,<_txerr>;
phy = <>;
phy-mode = "mii";
status = "okay";
@@ -229,6 +229,11 @@
 };
 
  {
+   gmac_txerr: gmac_txerr@0 {
+   pins = "PA17";
+   function = "gmac";
+   };
+
mmc3_cd_pin_olinuxinom: mmc3_cd_pin@0 {
pins = "PH11";
function = "gpio_in";
-- 
2.7.4



[PATCH v2 0/2] Update board support for A20-OLinuXino-MICRO

2017-08-28 Thread Stefan Mavrodiev
>From rev.J of A20-OLinuXino-MICRO, the board has new PHY chip
(LAN8710) which replace RTL8201. Also there is option for 4GB
eMMC chip.

Changes in v2:
* Remove pinctrl request for eMMC reset pin
* Dump the idea of renaming boards with emmc
* Using txerr as gmac function

Stefan Mavrodiev (2):
  ARM: dts: sun7i: Fix A20-OLinuXino-MICRO dts for LAN8710
  ARM: dts: sun7i: Add dts file for A20-OLinuXino-MICRO-eMMC

 arch/arm/boot/dts/Makefile |  1 +
 .../boot/dts/sun7i-a20-olinuxino-micro-emmc.dts| 70 ++
 arch/arm/boot/dts/sun7i-a20-olinuxino-micro.dts|  7 ++-
 3 files changed, 77 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/boot/dts/sun7i-a20-olinuxino-micro-emmc.dts

-- 
2.7.4



[PATCH v2 2/2] ARM: dts: sun7i: Add dts file for A20-OLinuXino-MICRO-eMMC

2017-08-28 Thread Stefan Mavrodiev
A20-OLinuXino-MICRO has option with onboard eMMC chip. For
now it's only shipped with 4BG chip, but in the future this
may change.

Signed-off-by: Stefan Mavrodiev 
---
 arch/arm/boot/dts/Makefile |  1 +
 .../boot/dts/sun7i-a20-olinuxino-micro-emmc.dts| 70 ++
 2 files changed, 71 insertions(+)
 create mode 100644 arch/arm/boot/dts/sun7i-a20-olinuxino-micro-emmc.dts

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 4b17f35..e1d1e93 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -880,6 +880,7 @@ dtb-$(CONFIG_MACH_SUN7I) += \
sun7i-a20-olinuxino-lime2.dtb \
sun7i-a20-olinuxino-lime2-emmc.dtb \
sun7i-a20-olinuxino-micro.dtb \
+   sun7i-a20-olinuxino-micro-emmc.dtb \
sun7i-a20-orangepi.dtb \
sun7i-a20-orangepi-mini.dtb \
sun7i-a20-pcduino3.dtb \
diff --git a/arch/arm/boot/dts/sun7i-a20-olinuxino-micro-emmc.dts 
b/arch/arm/boot/dts/sun7i-a20-olinuxino-micro-emmc.dts
new file mode 100644
index 000..d99e7b1
--- /dev/null
+++ b/arch/arm/boot/dts/sun7i-a20-olinuxino-micro-emmc.dts
@@ -0,0 +1,70 @@
+ /*
+ * Copyright 2017 Olimex Ltd.
+ * Stefan Mavrodiev 
+ *
+ * This file is dual-licensed: you can use it either under the terms
+ * of the GPL or the X11 license, at your option. Note that this dual
+ * licensing only applies to this file, and not this project as a
+ * whole.
+ *
+ *  a) This file is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License, or (at your option) any later version.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Or, alternatively,
+ *
+ *  b) Permission is hereby granted, free of charge, to any person
+ * obtaining a copy of this software and associated documentation
+ * files (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use,
+ * copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following
+ * conditions:
+ *
+ * The above copyright notice and this permission notice shall be
+ * included in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include "sun7i-a20-olinuxino-micro.dts"
+
+/ {
+   model = "Olimex A20-OLinuXino-MICRO-eMMC";
+   compatible = "olimex,a20-olinuxino-micro-emmc", "allwinner,sun7i-a20";
+
+   mmc2_pwrseq: pwrseq {
+   compatible = "mmc-pwrseq-emmc";
+   reset-gpios = < 2 16 GPIO_ACTIVE_LOW>;
+   };
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins_a>;
+   vmmc-supply = <_vcc3v3>;
+   bus-width = <4>;
+   non-removable;
+   mmc-pwrseq = <_pwrseq>;
+   status = "okay";
+
+   emmc: emmc@0 {
+   reg = <0>;
+   compatible = "mmc-card";
+   broken-hpi;
+   };
+};
-- 
2.7.4



[PATCH] powerpc/512x: clk: constify clk_div_table

2017-08-28 Thread Arvind Yadav
clk_div_table are not supposed to change at runtime.
mpc512x_clk_divtable function working with const
clk_div_table. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav 
---
 arch/powerpc/platforms/512x/clock-commonclk.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/512x/clock-commonclk.c 
b/arch/powerpc/platforms/512x/clock-commonclk.c
index add5a53..b3097fe 100644
--- a/arch/powerpc/platforms/512x/clock-commonclk.c
+++ b/arch/powerpc/platforms/512x/clock-commonclk.c
@@ -363,7 +363,7 @@ static int get_cpmf_mult_x2(void)
  */
 
 /* applies to the IPS_DIV, and PCI_DIV values */
-static struct clk_div_table divtab_2346[] = {
+static const struct clk_div_table divtab_2346[] = {
{ .val = 2, .div = 2, },
{ .val = 3, .div = 3, },
{ .val = 4, .div = 4, },
@@ -372,7 +372,7 @@ static int get_cpmf_mult_x2(void)
 };
 
 /* applies to the MBX_DIV, LPC_DIV, and NFC_DIV values */
-static struct clk_div_table divtab_1234[] = {
+static const struct clk_div_table divtab_1234[] = {
{ .val = 1, .div = 1, },
{ .val = 2, .div = 2, },
{ .val = 3, .div = 3, },
-- 
1.9.1



Re: [PATCH] connector: Delete an error message for a failed memory allocation in cn_queue_alloc_callback_entry()

2017-08-28 Thread Dan Carpenter
On Sun, Aug 27, 2017 at 11:16:06PM +, Waskiewicz Jr, Peter wrote:
> On 8/27/17 3:26 PM, SF Markus Elfring wrote:
> > From: Markus Elfring 
> > Date: Sun, 27 Aug 2017 21:18:37 +0200
> > 
> > Omit an extra message for a memory allocation failure in this function.
> > 
> > This issue was detected by using the Coccinelle software.
> 
> Did coccinelle trip on the message or the fact you weren't returning NULL?
> 

You've misread the patch somehow.  The existing code has a NULL return
and it's preserved in Markus's patch.  This sort of patch is to fix a
checkpatch.pl warning.  The error message from this kzalloc() isn't going
to get printed because it's a small allocation and small allocations
always succeed in current kernels.  But probably the main reason
checkpatch complains is that kmalloc() already prints a stack trace and
a bunch of other information so the printk doesn't add anyting.
Removing it saves a little memory.

I'm mostly a fan of running checkpatch on new patches or staging and not
on old code...

regards,
dan carpenter



Re: Re: [PATCH] fix memory leak on kvm_vm_ioctl_create_spapr_tce

2017-08-28 Thread Paul Mackerras
On Mon, Aug 28, 2017 at 06:28:08AM +0100, Al Viro wrote:
> On Mon, Aug 28, 2017 at 02:38:37PM +1000, Paul Mackerras wrote:
> > On Sun, Aug 27, 2017 at 10:02:20PM +0100, Al Viro wrote:
> > > On Wed, Aug 23, 2017 at 04:06:24PM +1000, Paul Mackerras wrote:
> > > 
> > > > It seems to me that it would be better to do the anon_inode_getfd()
> > > > call before the kvm_get_kvm() call, and go to the fail label if it
> > > > fails.
> > > 
> > > And what happens if another thread does close() on the (guessed) fd?
> > 
> > Chaos ensues, but mostly because we don't have proper mutual exclusion
> > on the modifications to the list.  I'll add a mutex_lock/unlock to
> > kvm_spapr_tce_release() and move the anon_inode_getfd() call inside
> > the mutex.
> > 
> > It looks like the other possible uses of the fd (mmap, and passing it
> > as a parameter to the KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl on a KVM
> > device fd) are safe.
> 
> Frankly, it's a lot saner to have "no failure points past anon_inode_getfd()"
> policy...

Right.  In my latest patch, there are no failure points past
anon_inode_getfd().

Paul.


[PATCH] [media] uvcvideo: zero seq number when disabling stream

2017-08-28 Thread Hans Yang
For bulk-based devices, when disabling the video stream,
in addition to issue CLEAR_FEATURE(HALT), it is better to set
alternate setting 0 as well or the sequnce number in host
side will probably not reset to zero.

Then in next time video stream start, the device will expect
host starts packet from 0 sequence number but host actually
continue the sequence number from last transaction and this
causes transaction errors.

This commit fixes this by adding set alternate setting 0 back
as what isoch-based devices do.

Below error message will also be eliminated for some devices:
uvcvideo: Non-zero status (-71) in video completion handler.

Signed-off-by: Hans Yang 
---
 drivers/media/usb/uvc/uvc_video.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/media/usb/uvc/uvc_video.c 
b/drivers/media/usb/uvc/uvc_video.c
index fb86d6af398d..ad80c2a6da6a 100644
--- a/drivers/media/usb/uvc/uvc_video.c
+++ b/drivers/media/usb/uvc/uvc_video.c
@@ -1862,10 +1862,9 @@ int uvc_video_enable(struct uvc_streaming *stream, int 
enable)
 
if (!enable) {
uvc_uninit_video(stream, 1);
-   if (stream->intf->num_altsetting > 1) {
-   usb_set_interface(stream->dev->udev,
+   usb_set_interface(stream->dev->udev,
  stream->intfnum, 0);
-   } else {
+   if (stream->intf->num_altsetting == 1) {
/* UVC doesn't specify how to inform a bulk-based device
 * when the video stream is stopped. Windows sends a
 * CLEAR_FEATURE(HALT) request to the video streaming
-- 
2.1.4



[PATCH net-next v3 3/3] ftgmac100: Support NCSI VLAN filtering when available

2017-08-28 Thread Samuel Mendoza-Jonas
Register the ndo_vlan_rx_{add,kill}_vid callbacks and set the
NETIF_F_HW_VLAN_CTAG_FILTER if NCSI is available.
This allows the VLAN core to notify the NCSI driver when changes occur
so that the remote NCSI channel can be properly configured to filter on
the set VLAN tags.

Signed-off-by: Samuel Mendoza-Jonas 
---
v2: Moved ftgmac100 change into same patch and reordered

 drivers/net/ethernet/faraday/ftgmac100.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/faraday/ftgmac100.c 
b/drivers/net/ethernet/faraday/ftgmac100.c
index 34dae51effd4..05fe7123d5ae 100644
--- a/drivers/net/ethernet/faraday/ftgmac100.c
+++ b/drivers/net/ethernet/faraday/ftgmac100.c
@@ -1623,6 +1623,8 @@ static const struct net_device_ops ftgmac100_netdev_ops = 
{
 #ifdef CONFIG_NET_POLL_CONTROLLER
.ndo_poll_controller= ftgmac100_poll_controller,
 #endif
+   .ndo_vlan_rx_add_vid= ncsi_vlan_rx_add_vid,
+   .ndo_vlan_rx_kill_vid   = ncsi_vlan_rx_kill_vid,
 };
 
 static int ftgmac100_setup_mdio(struct net_device *netdev)
@@ -1837,6 +1839,9 @@ static int ftgmac100_probe(struct platform_device *pdev)
NETIF_F_GRO | NETIF_F_SG | NETIF_F_HW_VLAN_CTAG_RX |
NETIF_F_HW_VLAN_CTAG_TX;
 
+   if (priv->use_ncsi)
+   netdev->hw_features |= NETIF_F_HW_VLAN_CTAG_FILTER;
+
/* AST2400  doesn't have working HW checksum generation */
if (np && (of_device_is_compatible(np, "aspeed,ast2400-mac")))
netdev->hw_features &= ~NETIF_F_HW_CSUM;
-- 
2.14.0



[PATCH net-next v3 2/3] net/ncsi: Configure VLAN tag filter

2017-08-28 Thread Samuel Mendoza-Jonas
Make use of the ndo_vlan_rx_{add,kill}_vid callbacks to have the NCSI
stack process new VLAN tags and configure the channel VLAN filter
appropriately.
Several VLAN tags can be set and a "Set VLAN Filter" packet must be sent
for each one, meaning the ncsi_dev_state_config_svf state must be
repeated. An internal list of VLAN tags is maintained, and compared
against the current channel's ncsi_channel_filter in order to keep track
within the state. VLAN filters are removed in a similar manner, with the
introduction of the ncsi_dev_state_config_clear_vids state. The maximum
number of VLAN tag filters is determined by the "Get Capabilities"
response from the channel.

Signed-off-by: Samuel Mendoza-Jonas 
---
v3: - Add comment describing change to ncsi_find_filter()
- Catch NULL in clear_one_vid() from ncsi_get_filter()
- Simplify state changes when kicking updated channel

 include/net/ncsi.h |   2 +
 net/ncsi/internal.h|  11 ++
 net/ncsi/ncsi-manage.c | 308 -
 net/ncsi/ncsi-rsp.c|   9 +-
 4 files changed, 326 insertions(+), 4 deletions(-)

diff --git a/include/net/ncsi.h b/include/net/ncsi.h
index 68680baac0fd..1f96af46df49 100644
--- a/include/net/ncsi.h
+++ b/include/net/ncsi.h
@@ -28,6 +28,8 @@ struct ncsi_dev {
 };
 
 #ifdef CONFIG_NET_NCSI
+int ncsi_vlan_rx_add_vid(struct net_device *dev, __be16 proto, u16 vid);
+int ncsi_vlan_rx_kill_vid(struct net_device *dev, __be16 proto, u16 vid);
 struct ncsi_dev *ncsi_register_dev(struct net_device *dev,
   void (*notifier)(struct ncsi_dev *nd));
 int ncsi_start_dev(struct ncsi_dev *nd);
diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h
index 1308a56f2591..af3d636534ef 100644
--- a/net/ncsi/internal.h
+++ b/net/ncsi/internal.h
@@ -180,6 +180,7 @@ struct ncsi_channel {
 #define NCSI_CHANNEL_INACTIVE  1
 #define NCSI_CHANNEL_ACTIVE2
 #define NCSI_CHANNEL_INVISIBLE 3
+   boolreconfigure_needed;
spinlock_t  lock;   /* Protect filters etc */
struct ncsi_package *package;
struct ncsi_channel_version version;
@@ -235,6 +236,9 @@ enum {
ncsi_dev_state_probe_dp,
ncsi_dev_state_config_sp= 0x0301,
ncsi_dev_state_config_cis,
+   ncsi_dev_state_config_clear_vids,
+   ncsi_dev_state_config_svf,
+   ncsi_dev_state_config_ev,
ncsi_dev_state_config_sma,
ncsi_dev_state_config_ebf,
 #if IS_ENABLED(CONFIG_IPV6)
@@ -253,6 +257,12 @@ enum {
ncsi_dev_state_suspend_done
 };
 
+struct vlan_vid {
+   struct list_head list;
+   __be16 proto;
+   u16 vid;
+};
+
 struct ncsi_dev_priv {
struct ncsi_dev ndev;/* Associated NCSI device */
unsigned intflags;   /* NCSI device flags  */
@@ -276,6 +286,7 @@ struct ncsi_dev_priv {
struct work_struct  work;/* For channel management */
struct packet_type  ptype;   /* NCSI packet Rx handler */
struct list_headnode;/* Form NCSI device list  */
+   struct list_headvlan_vids;   /* List of active VLAN IDs */
 };
 
 struct ncsi_cmd_arg {
diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
index a3bd5fa8ad09..11904b3b702d 100644
--- a/net/ncsi/ncsi-manage.c
+++ b/net/ncsi/ncsi-manage.c
@@ -38,6 +38,25 @@ static inline int ncsi_filter_size(int table)
return sizes[table];
 }
 
+u32 *ncsi_get_filter(struct ncsi_channel *nc, int table, int index)
+{
+   struct ncsi_channel_filter *ncf;
+   int size;
+
+   ncf = nc->filters[table];
+   if (!ncf)
+   return NULL;
+
+   size = ncsi_filter_size(table);
+   if (size < 0)
+   return NULL;
+
+   return ncf->data + size * index;
+}
+
+/* Find the first active filter in a filter table that matches the given
+ * data parameter. If data is NULL, this returns the first active filter.
+ */
 int ncsi_find_filter(struct ncsi_channel *nc, int table, void *data)
 {
struct ncsi_channel_filter *ncf;
@@ -58,7 +77,7 @@ int ncsi_find_filter(struct ncsi_channel *nc, int table, void 
*data)
index = -1;
while ((index = find_next_bit(bitmap, ncf->total, index + 1))
   < ncf->total) {
-   if (!memcmp(ncf->data + size * index, data, size)) {
+   if (!data || !memcmp(ncf->data + size * index, data, size)) {
spin_unlock_irqrestore(>lock, flags);
return index;
}
@@ -639,6 +658,95 @@ static void ncsi_suspend_channel(struct ncsi_dev_priv *ndp)
nd->state = ncsi_dev_state_functional;
 }
 
+/* Check the VLAN filter bitmap for a set filter, and construct a
+ * "Set VLAN Filter - Disable" packet if found.
+ */
+static int clear_one_vid(struct ncsi_dev_priv *ndp, struct ncsi_channel *nc,
+ 

[PATCH net-next v3 0/3] NCSI VLAN Filtering Support

2017-08-28 Thread Samuel Mendoza-Jonas
This series (mainly patch 2) adds VLAN filtering to the NCSI implementation.
A fair amount of code already exists in the NCSI stack for VLAN filtering but
none of it is actually hooked up. This goes the final mile and fixes a few
bugs in the existing code found along the way (patch 1).

Patch 3 adds the appropriate flag and callbacks to the ftgmac100 driver to
enable filtering as it's a large consumer of NCSI (and what I've been
testing on).

v3: - Add comment describing change to ncsi_find_filter()
- Catch NULL in clear_one_vid() from ncsi_get_filter()
- Simplify state changes when kicking updated channel

Samuel Mendoza-Jonas (3):
  net/ncsi: Fix several packet definitions
  net/ncsi: Configure VLAN tag filter
  ftgmac100: Support NCSI VLAN filtering when available

 drivers/net/ethernet/faraday/ftgmac100.c |   5 +
 include/net/ncsi.h   |   2 +
 net/ncsi/internal.h  |  11 ++
 net/ncsi/ncsi-cmd.c  |  10 +-
 net/ncsi/ncsi-manage.c   | 308 ++-
 net/ncsi/ncsi-pkt.h  |   2 +-
 net/ncsi/ncsi-rsp.c  |  12 +-
 7 files changed, 339 insertions(+), 11 deletions(-)

-- 
2.14.0



Re: [PATCH] staging: rtl8723bs: remove memset before memcpy

2017-08-28 Thread Dan Carpenter
On Mon, Aug 28, 2017 at 01:43:31AM +0530, Himanshu Jha wrote:
> calling memcpy immediately after memset with the same region of memory
> makes memset redundant.
> 
> Build successfully.
> 

Thanks for the patch, it looks good.  You don't need to say that it
builds successfully, because we already assume that's true.

> Signed-off-by: Himanshu Jha 
> ---

Sometimes I put a comment here under the cut off line if I want people
to know that I haven't tested a patch.

Anyway, don't resend the patch.  It's fine as-is (unless Greg
complains) but it's just for future reference.

regards,
dan carpenter



Re: [PATCH net-next v2 05/14] net: mvpp2: do not force the link mode

2017-08-28 Thread Antoine Tenart
Hi Russell,

On Fri, Aug 25, 2017 at 11:43:13PM +0100, Russell King - ARM Linux wrote:
> On Fri, Aug 25, 2017 at 04:48:12PM +0200, Antoine Tenart wrote:
> > The link mode (speed, duplex) was forced based on what the phylib
> > returns. This should not be the case, and only forced by ethtool
> > functions manually. This patch removes the link mode enforcement from
> > the phylib link_event callback.
> 
> So how does RGMII work (which has no in-band signalling between the PHY
> and MAC)?
> 
> phylib expects the network driver to configure it according to the PHY
> state at link_event time - I think you need to explain more why you
> think that this is not necessary.

Good catch, this won't work properly with RGMII. This could be done
out-of-band according to the spec, but that would use PHY polling and we
do not want that (the same concern was raised by Andrew on another
patch).

I'll keep this mode enforcement for RGMII then.

Thanks!
Antoine

-- 
Antoine Ténart, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


signature.asc
Description: PGP signature


[patch V3 39/44] x86/idt: Move APIC gate initialization to tables

2017-08-28 Thread Thomas Gleixner
Replace the APIC/SMP vector gate initialization with the table based
mechanism.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/desc.h |1 
 arch/x86/kernel/idt.c   |   48 ++
 arch/x86/kernel/irqinit.c   |   69 
 3 files changed, 50 insertions(+), 68 deletions(-)

--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -507,6 +507,7 @@ static inline void load_current_idt(void
 extern void idt_setup_early_handler(void);
 extern void idt_setup_early_traps(void);
 extern void idt_setup_traps(void);
+extern void idt_setup_apic_and_irq_gates(void);
 
 #ifdef CONFIG_X86_64
 extern void idt_setup_early_pf(void);
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -103,6 +103,46 @@ static const __initdata struct idt_data
 #endif
 };
 
+/*
+ * The APIC and SMP idt entries
+ */
+static const __initdata struct idt_data apic_idts[] = {
+#ifdef CONFIG_SMP
+   INTG(RESCHEDULE_VECTOR, reschedule_interrupt),
+   INTG(CALL_FUNCTION_VECTOR,  call_function_interrupt),
+   INTG(CALL_FUNCTION_SINGLE_VECTOR, call_function_single_interrupt),
+   INTG(IRQ_MOVE_CLEANUP_VECTOR,   irq_move_cleanup_interrupt),
+   INTG(REBOOT_VECTOR, reboot_interrupt),
+#endif
+
+#ifdef CONFIG_X86_THERMAL_VECTOR
+   INTG(THERMAL_APIC_VECTOR,   thermal_interrupt),
+#endif
+
+#ifdef CONFIG_X86_MCE_THRESHOLD
+   INTG(THRESHOLD_APIC_VECTOR, threshold_interrupt),
+#endif
+
+#ifdef CONFIG_X86_MCE_AMD
+   INTG(DEFERRED_ERROR_VECTOR, deferred_error_interrupt),
+#endif
+
+#ifdef CONFIG_X86_LOCAL_APIC
+   INTG(LOCAL_TIMER_VECTOR,apic_timer_interrupt),
+   INTG(X86_PLATFORM_IPI_VECTOR,   x86_platform_ipi),
+# ifdef CONFIG_HAVE_KVM
+   INTG(POSTED_INTR_VECTOR,kvm_posted_intr_ipi),
+   INTG(POSTED_INTR_WAKEUP_VECTOR, kvm_posted_intr_wakeup_ipi),
+   INTG(POSTED_INTR_NESTED_VECTOR, kvm_posted_intr_nested_ipi),
+# endif
+# ifdef CONFIG_IRQ_WORK
+   INTG(IRQ_WORK_VECTOR,   irq_work_interrupt),
+# endif
+   INTG(SPURIOUS_APIC_VECTOR,  spurious_interrupt),
+   INTG(ERROR_APIC_VECTOR, error_interrupt),
+#endif
+};
+
 #ifdef CONFIG_X86_64
 /*
  * Early traps running on the DEFAULT_STACK because the other interrupt
@@ -242,6 +282,14 @@ void __init idt_setup_debugidt_traps(voi
 #endif
 
 /**
+ * idt_setup_apic_and_irq_gates - Setup APIC/SMP and normal interrupt gates
+ */
+void __init idt_setup_apic_and_irq_gates(void)
+{
+   idt_setup_from_table(idt_table, apic_idts, ARRAY_SIZE(apic_idts));
+}
+
+/**
  * idt_setup_early_handler - Initializes the idt table with early handlers
  */
 void __init idt_setup_early_handler(void)
--- a/arch/x86/kernel/irqinit.c
+++ b/arch/x86/kernel/irqinit.c
@@ -87,73 +87,6 @@ void __init init_IRQ(void)
x86_init.irqs.intr_init();
 }
 
-static void __init smp_intr_init(void)
-{
-#ifdef CONFIG_SMP
-   /*
-* The reschedule interrupt is a CPU-to-CPU reschedule-helper
-* IPI, driven by wakeup.
-*/
-   alloc_intr_gate(RESCHEDULE_VECTOR, reschedule_interrupt);
-
-   /* IPI for generic function call */
-   alloc_intr_gate(CALL_FUNCTION_VECTOR, call_function_interrupt);
-
-   /* IPI for generic single function call */
-   alloc_intr_gate(CALL_FUNCTION_SINGLE_VECTOR,
-   call_function_single_interrupt);
-
-   /* Low priority IPI to cleanup after moving an irq */
-   set_intr_gate(IRQ_MOVE_CLEANUP_VECTOR, irq_move_cleanup_interrupt);
-   set_bit(IRQ_MOVE_CLEANUP_VECTOR, used_vectors);
-
-   /* IPI used for rebooting/stopping */
-   alloc_intr_gate(REBOOT_VECTOR, reboot_interrupt);
-#endif /* CONFIG_SMP */
-}
-
-static void __init apic_intr_init(void)
-{
-   smp_intr_init();
-
-#ifdef CONFIG_X86_THERMAL_VECTOR
-   alloc_intr_gate(THERMAL_APIC_VECTOR, thermal_interrupt);
-#endif
-#ifdef CONFIG_X86_MCE_THRESHOLD
-   alloc_intr_gate(THRESHOLD_APIC_VECTOR, threshold_interrupt);
-#endif
-
-#ifdef CONFIG_X86_MCE_AMD
-   alloc_intr_gate(DEFERRED_ERROR_VECTOR, deferred_error_interrupt);
-#endif
-
-#ifdef CONFIG_X86_LOCAL_APIC
-   /* self generated IPI for local APIC timer */
-   alloc_intr_gate(LOCAL_TIMER_VECTOR, apic_timer_interrupt);
-
-   /* IPI for X86 platform specific use */
-   alloc_intr_gate(X86_PLATFORM_IPI_VECTOR, x86_platform_ipi);
-#ifdef CONFIG_HAVE_KVM
-   /* IPI for KVM to deliver posted interrupt */
-   alloc_intr_gate(POSTED_INTR_VECTOR, kvm_posted_intr_ipi);
-   /* IPI for KVM to deliver interrupt to wake up tasks */
-   alloc_intr_gate(POSTED_INTR_WAKEUP_VECTOR, kvm_posted_intr_wakeup_ipi);
-   /* IPI for KVM to deliver nested posted interrupt */
-   alloc_intr_gate(POSTED_INTR_NESTED_VECTOR, kvm_posted_intr_nested_ipi);
-#endif
-
-   /* IPI vectors for APIC spurious and error interrupts */
-   

Re: [RFC] workqueue: remove manual lockdep uses to detect deadlocks

2017-08-28 Thread Peter Zijlstra
On Fri, Aug 25, 2017 at 05:41:03PM +0900, Byungchul Park wrote:
> Hello all,
> 
> This is _RFC_.
> 
> I want to request for comments about if it's reasonable conceptually. If
> yes, I want to resend after working it more carefully.
> 
> Could you let me know your opinions about this?
> 
> ->8-
> From 448360c343477fff63df766544eec4620657a59e Mon Sep 17 00:00:00 2001
> From: Byungchul Park 
> Date: Fri, 25 Aug 2017 17:35:07 +0900
> Subject: [RFC] workqueue: remove manual lockdep uses to detect deadlocks
> 
> We introduced the following commit to detect deadlocks caused by
> wait_for_completion() in flush_{workqueue, work}() and other locks. But
> now LOCKDEP_COMPLETIONS is introduced, such works are automatically done
> by LOCKDEP_COMPLETIONS. So it doesn't have to be done manually anymore.
> Removed it.
> 

No.. the existing annotation is strictly better because it will _always_
warn. It doesn't need to first observe things just right.


[patch V3 37/44] x86/idt: Move ist stack based traps to table init

2017-08-28 Thread Thomas Gleixner
Initialize the IST based traps via a table

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/desc.h |2 ++
 arch/x86/kernel/idt.c   |   22 ++
 arch/x86/kernel/traps.c |9 +
 3 files changed, 25 insertions(+), 8 deletions(-)

--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -509,9 +509,11 @@ extern void idt_setup_early_traps(void);
 
 #ifdef CONFIG_X86_64
 extern void idt_setup_early_pf(void);
+extern void idt_setup_ist_traps(void);
 extern void idt_setup_debugidt_traps(void);
 #else
 static inline void idt_setup_early_pf(void) { }
+static inline void idt_setup_ist_traps(void) { }
 static inline void idt_setup_debugidt_traps(void) { }
 #endif
 
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -92,6 +92,20 @@ struct desc_ptr idt_descr __ro_after_ini
 gate_desc debug_idt_table[IDT_ENTRIES] __page_aligned_bss;
 
 /*
+ * The exceptions which use Interrupt stacks. They are setup after
+ * cpu_init() when the TSS has been initialized.
+ */
+static const __initdata struct idt_data ist_idts[] = {
+   ISTG(X86_TRAP_DB,   debug,  DEBUG_STACK),
+   ISTG(X86_TRAP_NMI,  nmi,NMI_STACK),
+   ISTG(X86_TRAP_BP,   int3,   DEBUG_STACK),
+   ISTG(X86_TRAP_DF,   double_fault,   DOUBLEFAULT_STACK),
+#ifdef CONFIG_X86_MCE
+   ISTG(X86_TRAP_MC,   _check, MCE_STACK),
+#endif
+};
+
+/*
  * Override for the debug_idt. Same as the default, but with interrupt
  * stack set to DEFAULT_STACK (0). Required for NMI trap handling.
  */
@@ -158,6 +172,14 @@ void __init idt_setup_early_pf(void)
 }
 
 /**
+ * idt_setup_ist_traps - Initialize the idt table with traps using IST
+ */
+void __init idt_setup_ist_traps(void)
+{
+   idt_setup_from_table(idt_table, ist_idts, ARRAY_SIZE(ist_idts));
+}
+
+/**
  * idt_setup_debugidt_traps - Initialize the debug idt table with debug traps
  */
 void __init idt_setup_debugidt_traps(void)
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -979,14 +979,7 @@ void __init trap_init(void)
 */
cpu_init();
 
-   /*
-* X86_TRAP_DB and X86_TRAP_BP have been set
-* in early_trap_init(). However, ITS works only after
-* cpu_init() loads TSS. See comments in early_trap_init().
-*/
-   set_intr_gate_ist(X86_TRAP_DB, , DEBUG_STACK);
-   /* int3 can be called from all */
-   set_system_intr_gate_ist(X86_TRAP_BP, , DEBUG_STACK);
+   idt_setup_ist_traps();
 
x86_init.irqs.trap_init();
 




[patch V3 26/44] x86/gdt: Use bitfields for initialization

2017-08-28 Thread Thomas Gleixner
The GDT entry related code uses partially bitfields and macros which
initialize the two 16 bit parts of the entry by magic shift and mask
operations.

Clean it up and use the bitfields to initialize and access entries.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/entry/vdso/vma.c|2 -
 arch/x86/include/asm/desc.h  |   26 ++-
 arch/x86/include/asm/desc_defs.h |   44 +--
 arch/x86/math-emu/fpu_system.h   |2 -
 4 files changed, 38 insertions(+), 36 deletions(-)

--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -351,7 +351,7 @@ static void vgetcpu_cpu_init(void *arg)
 * and 8 bits for the node)
 */
d.limit0 = cpu | ((node & 0xf) << 12);
-   d.limit = node >> 4;
+   d.limit1 = node >> 4;
d.type = 5; /* RO data, expand down, accessed */
d.dpl = 3;  /* Visible to user code */
d.s = 1;/* Not a system segment */
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -23,7 +23,7 @@ static inline void fill_ldt(struct desc_
desc->s = 1;
desc->dpl   = 0x3;
desc->p = info->seg_not_present ^ 1;
-   desc->limit = (info->limit & 0xf) >> 16;
+   desc->limit1= (info->limit & 0xf) >> 16;
desc->avl   = info->useable;
desc->d = info->seg_32bit;
desc->g = info->limit_in_pages;
@@ -170,14 +170,20 @@ static inline void pack_descriptor(struc
   unsigned long limit, unsigned char type,
   unsigned char flags)
 {
-   desc->a = ((base & 0x) << 16) | (limit & 0x);
-   desc->b = (base & 0xff00) | ((base & 0xff) >> 16) |
-   (limit & 0x000f) | ((type & 0xff) << 8) |
-   ((flags & 0xf) << 20);
-   desc->p = 1;
+   desc->limit0= (u16) limit;
+   desc->base0 = (u16) base;
+   desc->base1 = (base >> 16) & 0xFF;
+   desc->type  = type & 0x0F;
+   desc->s = 0;
+   desc->dpl   = 0;
+   desc->p = 1;
+   desc->limit1= (limit >> 16) & 0xF;
+   desc->avl   = (flags >> 0) & 0x01;
+   desc->l = (flags >> 1) & 0x01;
+   desc->d = (flags >> 2) & 0x01;
+   desc->g = (flags >> 3) & 0x01;
 }
 
-
 static inline void set_tssldt_descriptor(void *d, unsigned long addr,
 unsigned type, unsigned size)
 {
@@ -195,7 +201,7 @@ static inline void set_tssldt_descriptor
desc->base2 = (addr >> 24) & 0xFF;
desc->base3 = (u32) (addr >> 32);
 #else
-   pack_descriptor((struct desc_struct *)d, addr, size, 0x80 | type, 0);
+   pack_descriptor((struct desc_struct *)d, addr, size, type, 0);
 #endif
 }
 
@@ -395,13 +401,13 @@ static inline void set_desc_base(struct
 
 static inline unsigned long get_desc_limit(const struct desc_struct *desc)
 {
-   return desc->limit0 | (desc->limit << 16);
+   return desc->limit0 | (desc->limit1 << 16);
 }
 
 static inline void set_desc_limit(struct desc_struct *desc, unsigned long 
limit)
 {
desc->limit0 = limit & 0x;
-   desc->limit = (limit >> 16) & 0xf;
+   desc->limit1 = (limit >> 16) & 0xf;
 }
 
 #ifdef CONFIG_X86_64
--- a/arch/x86/include/asm/desc_defs.h
+++ b/arch/x86/include/asm/desc_defs.h
@@ -11,34 +11,30 @@
 
 #include 
 
-/*
- * FIXME: Accessing the desc_struct through its fields is more elegant,
- * and should be the one valid thing to do. However, a lot of open code
- * still touches the a and b accessors, and doing this allow us to do it
- * incrementally. We keep the signature as a struct, rather than a union,
- * so we can get rid of it transparently in the future -- glommer
- */
 /* 8 byte segment descriptor */
 struct desc_struct {
-   union {
-   struct {
-   unsigned int a;
-   unsigned int b;
-   };
-   struct {
-   u16 limit0;
-   u16 base0;
-   unsigned base1: 8, type: 4, s: 1, dpl: 2, p: 1;
-   unsigned limit: 4, avl: 1, l: 1, d: 1, g: 1, base2: 8;
-   };
-   };
+   u16 limit0;
+   u16 base0;
+   u16 base1: 8, type: 4, s: 1, dpl: 2, p: 1;
+   u16 limit1: 4, avl: 1, l: 1, d: 1, g: 1, base2: 8;
 } __attribute__((packed));
 
-#define GDT_ENTRY_INIT(flags, base, limit) { { { \
-   .a = ((limit) & 0x) | (((base) & 0x) << 16), \
-   .b = (((base) & 0xff) >> 16) | (((flags) & 0xf0ff) << 8) | \
-   ((limit) & 0xf) | ((base) & 

[patch V3 34/44] x86/idt: Prepare for table based init

2017-08-28 Thread Thomas Gleixner
The IDT setup code is handled in several places. All of them use variants
of set_intr_gate() inlines. This can be done with a table based
initialization, which allows to reduce the inline zoo and puts all IDT
related code and information into a single place.

Add the infrastructure.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/idt.c |   67 ++
 1 file changed, 67 insertions(+)

--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -5,8 +5,49 @@
  */
 #include 
 
+#include 
+#include 
 #include 
 
+struct idt_data {
+   unsigned intvector;
+   unsigned intsegment;
+   struct idt_bits bits;
+   const void  *addr;
+};
+
+#define DPL0   0x0
+#define DPL3   0x3
+
+#define DEFAULT_STACK  0
+
+#define G(_vector, _addr, _ist, _type, _dpl, _segment) \
+   {   \
+   .vector = _vector,  \
+   .bits.ist   = _ist, \
+   .bits.type  = _type,\
+   .bits.dpl   = _dpl, \
+   .bits.p = 1,\
+   .addr   = _addr,\
+   .segment= _segment, \
+   }
+
+/* Interrupt gate */
+#define INTG(_vector, _addr)   \
+   G(_vector, _addr, DEFAULT_STACK, GATE_INTERRUPT, DPL0, __KERNEL_CS)
+
+/* System interrupt gate */
+#define SYSG(_vector, _addr)   \
+   G(_vector, _addr, DEFAULT_STACK, GATE_INTERRUPT, DPL3, __KERNEL_CS)
+
+/* Interrupt gate with interrupt stack */
+#define ISTG(_vector, _addr, _ist) \
+   G(_vector, _addr, _ist, GATE_INTERRUPT, DPL0, __KERNEL_CS)
+
+/* Task gate */
+#define TSKG(_vector, _gdt)\
+   G(_vector, NULL, DEFAULT_STACK, GATE_TASK, DPL0, _gdt << 3)
+
 /* Must be page-aligned because the real IDT is used in a fixmap. */
 gate_desc idt_table[IDT_ENTRIES] __page_aligned_bss;
 
@@ -25,6 +66,32 @@ const struct desc_ptr debug_idt_descr =
 };
 #endif
 
+static inline void idt_init_desc(gate_desc *gate, const struct idt_data *d)
+{
+   unsigned long addr = (unsigned long) d->addr;
+
+   gate->offset_low= (u16) addr;
+   gate->segment   = (u16) d->segment;
+   gate->bits  = d->bits;
+   gate->offset_middle = (u16) (addr >> 16);
+#ifdef CONFIG_X86_64
+   gate->offset_high   = (u32) (addr >> 32);
+   gate->reserved  = 0;
+#endif
+}
+
+static __init void
+idt_setup_from_table(gate_desc *idt, const struct idt_data *t, int size)
+{
+   gate_desc desc;
+
+   for (; size > 0; t++, size--) {
+   idt_init_desc(, t);
+   set_bit(t->vector, used_vectors);
+   write_idt_entry(idt, t->vector, );
+   }
+}
+
 /**
  * idt_setup_early_handler - Initializes the idt table with early handlers
  */




[patch V3 36/44] x86/idt: Move debug stack init to table based

2017-08-28 Thread Thomas Gleixner
Add the debug_idt init table and make use of it.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/desc.h |2 ++
 arch/x86/kernel/idt.c   |   23 +++
 arch/x86/kernel/traps.c |6 +-
 3 files changed, 26 insertions(+), 5 deletions(-)

--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -509,8 +509,10 @@ extern void idt_setup_early_traps(void);
 
 #ifdef CONFIG_X86_64
 extern void idt_setup_early_pf(void);
+extern void idt_setup_debugidt_traps(void);
 #else
 static inline void idt_setup_early_pf(void) { }
+static inline void idt_setup_debugidt_traps(void) { }
 #endif
 
 extern void idt_invalidate(void *addr);
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -68,6 +68,15 @@ static const __initdata struct idt_data
 static const __initdata struct idt_data early_pf_idts[] = {
INTG(X86_TRAP_PF,   page_fault),
 };
+
+/*
+ * Override for the debug_idt. Same as the default, but with interrupt
+ * stack set to DEFAULT_STACK (0). Required for NMI trap handling.
+ */
+static const __initdata struct idt_data dbg_idts[] = {
+   INTG(X86_TRAP_DB,   debug),
+   INTG(X86_TRAP_BP,   int3),
+};
 #endif
 
 /* Must be page-aligned because the real IDT is used in a fixmap. */
@@ -82,6 +91,10 @@ struct desc_ptr idt_descr __ro_after_ini
 /* No need to be aligned, but done to keep all IDTs defined the same way. */
 gate_desc debug_idt_table[IDT_ENTRIES] __page_aligned_bss;
 
+/*
+ * Override for the debug_idt. Same as the default, but with interrupt
+ * stack set to DEFAULT_STACK (0). Required for NMI trap handling.
+ */
 const struct desc_ptr debug_idt_descr = {
.size   = IDT_ENTRIES * 16 - 1,
.address= (unsigned long) debug_idt_table,
@@ -143,6 +156,16 @@ void __init idt_setup_early_pf(void)
idt_setup_from_table(idt_table, early_pf_idts,
 ARRAY_SIZE(early_pf_idts));
 }
+
+/**
+ * idt_setup_debugidt_traps - Initialize the debug idt table with debug traps
+ */
+void __init idt_setup_debugidt_traps(void)
+{
+   memcpy(_idt_table, _table, IDT_ENTRIES * 16);
+
+   idt_setup_from_table(debug_idt_table, dbg_idts, ARRAY_SIZE(dbg_idts));
+}
 #endif
 
 /**
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -990,9 +990,5 @@ void __init trap_init(void)
 
x86_init.irqs.trap_init();
 
-#ifdef CONFIG_X86_64
-   memcpy(_idt_table, _table, IDT_ENTRIES * 16);
-   set_nmi_gate(X86_TRAP_DB, );
-   set_nmi_gate(X86_TRAP_BP, );
-#endif
+   idt_setup_debugidt_traps();
 }




[patch V3 27/44] x86/ldttss: Cleanup 32bit descriptors

2017-08-28 Thread Thomas Gleixner
Like the IDT descriptors the LDT/TSS descriptors are pointlessly different
on 32 and 64 bit.

Unify them and get rid of the duplicated code.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/desc.h  |   26 +++---
 arch/x86/include/asm/desc_defs.h |   27 ---
 2 files changed, 15 insertions(+), 38 deletions(-)

--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -166,42 +166,22 @@ native_write_gdt_entry(struct desc_struc
memcpy([entry], desc, size);
 }
 
-static inline void pack_descriptor(struct desc_struct *desc, unsigned long 
base,
-  unsigned long limit, unsigned char type,
-  unsigned char flags)
-{
-   desc->limit0= (u16) limit;
-   desc->base0 = (u16) base;
-   desc->base1 = (base >> 16) & 0xFF;
-   desc->type  = type & 0x0F;
-   desc->s = 0;
-   desc->dpl   = 0;
-   desc->p = 1;
-   desc->limit1= (limit >> 16) & 0xF;
-   desc->avl   = (flags >> 0) & 0x01;
-   desc->l = (flags >> 1) & 0x01;
-   desc->d = (flags >> 2) & 0x01;
-   desc->g = (flags >> 3) & 0x01;
-}
-
 static inline void set_tssldt_descriptor(void *d, unsigned long addr,
 unsigned type, unsigned size)
 {
-#ifdef CONFIG_X86_64
-   struct ldttss_desc64 *desc = d;
+   struct ldttss_desc *desc = d;
 
memset(desc, 0, sizeof(*desc));
 
-   desc->limit0= size & 0x;
+   desc->limit0= (u16) size;
desc->base0 = (u16) addr;
desc->base1 = (addr >> 16) & 0xFF;
desc->type  = type;
desc->p = 1;
desc->limit1= (size >> 16) & 0xF;
desc->base2 = (addr >> 24) & 0xFF;
+#ifdef CONFIG_X86_64
desc->base3 = (u32) (addr >> 32);
-#else
-   pack_descriptor((struct desc_struct *)d, addr, size, type, 0);
 #endif
 }
 
--- a/arch/x86/include/asm/desc_defs.h
+++ b/arch/x86/include/asm/desc_defs.h
@@ -49,24 +49,21 @@ enum {
DESCTYPE_S = 0x10,  /* !system */
 };
 
-/* LDT or TSS descriptor in the GDT. 16 bytes. */
-struct ldttss_desc64 {
-   u16 limit0;
-   u16 base0;
-   unsigned base1 : 8, type : 5, dpl : 2, p : 1;
-   unsigned limit1 : 4, zero0 : 3, g : 1, base2 : 8;
-   u32 base3;
-   u32 zero1;
-} __attribute__((packed));
-
+/* LDT or TSS descriptor in the GDT. */
+struct ldttss_desc {
+   u16 limit0;
+   u16 base0;
 
+   u16 base1 : 8, type : 5, dpl : 2, p : 1;
+   u16 limit1 : 4, zero0 : 3, g : 1, base2 : 8;
 #ifdef CONFIG_X86_64
-typedef struct ldttss_desc64 ldt_desc;
-typedef struct ldttss_desc64 tss_desc;
-#else
-typedef struct desc_struct ldt_desc;
-typedef struct desc_struct tss_desc;
+   u32 base3;
+   u32 zero1;
 #endif
+} __attribute__((packed));
+
+typedef struct ldttss_desc ldt_desc;
+typedef struct ldttss_desc tss_desc;
 
 struct idt_bits {
u16 ist : 3,




[patch V3 40/44] x86/idt: Move interrupt gate initialization to IDT code

2017-08-28 Thread Thomas Gleixner
Move the gate intialization from interrupt init to the IDT code so all IDT
related operations are at a single place.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/idt.c |   18 ++
 arch/x86/kernel/irqinit.c |   18 --
 2 files changed, 18 insertions(+), 18 deletions(-)

--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -286,7 +286,25 @@ void __init idt_setup_debugidt_traps(voi
  */
 void __init idt_setup_apic_and_irq_gates(void)
 {
+   int i = FIRST_EXTERNAL_VECTOR;
+   void *entry;
+
idt_setup_from_table(idt_table, apic_idts, ARRAY_SIZE(apic_idts));
+
+   for_each_clear_bit_from(i, used_vectors, FIRST_SYSTEM_VECTOR) {
+   entry = irq_entries_start + 8 * (i - FIRST_EXTERNAL_VECTOR);
+   set_intr_gate(i, entry);
+   }
+
+   for_each_clear_bit_from(i, used_vectors, NR_VECTORS) {
+#ifdef CONFIG_X86_LOCAL_APIC
+   set_bit(i, used_vectors);
+   set_intr_gate(i, spurious_interrupt);
+#else
+   entry = irq_entries_start + 8 * (i - FIRST_EXTERNAL_VECTOR);
+   set_intr_gate(i, entry);
+#endif
+   }
 }
 
 /**
--- a/arch/x86/kernel/irqinit.c
+++ b/arch/x86/kernel/irqinit.c
@@ -89,29 +89,11 @@ void __init init_IRQ(void)
 
 void __init native_init_IRQ(void)
 {
-   int i;
-
/* Execute any quirks before the call gates are initialised: */
x86_init.irqs.pre_vector_init();
 
idt_setup_apic_and_irq_gates();
 
-   /*
-* Cover the whole vector space, no vector can escape
-* us. (some of these will be overridden and become
-* 'special' SMP interrupts)
-*/
-   i = FIRST_EXTERNAL_VECTOR;
-   for_each_clear_bit_from(i, used_vectors, FIRST_SYSTEM_VECTOR) {
-   /* IA32_SYSCALL_VECTOR could be used in trap_init already. */
-   set_intr_gate(i, irq_entries_start +
-   8 * (i - FIRST_EXTERNAL_VECTOR));
-   }
-#ifdef CONFIG_X86_LOCAL_APIC
-   for_each_clear_bit_from(i, used_vectors, NR_VECTORS)
-   set_intr_gate(i, spurious_interrupt);
-#endif
-
if (!acpi_ioapic && !of_ioapic && nr_legacy_irqs())
setup_irq(2, );
 




Re: [PATCH v2 1/3] media: V3s: Add support for Allwinner CSI.

2017-08-28 Thread Yong
Hi Maxime,

On Fri, 25 Aug 2017 15:41:14 +0200
Maxime Ripard  wrote:

> Hi Yong,
> 
> On Wed, Aug 23, 2017 at 10:32:16AM +0800, Yong wrote:
> > > > > > +static int sun6i_graph_notify_complete(struct v4l2_async_notifier 
> > > > > > *notifier)
> > > > > > +{
> > > > > > +   struct sun6i_csi *csi =
> > > > > > +   container_of(notifier, struct sun6i_csi, 
> > > > > > notifier);
> > > > > > +   struct sun6i_graph_entity *entity;
> > > > > > +   int ret;
> > > > > > +
> > > > > > +   dev_dbg(csi->dev, "notify complete, all subdevs registered\n");
> > > > > > +
> > > > > > +   /* Create links for every entity. */
> > > > > > +   list_for_each_entry(entity, >entities, list) {
> > > > > > +   ret = sun6i_graph_build_one(csi, entity);
> > > > > > +   if (ret < 0)
> > > > > > +   return ret;
> > > > > > +   }
> > > > > > +
> > > > > > +   /* Create links for video node. */
> > > > > > +   ret = sun6i_graph_build_video(csi);
> > > > > > +   if (ret < 0)
> > > > > > +   return ret;
> > > > > 
> > > > > Can you elaborate a bit on the difference between a node parsed with
> > > > > _graph_build_one and _graph_build_video? Can't you just store the
> > > > > remote sensor when you build the notifier, and reuse it here?
> > > > 
> > > > There maybe many usercases:
> > > > 1. CSI->Sensor.
> > > > 2. CSI->MIPI->Sensor.
> > > > 3. CSI->FPGA->Sensor1
> > > > ->Sensor2.
> > > > FPGA maybe some other video processor. FPGA, MIPI, Sensor can be
> > > > registered as v4l2 subdevs. We do not care about the driver code
> > > > of them. But they should be linked together here.
> > > > 
> > > > So, the _graph_build_one is used to link CSI port and subdevs. 
> > > > _graph_build_video is used to link CSI port and video node.
> > > 
> > > So the graph_build_one is for the two first cases, and the
> > > _build_video for the latter case?
> > 
> > No. 
> > The _graph_build_one is used to link the subdevs found in the device 
> > tree. _build_video is used to link the closest subdev to video node.
> > Video node is created in the driver, so the method to get it's pad is
> > diffrent to the subdevs.
> 
> Sorry for being slow here, I'm still not sure I get it.
> 
> In summary, both the sun6i_graph_build_one and sun6i_graph_build_video
> will iterate over each endpoint, will retrieve the remote entity, and
> will create the media link between the CSI pad and the remote pad.
> 
> As far as I can see, there's basically two things that
> sun6i_graph_build_one does that sun6i_graph_build_video doesn't:
>   - It skips all the links that would connect to one of the CSI sinks
>   - It skips all the links that would connect to a remote node that is
> equal to the CSI node.
> 
> I assume the latter is because you want to avoid going in an infinite
> loop when you would follow one of the CSI endpoint (going to the
> sensor), and then follow back the same link in the opposite
> direction. Right?

Not exactly. But any way, some code is true redundant here. I will 
make some improve.

> 
> I'm confused about the first one though. All the pads you create in
> your driver are sink pads, so wouldn't that skip all the pads of the
> CSI nodes?
> 
> Also, why do you iterate on all the CSI endpoints, when there's only
> of them? You want to anticipate the future binding for devices with
> multiple channels?
> 
> > > 
> > > If so, you should take a look at the last iteration of the
> > > subnotifiers rework by Nikas Söderlund (v4l2-async: add subnotifier
> > > registration for subdevices).
> > > 
> > > It allows subdevs to register notifiers, and you don't have to build
> > > the graph from the video device, each device and subdev can only care
> > > about what's next in the pipeline, but not really what's behind it.
> > > 
> > > That would mean in your case that you can only deal with your single
> > > CSI pad, and whatever subdev driver will use it care about its own.
> > 
> > Do you mean the subdevs create pad link in the notifier registered by
> > themself ?
> 
> Yes.
> 
> Thanks!
> Maxime
> 
> -- 
> Maxime Ripard, Free Electrons
> Embedded Linux and Kernel engineering
> http://free-electrons.com


Thanks,
Yong


Re: [PATCH] Kbuild: enable -Wunused-macros warning for "make W=1"

2017-08-28 Thread Johannes Thumshirn
On Fri, Aug 25, 2017 at 01:19:39AM +0900, Masahiro Yamada wrote:
> This makes W=1 too noisy.
> 
> For example, drivers often define unused register macros
> for completeness.  I do not think it is too bad in my opinion.
> 
> Perhaps, should it be moved to warning-2 ?
> 

Sure, I'll send a v2 in a few days.

Thanks,
Johannes

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


[patch V3 28/44] x86/idt: Create file for IDT related code

2017-08-28 Thread Thomas Gleixner
IDT related code lives in different places. Create a new source file to
hold it.

Move the idt_tables and descriptors to it for a start. Follow up patches
will gradually move more code over.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/Makefile |2 +-
 arch/x86/kernel/cpu/common.c |9 -
 arch/x86/kernel/idt.c|   26 ++
 arch/x86/kernel/traps.c  |6 --
 4 files changed, 27 insertions(+), 16 deletions(-)

--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -42,7 +42,7 @@ CFLAGS_irq.o := -I$(src)/../include/asm/
 
 obj-y  := process_$(BITS).o signal.o
 obj-$(CONFIG_COMPAT)   += signal_compat.o
-obj-y  += traps.o irq.o irq_$(BITS).o dumpstack_$(BITS).o
+obj-y  += traps.o idt.o irq.o irq_$(BITS).o dumpstack_$(BITS).o
 obj-y  += time.o ioport.o dumpstack.o nmi.o
 obj-$(CONFIG_MODIFY_LDT_SYSCALL)   += ldt.o
 obj-y  += setup.o x86_init.o i8259.o irqinit.o jump_label.o
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1289,15 +1289,6 @@ static __init int setup_disablecpuid(cha
 __setup("clearcpuid=", setup_disablecpuid);
 
 #ifdef CONFIG_X86_64
-struct desc_ptr idt_descr __ro_after_init = {
-   .size = NR_VECTORS * 16 - 1,
-   .address = (unsigned long) idt_table,
-};
-const struct desc_ptr debug_idt_descr = {
-   .size = NR_VECTORS * 16 - 1,
-   .address = (unsigned long) debug_idt_table,
-};
-
 DEFINE_PER_CPU_FIRST(union irq_stack_union,
 irq_stack_union) __aligned(PAGE_SIZE) __visible;
 
--- /dev/null
+++ b/arch/x86/kernel/idt.c
@@ -0,0 +1,26 @@
+/*
+ * Interrupt descriptor table related code
+ *
+ * This file is licensed under the GPL V2
+ */
+#include 
+
+#include 
+
+/* Must be page-aligned because the real IDT is used in a fixmap. */
+gate_desc idt_table[IDT_ENTRIES] __page_aligned_bss;
+
+#ifdef CONFIG_X86_64
+/* No need to be aligned, but done to keep all IDTs defined the same way. */
+gate_desc debug_idt_table[IDT_ENTRIES] __page_aligned_bss;
+
+struct desc_ptr idt_descr __ro_after_init = {
+   .size   = IDT_ENTRIES * 16 - 1,
+   .address= (unsigned long) idt_table,
+};
+
+const struct desc_ptr debug_idt_descr = {
+   .size   = IDT_ENTRIES * 16 - 1,
+   .address= (unsigned long) debug_idt_table,
+};
+#endif
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -65,18 +65,12 @@
 #include 
 #include 
 #include 
-
-/* No need to be aligned, but done to keep all IDTs defined the same way. */
-gate_desc debug_idt_table[NR_VECTORS] __page_aligned_bss;
 #else
 #include 
 #include 
 #include 
 #endif
 
-/* Must be page-aligned because the real IDT is used in a fixmap. */
-gate_desc idt_table[NR_VECTORS] __page_aligned_bss;
-
 DECLARE_BITMAP(used_vectors, NR_VECTORS);
 
 static inline void cond_local_irq_enable(struct pt_regs *regs)




[patch V3 12/44] x86/irqwork: Get rid of duplicated tracing interrupt code

2017-08-28 Thread Thomas Gleixner
Two NOP5 are a reasonable tradeoff to avoid duplicated code and the
requirement to switch the IDT.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/hw_irq.h |2 +-
 arch/x86/kernel/irq_work.c|   16 ++--
 2 files changed, 3 insertions(+), 15 deletions(-)

--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -48,13 +48,13 @@ extern asmlinkage void call_function_sin
 
 #ifdef CONFIG_TRACING
 /* Interrupt handlers registered during init_IRQ */
-extern void trace_irq_work_interrupt(void);
 extern void trace_thermal_interrupt(void);
 extern void trace_reschedule_interrupt(void);
 extern void trace_threshold_interrupt(void);
 extern void trace_deferred_error_interrupt(void);
 extern void trace_call_function_interrupt(void);
 extern void trace_call_function_single_interrupt(void);
+#define trace_irq_work_interrupt irq_work_interrupt
 #define trace_error_interrupt error_interrupt
 #define trace_spurious_interrupt spurious_interrupt
 #define trace_x86_platform_ipi x86_platform_ipi
--- a/arch/x86/kernel/irq_work.c
+++ b/arch/x86/kernel/irq_work.c
@@ -11,24 +11,12 @@
 #include 
 #include 
 
-static inline void __smp_irq_work_interrupt(void)
-{
-   inc_irq_stat(apic_irq_work_irqs);
-   irq_work_run();
-}
-
 __visible void __irq_entry smp_irq_work_interrupt(struct pt_regs *regs)
 {
ipi_entering_ack_irq();
-   __smp_irq_work_interrupt();
-   exiting_irq();
-}
-
-__visible void __irq_entry smp_trace_irq_work_interrupt(struct pt_regs *regs)
-{
-   ipi_entering_ack_irq();
trace_irq_work_entry(IRQ_WORK_VECTOR);
-   __smp_irq_work_interrupt();
+   inc_irq_stat(apic_irq_work_irqs);
+   irq_work_run();
trace_irq_work_exit(IRQ_WORK_VECTOR);
exiting_irq();
 }




[patch V3 17/44] x86/idt: Cleanup the i386 low level entry macros

2017-08-28 Thread Thomas Gleixner
Some of the entry function defines for i386 were explictely using the
BUILD_INTERRUPT3() macro to prevent that the extra trace entry got added
via BUILD_INTERRUPT(). No that the trace cruft is gone, the file can be
cleaned up and converted to use BUILD_INTERRUPT() which avoids the ugly
line breaks.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/entry_arch.h |   14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

--- a/arch/x86/include/asm/entry_arch.h
+++ b/arch/x86/include/asm/entry_arch.h
@@ -13,20 +13,16 @@
 BUILD_INTERRUPT(reschedule_interrupt,RESCHEDULE_VECTOR)
 BUILD_INTERRUPT(call_function_interrupt,CALL_FUNCTION_VECTOR)
 BUILD_INTERRUPT(call_function_single_interrupt,CALL_FUNCTION_SINGLE_VECTOR)
-BUILD_INTERRUPT3(irq_move_cleanup_interrupt, IRQ_MOVE_CLEANUP_VECTOR,
-smp_irq_move_cleanup_interrupt)
-BUILD_INTERRUPT3(reboot_interrupt, REBOOT_VECTOR, smp_reboot_interrupt)
+BUILD_INTERRUPT(irq_move_cleanup_interrupt, IRQ_MOVE_CLEANUP_VECTOR)
+BUILD_INTERRUPT(reboot_interrupt, REBOOT_VECTOR)
 #endif
 
 BUILD_INTERRUPT(x86_platform_ipi, X86_PLATFORM_IPI_VECTOR)
 
 #ifdef CONFIG_HAVE_KVM
-BUILD_INTERRUPT3(kvm_posted_intr_ipi, POSTED_INTR_VECTOR,
-smp_kvm_posted_intr_ipi)
-BUILD_INTERRUPT3(kvm_posted_intr_wakeup_ipi, POSTED_INTR_WAKEUP_VECTOR,
-smp_kvm_posted_intr_wakeup_ipi)
-BUILD_INTERRUPT3(kvm_posted_intr_nested_ipi, POSTED_INTR_NESTED_VECTOR,
-smp_kvm_posted_intr_nested_ipi)
+BUILD_INTERRUPT(kvm_posted_intr_ipi, POSTED_INTR_VECTOR)
+BUILD_INTERRUPT(kvm_posted_intr_wakeup_ipi, POSTED_INTR_WAKEUP_VECTOR)
+BUILD_INTERRUPT(kvm_posted_intr_nested_ipi, POSTED_INTR_NESTED_VECTOR)
 #endif
 
 /*




[patch V3 07/44] x86/traps: Simplify pagefault tracing logic

2017-08-28 Thread Thomas Gleixner
Make use of the new irqvector tracing static key and remove the duplicated
trace_do_pagefault() implementation.

If irq vector tracing is disabled, then the overhead of this is a single
NOP5, which is a reasonable tradeoff to avoid duplicated code and the
unholy macro mess.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/entry/entry_32.S|8 ---
 arch/x86/entry/entry_64.S|   13 ---
 arch/x86/include/asm/traps.h |   10 
 arch/x86/kernel/kvm.c|2 -
 arch/x86/mm/fault.c  |   49 +++
 5 files changed, 16 insertions(+), 66 deletions(-)

--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -891,14 +891,6 @@ BUILD_INTERRUPT3(hyperv_callback_vector,
 
 #endif /* CONFIG_HYPERV */
 
-#ifdef CONFIG_TRACING
-ENTRY(trace_page_fault)
-   ASM_CLAC
-   pushl   $trace_do_page_fault
-   jmp common_exception
-END(trace_page_fault)
-#endif
-
 ENTRY(page_fault)
ASM_CLAC
pushl   $do_page_fault
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -918,17 +918,6 @@ ENTRY(\sym)
 END(\sym)
 .endm
 
-#ifdef CONFIG_TRACING
-.macro trace_idtentry sym do_sym has_error_code:req
-idtentry trace(\sym) trace(\do_sym) has_error_code=\has_error_code
-idtentry \sym \do_sym has_error_code=\has_error_code
-.endm
-#else
-.macro trace_idtentry sym do_sym has_error_code:req
-idtentry \sym \do_sym has_error_code=\has_error_code
-.endm
-#endif
-
 idtentry divide_error  do_divide_error 
has_error_code=0
 idtentry overflow  do_overflow 
has_error_code=0
 idtentry boundsdo_bounds   
has_error_code=0
@@ -1096,7 +1085,7 @@ idtentry xen_stack_segmentdo_stack_segm
 #endif
 
 idtentry general_protectiondo_general_protection   has_error_code=1
-trace_idtentry page_fault  do_page_fault   has_error_code=1
+idtentry page_faultdo_page_fault   has_error_code=1
 
 #ifdef CONFIG_KVM_GUEST
 idtentry async_page_fault  do_async_page_fault has_error_code=1
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -39,7 +39,6 @@ asmlinkage void machine_check(void);
 asmlinkage void simd_coprocessor_error(void);
 
 #ifdef CONFIG_TRACING
-asmlinkage void trace_page_fault(void);
 #define trace_stack_segment stack_segment
 #define trace_divide_error divide_error
 #define trace_bounds bounds
@@ -54,6 +53,7 @@ asmlinkage void trace_page_fault(void);
 #define trace_alignment_check alignment_check
 #define trace_simd_coprocessor_error simd_coprocessor_error
 #define trace_async_page_fault async_page_fault
+#define trace_page_fault page_fault
 #endif
 
 dotraplinkage void do_divide_error(struct pt_regs *, long);
@@ -74,14 +74,6 @@ asmlinkage struct pt_regs *sync_regs(str
 #endif
 dotraplinkage void do_general_protection(struct pt_regs *, long);
 dotraplinkage void do_page_fault(struct pt_regs *, unsigned long);
-#ifdef CONFIG_TRACING
-dotraplinkage void trace_do_page_fault(struct pt_regs *, unsigned long);
-#else
-static inline void trace_do_page_fault(struct pt_regs *regs, unsigned long 
error)
-{
-   do_page_fault(regs, error);
-}
-#endif
 dotraplinkage void do_spurious_interrupt_bug(struct pt_regs *, long);
 dotraplinkage void do_coprocessor_error(struct pt_regs *, long);
 dotraplinkage void do_alignment_check(struct pt_regs *, long);
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -263,7 +263,7 @@ do_async_page_fault(struct pt_regs *regs
 
switch (kvm_read_and_reset_pf_reason()) {
default:
-   trace_do_page_fault(regs, error_code);
+   do_page_fault(regs, error_code);
break;
case KVM_PV_REASON_PAGE_NOT_PRESENT:
/* page is swapped out by the host. */
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1254,10 +1254,6 @@ static inline bool smap_violation(int er
  * This routine handles page faults.  It determines the address,
  * and the problem, and then passes it off to one of the appropriate
  * routines.
- *
- * This function must have noinline because both callers
- * {,trace_}do_page_fault() have notrace on. Having this an actual function
- * guarantees there's a function trace entry.
  */
 static noinline void
 __do_page_fault(struct pt_regs *regs, unsigned long error_code,
@@ -1490,27 +1486,6 @@ static noinline void
 }
 NOKPROBE_SYMBOL(__do_page_fault);
 
-dotraplinkage void notrace
-do_page_fault(struct pt_regs *regs, unsigned long error_code)
-{
-   unsigned long address = read_cr2(); /* Get the faulting address */
-   enum ctx_state prev_state;
-
-   /*
-* We must have this function tagged with __kprobes, notrace and call
-* read_cr2() before calling anything else. To avoid calling any kind
-* of tracing machinery before we've observed the CR2 value.
-*
-* 

[patch V3 08/44] x86/apic: Remove the duplicated tracing version of local_timer_interrupt

2017-08-28 Thread Thomas Gleixner
The two NOP5 are noise in the rest of the work which is done by the timer
interrupt and modern CPUs are pretty good in optimizing nops.

Get rid of the interrupt handler duplication and move the tracepoints into
the regular handler.

Signed-off-by: Thomas Gleixner 
Reviewed-by: Steven Rostedt (VMware) 

---
 arch/x86/include/asm/hw_irq.h |2 +-
 arch/x86/kernel/apic/apic.c   |   19 ---
 2 files changed, 1 insertion(+), 20 deletions(-)

--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -48,7 +48,6 @@ extern asmlinkage void call_function_sin
 
 #ifdef CONFIG_TRACING
 /* Interrupt handlers registered during init_IRQ */
-extern void trace_apic_timer_interrupt(void);
 extern void trace_x86_platform_ipi(void);
 extern void trace_error_interrupt(void);
 extern void trace_irq_work_interrupt(void);
@@ -59,6 +58,7 @@ extern void trace_threshold_interrupt(vo
 extern void trace_deferred_error_interrupt(void);
 extern void trace_call_function_interrupt(void);
 extern void trace_call_function_single_interrupt(void);
+#define trace_apic_timer_interrupt apic_timer_interrupt
 #define trace_irq_move_cleanup_interrupt  irq_move_cleanup_interrupt
 #define trace_reboot_interrupt  reboot_interrupt
 #define trace_kvm_posted_intr_ipi kvm_posted_intr_ipi
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1038,25 +1038,6 @@ static void local_apic_timer_interrupt(v
 * interrupt lock, which is the WrongThing (tm) to do.
 */
entering_ack_irq();
-   local_apic_timer_interrupt();
-   exiting_irq();
-
-   set_irq_regs(old_regs);
-}
-
-__visible void __irq_entry smp_trace_apic_timer_interrupt(struct pt_regs *regs)
-{
-   struct pt_regs *old_regs = set_irq_regs(regs);
-
-   /*
-* NOTE! We'd better ACK the irq immediately,
-* because timer handling can be slow.
-*
-* update_process_times() expects us to have done irq_enter().
-* Besides, if we don't timer interrupts ignore the global
-* interrupt lock, which is the WrongThing (tm) to do.
-*/
-   entering_ack_irq();
trace_local_timer_entry(LOCAL_TIMER_VECTOR);
local_apic_timer_interrupt();
trace_local_timer_exit(LOCAL_TIMER_VECTOR);




[PATCH 0/3] constify ux500 clk_ops.

2017-08-28 Thread Arvind Yadav
clk_ops are not supposed to change at runtime. All functions
working with clk_ops provided by  work
with const clk_ops. So mark the non-const clk_ops as const.

Here, Function "clk_reg_prcc" is used to initialized clk_init_data.
clk_init_data is working with const clk_ops. So make clk_reg_prcc
non-const clk_ops argument as const.

Arvind Yadav (3):
  [PATCH 1/3] clk: ux500: prcmu: constify clk_ops.
  [PATCH 2/3] clk: ux500: sysctrl: constify clk_ops.
  [PATCH 3/3] clk: ux500: prcc: constify clk_ops.

 drivers/clk/ux500/clk-prcc.c|  6 +++---
 drivers/clk/ux500/clk-prcmu.c   | 14 +++---
 drivers/clk/ux500/clk-sysctrl.c |  8 
 3 files changed, 14 insertions(+), 14 deletions(-)

-- 
1.9.1



[patch V3 10/44] x86/irq: Get rid of duplicated trace_x86_platform_ipi() code

2017-08-28 Thread Thomas Gleixner
Two NOP5 are really a good tradeoff vs. the unholy IDT switching mess,
which duplicates code all over the place.

Signed-off-by: Thomas Gleixner 
Reviewed-by: Steven Rostedt (VMware) 
---
 arch/x86/include/asm/hw_irq.h |2 +-
 arch/x86/kernel/irq.c |   25 +
 2 files changed, 6 insertions(+), 21 deletions(-)

--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -48,7 +48,6 @@ extern asmlinkage void call_function_sin
 
 #ifdef CONFIG_TRACING
 /* Interrupt handlers registered during init_IRQ */
-extern void trace_x86_platform_ipi(void);
 extern void trace_error_interrupt(void);
 extern void trace_irq_work_interrupt(void);
 extern void trace_spurious_interrupt(void);
@@ -58,6 +57,7 @@ extern void trace_threshold_interrupt(vo
 extern void trace_deferred_error_interrupt(void);
 extern void trace_call_function_interrupt(void);
 extern void trace_call_function_single_interrupt(void);
+#define trace_x86_platform_ipi x86_platform_ipi
 #define trace_apic_timer_interrupt apic_timer_interrupt
 #define trace_irq_move_cleanup_interrupt  irq_move_cleanup_interrupt
 #define trace_reboot_interrupt  reboot_interrupt
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -262,20 +262,16 @@ u64 arch_irq_stat(void)
 /*
  * Handler for X86_PLATFORM_IPI_VECTOR.
  */
-void __smp_x86_platform_ipi(void)
-{
-   inc_irq_stat(x86_platform_ipis);
-
-   if (x86_platform_ipi_callback)
-   x86_platform_ipi_callback();
-}
-
 __visible void __irq_entry smp_x86_platform_ipi(struct pt_regs *regs)
 {
struct pt_regs *old_regs = set_irq_regs(regs);
 
entering_ack_irq();
-   __smp_x86_platform_ipi();
+   trace_x86_platform_ipi_entry(X86_PLATFORM_IPI_VECTOR);
+   inc_irq_stat(x86_platform_ipis);
+   if (x86_platform_ipi_callback)
+   x86_platform_ipi_callback();
+   trace_x86_platform_ipi_exit(X86_PLATFORM_IPI_VECTOR);
exiting_irq();
set_irq_regs(old_regs);
 }
@@ -334,17 +330,6 @@ EXPORT_SYMBOL_GPL(kvm_set_posted_intr_wa
 }
 #endif
 
-__visible void __irq_entry smp_trace_x86_platform_ipi(struct pt_regs *regs)
-{
-   struct pt_regs *old_regs = set_irq_regs(regs);
-
-   entering_ack_irq();
-   trace_x86_platform_ipi_entry(X86_PLATFORM_IPI_VECTOR);
-   __smp_x86_platform_ipi();
-   trace_x86_platform_ipi_exit(X86_PLATFORM_IPI_VECTOR);
-   exiting_irq();
-   set_irq_regs(old_regs);
-}
 
 #ifdef CONFIG_HOTPLUG_CPU
 




[PATCH 3/3] clk: ux500: prcc: constify clk_ops.

2017-08-28 Thread Arvind Yadav
clk_ops are not supposed to change at runtime. All functions
working with clk_ops provided by  work
with const clk_ops. So mark the non-const clk_ops as const.

Here, Function "clk_reg_prcc" is used to initialized clk_init_data.
clk_init_data is working with const clk_ops. So make clk_reg_prcc
non-const clk_ops argument as const.

Signed-off-by: Arvind Yadav 
---
 drivers/clk/ux500/clk-prcc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/clk/ux500/clk-prcc.c b/drivers/clk/ux500/clk-prcc.c
index 0e95076..f505927 100644
--- a/drivers/clk/ux500/clk-prcc.c
+++ b/drivers/clk/ux500/clk-prcc.c
@@ -79,13 +79,13 @@ static int clk_prcc_is_enabled(struct clk_hw *hw)
return clk->is_enabled;
 }
 
-static struct clk_ops clk_prcc_pclk_ops = {
+static const struct clk_ops clk_prcc_pclk_ops = {
.enable = clk_prcc_pclk_enable,
.disable = clk_prcc_pclk_disable,
.is_enabled = clk_prcc_is_enabled,
 };
 
-static struct clk_ops clk_prcc_kclk_ops = {
+static const struct clk_ops clk_prcc_kclk_ops = {
.enable = clk_prcc_kclk_enable,
.disable = clk_prcc_kclk_disable,
.is_enabled = clk_prcc_is_enabled,
@@ -96,7 +96,7 @@ static struct clk *clk_reg_prcc(const char *name,
resource_size_t phy_base,
u32 cg_sel,
unsigned long flags,
-   struct clk_ops *clk_prcc_ops)
+   const struct clk_ops *clk_prcc_ops)
 {
struct clk_prcc *clk;
struct clk_init_data clk_prcc_init;
-- 
1.9.1



[PATCH 2/3] clk: ux500: sysctrl: constify clk_ops.

2017-08-28 Thread Arvind Yadav
clk_ops are not supposed to change at runtime. All functions
working with clk_ops provided by  work
with const clk_ops. So mark the non-const clk_ops as const.

Here, Function "clk_reg_sysctrl" is used to initialized clk_init_data.
clk_init_data is working with const clk_ops. So make clk_reg_sysctrl
non-const clk_ops argument as const.

Signed-off-by: Arvind Yadav 
---
 drivers/clk/ux500/clk-sysctrl.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/clk/ux500/clk-sysctrl.c b/drivers/clk/ux500/clk-sysctrl.c
index 266ddea..8a4e93c 100644
--- a/drivers/clk/ux500/clk-sysctrl.c
+++ b/drivers/clk/ux500/clk-sysctrl.c
@@ -98,18 +98,18 @@ static u8 clk_sysctrl_get_parent(struct clk_hw *hw)
return clk->parent_index;
 }
 
-static struct clk_ops clk_sysctrl_gate_ops = {
+static const struct clk_ops clk_sysctrl_gate_ops = {
.prepare = clk_sysctrl_prepare,
.unprepare = clk_sysctrl_unprepare,
 };
 
-static struct clk_ops clk_sysctrl_gate_fixed_rate_ops = {
+static const struct clk_ops clk_sysctrl_gate_fixed_rate_ops = {
.prepare = clk_sysctrl_prepare,
.unprepare = clk_sysctrl_unprepare,
.recalc_rate = clk_sysctrl_recalc_rate,
 };
 
-static struct clk_ops clk_sysctrl_set_parent_ops = {
+static const struct clk_ops clk_sysctrl_set_parent_ops = {
.set_parent = clk_sysctrl_set_parent,
.get_parent = clk_sysctrl_get_parent,
 };
@@ -124,7 +124,7 @@ static struct clk *clk_reg_sysctrl(struct device *dev,
unsigned long rate,
unsigned long enable_delay_us,
unsigned long flags,
-   struct clk_ops *clk_sysctrl_ops)
+   const struct clk_ops *clk_sysctrl_ops)
 {
struct clk_sysctrl *clk;
struct clk_init_data clk_sysctrl_init;
-- 
1.9.1



Re: [PATCH v4 00/10] make L2's kvm-clock stable, get rid of pvclock_gtod_copy in KVM

2017-08-28 Thread Denis Plotnikov



On 24.08.2017 11:00, Paolo Bonzini wrote:

On 23/08/2017 18:02, Paolo Bonzini wrote:


More duct tape would have been just:

-   if (pvclock_gtod_data.clock.vclock_mode != VCLOCK_TSC)
+   mode = READ_ONCE(pvclock_gtod_data.clock.vclock_mode);
+   if (mode != VCLOCK_TSC &&
+   (mode != VCLOCK_PVCLOCK || !pvclock_nested_virt_magic())
return false;

-   return do_realtime(ts, cycle_now) == VCLOCK_TSC;
+   switch (mode) {
+   case VCLOCK_TSC:
+   return do_realtime_tsc(ts, cycle_now);
+   case VCLOCK_PVCLOCK:
+   return do_realtime_pvclock(ts, cycle_now);
+   }

Nested virtualization does need a clocksource change notifier on top,
but we can cross that bridge later.  Maybe Denis can post just those
patches to begin with.


For what it's worth, this is all that's needed (with patches 1-2-3-4-5-7)
to support kvmclock on top of Hyper-V clock.  It's trivial.

Even if we could add paravirtualization magic to KVM live migration, we
certainly couldn't do that for other hypervisors.

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 5b882cc0c0e9..3bab935b021a 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -46,10 +46,24 @@ static u64 read_hv_clock_tsc(struct clocksource *arg)
return current_tick;
  }
  
+static bool read_hv_clock_tsc_with_stamp(struct clocksource *arg,

+u64 *cycles, u64 *cycles_stamp)
+{
+   *cycles = __hv_read_tsc_page(tsc_pg, _stamp);
+
+   if (*cycles == U64_MAX) {
+   *cycles = rdmsrl(HV_X64_MSR_TIME_REF_COUNT);
+   return false;
+   }
+
+   return true;
+}
+
  static struct clocksource hyperv_cs_tsc = {
.name   = "hyperv_clocksource_tsc_page",
.rating = 400,
.read   = read_hv_clock_tsc,
+   .read_with_stamp = read_hv_clock_tsc_with_stamp,
.mask   = CLOCKSOURCE_MASK(64),
.flags  = CLOCK_SOURCE_IS_CONTINUOUS,
  };
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 2b58c8c1eeaa..5aff66e9fff7 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -176,9 +176,9 @@ void hyperv_cleanup(void);
  #endif
  #ifdef CONFIG_HYPERV_TSCPAGE
  struct ms_hyperv_tsc_page *hv_get_tsc_page(void);
-static inline u64 hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg)
+static inline u64 __hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg, 
u64 *cur_tsc)
  {
-   u64 scale, offset, cur_tsc;
+   u64 scale, offset;
u32 sequence;
  
  	/*

@@ -209,7 +209,7 @@ static inline u64 hv_read_tsc_page(const struct 
ms_hyperv_tsc_page *tsc_pg)
  
  		scale = READ_ONCE(tsc_pg->tsc_scale);

offset = READ_ONCE(tsc_pg->tsc_offset);
-   cur_tsc = rdtsc_ordered();
+   *cur_tsc = rdtsc_ordered();
  
  		/*

 * Make sure we read sequence after we read all other values
@@ -219,9 +219,14 @@ static inline u64 hv_read_tsc_page(const struct 
ms_hyperv_tsc_page *tsc_pg)
  
  	} while (READ_ONCE(tsc_pg->tsc_sequence) != sequence);
  
-	return mul_u64_u64_shr(cur_tsc, scale, 64) + offset;

+   return mul_u64_u64_shr(*cur_tsc, scale, 64) + offset;
  }
  
+static inline u64 hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg)

+{
+   u64 cur_tsc;
+   return __hv_read_tsc_page(tsc_pg, _tsc);
+}
  #else
  static inline struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
  {


Denis, could you try redoing patch 7 to use the pvclock_gtod_notifier
instead of the new one you're adding, and only send that first part?  I
think it's a worthwhile cleanup anyway, so let's start with that.

Paolo


Ok, I'll do that

Denis


Re: [patch 10/41] x86/apic: Remove the duplicated tracing versions of interrupts

2017-08-28 Thread Peter Zijlstra
On Fri, Aug 25, 2017 at 11:49:47AM -0400, Steven Rostedt wrote:
> On Fri, 25 Aug 2017 12:31:13 +0200
> Thomas Gleixner  wrote:
> 
> > The error and the spurious interrupt are really rare events and not at all
> > so performance sensitive that two NOP5s can not be tolerated when tracing
> > is disabled.
> 
> Just a note. I'm sure if we disassembled it, it may be a little more
> work done than just two NOPs, as parameter passing to the tracepoints
> sometimes leak out of the static jump block. It's moot on this patch,
> but other irqs with fast paths may need to be looked at.

Is that something we can fix with the trace macros?

They have a general shape of:

#define trace_foo(args...)
if (static_branch_unlikely(_enabled)) {
__trace_foo(args...);
}

Right? And I suppose I see why the compiler would want to sometimes lift
stuff out of the branch block, but we'd really like it not to do that.
Would putting a barrier() in front of __trace_foo() help?


linux-next: build failure after merge of the scsi tree

2017-08-28 Thread Stephen Rothwell
Hi James,

After merging the scsi tree, today's linux-next build (sparc defconfig)
failed like this:

drivers/scsi/qlogicpti.c:1285:27: error: initialization from incompatible 
pointer type [-Werror=incompatible-pointer-types]
  .eh_host_reset_handler = qlogicpti_reset,
   ^

Caused by commit

  af167bc42d86 ("scsi: qlogicpti: move bus reset to host reset")

I have reverted that commit for today.
-- 
Cheers,
Stephen Rothwell


Re: [PATCH 2/5] mmc: sdhci: Add quirk to indicate MMC_RSP_136 has CRC

2017-08-28 Thread Adrian Hunter
On 21/08/17 10:41, Kishon Vijay Abraham I wrote:
> TI's implementation of sdhci controller used in DRA7 SoC's has
> CRC in responses with length 136 bits. Add quirk to indicate
> the controller has CRC in MMC_RSP_136. If this quirk is
> set sdhci library shouldn't shift the response present in
> SDHCI_RESPONSE register.
> 
> Signed-off-by: Kishon Vijay Abraham I 

Acked-by: Adrian Hunter 

> ---
>  drivers/mmc/host/sdhci.c | 3 +++
>  drivers/mmc/host/sdhci.h | 2 ++
>  2 files changed, 5 insertions(+)
> 
> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> index ba639b7851cb..9c8d7428df3c 100644
> --- a/drivers/mmc/host/sdhci.c
> +++ b/drivers/mmc/host/sdhci.c
> @@ -1182,6 +1182,9 @@ static void sdhci_read_rsp_136(struct sdhci_host *host, 
> struct mmc_command *cmd)
>   cmd->resp[i] = sdhci_readl(host, reg);
>   }
>  
> + if (host->quirks2 & SDHCI_QUIRK2_RSP_136_HAS_CRC)
> + return;
> +
>   /* CRC is stripped so we need to do some shifting */
>   for (i = 0; i < 4; i++) {
>   cmd->resp[i] <<= 8;
> diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
> index 399edc681623..54bc444c317f 100644
> --- a/drivers/mmc/host/sdhci.h
> +++ b/drivers/mmc/host/sdhci.h
> @@ -435,6 +435,8 @@ struct sdhci_host {
>  #define SDHCI_QUIRK2_ACMD23_BROKEN   (1<<14)
>  /* Broken Clock divider zero in controller */
>  #define SDHCI_QUIRK2_CLOCK_DIV_ZERO_BROKEN   (1<<15)
> +/* Controller has CRC in 136 bit Command Response */
> +#define SDHCI_QUIRK2_RSP_136_HAS_CRC (1<<16)
>  
>   int irq;/* Device IRQ */
>   void __iomem *ioaddr;   /* Mapped address */
> 



[PATCH 4.9 16/84] ipv6: repair fib6 tree in failure case

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Wei Wang 


[ Upstream commit 348a4002729ccab8b888b38cbc099efa2f2a2036 ]

In fib6_add(), it is possible that fib6_add_1() picks an intermediate
node and sets the node's fn->leaf to NULL in order to add this new
route. However, if fib6_add_rt2node() fails to add the new
route for some reason, fn->leaf will be left as NULL and could
potentially cause crash when fn->leaf is accessed in fib6_locate().
This patch makes sure fib6_repair_tree() is called to properly repair
fn->leaf in the above failure case.

Here is the syzkaller reported general protection fault in fib6_locate:
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault:  [#1] SMP KASAN
Modules linked in:
CPU: 0 PID: 40937 Comm: syz-executor3 Not tainted
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
task: 8801d7d64100 ti: 8801d01a task.ti: 8801d01a
RIP: 0010:[]  [] __ipv6_prefix_equal64_half 
include/net/ipv6.h:475 [inline]
RIP: 0010:[]  [] ipv6_prefix_equal 
include/net/ipv6.h:492 [inline]
RIP: 0010:[]  [] fib6_locate_1 
net/ipv6/ip6_fib.c:1210 [inline]
RIP: 0010:[]  [] fib6_locate+0x281/0x3c0 
net/ipv6/ip6_fib.c:1233
RSP: 0018:8801d01a36a8  EFLAGS: 00010202
RAX: 0020 RBX: 8801bc790e00 RCX: c90002983000
RDX: 1219 RSI: 8801d01a37a0 RDI: 0100
RBP: 8801d01a36f0 R08: 00ff R09: 
R10: 0003 R11:  R12: 0001
R13: dc00 R14: 8801d01a37a0 R15: 
FS:  7f6afd68c700() GS:8801db40() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 004c6340 CR3: ba41f000 CR4: 001426f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Stack:
 8801d01a37a8 8801d01a3780 ed003a0346f5 000c82a23ea0
 8800b7bd7700 8801d01a3780 8800b6a1c940 82a23ea0
 8801d01a3920 8801d01a3748 82a223d6 8801d7d64988
Call Trace:
 [] ip6_route_del+0x106/0x570 net/ipv6/route.c:2109
 [] inet6_rtm_delroute+0xfd/0x100 net/ipv6/route.c:3075
 [] rtnetlink_rcv_msg+0x549/0x7a0 net/core/rtnetlink.c:3450
 [] netlink_rcv_skb+0x141/0x370 net/netlink/af_netlink.c:2281
 [] rtnetlink_rcv+0x2f/0x40 net/core/rtnetlink.c:3456
 [] netlink_unicast_kernel net/netlink/af_netlink.c:1206 
[inline]
 [] netlink_unicast+0x518/0x750 net/netlink/af_netlink.c:1232
 [] netlink_sendmsg+0x8ce/0xc30 net/netlink/af_netlink.c:1778
 [] sock_sendmsg_nosec net/socket.c:609 [inline]
 [] sock_sendmsg+0xcf/0x110 net/socket.c:619
 [] sock_write_iter+0x222/0x3a0 net/socket.c:834
 [] new_sync_write+0x1dd/0x2b0 fs/read_write.c:478
 [] __vfs_write+0xe4/0x110 fs/read_write.c:491
 [] vfs_write+0x178/0x4b0 fs/read_write.c:538
 [] SYSC_write fs/read_write.c:585 [inline]
 [] SyS_write+0xd9/0x1b0 fs/read_write.c:577
 [] entry_SYSCALL_64_fastpath+0x12/0x17

Note: there is no "Fixes" tag as this seems to be a bug introduced
very early.

Signed-off-by: Wei Wang 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/ipv6/ip6_fib.c |   22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1001,7 +1001,7 @@ int fib6_add(struct fib6_node *root, str
/* Create subtree root node */
sfn = node_alloc();
if (!sfn)
-   goto st_failure;
+   goto failure;
 
sfn->leaf = info->nl_net->ipv6.ip6_null_entry;

atomic_inc(>nl_net->ipv6.ip6_null_entry->rt6i_ref);
@@ -1017,12 +1017,12 @@ int fib6_add(struct fib6_node *root, str
 
if (IS_ERR(sn)) {
/* If it is failed, discard just allocated
-  root, and then (in st_failure) stale node
+  root, and then (in failure) stale node
   in main tree.
 */
node_free(sfn);
err = PTR_ERR(sn);
-   goto st_failure;
+   goto failure;
}
 
/* Now link new subtree to main tree */
@@ -1036,7 +1036,7 @@ int fib6_add(struct fib6_node *root, str
 
if (IS_ERR(sn)) {
err = PTR_ERR(sn);
-   goto st_failure;
+ 

[PATCH 4.9 21/84] net_sched: fix order of queue length updates in qdisc_replace()

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Konstantin Khlebnikov 


[ Upstream commit 68a66d149a8c78ec6720f268597302883e48e9fa ]

This important to call qdisc_tree_reduce_backlog() after changing queue
length. Parent qdisc should deactivate class in ->qlen_notify() called from
qdisc_tree_reduce_backlog() but this happens only if qdisc->q.qlen in zero.

Missed class deactivations leads to crashes/warnings at picking packets
from empty qdisc and corrupting state at reactivating this class in future.

Signed-off-by: Konstantin Khlebnikov 
Fixes: 86a7996cc8a0 ("net_sched: introduce qdisc_replace() helper")
Acked-by: Cong Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 include/net/sch_generic.h |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -768,8 +768,11 @@ static inline struct Qdisc *qdisc_replac
old = *pold;
*pold = new;
if (old != NULL) {
-   qdisc_tree_reduce_backlog(old, old->q.qlen, 
old->qstats.backlog);
+   unsigned int qlen = old->q.qlen;
+   unsigned int backlog = old->qstats.backlog;
+
qdisc_reset(old);
+   qdisc_tree_reduce_backlog(old, qlen, backlog);
}
sch_tree_unlock(sch);
 




[PATCH 4.9 39/84] i2c: designware: Fix system suspend

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Ulf Hansson 

commit a23318feeff662c8d25d21623daebdd2e55ec221 upstream.

The commit 8503ff166504 ("i2c: designware: Avoid unnecessary resuming
during system suspend"), may suggest to the PM core to try out the so
called direct_complete path for system sleep. In this path, the PM core
treats a runtime suspended device as it's already in a proper low power
state for system sleep, which makes it skip calling the system sleep
callbacks for the device, except for the ->prepare() and the ->complete()
callbacks.

However, the PM core may unset the direct_complete flag for a parent
device, in case its child device are being system suspended before. In this
scenario, the PM core invokes the system sleep callbacks, no matter if the
device is runtime suspended or not.

Particularly in cases of an existing i2c slave device, the above path is
triggered, which breaks the assumption that the i2c device is always
runtime resumed whenever the dw_i2c_plat_suspend() is being called.

More precisely, dw_i2c_plat_suspend() calls clk_core_disable() and
clk_core_unprepare(), for an already disabled/unprepared clock, leading to
a splat in the log about clocks calls being wrongly balanced and breaking
system sleep.

To still allow the direct_complete path in cases when it's possible, but
also to keep the fix simple, let's runtime resume the i2c device in the
->suspend() callback, before continuing to put the device into low power
state.

Note, in cases when the i2c device is attached to the ACPI PM domain, this
problem doesn't occur, because ACPI's ->suspend() callback, assigned to
acpi_subsys_suspend(), already calls pm_runtime_resume() for the device.

It should also be noted that this change does not fix commit 8503ff166504
("i2c: designware: Avoid unnecessary resuming during system suspend").
Because for the non-ACPI case, the system sleep support was already broken
prior that point.

Signed-off-by: Ulf Hansson 
Acked-by: Rafael J. Wysocki 
Tested-by: John Stultz 
Tested-by: Jarkko Nikula 
Acked-by: Jarkko Nikula 
Reviewed-by: Mika Westerberg 
Signed-off-by: Wolfram Sang 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/i2c/busses/i2c-designware-platdrv.c |   14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

--- a/drivers/i2c/busses/i2c-designware-platdrv.c
+++ b/drivers/i2c/busses/i2c-designware-platdrv.c
@@ -319,7 +319,7 @@ static void dw_i2c_plat_complete(struct
 #endif
 
 #ifdef CONFIG_PM
-static int dw_i2c_plat_suspend(struct device *dev)
+static int dw_i2c_plat_runtime_suspend(struct device *dev)
 {
struct platform_device *pdev = to_platform_device(dev);
struct dw_i2c_dev *i_dev = platform_get_drvdata(pdev);
@@ -343,11 +343,21 @@ static int dw_i2c_plat_resume(struct dev
return 0;
 }
 
+#ifdef CONFIG_PM_SLEEP
+static int dw_i2c_plat_suspend(struct device *dev)
+{
+   pm_runtime_resume(dev);
+   return dw_i2c_plat_runtime_suspend(dev);
+}
+#endif
+
 static const struct dev_pm_ops dw_i2c_dev_pm_ops = {
.prepare = dw_i2c_plat_prepare,
.complete = dw_i2c_plat_complete,
SET_SYSTEM_SLEEP_PM_OPS(dw_i2c_plat_suspend, dw_i2c_plat_resume)
-   SET_RUNTIME_PM_OPS(dw_i2c_plat_suspend, dw_i2c_plat_resume, NULL)
+   SET_RUNTIME_PM_OPS(dw_i2c_plat_runtime_suspend,
+  dw_i2c_plat_resume,
+  NULL)
 };
 
 #define DW_I2C_DEV_PMOPS (_i2c_dev_pm_ops)




[PATCH 4.9 44/84] drm/atomic: If the atomic check fails, return its value first

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Maarten Lankhorst 

commit a0ffc51e20e90e0c1c2491de2b4b03f48b6caaba upstream.

The last part of drm_atomic_check_only is testing whether we need to
fail with -EINVAL when modeset is not allowed, but forgets to return
the value when atomic_check() fails first.

This results in -EDEADLK being replaced by -EINVAL, and the sanity
check in drm_modeset_drop_locks kicks in:

[  308.531734] [ cut here ]
[  308.531791] WARNING: CPU: 0 PID: 1886 at 
drivers/gpu/drm/drm_modeset_lock.c:217 drm_modeset_drop_locks+0x33/0xc0 [drm]
[  308.531828] Modules linked in:
[  308.532050] CPU: 0 PID: 1886 Comm: kms_atomic Tainted: G U  W 
4.13.0-rc5-patser+ #5225
[  308.532082] Hardware name: NUC5i7RYB, BIOS RYBDWi35.86A.0246.2015.0309.1355 
03/09/2015
[  308.532124] task: 8800cd9dae00 task.stack: 8800ca3b8000
[  308.532168] RIP: 0010:drm_modeset_drop_locks+0x33/0xc0 [drm]
[  308.532189] RSP: 0018:8800ca3bf980 EFLAGS: 00010282
[  308.532211] RAX: dc00 RBX: 8800ca3bfaf8 RCX: 13a171e6
[  308.532235] RDX: 110019477f69 RSI: a8ba4fa0 RDI: 8800ca3bfb48
[  308.532258] RBP: 8800ca3bf998 R08:  R09: 0003
[  308.532281] R10: 79dbe066 R11: f760b34b R12: 0001
[  308.532304] R13: dc00 R14: ffea R15: 880096889680
[  308.532328] FS:  7ff00959cec0() GS:8800d4e0() 
knlGS:
[  308.532359] CS:  0010 DS:  ES:  CR0: 80050033
[  308.532380] CR2: 0008 CR3: ca2e3000 CR4: 003406f0
[  308.532402] Call Trace:
[  308.532440]  drm_mode_atomic_ioctl+0x19fa/0x1c00 [drm]
[  308.532488]  ? drm_atomic_set_property+0x1220/0x1220 [drm]
[  308.532565]  ? avc_has_extended_perms+0xc39/0xff0
[  308.532593]  ? lock_downgrade+0x610/0x610
[  308.532640]  ? drm_atomic_set_property+0x1220/0x1220 [drm]
[  308.532680]  drm_ioctl_kernel+0x154/0x1a0 [drm]
[  308.532755]  drm_ioctl+0x624/0x8f0 [drm]
[  308.532858]  ? drm_atomic_set_property+0x1220/0x1220 [drm]
[  308.532976]  ? drm_getunique+0x210/0x210 [drm]
[  308.533061]  do_vfs_ioctl+0xd92/0xe40
[  308.533121]  ? ioctl_preallocate+0x1b0/0x1b0
[  308.533160]  ? selinux_capable+0x20/0x20
[  308.533191]  ? do_fcntl+0x1b1/0xbf0
[  308.533219]  ? kasan_slab_free+0xa2/0xb0
[  308.533249]  ? f_getown+0x4b/0xa0
[  308.533278]  ? putname+0xcf/0xe0
[  308.533309]  ? security_file_ioctl+0x57/0x90
[  308.533342]  SyS_ioctl+0x4e/0x80
[  308.533374]  entry_SYSCALL_64_fastpath+0x18/0xad
[  308.533405] RIP: 0033:0x7ff00779e4d7
[  308.533431] RSP: 002b:7fff66a043d8 EFLAGS: 0246 ORIG_RAX: 
0010
[  308.533481] RAX: ffda RBX: 00e7c7ca5910 RCX: 7ff00779e4d7
[  308.533560] RDX: 7fff66a04430 RSI: c03864bc RDI: 0003
[  308.533608] RBP: 7ff007a5fb00 R08: 00e7c7ca4620 R09: 00e7c7ca5e60
[  308.533647] R10: 0001 R11: 0246 R12: 0070
[  308.533685] R13:  R14:  R15: 00e7c7ca5930
[  308.533770] Code: ff df 55 48 89 e5 41 55 41 54 53 48 89 fb 48 83 c7
50 48 89 fa 48 c1 ea 03 80 3c 02 00 74 05 e8 94 d4 16 e7 48 83 7b 50 00
74 02 <0f> ff 4c 8d 6b 58 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1
[  308.534086] ---[ end trace 77f11e53b1df44ad ]---

Solve this by adding the missing return.

This is also a bugfix because we could end up rejecting updates with
-EINVAL because of a early -EDEADLK, while if atomic_check ran to
completion it might have downgraded the modeset to a fastset.

Signed-off-by: Maarten Lankhorst 
Testcase: kms_atomic
Link: 
https://patchwork.freedesktop.org/patch/msgid/20170815095706.23624-1-maarten.lankho...@linux.intel.com
Fixes: d34f20d6e2f2 ("drm: Atomic modeset ioctl")
Reviewed-by: Daniel Vetter 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/gpu/drm/drm_atomic.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -1386,6 +1386,9 @@ int drm_atomic_check_only(struct drm_ato
if (config->funcs->atomic_check)
ret = config->funcs->atomic_check(state->dev, state);
 
+   if (ret)
+   return ret;
+
if (!state->allow_modeset) {
for_each_crtc_in_state(state, crtc, crtc_state, i) {
if (drm_atomic_crtc_needs_modeset(crtc_state)) {
@@ -1396,7 +1399,7 @@ int drm_atomic_check_only(struct drm_ato
}
}
 
-   return ret;
+   return 0;
 }
 EXPORT_SYMBOL(drm_atomic_check_only);
 




[PATCH 4.9 62/84] Revert "leds: handle suspend/resume in heartbeat trigger"

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Zhang Bo 

commit 436c4c45b5b9562b59cedbb51b7343ab4a6dd8cc upstream.

This reverts commit 5ab92a7cb82c66bf30685583a38a18538e3807db.

System cannot enter suspend mode because of heartbeat led trigger.
In autosleep_wq, try_to_suspend function will try to enter suspend
mode in specific period. it will get wakeup_count then call pm_notifier
chain callback function and freeze processes.
Heartbeat_pm_notifier is called and it call led_trigger_unregister to
change the trigger of led device to none. It will send uevent message
and the wakeup source count changed. As wakeup_count changed, suspend
will abort.

Fixes: 5ab92a7cb82c ("leds: handle suspend/resume in heartbeat trigger")
Signed-off-by: Zhang Bo 
Acked-by: Pavel Machek 
Reviewed-by: Linus Walleij 
Signed-off-by: Jacek Anaszewski 
Cc: Geert Uytterhoeven 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/leds/trigger/ledtrig-heartbeat.c |   31 ---
 1 file changed, 31 deletions(-)

--- a/drivers/leds/trigger/ledtrig-heartbeat.c
+++ b/drivers/leds/trigger/ledtrig-heartbeat.c
@@ -19,7 +19,6 @@
 #include 
 #include 
 #include 
-#include 
 #include "../leds.h"
 
 static int panic_heartbeats;
@@ -155,30 +154,6 @@ static struct led_trigger heartbeat_led_
.deactivate = heartbeat_trig_deactivate,
 };
 
-static int heartbeat_pm_notifier(struct notifier_block *nb,
-unsigned long pm_event, void *unused)
-{
-   int rc;
-
-   switch (pm_event) {
-   case PM_SUSPEND_PREPARE:
-   case PM_HIBERNATION_PREPARE:
-   case PM_RESTORE_PREPARE:
-   led_trigger_unregister(_led_trigger);
-   break;
-   case PM_POST_SUSPEND:
-   case PM_POST_HIBERNATION:
-   case PM_POST_RESTORE:
-   rc = led_trigger_register(_led_trigger);
-   if (rc)
-   pr_err("could not re-register heartbeat trigger\n");
-   break;
-   default:
-   break;
-   }
-   return NOTIFY_DONE;
-}
-
 static int heartbeat_reboot_notifier(struct notifier_block *nb,
 unsigned long code, void *unused)
 {
@@ -193,10 +168,6 @@ static int heartbeat_panic_notifier(stru
return NOTIFY_DONE;
 }
 
-static struct notifier_block heartbeat_pm_nb = {
-   .notifier_call = heartbeat_pm_notifier,
-};
-
 static struct notifier_block heartbeat_reboot_nb = {
.notifier_call = heartbeat_reboot_notifier,
 };
@@ -213,14 +184,12 @@ static int __init heartbeat_trig_init(vo
atomic_notifier_chain_register(_notifier_list,
   _panic_nb);
register_reboot_notifier(_reboot_nb);
-   register_pm_notifier(_pm_nb);
}
return rc;
 }
 
 static void __exit heartbeat_trig_exit(void)
 {
-   unregister_pm_notifier(_pm_nb);
unregister_reboot_notifier(_reboot_nb);
atomic_notifier_chain_unregister(_notifier_list,
 _panic_nb);




[PATCH 4.4 50/53] ntb_transport: fix qp count bug

2017-08-28 Thread Greg Kroah-Hartman
4.4-stable review patch.  If anyone has any objections, please let me know.

--

From: Logan Gunthorpe 

commit cb827ee6ccc3e480f0d9c0e8e53eef55be5b0414 upstream.

In cases where there are more mw's than spads/2-2, the mw count gets
reduced to match the limitation. ntb_transport also tries to ensure that
there are fewer qps than mws but uses the full mw count instead of
the reduced one. When this happens, the math in
'ntb_transport_setup_qp_mw' will get confused and result in a kernel
paging request bug.

This patch fixes the bug by reducing qp_count to the reduced mw count
instead of the full mw count.

Signed-off-by: Logan Gunthorpe 
Fixes: e26a5843f7f5 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Acked-by: Allen Hubbe 
Signed-off-by: Jon Mason 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/ntb/ntb_transport.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/ntb/ntb_transport.c
+++ b/drivers/ntb/ntb_transport.c
@@ -1065,8 +1065,8 @@ static int ntb_transport_probe(struct nt
qp_count = ilog2(qp_bitmap);
if (max_num_clients && max_num_clients < qp_count)
qp_count = max_num_clients;
-   else if (mw_count < qp_count)
-   qp_count = mw_count;
+   else if (nt->mw_count < qp_count)
+   qp_count = nt->mw_count;
 
qp_bitmap &= BIT_ULL(qp_count) - 1;
 




[PATCH 4.4 47/53] ASoC: rsnd: Add missing initialization of ADG req_rate

2017-08-28 Thread Greg Kroah-Hartman
4.4-stable review patch.  If anyone has any objections, please let me know.

--

From: Geert Uytterhoeven 

commit 8b27418f300cafbdbbb8cfa9c29d398ed34d6723 upstream.

If the "clock-frequency" DT property is not found, req_rate is used
uninitialized, and the "audio_clkout" clock will be created with an
arbitrary clock rate.

This uninitialized kernel stack data may leak to userspace through
/sys/kernel/debug/clk/clk_summary, cfr. the value in the "rate" column:

   clock enable_cnt  prepare_cntrate   accuracy   phase

 audio_clkout 00  4001836240  0 0

Signed-off-by: Geert Uytterhoeven 
Acked-by: Kuninori Morimoto 
Signed-off-by: Mark Brown 
Signed-off-by: Thong Ho 
Signed-off-by: Nhan Nguyen 
Signed-off-by: Greg Kroah-Hartman 

---
 sound/soc/sh/rcar/adg.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/sound/soc/sh/rcar/adg.c
+++ b/sound/soc/sh/rcar/adg.c
@@ -437,7 +437,7 @@ static void rsnd_adg_get_clkout(struct r
struct device *dev = rsnd_priv_to_dev(priv);
struct device_node *np = dev->of_node;
u32 ckr, rbgx, rbga, rbgb;
-   u32 rate, req_rate, div;
+   u32 rate, req_rate = 0, div;
uint32_t count = 0;
unsigned long req_48kHz_rate, req_441kHz_rate;
int i;




[PATCH 4.4 31/53] cifs: Fix df output for users with quota limits

2017-08-28 Thread Greg Kroah-Hartman
4.4-stable review patch.  If anyone has any objections, please let me know.

--

From: Sachin Prabhu 

commit 42bec214d8bd432be6d32a1acb0a9079ecd4d142 upstream.

The df for a SMB2 share triggers a GetInfo call for
FS_FULL_SIZE_INFORMATION. The values returned are used to populate
struct statfs.

The problem is that none of the information returned by the call
contains the total blocks available on the filesystem. Instead we use
the blocks available to the user ie. quota limitation when filling out
statfs.f_blocks. The information returned does contain Actual free units
on the filesystem and is used to populate statfs.f_bfree. For users with
quota enabled, it can lead to situations where the total free space
reported is more than the total blocks on the system ending up with df
reports like the following

 # df -h /mnt/a
Filesystem Size  Used Avail Use% Mounted on
//192.168.22.10/a  2.5G -2.3G  2.5G- /mnt/a

To fix this problem, we instead populate both statfs.f_bfree with the
same value as statfs.f_bavail ie. CallerAvailableAllocationUnits. This
is similar to what is done already in the code for cifs and df now
reports the quota information for the user used to mount the share.

 # df --si /mnt/a
Filesystem Size  Used Avail Use% Mounted on
//192.168.22.10/a  2.7G  101M  2.6G   4% /mnt/a

Signed-off-by: Sachin Prabhu 
Signed-off-by: Pierguido Lambri 
Signed-off-by: Steve French 
Signed-off-by: Greg Kroah-Hartman 

---
 fs/cifs/smb2pdu.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -2768,8 +2768,8 @@ copy_fs_info_to_kstatfs(struct smb2_fs_f
kst->f_bsize = le32_to_cpu(pfs_inf->BytesPerSector) *
  le32_to_cpu(pfs_inf->SectorsPerAllocationUnit);
kst->f_blocks = le64_to_cpu(pfs_inf->TotalAllocationUnits);
-   kst->f_bfree  = le64_to_cpu(pfs_inf->ActualAvailableAllocationUnits);
-   kst->f_bavail = le64_to_cpu(pfs_inf->CallerAvailableAllocationUnits);
+   kst->f_bfree  = kst->f_bavail =
+   le64_to_cpu(pfs_inf->CallerAvailableAllocationUnits);
return;
 }
 




[PATCH 4.4 28/53] drm: rcar-du: Fix display timing controller parameter

2017-08-28 Thread Greg Kroah-Hartman
4.4-stable review patch.  If anyone has any objections, please let me know.

--

From: Koji Matsuoka 

commit 9cdced8a39c04cf798ddb2a27cb5952f7d39f633 upstream.

There is a bug in the setting of the DES (Display Enable Signal)
register. This current setting occurs 1 dot left shift. The DES
register should be set minus one value about the specifying value
with H/W specification. This patch corrects it.

Signed-off-by: Koji Matsuoka 
Signed-off-by: Laurent Pinchart 
Signed-off-by: Thong Ho 
Signed-off-by: Nhan Nguyen 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/gpu/drm/rcar-du/rcar_du_crtc.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
@@ -171,7 +171,7 @@ static void rcar_du_crtc_set_display_tim
mode->crtc_vsync_start - 1);
rcar_du_crtc_write(rcrtc, VCR,  mode->crtc_vtotal - 1);
 
-   rcar_du_crtc_write(rcrtc, DESR,  mode->htotal - mode->hsync_start);
+   rcar_du_crtc_write(rcrtc, DESR,  mode->htotal - mode->hsync_start - 1);
rcar_du_crtc_write(rcrtc, DEWR,  mode->hdisplay);
 }
 




[PATCH 4.4 32/53] cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()

2017-08-28 Thread Greg Kroah-Hartman
4.4-stable review patch.  If anyone has any objections, please let me know.

--

From: Ronnie Sahlberg 

commit d3edede29f74d335f81d95a4588f5f136a9f7dcf upstream.

Add checking for the path component length and verify it is <= the maximum
that the server advertizes via FileFsAttributeInformation.

With this patch cifs.ko will now return ENAMETOOLONG instead of ENOENT
when users to access an overlong path.

To test this, try to cd into a (non-existing) directory on a CIFS share
that has a too long name:
cd /mnt/aaa...

and it now should show a good error message from the shell:
bash: cd: /mnt/...aa: File name too long

rh bz 1153996

Signed-off-by: Ronnie Sahlberg 
Signed-off-by: Steve French 
Signed-off-by: Greg Kroah-Hartman 

---
 fs/cifs/dir.c |   18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

--- a/fs/cifs/dir.c
+++ b/fs/cifs/dir.c
@@ -183,15 +183,20 @@ cifs_bp_rename_retry:
 }
 
 /*
+ * Don't allow path components longer than the server max.
  * Don't allow the separator character in a path component.
  * The VFS will not allow "/", but "\" is allowed by posix.
  */
 static int
-check_name(struct dentry *direntry)
+check_name(struct dentry *direntry, struct cifs_tcon *tcon)
 {
struct cifs_sb_info *cifs_sb = CIFS_SB(direntry->d_sb);
int i;
 
+   if (unlikely(direntry->d_name.len >
+tcon->fsAttrInfo.MaxPathNameComponentLength))
+   return -ENAMETOOLONG;
+
if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_POSIX_PATHS)) {
for (i = 0; i < direntry->d_name.len; i++) {
if (direntry->d_name.name[i] == '\\') {
@@ -489,10 +494,6 @@ cifs_atomic_open(struct inode *inode, st
return finish_no_open(file, res);
}
 
-   rc = check_name(direntry);
-   if (rc)
-   return rc;
-
xid = get_xid();
 
cifs_dbg(FYI, "parent inode = 0x%p name is: %pd and dentry = 0x%p\n",
@@ -505,6 +506,11 @@ cifs_atomic_open(struct inode *inode, st
}
 
tcon = tlink_tcon(tlink);
+
+   rc = check_name(direntry, tcon);
+   if (rc)
+   goto out_free_xid;
+
server = tcon->ses->server;
 
if (server->ops->new_lease_key)
@@ -765,7 +771,7 @@ cifs_lookup(struct inode *parent_dir_ino
}
pTcon = tlink_tcon(tlink);
 
-   rc = check_name(direntry);
+   rc = check_name(direntry, pTcon);
if (rc)
goto lookup_out;
 




[PATCH 4.4 30/53] tracing: Fix freeing of filter in create_filter() when set_str is false

2017-08-28 Thread Greg Kroah-Hartman
4.4-stable review patch.  If anyone has any objections, please let me know.

--

From: Steven Rostedt (VMware) 

commit 8b0db1a5bdfcee0dbfa89607672598ae203c9045 upstream.

Performing the following task with kmemleak enabled:

 # cd /sys/kernel/tracing/events/irq/irq_handler_entry/
 # echo 'enable_event:kmem:kmalloc:3 if irq >' > trigger
 # echo 'enable_event:kmem:kmalloc:3 if irq > 31' > trigger
 # echo scan > /sys/kernel/debug/kmemleak
 # cat /sys/kernel/debug/kmemleak
unreferenced object 0x8800b9290308 (size 32):
  comm "bash", pid 1114, jiffies 4294848451 (age 141.139s)
  hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  backtrace:
[] kmemleak_alloc+0x4a/0xa0
[] kmem_cache_alloc_trace+0x158/0x290
[] create_filter_start.constprop.28+0x99/0x940
[] create_filter+0xa9/0x160
[] create_event_filter+0xc/0x10
[] set_trigger_filter+0xe5/0x210
[] event_enable_trigger_func+0x324/0x490
[] event_trigger_write+0x1a2/0x260
[] __vfs_write+0xd7/0x380
[] vfs_write+0x101/0x260
[] SyS_write+0xab/0x130
[] entry_SYSCALL_64_fastpath+0x1f/0xbe
[] 0x

The function create_filter() is passed a 'filterp' pointer that gets
allocated, and if "set_str" is true, it is up to the caller to free it, even
on error. The problem is that the pointer is not freed by create_filter()
when set_str is false. This is a bug, and it is not up to the caller to free
the filter on error if it doesn't care about the string.

Link: 
http://lkml.kernel.org/r/1502705898-27571-2-git-send-email-ch...@redhat.com

Fixes: 38b78eb85 ("tracing: Factorize filter creation")
Reported-by: Chunyu Hu 
Tested-by: Chunyu Hu 
Signed-off-by: Steven Rostedt (VMware) 
Signed-off-by: Greg Kroah-Hartman 

---
 kernel/trace/trace_events_filter.c |4 
 1 file changed, 4 insertions(+)

--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -1979,6 +1979,10 @@ static int create_filter(struct trace_ev
if (err && set_str)
append_filter_err(ps, filter);
}
+   if (err && !set_str) {
+   free_event_filter(filter);
+   filter = NULL;
+   }
create_filter_finish(ps);
 
*filterp = filter;




[PATCH 4.4 33/53] nfsd: Limit end of page list when decoding NFSv4 WRITE

2017-08-28 Thread Greg Kroah-Hartman
4.4-stable review patch.  If anyone has any objections, please let me know.

--

From: Chuck Lever 

commit fc788f64f1f3eb31e87d4f53bcf1ab76590d5838 upstream.

When processing an NFSv4 WRITE operation, argp->end should never
point past the end of the data in the final page of the page list.
Otherwise, nfsd4_decode_compound can walk into uninitialized memory.

More critical, nfsd4_decode_write is failing to increment argp->pagelen
when it increments argp->pagelist.  This can cause later xdr decoders
to assume more data is available than really is, which can cause server
crashes on malformed requests.

Signed-off-by: Chuck Lever 
Signed-off-by: J. Bruce Fields 
Signed-off-by: Greg Kroah-Hartman 

---
 fs/nfsd/nfs4xdr.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -129,7 +129,7 @@ static void next_decode_page(struct nfsd
argp->p = page_address(argp->pagelist[0]);
argp->pagelist++;
if (argp->pagelen < PAGE_SIZE) {
-   argp->end = argp->p + (argp->pagelen>>2);
+   argp->end = argp->p + XDR_QUADLEN(argp->pagelen);
argp->pagelen = 0;
} else {
argp->end = argp->p + (PAGE_SIZE>>2);
@@ -1246,9 +1246,7 @@ nfsd4_decode_write(struct nfsd4_compound
argp->pagelen -= pages * PAGE_SIZE;
len -= pages * PAGE_SIZE;
 
-   argp->p = (__be32 *)page_address(argp->pagelist[0]);
-   argp->pagelist++;
-   argp->end = argp->p + XDR_QUADLEN(PAGE_SIZE);
+   next_decode_page(argp);
}
argp->p += XDR_QUADLEN(len);
 




[PATCH 4.4 29/53] drm: rcar-du: Fix H/V sync signal polarity configuration

2017-08-28 Thread Greg Kroah-Hartman
4.4-stable review patch.  If anyone has any objections, please let me know.

--

From: Koji Matsuoka 

commit fd1adef3bff0663c5ac31b45bc4a05fafd43d19b upstream.

The VSL and HSL bits in the DSMR register set the corresponding
horizontal and vertical sync signal polarity to active high. The code
got it the wrong way around, fix it.

Signed-off-by: Koji Matsuoka 
Signed-off-by: Laurent Pinchart 
Signed-off-by: Thong Ho 
Signed-off-by: Nhan Nguyen 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/gpu/drm/rcar-du/rcar_du_crtc.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
@@ -148,8 +148,8 @@ static void rcar_du_crtc_set_display_tim
rcar_du_group_write(rcrtc->group, rcrtc->index % 2 ? OTAR2 : OTAR, 0);
 
/* Signal polarities */
-   value = ((mode->flags & DRM_MODE_FLAG_PVSYNC) ? 0 : DSMR_VSL)
- | ((mode->flags & DRM_MODE_FLAG_PHSYNC) ? 0 : DSMR_HSL)
+   value = ((mode->flags & DRM_MODE_FLAG_PVSYNC) ? DSMR_VSL : 0)
+ | ((mode->flags & DRM_MODE_FLAG_PHSYNC) ? DSMR_HSL : 0)
  | DSMR_DIPM_DE | DSMR_CSPM;
rcar_du_crtc_write(rcrtc, DSMR, value);
 




Re: [PATCH] net: sunrpc: svcsock: fix NULL-pointer exception

2017-08-28 Thread Vadim Lomovtsev
On Fri, Aug 25, 2017 at 06:01:28PM -0400, J. Bruce Fields wrote:
> On Fri, Aug 18, 2017 at 06:00:47AM -0400, Vadim Lomovtsev wrote:
> > While running nfs/connectathon tests kernel NULL-pointer exception
> > has been observed due to races in svcsock.c.
> > 
> > Race is appear when kernel accepts connection by kernel_accept
> > (which creates new socket) and start queuing ingress packets
> > to new socket. This happanes in ksoftirq context which concurrently
> > on a differnt core while new socket setup is not done yet.
> > 
> > The fix is to re-order socket user data init sequence, add NULL-ptr
> > check before callback call along with barriers to prevent kernel crash.
> > 
> > Test results: nfs/connectathon reports '0' failed tests for about 200+ 
> > iterations.
> 
> By the way, is there anything special about your setup that allows you
> to reproduce this?  There's nothing special about connectathon tests, so
> I'm just wondering why we haven't had a lot of reports of this.

>From what I have now - nothing special in test setup and/or configuration.
I believe it is because or high amount of CPU running. It was found at 32
cores CPU, while simply invoking test by "make run" command.

WBR,
Vadim

> 
> --b.
> 
> > 
> > Crash log:
> > ---<-snip->---
> > [ 6708.638984] Unable to handle kernel NULL pointer dereference at virtual 
> > address 
> > [ 6708.647093] pgd = 094e
> > [ 6708.650497] [] *pgd=01090003, *pud=01090003, 
> > *pmd=01080003, *pte=
> > [ 6708.660761] Internal error: Oops: 8605 [#1] SMP
> > [ 6708.665630] Modules linked in: nfsv3 nfnetlink_queue nfnetlink_log 
> > nfnetlink rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache overlay 
> > xt_CONNSECMARK xt_SECMARK xt_conntrack iptable_security ip_tables ah4 
> > xfrm4_mode_transport sctp tun binfmt_misc ext4 jbd2 mbcache loop tcp_diag 
> > udp_diag inet_diag rpcrdma ib_isert iscsi_target_mod ib_iser rdma_cm iw_cm 
> > libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp 
> > scsi_transport_srp ib_ipoib ib_ucm ib_uverbs ib_umad ib_cm ib_core 
> > nls_koi8_u nls_cp932 ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack 
> > vfat fat ghash_ce sha2_ce sha1_ce cavium_rng_vf i2c_thunderx sg 
> > thunderx_edac i2c_smbus edac_core cavium_rng nfsd auth_rpcgss nfs_acl lockd 
> > grace sunrpc xfs libcrc32c nicvf nicpf ast i2c_algo_bit drm_kms_helper 
> > syscopyarea sysfillrect sysimgblt fb_sys_fops
> > [ 6708.736446]  ttm drm i2c_core thunder_bgx thunder_xcv mdio_thunder 
> > mdio_cavium dm_mirror dm_region_hash dm_log dm_mod [last unloaded: 
> > stap_3c300909c5b3f46dcacd49aab3334af_87021]
> > [ 6708.752275] CPU: 84 PID: 0 Comm: swapper/84 Tainted: GW  OE   
> > 4.11.0-4.el7.aarch64 #1
> > [ 6708.760787] Hardware name: www.cavium.com CRB-2S/CRB-2S, BIOS 0.3 Mar 13 
> > 2017
> > [ 6708.767910] task: 810006842e80 task.stack: 81000689c000
> > [ 6708.773822] PC is at 0x0
> > [ 6708.776739] LR is at svc_data_ready+0x38/0x88 [sunrpc]
> > [ 6708.781866] pc : [<>] lr : [] pstate: 
> > 6145
> > [ 6708.789248] sp : 810ffbad3900
> > [ 6708.792551] x29: 810ffbad3900 x28: 08c73d58
> > [ 6708.797853] x27:  x26: 81000bbe1e00
> > [ 6708.803156] x25: 0020 x24: 800f7410bf28
> > [ 6708.808458] x23: 08c63000 x22: 08c63000
> > [ 6708.813760] x21: 800f7410bf28 x20: 81000bbe1e00
> > [ 6708.819063] x19: 810012412400 x18: d82a9df2
> > [ 6708.824365] x17:  x16: 
> > [ 6708.829667] x15:  x14: 0001
> > [ 6708.834969] x13:  x12: 722e736f622e676e
> > [ 6708.840271] x11: f814dd99 x10: 
> > [ 6708.845573] x9 : 737468722500 x8 : 
> > [ 6708.850875] x7 :  x6 : 
> > [ 6708.856177] x5 : 0028 x4 : 
> > [ 6708.861479] x3 :  x2 : e500
> > [ 6708.866781] x1 :  x0 : 81000bbe1e00
> > [ 6708.872084]
> > [ 6708.873565] Process swapper/84 (pid: 0, stack limit = 0x81000689c000)
> > [ 6708.880341] Stack: (0x810ffbad3900 to 0x8100068a)
> > [ 6708.886075] Call trace:
> > [ 6708.888513] Exception stack(0x810ffbad3710 to 0x810ffbad3840)
> > [ 6708.894942] 3700:   810012412400 
> > 0001
> > [ 6708.902759] 3720: 810ffbad3900  6145 
> > 800f7930
> > [ 6708.910577] 3740: 09274d00 03ea 0015 
> > 08c63000
> > [ 6708.918395] 3760: 810ffbad3830 800f7930 004d 
> > 
> > [ 6708.926212] 3780: 810ffbad3890 080f88dc 800f7930 
> > 004d
> > [ 6708.934030] 37a0: 800f7930093c 08c63000  
> > 0140
> > [ 6708.941848] 37c0: 08c2c000 

Re: [PATCH net-next v2 05/14] net: mvpp2: do not force the link mode

2017-08-28 Thread Marcin Wojtas
Hi Antoine,

2017-08-28 8:55 GMT+02:00 Antoine Tenart :
> Hi Russell,
>
> On Fri, Aug 25, 2017 at 11:43:13PM +0100, Russell King - ARM Linux wrote:
>> On Fri, Aug 25, 2017 at 04:48:12PM +0200, Antoine Tenart wrote:
>> > The link mode (speed, duplex) was forced based on what the phylib
>> > returns. This should not be the case, and only forced by ethtool
>> > functions manually. This patch removes the link mode enforcement from
>> > the phylib link_event callback.
>>
>> So how does RGMII work (which has no in-band signalling between the PHY
>> and MAC)?
>>
>> phylib expects the network driver to configure it according to the PHY
>> state at link_event time - I think you need to explain more why you
>> think that this is not necessary.
>
> Good catch, this won't work properly with RGMII. This could be done
> out-of-band according to the spec, but that would use PHY polling and we
> do not want that (the same concern was raised by Andrew on another
> patch).
>
> I'll keep this mode enforcement for RGMII then.
>

Can you be 100% sure that when using SGMII with PHY's (like Marvell
Alaska 88E1xxx series), is in-band link information always available?
I'd be very cautious with such assumption and use in-band management
only when set in the DT, like mvneta. I think phylib can properly can
do its work when MDIO connection is provided on the board.

Did you check the change also on A375?

Best regards,
Marcin


[PATCH 4.4 11/53] tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP

2017-08-28 Thread Greg Kroah-Hartman
4.4-stable review patch.  If anyone has any objections, please let me know.

--

From: Neal Cardwell 


[ Upstream commit cdbeb633ca71a02b7b63bfeb94994bf4e1a0b894 ]

In some situations tcp_send_loss_probe() can realize that it's unable
to send a loss probe (TLP), and falls back to calling tcp_rearm_rto()
to schedule an RTO timer. In such cases, sometimes tcp_rearm_rto()
realizes that the RTO was eligible to fire immediately or at some
point in the past (delta_us <= 0). Previously in such cases
tcp_rearm_rto() was scheduling such "overdue" RTOs to happen at now +
icsk_rto, which caused needless delays of hundreds of milliseconds
(and non-linear behavior that made reproducible testing
difficult). This commit changes the logic to schedule "overdue" RTOs
ASAP, rather than at now + icsk_rto.

Fixes: 6ba8a3b19e76 ("tcp: Tail loss probe (TLP)")
Suggested-by: Yuchung Cheng 
Signed-off-by: Neal Cardwell 
Signed-off-by: Yuchung Cheng 
Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/ipv4/tcp_input.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3028,8 +3028,7 @@ void tcp_rearm_rto(struct sock *sk)
/* delta may not be positive if the socket is locked
 * when the retrans timer fires and is rescheduled.
 */
-   if (delta > 0)
-   rto = delta;
+   rto = max(delta, 1);
}
inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, rto,
  TCP_RTO_MAX);




[PATCH 4.4 13/53] net: sched: fix NULL pointer dereference when action calls some targets

2017-08-28 Thread Greg Kroah-Hartman
4.4-stable review patch.  If anyone has any objections, please let me know.

--

From: Xin Long 


[ Upstream commit 4f8a881acc9d1adaf1e552349a0b1df28933a04c ]

As we know in some target's checkentry it may dereference par.entryinfo
to check entry stuff inside. But when sched action calls xt_check_target,
par.entryinfo is set with NULL. It would cause kernel panic when calling
some targets.

It can be reproduce with:
  # tc qd add dev eth1 ingress handle :
  # tc filter add dev eth1 parent : u32 match u32 0 0 action xt \
-j ECN --ecn-tcp-remove

It could also crash kernel when using target CLUSTERIP or TPROXY.

By now there's no proper value for par.entryinfo in ipt_init_target,
but it can not be set with NULL. This patch is to void all these
panics by setting it with an ipt_entry obj with all members = 0.

Note that this issue has been there since the very beginning.

Signed-off-by: Xin Long 
Acked-by: Pablo Neira Ayuso 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/sched/act_ipt.c |2 ++
 1 file changed, 2 insertions(+)

--- a/net/sched/act_ipt.c
+++ b/net/sched/act_ipt.c
@@ -34,6 +34,7 @@ static int ipt_init_target(struct xt_ent
 {
struct xt_tgchk_param par;
struct xt_target *target;
+   struct ipt_entry e = {};
int ret = 0;
 
target = xt_request_find_target(AF_INET, t->u.user.name,
@@ -44,6 +45,7 @@ static int ipt_init_target(struct xt_ent
t->u.kernel.target = target;
memset(, 0, sizeof(par));
par.table = table;
+   par.entryinfo = 
par.target= target;
par.targinfo  = t->data;
par.hook_mask = hook;




[PATCH] perf script: Add script orientation output support for monitored events

2017-08-28 Thread Wind Yu
Introduce a new option to print trace output to files named by the
monitored events and update perf-script documentation accordingly.

Shown below is output of perf script command with the newly introduced
option.

$perf record -e cycles -e context-switches -ag -- sleep 10
$perf script -O
$ls /
cycles.stacks context-switches.stacks

Without the orientation option, drawing flamegraphs for different events
is really hard. You can only monitor one event at a time for perf record.
Using this option, we can get the trace output files named by the monitored
events, and could draw flamegraphs according to the event's name.

Signed-off-by: Wind Yu 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
---
 tools/perf/builtin-script.c | 469 +---
 tools/perf/util/tool.h  |   1 +
 2 files changed, 269 insertions(+), 201 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 378f76c..45a121b 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -57,6 +57,7 @@
 static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
 static struct perf_stat_config stat_config;
 static int max_blocks;
+static FILE*orientation_file;
 
 unsigned int scripting_max_stack = PERF_MAX_STACK_DEPTH;
 
@@ -485,8 +486,8 @@ static int perf_session__check_output_opt(struct 
perf_session *session)
return 0;
 }
 
-static void print_sample_iregs(struct perf_sample *sample,
- struct perf_event_attr *attr)
+static void fprint_sample_iregs(struct perf_sample *sample,
+ struct perf_event_attr *attr, FILE *fp)
 {
struct regs_dump *regs = >intr_regs;
uint64_t mask = attr->sample_regs_intr;
@@ -497,13 +498,13 @@ static void print_sample_iregs(struct perf_sample *sample,
 
for_each_set_bit(r, (unsigned long *) , sizeof(mask) * 8) {
u64 val = regs->regs[i++];
-   printf("%5s:0x%"PRIx64" ", perf_reg_name(r), val);
+   fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r), val);
}
 }
 
-static void print_sample_start(struct perf_sample *sample,
+static void fprint_sample_start(struct perf_sample *sample,
   struct thread *thread,
-  struct perf_evsel *evsel)
+  struct perf_evsel *evsel, FILE *fp)
 {
struct perf_event_attr *attr = >attr;
unsigned long secs;
@@ -511,25 +512,25 @@ static void print_sample_start(struct perf_sample *sample,
 
if (PRINT_FIELD(COMM)) {
if (latency_format)
-   printf("%8.8s ", thread__comm_str(thread));
+   fprintf(fp, "%8.8s ", thread__comm_str(thread));
else if (PRINT_FIELD(IP) && symbol_conf.use_callchain)
-   printf("%s ", thread__comm_str(thread));
+   fprintf(fp, "%s ", thread__comm_str(thread));
else
-   printf("%16s ", thread__comm_str(thread));
+   fprintf(fp, "%16s ", thread__comm_str(thread));
}
 
if (PRINT_FIELD(PID) && PRINT_FIELD(TID))
-   printf("%5d/%-5d ", sample->pid, sample->tid);
+   fprintf(fp, "%5d/%-5d ", sample->pid, sample->tid);
else if (PRINT_FIELD(PID))
-   printf("%5d ", sample->pid);
+   fprintf(fp, "%5d ", sample->pid);
else if (PRINT_FIELD(TID))
-   printf("%5d ", sample->tid);
+   fprintf(fp, "%5d ", sample->tid);
 
if (PRINT_FIELD(CPU)) {
if (latency_format)
-   printf("%3d ", sample->cpu);
+   fprintf(fp, "%3d ", sample->cpu);
else
-   printf("[%03d] ", sample->cpu);
+   fprintf(fp, "[%03d] ", sample->cpu);
}
 
if (PRINT_FIELD(TIME)) {
@@ -538,11 +539,11 @@ static void print_sample_start(struct perf_sample *sample,
nsecs -= secs * NSEC_PER_SEC;
 
if (nanosecs)
-   printf("%5lu.%09llu: ", secs, nsecs);
+   fprintf(fp, "%5lu.%09llu: ", secs, nsecs);
else {
char sample_time[32];
timestamp__scnprintf_usec(sample->time, sample_time, 
sizeof(sample_time));
-   printf("%12s: ", sample_time);
+   fprintf(fp, "%12s: ", sample_time);
}
}
 }
@@ -556,9 +557,10 @@ static void print_sample_start(struct perf_sample *sample,
return br->flags.predicted ? 'P' : 'M';
 }
 
-static void print_sample_brstack(struct perf_sample *sample,
+static void 

Re: [PATCH] staging: gdm724x: Rename variable for consistency

2017-08-28 Thread Greg KH
On Thu, Aug 24, 2017 at 07:18:53PM -0400, Nick Fox wrote:
> Rename dftEpsId variable to dft_eps_ID to be consistent with other
> variables in the source file.
> 
> Signed-off-by: Nick Fox 
> ---
>  drivers/staging/gdm724x/hci_packet.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Always test-build your patches, to not do so is rude to those of us you
expect to apply your patch :(

greg k-h


Re: [RFC v1] sched/fair: search a task from the tail of the queue

2017-08-28 Thread Peter Zijlstra
On Fri, Aug 25, 2017 at 12:11:31AM +0200, Uladzislau Rezki (Sony) wrote:
> From: Uladzislau Rezki 
> 
> As a first step this patch makes cfs_tasks list as MRU one.
> It means, that when a next task is picked to run on physical
> CPU it is moved to the front of the list.
> 
> Thefore, the cfs_tasks list is more or less sorted (except woken
> tasks) starting from recently given CPU time tasks toward tasks
> with max wait time in a run-queue, i.e. MRU list.
> 
> Second, as part of the load balance operation, this approach
> starts detach_tasks()/detach_one_task() from the tail of the
> queue instead of the head, giving some advantages:
> 
> - tends to pick a task with highest wait time;
> - tasks located in the tail are less likely cache-hot,
>   therefore the can_migrate_task() decision is higher.
> 
> hackbench illustrates slightly better performance. For example
> doing 1000 samples and 40 groups on i5-3320M CPU, it shows below
> figures:
> 
> default: 0.644 avg
> patched: 0.637 avg
> 
> Signed-off-by: Uladzislau Rezki (Sony) 
> ---
>  kernel/sched/fair.c | 19 ++-
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index c77e4b1d51c0..cda281c6bb29 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6357,7 +6357,7 @@ pick_next_task_fair(struct rq *rq, struct task_struct 
> *prev, struct rq_flags *rf
>   if (hrtick_enabled(rq))
>   hrtick_start_fair(rq, p);
>  
> - return p;
> + goto done;
>  simple:
>   cfs_rq = >cfs;
>  #endif
> @@ -6378,6 +6378,14 @@ pick_next_task_fair(struct rq *rq, struct task_struct 
> *prev, struct rq_flags *rf
>   if (hrtick_enabled(rq))
>   hrtick_start_fair(rq, p);
>  
> +done: __maybe_unused
> + /*
> +  * Move the next running task to the front of
> +  * the list, so our cfs_tasks list becomes MRU
> +  * one.
> +  */
> + list_move(>group_node, >cfs_tasks);
> +
>   return p;
>  
>  idle:

Could you also run something like:

$ taskset 1 perf bench sched pipe

to make sure the added list_move() doesn't hurt, I'm not sure group_node
and cfs_tasks are in cachelines we already touch for that operation.

And if you can see that list_move() hurt in "perf annotate", try moving
those members around to lines that we already need anyway.


[PATCH 4.9 83/84] ACPI: EC: Fix regression related to wrong ECDT initialization order

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Lv Zheng 

commit 98529b9272e06a7767034fb8a32e43cdecda240a upstream.

Commit 2a5708409e4e (ACPI / EC: Fix a gap that ECDT EC cannot handle
EC events) introduced acpi_ec_ecdt_start(), but that function is
invoked before acpi_ec_query_init(), which is too early.  This causes
the kernel to crash if an EC event occurs after boot, when ec_query_wq
is not valid:

 BUG: unable to handle kernel NULL pointer dereference at 0102
 ...
 Workqueue: events acpi_ec_event_handler
 task: 9f539790dac0 task.stack: b437c0e1
 RIP: 0010:__queue_work+0x32/0x430

Normally, the DSDT EC should always be valid, so acpi_ec_ecdt_start()
is actually a no-op in the majority of cases.  However, commit
c712bb58d827 (ACPI / EC: Add support to skip boot stage DSDT probe)
caused the probing of the DSDT EC as the "boot EC" to be skipped when
the ECDT EC is valid and uncovered the bug.

Fix this issue by invoking acpi_ec_ecdt_start() after acpi_ec_query_init()
in acpi_ec_init().

Link: https://jira01.devtools.intel.com/browse/LCK-4348
Fixes: 2a5708409e4e (ACPI / EC: Fix a gap that ECDT EC cannot handle EC events)
Fixes: c712bb58d827 (ACPI / EC: Add support to skip boot stage DSDT probe)
Reported-by: Wang Wendy 
Tested-by: Feng Chenzhou 
Signed-off-by: Lv Zheng 
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/acpi/ec.c   |   17 +++--
 drivers/acpi/internal.h |1 -
 drivers/acpi/scan.c |1 -
 3 files changed, 7 insertions(+), 12 deletions(-)

--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -1728,7 +1728,7 @@ error:
  * functioning ECDT EC first in order to handle the events.
  * https://bugzilla.kernel.org/show_bug.cgi?id=115021
  */
-int __init acpi_ec_ecdt_start(void)
+static int __init acpi_ec_ecdt_start(void)
 {
acpi_handle handle;
 
@@ -1959,20 +1959,17 @@ static inline void acpi_ec_query_exit(vo
 int __init acpi_ec_init(void)
 {
int result;
+   int ecdt_fail, dsdt_fail;
 
/* register workqueue for _Qxx evaluations */
result = acpi_ec_query_init();
if (result)
-   goto err_exit;
-   /* Now register the driver for the EC */
-   result = acpi_bus_register_driver(_ec_driver);
-   if (result)
-   goto err_exit;
+   return result;
 
-err_exit:
-   if (result)
-   acpi_ec_query_exit();
-   return result;
+   /* Drivers must be started after acpi_ec_query_init() */
+   ecdt_fail = acpi_ec_ecdt_start();
+   dsdt_fail = acpi_bus_register_driver(_ec_driver);
+   return ecdt_fail && dsdt_fail ? -ENODEV : 0;
 }
 
 /* EC driver currently not unloadable */
--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -185,7 +185,6 @@ typedef int (*acpi_ec_query_func) (void
 int acpi_ec_init(void);
 int acpi_ec_ecdt_probe(void);
 int acpi_ec_dsdt_probe(void);
-int acpi_ec_ecdt_start(void);
 void acpi_ec_block_transactions(void);
 void acpi_ec_unblock_transactions(void);
 int acpi_ec_add_query_handler(struct acpi_ec *ec, u8 query_bit,
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -2051,7 +2051,6 @@ int __init acpi_scan_init(void)
 
acpi_gpe_apply_masked_gpes();
acpi_update_all_gpes();
-   acpi_ec_ecdt_start();
 
acpi_scan_initialized = true;
 




[PATCH 4.9 55/84] ftrace: Check for null ret_stack on profile function graph entry function

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Steven Rostedt (VMware) 

commit a8f0f9e49956a74718874b800251455680085600 upstream.

There's a small race when function graph shutsdown and the calling of the
registered function graph entry callback. The callback must not reference
the task's ret_stack without first checking that it is not NULL. Note, when
a ret_stack is allocated for a task, it stays allocated until the task exits.
The problem here, is that function_graph is shutdown, and a new task was
created, which doesn't have its ret_stack allocated. But since some of the
functions are still being traced, the callbacks can still be called.

The normal function_graph code handles this, but starting with commit
8861dd303c ("ftrace: Access ret_stack->subtime only in the function
profiler") the profiler code references the ret_stack on function entry, but
doesn't check if it is NULL first.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=196611

Fixes: 8861dd303c ("ftrace: Access ret_stack->subtime only in the function 
profiler")
Reported-by: lilyd...@gmail.com
Signed-off-by: Steven Rostedt (VMware) 
Signed-off-by: Greg Kroah-Hartman 

---
 kernel/trace/ftrace.c |4 
 1 file changed, 4 insertions(+)

--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -876,6 +876,10 @@ static int profile_graph_entry(struct ft
 
function_profile_call(trace->func, 0, NULL, NULL);
 
+   /* If function graph is shutting down, ret_stack can be NULL */
+   if (!current->ret_stack)
+   return 0;
+
if (index >= 0 && index < FTRACE_RETFUNC_DEPTH)
current->ret_stack[index].subtime = 0;
 




[PATCH 4.9 82/84] ACPI / APEI: Add missing synchronize_rcu() on NOTIFY_SCI removal

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: James Morse 

commit 7d64f82cceb21e6d95db312d284f5f195e120154 upstream.

When removing a GHES device notified by SCI, list_del_rcu() is used,
ghes_remove() should call synchronize_rcu() before it goes on to call
kfree(ghes), otherwise concurrent RCU readers may still hold this list
entry after it has been freed.

Signed-off-by: James Morse 
Reviewed-by: "Huang, Ying" 
Fixes: 81e88fdc432a (ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI 
notification type support)
Signed-off-by: Rafael J. Wysocki 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/acpi/apei/ghes.c |1 +
 1 file changed, 1 insertion(+)

--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -1072,6 +1072,7 @@ static int ghes_remove(struct platform_d
if (list_empty(_sci))
unregister_acpi_hed_notifier(_notifier_sci);
mutex_unlock(_list_mutex);
+   synchronize_rcu();
break;
case ACPI_HEST_NOTIFY_NMI:
ghes_nmi_remove(ghes);




[PATCH 4.9 78/84] ntb: no sleep in ntb_async_tx_submit

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Allen Hubbe 

commit 88931ec3dc11e7dbceb3b0df455693873b508fbe upstream.

Do not sleep in ntb_async_tx_submit, which could deadlock.
This reverts commit "8c874cc140d667f84ae4642bb5b5e0d6396d2ca4"

Fixes: 8c874cc140d6 ("NTB: Address out of DMA descriptor issue with NTB")
Reported-by: Jia-Ju Bai 
Signed-off-by: Allen Hubbe 
Acked-by: Dave Jiang 
Signed-off-by: Jon Mason 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/ntb/ntb_transport.c |   50 ++--
 1 file changed, 7 insertions(+), 43 deletions(-)

--- a/drivers/ntb/ntb_transport.c
+++ b/drivers/ntb/ntb_transport.c
@@ -176,14 +176,12 @@ struct ntb_transport_qp {
u64 rx_err_ver;
u64 rx_memcpy;
u64 rx_async;
-   u64 dma_rx_prep_err;
u64 tx_bytes;
u64 tx_pkts;
u64 tx_ring_full;
u64 tx_err_no_buf;
u64 tx_memcpy;
u64 tx_async;
-   u64 dma_tx_prep_err;
 };
 
 struct ntb_transport_mw {
@@ -256,8 +254,6 @@ enum {
 #define QP_TO_MW(nt, qp)   ((qp) % nt->mw_count)
 #define NTB_QP_DEF_NUM_ENTRIES 100
 #define NTB_LINK_DOWN_TIMEOUT  10
-#define DMA_RETRIES20
-#define DMA_OUT_RESOURCE_TOmsecs_to_jiffies(50)
 
 static void ntb_transport_rxc_db(unsigned long data);
 static const struct ntb_ctx_ops ntb_transport_ops;
@@ -518,12 +514,6 @@ static ssize_t debugfs_read(struct file
out_offset += snprintf(buf + out_offset, out_count - out_offset,
   "free tx - \t%u\n",
   ntb_transport_tx_free_entry(qp));
-   out_offset += snprintf(buf + out_offset, out_count - out_offset,
-  "DMA tx prep err - \t%llu\n",
-  qp->dma_tx_prep_err);
-   out_offset += snprintf(buf + out_offset, out_count - out_offset,
-  "DMA rx prep err - \t%llu\n",
-  qp->dma_rx_prep_err);
 
out_offset += snprintf(buf + out_offset, out_count - out_offset,
   "\n");
@@ -770,8 +760,6 @@ static void ntb_qp_link_down_reset(struc
qp->tx_err_no_buf = 0;
qp->tx_memcpy = 0;
qp->tx_async = 0;
-   qp->dma_tx_prep_err = 0;
-   qp->dma_rx_prep_err = 0;
 }
 
 static void ntb_qp_link_cleanup(struct ntb_transport_qp *qp)
@@ -1314,7 +1302,6 @@ static int ntb_async_rx_submit(struct nt
struct dmaengine_unmap_data *unmap;
dma_cookie_t cookie;
void *buf = entry->buf;
-   int retries = 0;
 
len = entry->len;
device = chan->device;
@@ -1343,22 +1330,11 @@ static int ntb_async_rx_submit(struct nt
 
unmap->from_cnt = 1;
 
-   for (retries = 0; retries < DMA_RETRIES; retries++) {
-   txd = device->device_prep_dma_memcpy(chan,
-unmap->addr[1],
-unmap->addr[0], len,
-DMA_PREP_INTERRUPT);
-   if (txd)
-   break;
-
-   set_current_state(TASK_INTERRUPTIBLE);
-   schedule_timeout(DMA_OUT_RESOURCE_TO);
-   }
-
-   if (!txd) {
-   qp->dma_rx_prep_err++;
+   txd = device->device_prep_dma_memcpy(chan, unmap->addr[1],
+unmap->addr[0], len,
+DMA_PREP_INTERRUPT);
+   if (!txd)
goto err_get_unmap;
-   }
 
txd->callback_result = ntb_rx_copy_callback;
txd->callback_param = entry;
@@ -1603,7 +1579,6 @@ static int ntb_async_tx_submit(struct nt
struct dmaengine_unmap_data *unmap;
dma_addr_t dest;
dma_cookie_t cookie;
-   int retries = 0;
 
device = chan->device;
dest = qp->tx_mw_phys + qp->tx_max_frame * entry->tx_index;
@@ -1625,21 +1600,10 @@ static int ntb_async_tx_submit(struct nt
 
unmap->to_cnt = 1;
 
-   for (retries = 0; retries < DMA_RETRIES; retries++) {
-   txd = device->device_prep_dma_memcpy(chan, dest,
-unmap->addr[0], len,
-DMA_PREP_INTERRUPT);
-   if (txd)
-   break;
-
-   set_current_state(TASK_INTERRUPTIBLE);
-   schedule_timeout(DMA_OUT_RESOURCE_TO);
-   }
-
-   if (!txd) {
-   qp->dma_tx_prep_err++;
+   txd = device->device_prep_dma_memcpy(chan, dest, unmap->addr[0], len,
+DMA_PREP_INTERRUPT);
+   if (!txd)
goto err_get_unmap;
-   }
 
txd->callback_result = 

[PATCH 4.9 81/84] ACPI: ioapic: Clear on-stack resource before using it

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Joerg Roedel 

commit e3d5092b6756b9e0b08f94bbeafcc7afe19f0996 upstream.

The on-stack resource-window 'win' in setup_res() is not
properly initialized. This causes the pointers in the
embedded 'struct resource' to contain stale addresses.

These pointers (in my case the ->child pointer) later get
propagated to the global iomem_resources list, causing a #GP
exception when the list is traversed in
iomem_map_sanity_check().

Fixes: c183619b63ec (x86/irq, ACPI: Implement ACPI driver to support IOAPIC 
hotplug)
Signed-off-by: Joerg Roedel 
Signed-off-by: Rafael J. Wysocki 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/acpi/ioapic.c |6 ++
 1 file changed, 6 insertions(+)

--- a/drivers/acpi/ioapic.c
+++ b/drivers/acpi/ioapic.c
@@ -45,6 +45,12 @@ static acpi_status setup_res(struct acpi
struct resource *res = data;
struct resource_win win;
 
+   /*
+* We might assign this to 'res' later, make sure all pointers are
+* cleared before the resource is added to the global list
+*/
+   memset(, 0, sizeof(win));
+
res->flags = 0;
if (acpi_dev_filter_resource_type(acpi_res, IORESOURCE_MEM))
return AE_OK;




[PATCH 4.9 51/84] kbuild: linker script do not match C names unless LD_DEAD_CODE_DATA_ELIMINATION is configured

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Nicholas Piggin 

commit cb87481ee89dbd6609e227afbf64900fb4e5c930 upstream.

The .data and .bss sections were modified in the generic linker script to
pull in sections named .data., which are generated by gcc with
-ffunction-sections and -fdata-sections options.

The problem with this pattern is it can also match section names that Linux
defines explicitly, e.g., .data.unlikely. This can cause Linux sections to
get moved into the wrong place.

The way to avoid this is to use ".." separators for explicit section names
(the dot character is valid in a section name but not a C identifier).
However currently there are sections which don't follow this rule, so for
now just disable the wild card by default.

Example: http://marc.info/?l=linux-arm-kernel=150106824024221=2

Fixes: b67067f1176df ("kbuild: allow archs to select link dead code/data 
elimination")
Signed-off-by: Nicholas Piggin 
Signed-off-by: Masahiro Yamada 
Signed-off-by: Greg Kroah-Hartman 

---
 include/asm-generic/vmlinux.lds.h |   38 ++
 1 file changed, 26 insertions(+), 12 deletions(-)

--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -60,6 +60,22 @@
 #define ALIGN_FUNCTION()  . = ALIGN(8)
 
 /*
+ * LD_DEAD_CODE_DATA_ELIMINATION option enables -fdata-sections, which
+ * generates .data.identifier sections, which need to be pulled in with
+ * .data. We don't want to pull in .data..other sections, which Linux
+ * has defined. Same for text and bss.
+ */
+#ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION
+#define TEXT_MAIN .text .text.[0-9a-zA-Z_]*
+#define DATA_MAIN .data .data.[0-9a-zA-Z_]*
+#define BSS_MAIN .bss .bss.[0-9a-zA-Z_]*
+#else
+#define TEXT_MAIN .text
+#define DATA_MAIN .data
+#define BSS_MAIN .bss
+#endif
+
+/*
  * Align to a 32 byte boundary equal to the
  * alignment gcc 4.5 uses for a struct
  */
@@ -198,12 +214,9 @@
 
 /*
  * .data section
- * LD_DEAD_CODE_DATA_ELIMINATION option enables -fdata-sections generates
- * .data.identifier which needs to be pulled in with .data, but don't want to
- * pull in .data..stuff which has its own requirements. Same for bss.
  */
 #define DATA_DATA  \
-   *(.data .data.[0-9a-zA-Z_]*)\
+   *(DATA_MAIN)\
*(.ref.data)\
*(.data..shared_aligned) /* percpu related */   \
MEM_KEEP(init.data) \
@@ -436,16 +449,17 @@
VMLINUX_SYMBOL(__security_initcall_end) = .;\
}
 
-/* .text section. Map to function alignment to avoid address changes
+/*
+ * .text section. Map to function alignment to avoid address changes
  * during second ld run in second ld pass when generating System.map
- * LD_DEAD_CODE_DATA_ELIMINATION option enables -ffunction-sections generates
- * .text.identifier which needs to be pulled in with .text , but some
- * architectures define .text.foo which is not intended to be pulled in here.
- * Those enabling LD_DEAD_CODE_DATA_ELIMINATION must ensure they don't have
- * conflicting section names, and must pull in .text.[0-9a-zA-Z_]* */
+ *
+ * TEXT_MAIN here will match .text.fixup and .text.unlikely if dead
+ * code elimination is enabled, so these sections should be converted
+ * to use ".." first.
+ */
 #define TEXT_TEXT  \
ALIGN_FUNCTION();   \
-   *(.text.hot .text .text.fixup .text.unlikely)   \
+   *(.text.hot TEXT_MAIN .text.fixup .text.unlikely)   \
*(.ref.text)\
MEM_KEEP(init.text) \
MEM_KEEP(exit.text) \
@@ -613,7 +627,7 @@
BSS_FIRST_SECTIONS  \
*(.bss..page_aligned)   \
*(.dynbss)  \
-   *(.bss .bss.[0-9a-zA-Z_]*)  \
+   *(BSS_MAIN) \
*(COMMON)   \
}
 




[PATCH 4.9 18/84] net/mlx4_core: Enable 4K UAR if SRIOV module parameter is not enabled

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Huy Nguyen 


[ Upstream commit ca3d89a3ebe79367bd41b6b8ba37664478ae2dba ]

enable_4k_uar module parameter was added in patch cited below to
address the backward compatibility issue in SRIOV when the VM has
system's PAGE_SIZE uar implementation and the Hypervisor has 4k uar
implementation.

The above compatibility issue does not exist in the non SRIOV case.
In this patch, we always enable 4k uar implementation if SRIOV
is not enabled on mlx4's supported cards.

Fixes: 76e39ccf9c36 ("net/mlx4_core: Fix backward compatibility on VFs")
Signed-off-by: Huy Nguyen 
Reviewed-by: Daniel Jurgens 
Signed-off-by: Saeed Mahameed 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/mellanox/mlx4/main.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -430,7 +430,7 @@ static int mlx4_dev_cap(struct mlx4_dev
/* Virtual PCI function needs to determine UAR page size from
 * firmware. Only master PCI function can set the uar page size
 */
-   if (enable_4k_uar)
+   if (enable_4k_uar || !dev->persist->num_vfs)
dev->uar_page_shift = DEFAULT_UAR_PAGE_SHIFT;
else
dev->uar_page_shift = PAGE_SHIFT;
@@ -2269,7 +2269,7 @@ static int mlx4_init_hca(struct mlx4_dev
 
dev->caps.max_fmr_maps = (1 << (32 - 
ilog2(dev->caps.num_mpts))) - 1;
 
-   if (enable_4k_uar) {
+   if (enable_4k_uar || !dev->persist->num_vfs) {
init_hca.log_uar_sz = ilog2(dev->caps.num_uars) +
PAGE_SHIFT - 
DEFAULT_UAR_PAGE_SHIFT;
init_hca.uar_page_sz = DEFAULT_UAR_PAGE_SHIFT - 12;




[PATCH 4.9 19/84] irda: do not leak initialized list.dev to userspace

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Colin Ian King 


[ Upstream commit b024d949a3c24255a7ef1a470420eb478949aa4c ]

list.dev has not been initialized and so the copy_to_user is copying
data from the stack back to user space which is a potential
information leak. Fix this ensuring all of list is initialized to
zero.

Detected by CoverityScan, CID#1357894 ("Uninitialized scalar variable")

Signed-off-by: Colin Ian King 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/irda/af_irda.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -2223,7 +2223,7 @@ static int irda_getsockopt(struct socket
 {
struct sock *sk = sock->sk;
struct irda_sock *self = irda_sk(sk);
-   struct irda_device_list list;
+   struct irda_device_list list = { 0 };
struct irda_device_info *discoveries;
struct irda_ias_set *   ias_opt;/* IAS get/query params */
struct ias_object * ias_obj;/* Object in IAS */




Re: [PATCH 3/3] IPI: Avoid to use 2 cache lines for one call_single_data

2017-08-28 Thread Peter Zijlstra
On Mon, Aug 28, 2017 at 01:19:21PM +0800, Huang, Ying wrote:
> > What do you think about this version?
> >
> 
> Ping.

Thanks, yes that got lost in the inbox :-(

I'll queue it, thanks !


[PATCH 4.9 10/84] ptr_ring: use kmalloc_array()

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Eric Dumazet 


[ Upstream commit 81fbfe8adaf38d4f5a98c19bebfd41c5d6acaee8 ]

As found by syzkaller, malicious users can set whatever tx_queue_len
on a tun device and eventually crash the kernel.

Lets remove the ALIGN(XXX, SMP_CACHE_BYTES) thing since a small
ring buffer is not fast anyway.

Fixes: 2e0ab8ca83c1 ("ptr_ring: array based FIFO for pointers")
Signed-off-by: Eric Dumazet 
Reported-by: Dmitry Vyukov 
Cc: Michael S. Tsirkin 
Cc: Jason Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 include/linux/ptr_ring.h  |9 +
 include/linux/skb_array.h |3 ++-
 2 files changed, 7 insertions(+), 5 deletions(-)

--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -340,9 +340,9 @@ static inline void *ptr_ring_consume_bh(
__PTR_RING_PEEK_CALL_v; \
 })
 
-static inline void **__ptr_ring_init_queue_alloc(int size, gfp_t gfp)
+static inline void **__ptr_ring_init_queue_alloc(unsigned int size, gfp_t gfp)
 {
-   return kzalloc(ALIGN(size * sizeof(void *), SMP_CACHE_BYTES), gfp);
+   return kcalloc(size, sizeof(void *), gfp);
 }
 
 static inline int ptr_ring_init(struct ptr_ring *r, int size, gfp_t gfp)
@@ -417,7 +417,8 @@ static inline int ptr_ring_resize(struct
  * In particular if you consume ring in interrupt or BH context, you must
  * disable interrupts/BH when doing so.
  */
-static inline int ptr_ring_resize_multiple(struct ptr_ring **rings, int nrings,
+static inline int ptr_ring_resize_multiple(struct ptr_ring **rings,
+  unsigned int nrings,
   int size,
   gfp_t gfp, void (*destroy)(void *))
 {
@@ -425,7 +426,7 @@ static inline int ptr_ring_resize_multip
void ***queues;
int i;
 
-   queues = kmalloc(nrings * sizeof *queues, gfp);
+   queues = kmalloc_array(nrings, sizeof(*queues), gfp);
if (!queues)
goto noqueues;
 
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -162,7 +162,8 @@ static inline int skb_array_resize(struc
 }
 
 static inline int skb_array_resize_multiple(struct skb_array **rings,
-   int nrings, int size, gfp_t gfp)
+   int nrings, unsigned int size,
+   gfp_t gfp)
 {
BUILD_BUG_ON(offsetof(struct skb_array, ring));
return ptr_ring_resize_multiple((struct ptr_ring **)rings,




[PATCH 4.9 13/84] sctp: fully initialize the IPv6 address in sctp_v6_to_addr()

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Alexander Potapenko 


[ Upstream commit 15339e441ec46fbc3bf3486bb1ae4845b0f1bb8d ]

KMSAN reported use of uninitialized sctp_addr->v4.sin_addr.s_addr and
sctp_addr->v6.sin6_scope_id in sctp_v6_cmp_addr() (see below).
Make sure all fields of an IPv6 address are initialized, which
guarantees that the IPv4 fields are also initialized.

==
 BUG: KMSAN: use of uninitialized memory in sctp_v6_cmp_addr+0x8d4/0x9f0
 net/sctp/ipv6.c:517
 CPU: 2 PID: 31056 Comm: syz-executor1 Not tainted 4.11.0-rc5+ #2944
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
 01/01/2011
 Call Trace:
  dump_stack+0x172/0x1c0 lib/dump_stack.c:42
  is_logbuf_locked mm/kmsan/kmsan.c:59 [inline]
  kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:938
  native_save_fl arch/x86/include/asm/irqflags.h:18 [inline]
  arch_local_save_flags arch/x86/include/asm/irqflags.h:72 [inline]
  arch_local_irq_save arch/x86/include/asm/irqflags.h:113 [inline]
  __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:467
  sctp_v6_cmp_addr+0x8d4/0x9f0 net/sctp/ipv6.c:517
  sctp_v6_get_dst+0x8c7/0x1630 net/sctp/ipv6.c:290
  sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
  sctp_assoc_add_peer+0x66d/0x16f0 net/sctp/associola.c:651
  sctp_sendmsg+0x35a5/0x4f90 net/sctp/socket.c:1871
  inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
  sock_sendmsg_nosec net/socket.c:633 [inline]
  sock_sendmsg net/socket.c:643 [inline]
  SYSC_sendto+0x608/0x710 net/socket.c:1696
  SyS_sendto+0x8a/0xb0 net/socket.c:1664
  entry_SYSCALL_64_fastpath+0x13/0x94
 RIP: 0033:0x44b479
 RSP: 002b:7f6213f21c08 EFLAGS: 0286 ORIG_RAX: 002c
 RAX: ffda RBX: 2000 RCX: 0044b479
 RDX: 0041 RSI: 20edd000 RDI: 0006
 RBP: 007080a8 R08: 20b85fe4 R09: 001c
 R10: 00040005 R11: 0286 R12: 
 R13: 3760 R14: 006e5820 R15: 00ff8000
 origin description: dst_saddr@sctp_v6_get_dst
 local variable created at:
  sk_fullsock include/net/sock.h:2321 [inline]
  inet6_sk include/linux/ipv6.h:309 [inline]
  sctp_v6_get_dst+0x91/0x1630 net/sctp/ipv6.c:241
  sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
==
 BUG: KMSAN: use of uninitialized memory in sctp_v6_cmp_addr+0x8d4/0x9f0
 net/sctp/ipv6.c:517
 CPU: 2 PID: 31056 Comm: syz-executor1 Not tainted 4.11.0-rc5+ #2944
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
 01/01/2011
 Call Trace:
  dump_stack+0x172/0x1c0 lib/dump_stack.c:42
  is_logbuf_locked mm/kmsan/kmsan.c:59 [inline]
  kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:938
  native_save_fl arch/x86/include/asm/irqflags.h:18 [inline]
  arch_local_save_flags arch/x86/include/asm/irqflags.h:72 [inline]
  arch_local_irq_save arch/x86/include/asm/irqflags.h:113 [inline]
  __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:467
  sctp_v6_cmp_addr+0x8d4/0x9f0 net/sctp/ipv6.c:517
  sctp_v6_get_dst+0x8c7/0x1630 net/sctp/ipv6.c:290
  sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
  sctp_assoc_add_peer+0x66d/0x16f0 net/sctp/associola.c:651
  sctp_sendmsg+0x35a5/0x4f90 net/sctp/socket.c:1871
  inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
  sock_sendmsg_nosec net/socket.c:633 [inline]
  sock_sendmsg net/socket.c:643 [inline]
  SYSC_sendto+0x608/0x710 net/socket.c:1696
  SyS_sendto+0x8a/0xb0 net/socket.c:1664
  entry_SYSCALL_64_fastpath+0x13/0x94
 RIP: 0033:0x44b479
 RSP: 002b:7f6213f21c08 EFLAGS: 0286 ORIG_RAX: 002c
 RAX: ffda RBX: 2000 RCX: 0044b479
 RDX: 0041 RSI: 20edd000 RDI: 0006
 RBP: 007080a8 R08: 20b85fe4 R09: 001c
 R10: 00040005 R11: 0286 R12: 
 R13: 3760 R14: 006e5820 R15: 00ff8000
 origin description: dst_saddr@sctp_v6_get_dst
 local variable created at:
  sk_fullsock include/net/sock.h:2321 [inline]
  inet6_sk include/linux/ipv6.h:309 [inline]
  sctp_v6_get_dst+0x91/0x1630 net/sctp/ipv6.c:241
  sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
==

Signed-off-by: Alexander Potapenko 
Reviewed-by: Xin Long 
Acked-by: Marcelo Ricardo Leitner 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/sctp/ipv6.c |2 ++
 1 file changed, 2 insertions(+)

--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -512,7 +512,9 @@ static void sctp_v6_to_addr(union sctp_a
 {
addr->sa.sa_family = AF_INET6;
addr->v6.sin6_port = port;
+   

[PATCH 4.9 41/84] fork: fix incorrect fput of ->exe_file causing use-after-free

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Eric Biggers 

commit 2b7e8665b4ff51c034c55df3cff76518d1a9ee3a upstream.

Commit 7c051267931a ("mm, fork: make dup_mmap wait for mmap_sem for
write killable") made it possible to kill a forking task while it is
waiting to acquire its ->mmap_sem for write, in dup_mmap().

However, it was overlooked that this introduced an new error path before
a reference is taken on the mm_struct's ->exe_file.  Since the
->exe_file of the new mm_struct was already set to the old ->exe_file by
the memcpy() in dup_mm(), it was possible for the mmput() in the error
path of dup_mm() to drop a reference to ->exe_file which was never
taken.

This caused the struct file to later be freed prematurely.

Fix it by updating mm_init() to NULL out the ->exe_file, in the same
place it clears other things like the list of mmaps.

This bug was found by syzkaller.  It can be reproduced using the
following C program:

#define _GNU_SOURCE
#include 
#include 
#include 
#include 
#include 
#include 

static void *mmap_thread(void *_arg)
{
for (;;) {
mmap(NULL, 0x100, PROT_READ,
 MAP_POPULATE|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
}
}

static void *fork_thread(void *_arg)
{
usleep(rand() % 1);
fork();
}

int main(void)
{
fork();
fork();
fork();
for (;;) {
if (fork() == 0) {
pthread_t t;

pthread_create(, NULL, mmap_thread, NULL);
pthread_create(, NULL, fork_thread, NULL);
usleep(rand() % 1);
syscall(__NR_exit_group, 0);
}
wait(NULL);
}
}

No special kernel config options are needed.  It usually causes a NULL
pointer dereference in __remove_shared_vm_struct() during exit, or in
dup_mmap() (which is usually inlined into copy_process()) during fork.
Both are due to a vm_area_struct's ->vm_file being used after it's
already been freed.

Google Bug Id: 64772007

Link: http://lkml.kernel.org/r/20170823211408.31198-1-ebigge...@gmail.com
Fixes: 7c051267931a ("mm, fork: make dup_mmap wait for mmap_sem for write 
killable")
Signed-off-by: Eric Biggers 
Tested-by: Mark Rutland 
Acked-by: Michal Hocko 
Cc: Dmitry Vyukov 
Cc: Ingo Molnar 
Cc: Konstantin Khlebnikov 
Cc: Oleg Nesterov 
Cc: Peter Zijlstra 
Cc: Vlastimil Babka 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 

---
 kernel/fork.c |1 +
 1 file changed, 1 insertion(+)

--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -766,6 +766,7 @@ static struct mm_struct *mm_init(struct
mm_init_cpumask(mm);
mm_init_aio(mm);
mm_init_owner(mm, p);
+   RCU_INIT_POINTER(mm->exe_file, NULL);
mmu_notifier_mm_init(mm);
clear_tlb_flush_pending(mm);
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS




[PATCH 4.9 37/84] ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Alexey Brodkin 

commit 7d79cee2c6540ea64dd917a14e2fd63d4ac3d3c0 upstream.

It is necessary to explicitly set both SLC_AUX_RGN_START1 and SLC_AUX_RGN_END1
which hold MSB bits of the physical address correspondingly of region start
and end otherwise SLC region operation is executed in unpredictable manner

Without this patch, SLC flushes on HSDK (IOC disabled) were taking
seconds.

Reported-by: Vladimir Kondratiev 
Signed-off-by: Alexey Brodkin 
Signed-off-by: Vineet Gupta 
[vgupta: PAR40 regs only written if PAE40 exist]
Signed-off-by: Greg Kroah-Hartman 

---
 arch/arc/include/asm/cache.h |2 ++
 arch/arc/mm/cache.c  |   13 +++--
 2 files changed, 13 insertions(+), 2 deletions(-)

--- a/arch/arc/include/asm/cache.h
+++ b/arch/arc/include/asm/cache.h
@@ -89,7 +89,9 @@ extern unsigned long perip_base, perip_e
 #define ARC_REG_SLC_FLUSH  0x904
 #define ARC_REG_SLC_INVALIDATE 0x905
 #define ARC_REG_SLC_RGN_START  0x914
+#define ARC_REG_SLC_RGN_START1 0x915
 #define ARC_REG_SLC_RGN_END0x916
+#define ARC_REG_SLC_RGN_END1   0x917
 
 /* Bit val in SLC_CONTROL */
 #define SLC_CTRL_IM0x040
--- a/arch/arc/mm/cache.c
+++ b/arch/arc/mm/cache.c
@@ -562,6 +562,7 @@ noinline void slc_op(phys_addr_t paddr,
static DEFINE_SPINLOCK(lock);
unsigned long flags;
unsigned int ctrl;
+   phys_addr_t end;
 
spin_lock_irqsave(, flags);
 
@@ -591,8 +592,16 @@ noinline void slc_op(phys_addr_t paddr,
 * END needs to be setup before START (latter triggers the operation)
 * END can't be same as START, so add (l2_line_sz - 1) to sz
 */
-   write_aux_reg(ARC_REG_SLC_RGN_END, (paddr + sz + l2_line_sz - 1));
-   write_aux_reg(ARC_REG_SLC_RGN_START, paddr);
+   end = paddr + sz + l2_line_sz - 1;
+   if (is_pae40_enabled())
+   write_aux_reg(ARC_REG_SLC_RGN_END1, upper_32_bits(end));
+
+   write_aux_reg(ARC_REG_SLC_RGN_END, lower_32_bits(end));
+
+   if (is_pae40_enabled())
+   write_aux_reg(ARC_REG_SLC_RGN_START1, upper_32_bits(paddr));
+
+   write_aux_reg(ARC_REG_SLC_RGN_START, lower_32_bits(paddr));
 
while (read_aux_reg(ARC_REG_SLC_CTRL) & SLC_CTRL_BUSY);
 




[PATCH 4.9 36/84] ALSA: firewire: fix NULL pointer dereference when releasing uninitialized data of iso-resource

2017-08-28 Thread Greg Kroah-Hartman
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Takashi Sakamoto 

commit 0c264af7be2013266c5b4c644f3f366399ee490a upstream.

When calling 'iso_resource_free()' for uninitialized data, this function
causes NULL pointer dereference due to its 'unit' member. This occurs when
unplugging audio and music units on IEEE 1394 bus at failure of card
registration.

This commit fixes the bug. The bug exists since kernel v4.5.

Fixes: 324540c4e05c ('ALSA: fireface: postpone sound card registration') at 
v4.12
Fixes: 8865a31e0fd8 ('ALSA: firewire-motu: postpone sound card registration') 
at v4.12
Fixes: b610386c8afb ('ALSA: firewire-tascam: deleyed registration of sound 
card') at v4.7
Fixes: 86c8dd7f4da3 ('ALSA: firewire-digi00x: delayed registration of sound 
card') at v4.7
Fixes: 6c29230e2a5f ('ALSA: oxfw: delayed registration of sound card') at v4.7
Fixes: 7d3c1d5901aa ('ALSA: fireworks: delayed registration of sound card') at 
v4.7
Fixes: 04a2c73c97eb ('ALSA: bebob: delayed registration of sound card') at v4.7
Fixes: b59fb1900b4f ('ALSA: dice: postpone card registration') at v4.5
Signed-off-by: Takashi Sakamoto 
Signed-off-by: Takashi Iwai 
Signed-off-by: Greg Kroah-Hartman 

---
 sound/firewire/iso-resources.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/sound/firewire/iso-resources.c
+++ b/sound/firewire/iso-resources.c
@@ -210,9 +210,14 @@ EXPORT_SYMBOL(fw_iso_resources_update);
  */
 void fw_iso_resources_free(struct fw_iso_resources *r)
 {
-   struct fw_card *card = fw_parent_device(r->unit)->card;
+   struct fw_card *card;
int bandwidth, channel;
 
+   /* Not initialized. */
+   if (r->unit == NULL)
+   return;
+   card = fw_parent_device(r->unit)->card;
+
mutex_lock(>mutex);
 
if (r->allocated) {




[patch V3 23/44] x86/percpu: Use static initializer for GDT entry

2017-08-28 Thread Thomas Gleixner
The IDT cleanup is about to remove pack_descriptor(). The GDT setup for the
percpu storage can be achieved with the static initializer as well. Replace
it.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/setup_percpu.c |9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -155,13 +155,10 @@ static void __init pcpup_populate_pte(un
 static inline void setup_percpu_segment(int cpu)
 {
 #ifdef CONFIG_X86_32
-   struct desc_struct gdt;
+   struct desc_struct d = GDT_ENTRY_INIT(0x8092, per_cpu_offset(cpu),
+ 0xF);
 
-   pack_descriptor(, per_cpu_offset(cpu), 0xF,
-   0x2 | DESCTYPE_S, 0x8);
-   gdt.s = 1;
-   write_gdt_entry(get_cpu_gdt_rw(cpu),
-   GDT_ENTRY_PERCPU, , DESCTYPE_S);
+   write_gdt_entry(get_cpu_gdt_rw(cpu), GDT_ENTRY_PERCPU, , DESCTYPE_S);
 #endif
 }
 




[patch V3 24/44] x86/fpu: Use bitfield accessors for desc_struct

2017-08-28 Thread Thomas Gleixner
desc_struct is a union of u32 fields and bitfields. The access to the u32
fields is done with magic macros.

Convert it to use the bitfields and replace the macro magic with parseable
inline functions.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/math-emu/fpu_entry.c   |   11 -
 arch/x86/math-emu/fpu_system.h  |   48 ++--
 arch/x86/math-emu/get_address.c |   17 +++---
 3 files changed, 51 insertions(+), 25 deletions(-)

--- a/arch/x86/math-emu/fpu_entry.c
+++ b/arch/x86/math-emu/fpu_entry.c
@@ -147,7 +147,7 @@ void math_emulate(struct math_emu_info *
}
 
code_descriptor = FPU_get_ldt_descriptor(FPU_CS);
-   if (SEG_D_SIZE(code_descriptor)) {
+   if (code_descriptor.d) {
/* The above test may be wrong, the book is not clear */
/* Segmented 32 bit protected mode */
addr_modes.default_mode = SEG32;
@@ -155,11 +155,10 @@ void math_emulate(struct math_emu_info *
/* 16 bit protected mode */
addr_modes.default_mode = PM16;
}
-   FPU_EIP += code_base = SEG_BASE_ADDR(code_descriptor);
-   code_limit = code_base
-   + (SEG_LIMIT(code_descriptor) +
-  1) * SEG_GRANULARITY(code_descriptor)
-   - 1;
+   FPU_EIP += code_base = seg_get_base(_descriptor);
+   code_limit = seg_get_limit(_descriptor) + 1;
+   code_limit *= seg_get_granularity(_descriptor);
+   code_limit += code_base - 1;
if (code_limit < code_base)
code_limit = 0x;
}
--- a/arch/x86/math-emu/fpu_system.h
+++ b/arch/x86/math-emu/fpu_system.h
@@ -34,17 +34,43 @@ static inline struct desc_struct FPU_get
return ret;
 }
 
-#define SEG_D_SIZE(x)  ((x).b & (3 << 21))
-#define SEG_G_BIT(x)   ((x).b & (1 << 23))
-#define SEG_GRANULARITY(x) (((x).b & (1 << 23)) ? 4096 : 1)
-#define SEG_286_MODE(x)((x).b & ( 0xff00 | 0xf | (1 << 
23)))
-#define SEG_BASE_ADDR(s)   (((s).b & 0xff00) \
-| (((s).b & 0xff) << 16) | ((s).a >> 16))
-#define SEG_LIMIT(s)   (((s).b & 0xff) | ((s).a & 0x))
-#define SEG_EXECUTE_ONLY(s)(((s).b & ((1 << 11) | (1 << 9))) == (1 << 11))
-#define SEG_WRITE_PERM(s)  (((s).b & ((1 << 11) | (1 << 9))) == (1 << 9))
-#define SEG_EXPAND_DOWN(s) (((s).b & ((1 << 11) | (1 << 10))) \
-== (1 << 10))
+#define SEG_TYPE_WRITABLE  (1U << 1)
+#define SEG_TYPE_EXPANDS_DOWN  (1U << 2)
+#define SEG_TYPE_EXECUTE   (1U << 3)
+#define SEG_TYPE_EXPAND_MASK   (SEG_TYPE_EXPANDS_DOWN | SEG_TYPE_EXECUTE)
+#define SEG_TYPE_EXECUTE_MASK  (SEG_TYPE_WRITABLE | SEG_TYPE_EXECUTE)
+
+static inline unsigned long seg_get_base(struct desc_struct *d)
+{
+   unsigned long base = (unsigned long)d->base2 << 24;
+
+   return base | ((unsigned long)d->base1 << 16) | d->base0;
+}
+
+static inline unsigned long seg_get_limit(struct desc_struct *d)
+{
+   return ((unsigned long)d->limit << 16) | d->limit0;
+}
+
+static inline unsigned long seg_get_granularity(struct desc_struct *d)
+{
+   return d->g ? 4096 : 1;
+}
+
+static inline bool seg_expands_down(struct desc_struct *d)
+{
+   return (d->type & SEG_TYPE_EXPAND_MASK) == SEG_TYPE_EXPANDS_DOWN;
+}
+
+static inline bool seg_execute_only(struct desc_struct *d)
+{
+   return (d->type & SEG_TYPE_EXECUTE_MASK) == SEG_TYPE_EXECUTE;
+}
+
+static inline bool seg_writable(struct desc_struct *d)
+{
+   return (d->type & SEG_TYPE_EXECUTE_MASK) == SEG_TYPE_WRITABLE;
+}
 
 #define I387   (>thread.fpu.state)
 #define FPU_info   (I387->soft.info)
--- a/arch/x86/math-emu/get_address.c
+++ b/arch/x86/math-emu/get_address.c
@@ -159,17 +159,18 @@ static long pm_address(u_char FPU_modrm,
}
 
descriptor = FPU_get_ldt_descriptor(addr->selector);
-   base_address = SEG_BASE_ADDR(descriptor);
+   base_address = seg_get_base();
address = base_address + offset;
-   limit = base_address
-   + (SEG_LIMIT(descriptor) + 1) * SEG_GRANULARITY(descriptor) - 1;
+   limit = seg_get_limit() + 1;
+   limit *= seg_get_granularity();
+   limit += base_address - 1;
if (limit < base_address)
limit = 0x;
 
-   if (SEG_EXPAND_DOWN(descriptor)) {
-   if (SEG_G_BIT(descriptor))
+   if (seg_expands_down()) {
+   if (descriptor.g) {
seg_top = 0x;
-   else {
+   } else {
seg_top = base_address + (1 << 20);
if (seg_top < base_address)
seg_top = 0x;
@@ -182,8 +183,8 @@ static long 

[patch V3 41/44] x86/idt: Remove unused functions/inlines

2017-08-28 Thread Thomas Gleixner
The IDT related inlines are not longer used. Remove them.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/desc.h |   36 
 1 file changed, 36 deletions(-)

--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -390,16 +390,6 @@ static inline void set_desc_limit(struct
desc->limit1 = (limit >> 16) & 0xf;
 }
 
-#ifdef CONFIG_X86_64
-static inline void set_nmi_gate(int gate, void *addr)
-{
-   gate_desc s;
-
-   pack_gate(, GATE_INTERRUPT, (unsigned long)addr, 0, 0, __KERNEL_CS);
-   write_idt_entry(debug_idt_table, gate, );
-}
-#endif
-
 static inline void _set_gate(int gate, unsigned type, const void *addr,
 unsigned dpl, unsigned ist, unsigned seg)
 {
@@ -437,32 +427,6 @@ static inline void alloc_system_vector(i
set_intr_gate(n, addr); \
} while (0)
 
-/*
- * This routine sets up an interrupt gate at directory privilege level 3.
- */
-static inline void set_system_intr_gate(unsigned int n, void *addr)
-{
-   BUG_ON((unsigned)n > 0xFF);
-   _set_gate(n, GATE_INTERRUPT, addr, 0x3, 0, __KERNEL_CS);
-}
-
-static inline void set_task_gate(unsigned int n, unsigned int gdt_entry)
-{
-   BUG_ON((unsigned)n > 0xFF);
-   _set_gate(n, GATE_TASK, (void *)0, 0, 0, (gdt_entry<<3));
-}
-
-static inline void set_intr_gate_ist(int n, void *addr, unsigned ist)
-{
-   BUG_ON((unsigned)n > 0xFF);
-   _set_gate(n, GATE_INTERRUPT, addr, 0, ist, __KERNEL_CS);
-}
-
-static inline void set_system_intr_gate_ist(int n, void *addr, unsigned ist)
-{
-   BUG_ON((unsigned)n > 0xFF);
-   _set_gate(n, GATE_INTERRUPT, addr, 0x3, ist, __KERNEL_CS);
-}
 
 #ifdef CONFIG_X86_64
 DECLARE_PER_CPU(u32, debug_idt_ctr);




[patch V3 42/44] x86/idt: Deinline setup functions

2017-08-28 Thread Thomas Gleixner
Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/desc.h |   37 ++---
 arch/x86/kernel/idt.c   |   43 ++-
 2 files changed, 36 insertions(+), 44 deletions(-)

--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -390,44 +390,11 @@ static inline void set_desc_limit(struct
desc->limit1 = (limit >> 16) & 0xf;
 }
 
-static inline void _set_gate(int gate, unsigned type, const void *addr,
-unsigned dpl, unsigned ist, unsigned seg)
-{
-   gate_desc s;
-
-   pack_gate(, type, (unsigned long)addr, dpl, ist, seg);
-   /*
-* does not need to be atomic because it is only done once at
-* setup time
-*/
-   write_idt_entry(idt_table, gate, );
-}
-
-static inline void set_intr_gate(unsigned int n, const void *addr)
-{
-   BUG_ON(n > 0xFF);
-   _set_gate(n, GATE_INTERRUPT, addr, 0, 0, __KERNEL_CS);
-}
+void set_intr_gate(unsigned int n, const void *addr);
+void alloc_intr_gate(unsigned int n, const void *addr);
 
 extern unsigned long used_vectors[];
 
-static inline void alloc_system_vector(int vector)
-{
-   BUG_ON(vector < FIRST_SYSTEM_VECTOR);
-   if (!test_bit(vector, used_vectors)) {
-   set_bit(vector, used_vectors);
-   } else {
-   BUG();
-   }
-}
-
-#define alloc_intr_gate(n, addr)   \
-   do {\
-   alloc_system_vector(n); \
-   set_intr_gate(n, addr); \
-   } while (0)
-
-
 #ifdef CONFIG_X86_64
 DECLARE_PER_CPU(u32, debug_idt_ctr);
 static inline bool is_debug_idt_enabled(void)
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -212,15 +212,16 @@ static inline void idt_init_desc(gate_de
 #endif
 }
 
-static __init void
-idt_setup_from_table(gate_desc *idt, const struct idt_data *t, int size)
+static void
+idt_setup_from_table(gate_desc *idt, const struct idt_data *t, int size, bool 
sys)
 {
gate_desc desc;
 
for (; size > 0; t++, size--) {
idt_init_desc(, t);
-   set_bit(t->vector, used_vectors);
write_idt_entry(idt, t->vector, );
+   if (sys)
+   set_bit(t->vector, used_vectors);
}
 }
 
@@ -233,7 +234,8 @@ idt_setup_from_table(gate_desc *idt, con
  */
 void __init idt_setup_early_traps(void)
 {
-   idt_setup_from_table(idt_table, early_idts, ARRAY_SIZE(early_idts));
+   idt_setup_from_table(idt_table, early_idts, ARRAY_SIZE(early_idts),
+true);
load_idt(_descr);
 }
 
@@ -242,7 +244,7 @@ void __init idt_setup_early_traps(void)
  */
 void __init idt_setup_traps(void)
 {
-   idt_setup_from_table(idt_table, def_idts, ARRAY_SIZE(def_idts));
+   idt_setup_from_table(idt_table, def_idts, ARRAY_SIZE(def_idts), true);
 }
 
 #ifdef CONFIG_X86_64
@@ -259,7 +261,7 @@ void __init idt_setup_traps(void)
 void __init idt_setup_early_pf(void)
 {
idt_setup_from_table(idt_table, early_pf_idts,
-ARRAY_SIZE(early_pf_idts));
+ARRAY_SIZE(early_pf_idts), true);
 }
 
 /**
@@ -267,7 +269,7 @@ void __init idt_setup_early_pf(void)
  */
 void __init idt_setup_ist_traps(void)
 {
-   idt_setup_from_table(idt_table, ist_idts, ARRAY_SIZE(ist_idts));
+   idt_setup_from_table(idt_table, ist_idts, ARRAY_SIZE(ist_idts), true);
 }
 
 /**
@@ -277,7 +279,7 @@ void __init idt_setup_debugidt_traps(voi
 {
memcpy(_idt_table, _table, IDT_ENTRIES * 16);
 
-   idt_setup_from_table(debug_idt_table, dbg_idts, ARRAY_SIZE(dbg_idts));
+   idt_setup_from_table(debug_idt_table, dbg_idts, ARRAY_SIZE(dbg_idts), 
false);
 }
 #endif
 
@@ -289,7 +291,7 @@ void __init idt_setup_apic_and_irq_gates
int i = FIRST_EXTERNAL_VECTOR;
void *entry;
 
-   idt_setup_from_table(idt_table, apic_idts, ARRAY_SIZE(apic_idts));
+   idt_setup_from_table(idt_table, apic_idts, ARRAY_SIZE(apic_idts), true);
 
for_each_clear_bit_from(i, used_vectors, FIRST_SYSTEM_VECTOR) {
entry = irq_entries_start + 8 * (i - FIRST_EXTERNAL_VECTOR);
@@ -333,3 +335,26 @@ void idt_invalidate(void *addr)
 
load_idt();
 }
+
+void set_intr_gate(unsigned int n, const void *addr)
+{
+   struct idt_data data;
+
+   BUG_ON(n > 0xFF);
+
+   memset(, 0, sizeof(data));
+   data.vector = n;
+   data.addr   = addr;
+   data.segment= __KERNEL_CS;
+   data.bits.type  = GATE_INTERRUPT;
+   data.bits.p = 1;
+
+   idt_setup_from_table(idt_table, , 1, false);
+}
+
+void alloc_intr_gate(unsigned int n, const void *addr)
+{
+   BUG_ON(test_bit(n, used_vectors) || n < FIRST_SYSTEM_VECTOR);
+   set_bit(n, used_vectors);
+   set_intr_gate(n, addr);
+}




[patch V3 44/44] x86/idt: Hide set_intr_gate()

2017-08-28 Thread Thomas Gleixner
set_intr_gate() is an internal function of the IDT code. The only user left
is the KVM code which replaces the pagefault handler eventually.

Provide an explicit update_intr_gate() function and make set_intr_gate()
static. While at it replace the magic number 14 in the KVM code with the
proper trap define.

Signed-off-by: Thomas Gleixner 
Acked-by: Paolo Bonzini 
---
 arch/x86/include/asm/desc.h |2 +-
 arch/x86/kernel/idt.c   |   33 -
 arch/x86/kernel/kvm.c   |2 +-
 3 files changed, 22 insertions(+), 15 deletions(-)

--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -390,7 +390,7 @@ static inline void set_desc_limit(struct
desc->limit1 = (limit >> 16) & 0xf;
 }
 
-void set_intr_gate(unsigned int n, const void *addr);
+void update_intr_gate(unsigned int n, const void *addr);
 void alloc_intr_gate(unsigned int n, const void *addr);
 
 extern unsigned long used_vectors[];
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -225,6 +225,22 @@ idt_setup_from_table(gate_desc *idt, con
}
 }
 
+static void set_intr_gate(unsigned int n, const void *addr)
+{
+   struct idt_data data;
+
+   BUG_ON(n > 0xFF);
+
+   memset(, 0, sizeof(data));
+   data.vector = n;
+   data.addr   = addr;
+   data.segment= __KERNEL_CS;
+   data.bits.type  = GATE_INTERRUPT;
+   data.bits.p = 1;
+
+   idt_setup_from_table(idt_table, , 1, false);
+}
+
 /**
  * idt_setup_early_traps - Initialize the idt table with early traps
  *
@@ -336,20 +352,11 @@ void idt_invalidate(void *addr)
load_idt();
 }
 
-void set_intr_gate(unsigned int n, const void *addr)
+void __init update_intr_gate(unsigned int n, const void *addr)
 {
-   struct idt_data data;
-
-   BUG_ON(n > 0xFF);
-
-   memset(, 0, sizeof(data));
-   data.vector = n;
-   data.addr   = addr;
-   data.segment= __KERNEL_CS;
-   data.bits.type  = GATE_INTERRUPT;
-   data.bits.p = 1;
-
-   idt_setup_from_table(idt_table, , 1, false);
+   if (WARN_ON_ONCE(!test_bit(n, used_vectors)))
+   return;
+   set_intr_gate(n, addr);
 }
 
 void alloc_intr_gate(unsigned int n, const void *addr)
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -455,7 +455,7 @@ static int kvm_cpu_down_prepare(unsigned
 
 static void __init kvm_apf_trap_init(void)
 {
-   set_intr_gate(14, async_page_fault);
+   update_intr_gate(X86_TRAP_PF, async_page_fault);
 }
 
 void __init kvm_guest_init(void)




[patch V3 38/44] x86/idt: Move regular trap init to tables

2017-08-28 Thread Thomas Gleixner
Initialize the regular traps with a table.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/desc.h |1 
 arch/x86/kernel/idt.c   |   51 
 arch/x86/kernel/traps.c |   41 ---
 3 files changed, 53 insertions(+), 40 deletions(-)

--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -506,6 +506,7 @@ static inline void load_current_idt(void
 
 extern void idt_setup_early_handler(void);
 extern void idt_setup_early_traps(void);
+extern void idt_setup_traps(void);
 
 #ifdef CONFIG_X86_64
 extern void idt_setup_early_pf(void);
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -60,6 +60,49 @@ static const __initdata struct idt_data
 #endif
 };
 
+/*
+ * The default IDT entries which are set up in trap_init() before
+ * cpu_init() is invoked. Interrupt stacks cannot be used at that point and
+ * the traps which use them are reinitialized with IST after cpu_init() has
+ * set up TSS.
+ */
+static const __initdata struct idt_data def_idts[] = {
+   INTG(X86_TRAP_DE,   divide_error),
+   INTG(X86_TRAP_NMI,  nmi),
+   INTG(X86_TRAP_BR,   bounds),
+   INTG(X86_TRAP_UD,   invalid_op),
+   INTG(X86_TRAP_NM,   device_not_available),
+   INTG(X86_TRAP_OLD_MF,   coprocessor_segment_overrun),
+   INTG(X86_TRAP_TS,   invalid_TSS),
+   INTG(X86_TRAP_NP,   segment_not_present),
+   INTG(X86_TRAP_SS,   stack_segment),
+   INTG(X86_TRAP_GP,   general_protection),
+   INTG(X86_TRAP_SPURIOUS, spurious_interrupt_bug),
+   INTG(X86_TRAP_MF,   coprocessor_error),
+   INTG(X86_TRAP_AC,   alignment_check),
+   INTG(X86_TRAP_XF,   simd_coprocessor_error),
+
+#ifdef CONFIG_X86_32
+   TSKG(X86_TRAP_DF,   GDT_ENTRY_DOUBLEFAULT_TSS),
+#else
+   INTG(X86_TRAP_DF,   double_fault),
+#endif
+   INTG(X86_TRAP_DB,   debug),
+   INTG(X86_TRAP_NMI,  nmi),
+   INTG(X86_TRAP_BP,   int3),
+
+#ifdef CONFIG_X86_MCE
+   INTG(X86_TRAP_MC,   _check),
+#endif
+
+   SYSG(X86_TRAP_OF,   overflow),
+#if defined(CONFIG_IA32_EMULATION)
+   SYSG(IA32_SYSCALL_VECTOR,   entry_INT80_compat),
+#elif defined(CONFIG_X86_32)
+   SYSG(IA32_SYSCALL_VECTOR,   entry_INT80_32),
+#endif
+};
+
 #ifdef CONFIG_X86_64
 /*
  * Early traps running on the DEFAULT_STACK because the other interrupt
@@ -154,6 +197,14 @@ void __init idt_setup_early_traps(void)
load_idt(_descr);
 }
 
+/**
+ * idt_setup_traps - Initialize the idt table with default traps
+ */
+void __init idt_setup_traps(void)
+{
+   idt_setup_from_table(idt_table, def_idts, ARRAY_SIZE(def_idts));
+}
+
 #ifdef CONFIG_X86_64
 /**
  * idt_setup_early_pf - Initialize the idt table with early pagefault handler
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -925,46 +925,7 @@ dotraplinkage void do_iret_error(struct
 
 void __init trap_init(void)
 {
-   int i;
-
-   set_intr_gate(X86_TRAP_DE, divide_error);
-   set_intr_gate_ist(X86_TRAP_NMI, , NMI_STACK);
-   /* int4 can be called from all */
-   set_system_intr_gate(X86_TRAP_OF, );
-   set_intr_gate(X86_TRAP_BR, bounds);
-   set_intr_gate(X86_TRAP_UD, invalid_op);
-   set_intr_gate(X86_TRAP_NM, device_not_available);
-#ifdef CONFIG_X86_32
-   set_task_gate(X86_TRAP_DF, GDT_ENTRY_DOUBLEFAULT_TSS);
-#else
-   set_intr_gate_ist(X86_TRAP_DF, _fault, DOUBLEFAULT_STACK);
-#endif
-   set_intr_gate(X86_TRAP_OLD_MF, coprocessor_segment_overrun);
-   set_intr_gate(X86_TRAP_TS, invalid_TSS);
-   set_intr_gate(X86_TRAP_NP, segment_not_present);
-   set_intr_gate(X86_TRAP_SS, stack_segment);
-   set_intr_gate(X86_TRAP_GP, general_protection);
-   set_intr_gate(X86_TRAP_SPURIOUS, spurious_interrupt_bug);
-   set_intr_gate(X86_TRAP_MF, coprocessor_error);
-   set_intr_gate(X86_TRAP_AC, alignment_check);
-#ifdef CONFIG_X86_MCE
-   set_intr_gate_ist(X86_TRAP_MC, _check, MCE_STACK);
-#endif
-   set_intr_gate(X86_TRAP_XF, simd_coprocessor_error);
-
-   /* Reserve all the builtin and the syscall vector: */
-   for (i = 0; i < FIRST_EXTERNAL_VECTOR; i++)
-   set_bit(i, used_vectors);
-
-#ifdef CONFIG_IA32_EMULATION
-   set_system_intr_gate(IA32_SYSCALL_VECTOR, entry_INT80_compat);
-   set_bit(IA32_SYSCALL_VECTOR, used_vectors);
-#endif
-
-#ifdef CONFIG_X86_32
-   set_system_intr_gate(IA32_SYSCALL_VECTOR, entry_INT80_32);
-   set_bit(IA32_SYSCALL_VECTOR, used_vectors);
-#endif
+   idt_setup_traps();
 
/*
 * Set the IDT descriptor to a fixed read-only location, so that the




[patch V3 18/44] x86/tracing: Disentangle pagefault and resched IPI tracing key

2017-08-28 Thread Thomas Gleixner
The pagefault and the resched IPI handler are the only ones where it is
worth to optimize the code further in case tracepoints are disabled. But it
makes no sense to have a single static key for both.

Seperate the static keys so the facilities are handled seperately.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/trace/common.h  |   15 ---
 arch/x86/include/asm/trace/exceptions.h  |6 --
 arch/x86/include/asm/trace/irq_vectors.h |   29 +++--
 arch/x86/kernel/smp.c|2 +-
 arch/x86/kernel/tracepoint.c |   27 ++-
 arch/x86/mm/fault.c  |2 +-
 6 files changed, 59 insertions(+), 22 deletions(-)

--- a/arch/x86/include/asm/trace/common.h
+++ b/arch/x86/include/asm/trace/common.h
@@ -1,15 +1,16 @@
 #ifndef _ASM_TRACE_COMMON_H
 #define _ASM_TRACE_COMMON_H
 
-extern int trace_irq_vector_regfunc(void);
-extern void trace_irq_vector_unregfunc(void);
-
 #ifdef CONFIG_TRACING
-DECLARE_STATIC_KEY_FALSE(trace_irqvectors_key);
-#define trace_irqvectors_enabled() \
-   static_branch_unlikely(_irqvectors_key)
+DECLARE_STATIC_KEY_FALSE(trace_pagefault_key);
+#define trace_pagefault_enabled()  \
+   static_branch_unlikely(_pagefault_key)
+DECLARE_STATIC_KEY_FALSE(trace_resched_ipi_key);
+#define trace_resched_ipi_enabled()\
+   static_branch_unlikely(_resched_ipi_key)
 #else
-static inline bool trace_irqvectors_enabled(void) { return false; }
+static inline bool trace_pagefault_enabled(void) { return false; }
+static inline bool trace_resched_ipi_enabled(void) { return false; }
 #endif
 
 #endif
--- a/arch/x86/include/asm/trace/exceptions.h
+++ b/arch/x86/include/asm/trace/exceptions.h
@@ -7,6 +7,9 @@
 #include 
 #include 
 
+extern int trace_pagefault_reg(void);
+extern void trace_pagefault_unreg(void);
+
 DECLARE_EVENT_CLASS(x86_exceptions,
 
TP_PROTO(unsigned long address, struct pt_regs *regs,
@@ -35,8 +38,7 @@ DEFINE_EVENT_FN(x86_exceptions, name,
TP_PROTO(unsigned long address, struct pt_regs *regs,   \
 unsigned long error_code), \
TP_ARGS(address, regs, error_code), \
-   trace_irq_vector_regfunc,   \
-   trace_irq_vector_unregfunc);
+   trace_pagefault_reg, trace_pagefault_unreg);
 
 DEFINE_PAGE_FAULT_EVENT(page_fault_user);
 DEFINE_PAGE_FAULT_EVENT(page_fault_kernel);
--- a/arch/x86/include/asm/trace/irq_vectors.h
+++ b/arch/x86/include/asm/trace/irq_vectors.h
@@ -7,6 +7,9 @@
 #include 
 #include 
 
+extern int trace_resched_ipi_reg(void);
+extern void trace_resched_ipi_unreg(void);
+
 DECLARE_EVENT_CLASS(x86_irq_vector,
 
TP_PROTO(int vector),
@@ -26,15 +29,22 @@ DECLARE_EVENT_CLASS(x86_irq_vector,
 #define DEFINE_IRQ_VECTOR_EVENT(name)  \
 DEFINE_EVENT_FN(x86_irq_vector, name##_entry,  \
TP_PROTO(int vector),   \
+   TP_ARGS(vector), NULL, NULL);   \
+DEFINE_EVENT_FN(x86_irq_vector, name##_exit,   \
+   TP_PROTO(int vector),   \
+   TP_ARGS(vector), NULL, NULL);
+
+#define DEFINE_RESCHED_IPI_EVENT(name) \
+DEFINE_EVENT_FN(x86_irq_vector, name##_entry,  \
+   TP_PROTO(int vector),   \
TP_ARGS(vector),\
-   trace_irq_vector_regfunc,   \
-   trace_irq_vector_unregfunc);\
+   trace_resched_ipi_reg,  \
+   trace_resched_ipi_unreg);   \
 DEFINE_EVENT_FN(x86_irq_vector, name##_exit,   \
TP_PROTO(int vector),   \
TP_ARGS(vector),\
-   trace_irq_vector_regfunc,   \
-   trace_irq_vector_unregfunc);
-
+   trace_resched_ipi_reg,  \
+   trace_resched_ipi_unreg);
 
 /*
  * local_timer - called when entering/exiting a local timer interrupt
@@ -43,9 +53,16 @@ DEFINE_EVENT_FN(x86_irq_vector, name##_e
 DEFINE_IRQ_VECTOR_EVENT(local_timer);
 
 /*
+ * The ifdef is required because that tracepoint macro hell emits tracepoint
+ * code in files which include this header even if the tracepoint is not
+ * enabled. Brilliant stuff that.
+ */
+#ifdef CONFIG_SMP
+/*
  * reschedule - called when entering/exiting a reschedule vector handler
  */
-DEFINE_IRQ_VECTOR_EVENT(reschedule);
+DEFINE_RESCHED_IPI_EVENT(reschedule);
+#endif
 
 /*
  * spurious_apic - called when entering/exiting a spurious apic vector handler
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -262,7 +262,7 @@ static void native_stop_other_cpus(int w
ack_APIC_irq();
inc_irq_stat(irq_resched_count);
 
-   if (trace_irqvectors_enabled()) {
+   if (trace_resched_ipi_enabled()) {
/*
 * scheduler_ipi() might call irq_enter() as well, but
 * nested calls 

Re: [PATCH v4] x86/boot/KASLR: exclude EFI_BOOT_SERVICES_{CODE|DATA} from KASLR's choice

2017-08-28 Thread Baoquan He
Hi Naoya,

Thanks for this fix. I saw NEC had reported a bug to rhel previously,
and the bug truly will corrupt OS, it can be fixed by this patch.

This patch looks good to me, just a small concern, please see below
inline comment.

On 08/24/17 at 07:33pm, Naoya Horiguchi wrote:
> KASLR chooses kernel location from E820_TYPE_RAM regions by walking over
> e820 entries now. E820_TYPE_RAM includes EFI_BOOT_SERVICES_CODE and
> EFI_BOOT_SERVICES_DATA, so those regions can be the target. According to
> UEFI spec, all memory regions marked as EfiBootServicesCode and
> EfiBootServicesData are available for free memory after the first call
> of ExitBootServices(). So such regions should be usable for kernel on
> spec basis.
> 
> In x86, however, we have some workaround for broken firmware, where we
> keep such regions reserved until SetVirtualAddressMap() is done.
> See the following code in should_map_region():
> 
>   static bool should_map_region(efi_memory_desc_t *md)
>   {
>   ...
>   /*
>* Map boot services regions as a workaround for buggy
>* firmware that accesses them even when they shouldn't.
>*
>* See efi_{reserve,free}_boot_services().
>*/
>   if (md->type == EFI_BOOT_SERVICES_CODE ||
>   md->type == EFI_BOOT_SERVICES_DATA)
>   return false;
> 
> This workaround suppressed a boot crash, but potential issues still
> remain because no one prevents the regions from overlapping with kernel
> image by KASLR.
> 
> So let's make sure that EFI_BOOT_SERVICES_{CODE|DATA} regions are never
> chosen as kernel memory for the workaround to work fine. Furthermore,
> we choose kernel address only from EFI_CONVENTIONAL_MEMORY because it's
> the only memory type we know to be free.

Here, I think it's better to present why EFI_CONVENTIONAL_MEMORY is the
only memory type we should choose. EFI_BOOT_SERVICES_xxx has been clear
to us why it's not good. It might be worth to saying something about
EFI_LOADER_xxx why it's not ok to choose. Maybe one sentence to mention
it and take pgd as exampel as Matt ever said.

Thanks
Baoquan

> v3 -> v4:
> - update comment and patch description to mention why only
>   EFI_CONVENTIONAL_MEMORY is chosen.
> - use efi_early_memdesc_ptr()
> - I decided not to post cleanup patches (patch 2/2 in previous series)
>   because it's not necessary to fix the issue.
> 
> v2 -> v3:
> - skip EFI_LOADER_CODE and EFI_LOADER_DATA in region scan
> 
> v1 -> v2:
> - switch efi_mirror_found to local variable
> - insert break when EFI_MEMORY_MORE_RELIABLE found
> ---
>  arch/x86/boot/compressed/kaslr.c | 35 ++-
>  1 file changed, 26 insertions(+), 9 deletions(-)
> 
> diff --git tip/x86/boot/arch/x86/boot/compressed/kaslr.c 
> tip/x86/boot_patched/arch/x86/boot/compressed/kaslr.c
> index 7de23bb..ba5e9e5 100644
> --- tip/x86/boot/arch/x86/boot/compressed/kaslr.c
> +++ tip/x86/boot_patched/arch/x86/boot/compressed/kaslr.c
> @@ -597,19 +597,36 @@ process_efi_entries(unsigned long minimum, unsigned 
> long image_size)
>   for (i = 0; i < nr_desc; i++) {
>   md = efi_early_memdesc_ptr(pmap, e->efi_memdesc_size, i);
>   if (md->attribute & EFI_MEMORY_MORE_RELIABLE) {
> - region.start = md->phys_addr;
> - region.size = md->num_pages << EFI_PAGE_SHIFT;
> - process_mem_region(, minimum, image_size);
>   efi_mirror_found = true;
> -
> - if (slot_area_index == MAX_SLOT_AREA) {
> - debug_putstr("Aborted EFI scan (slot_areas 
> full)!\n");
> - break;
> - }
> + break;
>   }
>   }
>  
> - return efi_mirror_found;
> + for (i = 0; i < nr_desc; i++) {
> + md = efi_early_memdesc_ptr(pmap, e->efi_memdesc_size, i);
> +
> + /*
> +  * According to spec, EFI_BOOT_SERVICES_{CODE|DATA} are also
> +  * available for kernel image, but we don't include them for
> +  * the workaround for buggy firmware.
> +  * Only EFI_CONVENTIONAL_MEMORY is guaranteed to be free.
> +  */
> + if (md->type != EFI_CONVENTIONAL_MEMORY)
> + continue;
> +
> + if (efi_mirror_found &&
> + !(md->attribute & EFI_MEMORY_MORE_RELIABLE))
> + continue;
> +
> + region.start = md->phys_addr;
> + region.size = md->num_pages << EFI_PAGE_SHIFT;
> + process_mem_region(, minimum, image_size);
> + if (slot_area_index == MAX_SLOT_AREA) {
> + debug_putstr("Aborted EFI scan (slot_areas full)!\n");
> + break;
> + }
> + }
> + return true;
>  }
>  #else
>  static inline bool

[patch V3 21/44] x86/tracing: Build tracepoints only when they are used

2017-08-28 Thread Thomas Gleixner
The tracepoint macro magic emits code for all tracepoints in a event header
file. That code stays around even if the tracepoint is not used at all. The
linker does not discard it.

Build the various irq_vector tracepoints dependent on the appropriate CONFIG
switches.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/trace/irq_vectors.h |   36 ---
 1 file changed, 24 insertions(+), 12 deletions(-)

--- a/arch/x86/include/asm/trace/irq_vectors.h
+++ b/arch/x86/include/asm/trace/irq_vectors.h
@@ -7,6 +7,8 @@
 #include 
 #include 
 
+#ifdef CONFIG_X86_LOCAL_APIC
+
 extern int trace_resched_ipi_reg(void);
 extern void trace_resched_ipi_unreg(void);
 
@@ -53,18 +55,6 @@ DEFINE_EVENT_FN(x86_irq_vector, name##_e
 DEFINE_IRQ_VECTOR_EVENT(local_timer);
 
 /*
- * The ifdef is required because that tracepoint macro hell emits tracepoint
- * code in files which include this header even if the tracepoint is not
- * enabled. Brilliant stuff that.
- */
-#ifdef CONFIG_SMP
-/*
- * reschedule - called when entering/exiting a reschedule vector handler
- */
-DEFINE_RESCHED_IPI_EVENT(reschedule);
-#endif
-
-/*
  * spurious_apic - called when entering/exiting a spurious apic vector handler
  */
 DEFINE_IRQ_VECTOR_EVENT(spurious_apic);
@@ -80,6 +70,7 @@ DEFINE_IRQ_VECTOR_EVENT(error_apic);
  */
 DEFINE_IRQ_VECTOR_EVENT(x86_platform_ipi);
 
+#ifdef CONFIG_IRQ_WORK
 /*
  * irq_work - called when entering/exiting a irq work interrupt
  * vector handler
@@ -96,6 +87,18 @@ DEFINE_IRQ_VECTOR_EVENT(irq_work);
  *  4) goto 1
  */
 TRACE_EVENT_PERF_PERM(irq_work_exit, is_sampling_event(p_event) ? -EPERM : 0);
+#endif
+
+/*
+ * The ifdef is required because that tracepoint macro hell emits tracepoint
+ * code in files which include this header even if the tracepoint is not
+ * enabled. Brilliant stuff that.
+ */
+#ifdef CONFIG_SMP
+/*
+ * reschedule - called when entering/exiting a reschedule vector handler
+ */
+DEFINE_RESCHED_IPI_EVENT(reschedule);
 
 /*
  * call_function - called when entering/exiting a call function interrupt
@@ -108,24 +111,33 @@ DEFINE_IRQ_VECTOR_EVENT(call_function);
  * single interrupt vector handler
  */
 DEFINE_IRQ_VECTOR_EVENT(call_function_single);
+#endif
 
+#ifdef CONFIG_X86_MCE_THRESHOLD
 /*
  * threshold_apic - called when entering/exiting a threshold apic interrupt
  * vector handler
  */
 DEFINE_IRQ_VECTOR_EVENT(threshold_apic);
+#endif
 
+#ifdef CONFIG_X86_MCE_AMD
 /*
  * deferred_error_apic - called when entering/exiting a deferred apic interrupt
  * vector handler
  */
 DEFINE_IRQ_VECTOR_EVENT(deferred_error_apic);
+#endif
 
+#ifdef CONFIG_X86_THERMAL_VECTOR
 /*
  * thermal_apic - called when entering/exiting a thermal apic interrupt
  * vector handler
  */
 DEFINE_IRQ_VECTOR_EVENT(thermal_apic);
+#endif
+
+#endif /* CONFIG_X86_LOCAL_APIC */
 
 #undef TRACE_INCLUDE_PATH
 #define TRACE_INCLUDE_PATH .




[patch V3 22/44] x86/idt: Unify gate_struct handling for 32/64bit

2017-08-28 Thread Thomas Gleixner
The first 32bits of gate struct are the same for 32 and 64 bit. The 32bit
version uses desc_struct and no designated data structure, so we need
different accessors for 32 and 64 bit. Aside of that the macros which are
necessary to build the 32bit gate descriptor are horrible to read.

Unify the gate structs and switch all code fiddling with it over.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/boot/compressed/eboot.c |8 ++---
 arch/x86/include/asm/desc.h  |   45 +-
 arch/x86/include/asm/desc_defs.h |   57 +--
 arch/x86/kvm/vmx.c   |2 -
 arch/x86/xen/enlighten_pv.c  |   12 
 5 files changed, 67 insertions(+), 57 deletions(-)

--- a/arch/x86/boot/compressed/eboot.c
+++ b/arch/x86/boot/compressed/eboot.c
@@ -1058,7 +1058,7 @@ struct boot_params *efi_main(struct efi_
desc->s = DESC_TYPE_CODE_DATA;
desc->dpl = 0;
desc->p = 1;
-   desc->limit = 0xf;
+   desc->limit1 = 0xf;
desc->avl = 0;
desc->l = 0;
desc->d = SEG_OP_SIZE_32BIT;
@@ -1078,7 +1078,7 @@ struct boot_params *efi_main(struct efi_
desc->s = DESC_TYPE_CODE_DATA;
desc->dpl = 0;
desc->p = 1;
-   desc->limit = 0xf;
+   desc->limit1 = 0xf;
desc->avl = 0;
if (IS_ENABLED(CONFIG_X86_64)) {
desc->l = 1;
@@ -1099,7 +1099,7 @@ struct boot_params *efi_main(struct efi_
desc->s = DESC_TYPE_CODE_DATA;
desc->dpl = 0;
desc->p = 1;
-   desc->limit = 0xf;
+   desc->limit1 = 0xf;
desc->avl = 0;
desc->l = 0;
desc->d = SEG_OP_SIZE_32BIT;
@@ -1116,7 +1116,7 @@ struct boot_params *efi_main(struct efi_
desc->s = 0;
desc->dpl = 0;
desc->p = 1;
-   desc->limit = 0x0;
+   desc->limit1 = 0x0;
desc->avl = 0;
desc->l = 0;
desc->d = 0;
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -84,33 +84,25 @@ static inline phys_addr_t get_cpu_gdt_pa
return per_cpu_ptr_to_phys(get_cpu_gdt_rw(cpu));
 }
 
-#ifdef CONFIG_X86_64
-
 static inline void pack_gate(gate_desc *gate, unsigned type, unsigned long 
func,
 unsigned dpl, unsigned ist, unsigned seg)
 {
-   gate->offset_low= PTR_LOW(func);
+   gate->offset_low= (u16) func;
+   gate->bits.p= 1;
+   gate->bits.dpl  = dpl;
+   gate->bits.zero = 0;
+   gate->bits.type = type;
+   gate->offset_middle = (u16) (func >> 16);
+#ifdef CONFIG_X86_64
gate->segment   = __KERNEL_CS;
-   gate->ist   = ist;
-   gate->p = 1;
-   gate->dpl   = dpl;
-   gate->zero0 = 0;
-   gate->zero1 = 0;
-   gate->type  = type;
-   gate->offset_middle = PTR_MIDDLE(func);
-   gate->offset_high   = PTR_HIGH(func);
-}
-
+   gate->bits.ist  = ist;
+   gate->reserved  = 0;
+   gate->offset_high   = (u32) (func >> 32);
 #else
-static inline void pack_gate(gate_desc *gate, unsigned char type,
-unsigned long base, unsigned dpl, unsigned flags,
-unsigned short seg)
-{
-   gate->a = (seg << 16) | (base & 0x);
-   gate->b = (base & 0x) | (((0x80 | type | (dpl << 5)) & 0xff) << 
8);
-}
-
+   gate->segment   = seg;
+   gate->bits.ist  = 0;
 #endif
+}
 
 static inline int desc_empty(const void *ptr)
 {
@@ -186,7 +178,8 @@ static inline void pack_descriptor(struc
 }
 
 
-static inline void set_tssldt_descriptor(void *d, unsigned long addr, unsigned 
type, unsigned size)
+static inline void set_tssldt_descriptor(void *d, unsigned long addr,
+unsigned type, unsigned size)
 {
 #ifdef CONFIG_X86_64
struct ldttss_desc64 *desc = d;
@@ -194,13 +187,13 @@ static inline void set_tssldt_descriptor
memset(desc, 0, sizeof(*desc));
 
desc->limit0= size & 0x;
-   desc->base0 = PTR_LOW(addr);
-   desc->base1 = PTR_MIDDLE(addr) & 0xFF;
+   desc->base0 = (u16) addr;
+   desc->base1 = (addr >> 16) & 0xFF;
desc->type  = type;
desc->p = 1;
desc->limit1= (size >> 16) & 0xF;
-   desc->base2 = (PTR_MIDDLE(addr) >> 8) & 0xFF;
-   desc->base3 = PTR_HIGH(addr);
+   desc->base2 = (addr >> 24) & 0xFF;
+   desc->base3 = (u32) (addr >> 32);
 #else
pack_descriptor((struct desc_struct *)d, addr, size, 0x80 | type, 0);
 #endif
--- a/arch/x86/include/asm/desc_defs.h
+++ 

[patch V3 20/44] x86/irq_work: Make it depend on APIC

2017-08-28 Thread Thomas Gleixner
The irq work interrupt vector is only installed when CONFIG_X86_LOCAL_APIC is
enabled, but the interrupt handler is compiled in unconditionally.

Compile the cruft out when the APIC is disabled.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/irq_work.h |8 
 arch/x86/kernel/irq_work.c  |4 ++--
 2 files changed, 10 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/irq_work.h
+++ b/arch/x86/include/asm/irq_work.h
@@ -3,9 +3,17 @@
 
 #include 
 
+#ifdef CONFIG_X86_LOCAL_APIC
 static inline bool arch_irq_work_has_interrupt(void)
 {
return boot_cpu_has(X86_FEATURE_APIC);
 }
+extern void arch_irq_work_raise(void);
+#else
+static inline bool arch_irq_work_has_interrupt(void)
+{
+   return false;
+}
+#endif
 
 #endif /* _ASM_IRQ_WORK_H */
--- a/arch/x86/kernel/irq_work.c
+++ b/arch/x86/kernel/irq_work.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 
+#ifdef CONFIG_X86_LOCAL_APIC
 __visible void __irq_entry smp_irq_work_interrupt(struct pt_regs *regs)
 {
ipi_entering_ack_irq();
@@ -23,11 +24,10 @@
 
 void arch_irq_work_raise(void)
 {
-#ifdef CONFIG_X86_LOCAL_APIC
if (!arch_irq_work_has_interrupt())
return;
 
apic->send_IPI_self(IRQ_WORK_VECTOR);
apic_wait_icr_idle();
-#endif
 }
+#endif




[patch V3 29/44] x86/idt: Move 32bit idt_descr to C code

2017-08-28 Thread Thomas Gleixner
32bit has the idt_descr sitting in the low level assembly entry code. There
is no reason for that. Move it into the C file and use the 64bit version of
it.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/head_32.S |6 --
 arch/x86/kernel/idt.c |   10 +-
 2 files changed, 5 insertions(+), 11 deletions(-)

--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -622,7 +622,6 @@ ENTRY(initial_stack)
 
.data
 .globl boot_gdt_descr
-.globl idt_descr
 
ALIGN
 # early boot GDT descriptor (must use 1:1 address mapping)
@@ -631,11 +630,6 @@ ENTRY(initial_stack)
.word __BOOT_DS+7
.long boot_gdt - __PAGE_OFFSET
 
-   .word 0 # 32-bit align idt_desc.address
-idt_descr:
-   .word IDT_ENTRIES*8-1   # idt contains 256 entries
-   .long idt_table
-
 # boot GDT descriptor (later on used by CPU#0):
.word 0 # 32 bit align gdt_desc.address
 ENTRY(early_gdt_descr)
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -10,15 +10,15 @@
 /* Must be page-aligned because the real IDT is used in a fixmap. */
 gate_desc idt_table[IDT_ENTRIES] __page_aligned_bss;
 
-#ifdef CONFIG_X86_64
-/* No need to be aligned, but done to keep all IDTs defined the same way. */
-gate_desc debug_idt_table[IDT_ENTRIES] __page_aligned_bss;
-
 struct desc_ptr idt_descr __ro_after_init = {
-   .size   = IDT_ENTRIES * 16 - 1,
+   .size   = (IDT_ENTRIES * 2 * sizeof(unsigned long)) - 1,
.address= (unsigned long) idt_table,
 };
 
+#ifdef CONFIG_X86_64
+/* No need to be aligned, but done to keep all IDTs defined the same way. */
+gate_desc debug_idt_table[IDT_ENTRIES] __page_aligned_bss;
+
 const struct desc_ptr debug_idt_descr = {
.size   = IDT_ENTRIES * 16 - 1,
.address= (unsigned long) debug_idt_table,




[patch V3 19/44] x86/ipi: Make platform IPI depend on APIC

2017-08-28 Thread Thomas Gleixner
The platform IPI vector is only installed when the local APIC is enabled. All
users of it depend on the local APIC anyway.

Make the related code conditional on CONFIG_X86_LOCAL_APIC.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/entry_arch.h |3 +--
 arch/x86/kernel/irq.c |   11 ++-
 2 files changed, 7 insertions(+), 7 deletions(-)

--- a/arch/x86/include/asm/entry_arch.h
+++ b/arch/x86/include/asm/entry_arch.h
@@ -17,8 +17,6 @@ BUILD_INTERRUPT(irq_move_cleanup_interru
 BUILD_INTERRUPT(reboot_interrupt, REBOOT_VECTOR)
 #endif
 
-BUILD_INTERRUPT(x86_platform_ipi, X86_PLATFORM_IPI_VECTOR)
-
 #ifdef CONFIG_HAVE_KVM
 BUILD_INTERRUPT(kvm_posted_intr_ipi, POSTED_INTR_VECTOR)
 BUILD_INTERRUPT(kvm_posted_intr_wakeup_ipi, POSTED_INTR_WAKEUP_VECTOR)
@@ -37,6 +35,7 @@ BUILD_INTERRUPT(kvm_posted_intr_nested_i
 BUILD_INTERRUPT(apic_timer_interrupt,LOCAL_TIMER_VECTOR)
 BUILD_INTERRUPT(error_interrupt,ERROR_APIC_VECTOR)
 BUILD_INTERRUPT(spurious_interrupt,SPURIOUS_APIC_VECTOR)
+BUILD_INTERRUPT(x86_platform_ipi, X86_PLATFORM_IPI_VECTOR)
 
 #ifdef CONFIG_IRQ_WORK
 BUILD_INTERRUPT(irq_work_interrupt, IRQ_WORK_VECTOR)
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -29,9 +29,6 @@ EXPORT_PER_CPU_SYMBOL(irq_regs);
 
 atomic_t irq_err_count;
 
-/* Function pointer for generic interrupt vector handling */
-void (*x86_platform_ipi_callback)(void) = NULL;
-
 /*
  * 'what should we do if we get a hw irq event on an illegal vector'.
  * each architecture has to answer this themselves.
@@ -87,13 +84,13 @@ int arch_show_interrupts(struct seq_file
for_each_online_cpu(j)
seq_printf(p, "%10u ", irq_stats(j)->icr_read_retry_count);
seq_puts(p, "  APIC ICR read retries\n");
-#endif
if (x86_platform_ipi_callback) {
seq_printf(p, "%*s: ", prec, "PLT");
for_each_online_cpu(j)
seq_printf(p, "%10u ", irq_stats(j)->x86_platform_ipis);
seq_puts(p, "  Platform interrupts\n");
}
+#endif
 #ifdef CONFIG_SMP
seq_printf(p, "%*s: ", prec, "RES");
for_each_online_cpu(j)
@@ -183,9 +180,9 @@ u64 arch_irq_stat_cpu(unsigned int cpu)
sum += irq_stats(cpu)->apic_perf_irqs;
sum += irq_stats(cpu)->apic_irq_work_irqs;
sum += irq_stats(cpu)->icr_read_retry_count;
-#endif
if (x86_platform_ipi_callback)
sum += irq_stats(cpu)->x86_platform_ipis;
+#endif
 #ifdef CONFIG_SMP
sum += irq_stats(cpu)->irq_resched_count;
sum += irq_stats(cpu)->irq_call_count;
@@ -259,6 +256,9 @@ u64 arch_irq_stat(void)
return 1;
 }
 
+#ifdef CONFIG_X86_LOCAL_APIC
+/* Function pointer for generic interrupt vector handling */
+void (*x86_platform_ipi_callback)(void) = NULL;
 /*
  * Handler for X86_PLATFORM_IPI_VECTOR.
  */
@@ -275,6 +275,7 @@ u64 arch_irq_stat(void)
exiting_irq();
set_irq_regs(old_regs);
 }
+#endif
 
 #ifdef CONFIG_HAVE_KVM
 static void dummy_handler(void) {}




[patch V3 05/44] x86/boot: Move EISA setup to a proper place

2017-08-28 Thread Thomas Gleixner
EISA has absolutely nothing to do with traps. The EISA bus detection does
not need to run in the very early boot. It's good enough to run it before
the EISA bus and drivers are initialized.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/Makefile |1 +
 arch/x86/kernel/eisa.c   |   18 ++
 arch/x86/kernel/traps.c  |   13 -
 3 files changed, 19 insertions(+), 13 deletions(-)

--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -111,6 +111,7 @@ obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= parav
 obj-$(CONFIG_PARAVIRT_CLOCK)   += pvclock.o
 obj-$(CONFIG_X86_PMEM_LEGACY_DEVICE) += pmem.o
 
+obj-$(CONFIG_EISA) += eisa.o
 obj-$(CONFIG_PCSPKR_PLATFORM)  += pcspeaker.o
 
 obj-$(CONFIG_X86_CHECK_BIOS_CORRUPTION) += check.o
--- /dev/null
+++ b/arch/x86/kernel/eisa.c
@@ -0,0 +1,18 @@
+/*
+ * EISA specific code
+ *
+ * This file is licensed under the GPL V2
+ */
+#include 
+#include 
+
+static __init int eisa_bus_probe(void)
+{
+   void __iomem *p = ioremap(0x0FFFD9, 4);
+
+   if (readl(p) == 'E' + ('I'<<8) + ('S'<<16) + ('A'<<24))
+   EISA_bus = 1;
+   iounmap(p);
+   return 0;
+}
+subsys_initcall(eisa_bus_probe);
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -38,11 +38,6 @@
 #include 
 #include 
 
-#ifdef CONFIG_EISA
-#include 
-#include 
-#endif
-
 #if defined(CONFIG_EDAC)
 #include 
 #endif
@@ -969,14 +964,6 @@ void __init trap_init(void)
 {
int i;
 
-#ifdef CONFIG_EISA
-   void __iomem *p = early_ioremap(0x0FFFD9, 4);
-
-   if (readl(p) == 'E' + ('I'<<8) + ('S'<<16) + ('A'<<24))
-   EISA_bus = 1;
-   early_iounmap(p, 4);
-#endif
-
set_intr_gate(X86_TRAP_DE, divide_error);
set_intr_gate_ist(X86_TRAP_NMI, , NMI_STACK);
/* int4 can be called from all */




[patch V3 06/44] x86/tracing: Introduce a static key for exception tracing

2017-08-28 Thread Thomas Gleixner
Switching the IDT just for avoiding tracepoints creates a complete
impenetrable macro/inline/ifdef mess.

There is no point in avoiding tracepoints for most of the traps/exceptions.
For the more expensive tracepoints, like pagefaults, this can be handled with
an explicit static key.

Preparatory patch to remove the tracing idt.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/trace/common.h  |   15 +++
 arch/x86/include/asm/trace/exceptions.h  |4 +---
 arch/x86/include/asm/trace/irq_vectors.h |4 +---
 arch/x86/kernel/tracepoint.c |9 -
 4 files changed, 25 insertions(+), 7 deletions(-)

--- /dev/null
+++ b/arch/x86/include/asm/trace/common.h
@@ -0,0 +1,15 @@
+#ifndef _ASM_TRACE_COMMON_H
+#define _ASM_TRACE_COMMON_H
+
+extern int trace_irq_vector_regfunc(void);
+extern void trace_irq_vector_unregfunc(void);
+
+#ifdef CONFIG_TRACING
+DECLARE_STATIC_KEY_FALSE(trace_irqvectors_key);
+#define trace_irqvectors_enabled() \
+   static_branch_unlikely(_irqvectors_key)
+#else
+static inline bool trace_irqvectors_enabled(void) { return false; }
+#endif
+
+#endif
--- a/arch/x86/include/asm/trace/exceptions.h
+++ b/arch/x86/include/asm/trace/exceptions.h
@@ -5,9 +5,7 @@
 #define _TRACE_PAGE_FAULT_H
 
 #include 
-
-extern int trace_irq_vector_regfunc(void);
-extern void trace_irq_vector_unregfunc(void);
+#include 
 
 DECLARE_EVENT_CLASS(x86_exceptions,
 
--- a/arch/x86/include/asm/trace/irq_vectors.h
+++ b/arch/x86/include/asm/trace/irq_vectors.h
@@ -5,9 +5,7 @@
 #define _TRACE_IRQ_VECTORS_H
 
 #include 
-
-extern int trace_irq_vector_regfunc(void);
-extern void trace_irq_vector_unregfunc(void);
+#include 
 
 DECLARE_EVENT_CLASS(x86_irq_vector,
 
--- a/arch/x86/kernel/tracepoint.c
+++ b/arch/x86/kernel/tracepoint.c
@@ -4,9 +4,11 @@
  * Copyright (C) 2013 Seiji Aguchi 
  *
  */
+#include 
+#include 
+
 #include 
 #include 
-#include 
 
 atomic_t trace_idt_ctr = ATOMIC_INIT(0);
 struct desc_ptr trace_idt_descr = { NR_VECTORS * 16 - 1,
@@ -15,6 +17,7 @@ struct desc_ptr trace_idt_descr = { NR_V
 /* No need to be aligned, but done to keep all IDTs defined the same way. */
 gate_desc trace_idt_table[NR_VECTORS] __page_aligned_bss;
 
+DEFINE_STATIC_KEY_FALSE(trace_irqvectors_key);
 static int trace_irq_vector_refcount;
 static DEFINE_MUTEX(irq_vector_mutex);
 
@@ -36,6 +39,8 @@ static void switch_idt(void *arg)
 
 int trace_irq_vector_regfunc(void)
 {
+   static_branch_inc(_irqvectors_key);
+
mutex_lock(_vector_mutex);
if (!trace_irq_vector_refcount) {
set_trace_idt_ctr(1);
@@ -49,6 +54,8 @@ int trace_irq_vector_regfunc(void)
 
 void trace_irq_vector_unregfunc(void)
 {
+   static_branch_dec(_irqvectors_key);
+
mutex_lock(_vector_mutex);
trace_irq_vector_refcount--;
if (!trace_irq_vector_refcount) {




[patch V3 04/44] x86/irq: Remove duplicated used_vectors definition

2017-08-28 Thread Thomas Gleixner
Remove the unparseable comment in the other place while at it.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/desc.h |1 -
 arch/x86/include/asm/irq.h  |3 ---
 2 files changed, 4 deletions(-)

--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -483,7 +483,6 @@ static inline void _set_gate(int gate, u
0, 0, __KERNEL_CS); \
} while (0)
 
-/* used_vectors is BITMAP for irq is not managed by percpu vector_irq */
 extern unsigned long used_vectors[];
 
 static inline void alloc_system_vector(int vector)
--- a/arch/x86/include/asm/irq.h
+++ b/arch/x86/include/asm/irq.h
@@ -42,9 +42,6 @@ extern bool handle_irq(struct irq_desc *
 
 extern __visible unsigned int do_IRQ(struct pt_regs *regs);
 
-/* Interrupt vector management */
-extern DECLARE_BITMAP(used_vectors, NR_VECTORS);
-
 extern void init_ISA_irqs(void);
 
 #ifdef CONFIG_X86_LOCAL_APIC




[patch V3 09/44] x86/apic: Use this_cpu_ptr in local_timer_interrupt

2017-08-28 Thread Thomas Gleixner
Accessing the per cpu data via per_cpu(, smp_processor_id()) is
pointless. Use this_cpu_ptr() instead.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/apic/apic.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -988,8 +988,7 @@ void setup_secondary_APIC_clock(void)
  */
 static void local_apic_timer_interrupt(void)
 {
-   int cpu = smp_processor_id();
-   struct clock_event_device *evt = _cpu(lapic_events, cpu);
+   struct clock_event_device *evt = this_cpu_ptr(_events);
 
/*
 * Normally we should not be here till LAPIC has been initialized but
@@ -1003,7 +1002,8 @@ static void local_apic_timer_interrupt(v
 * spurious.
 */
if (!evt->event_handler) {
-   pr_warning("Spurious LAPIC timer interrupt on cpu %d\n", cpu);
+   pr_warning("Spurious LAPIC timer interrupt on cpu %d\n",
+  smp_processor_id());
/* Switch it off */
lapic_timer_shutdown(evt);
return;




[patch V3 01/44] x86/irq: Remove vector_used_by_percpu_irq()

2017-08-28 Thread Thomas Gleixner
Last user (lguest) is gone. Remove it.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/irq.h |1 -
 arch/x86/kernel/irq.c  |2 --
 arch/x86/kernel/irqinit.c  |   12 
 3 files changed, 15 deletions(-)

--- a/arch/x86/include/asm/irq.h
+++ b/arch/x86/include/asm/irq.h
@@ -44,7 +44,6 @@ extern __visible unsigned int do_IRQ(str
 
 /* Interrupt vector management */
 extern DECLARE_BITMAP(used_vectors, NR_VECTORS);
-extern int vector_used_by_percpu_irq(unsigned int vector);
 
 extern void init_ISA_irqs(void);
 
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -346,8 +346,6 @@ EXPORT_SYMBOL_GPL(kvm_set_posted_intr_wa
set_irq_regs(old_regs);
 }
 
-EXPORT_SYMBOL_GPL(vector_used_by_percpu_irq);
-
 #ifdef CONFIG_HOTPLUG_CPU
 
 /* These two declarations are only used in check_irq_vectors_for_cpu_disable()
--- a/arch/x86/kernel/irqinit.c
+++ b/arch/x86/kernel/irqinit.c
@@ -55,18 +55,6 @@ DEFINE_PER_CPU(vector_irq_t, vector_irq)
[0 ... NR_VECTORS - 1] = VECTOR_UNUSED,
 };
 
-int vector_used_by_percpu_irq(unsigned int vector)
-{
-   int cpu;
-
-   for_each_online_cpu(cpu) {
-   if (!IS_ERR_OR_NULL(per_cpu(vector_irq, cpu)[vector]))
-   return 1;
-   }
-
-   return 0;
-}
-
 void __init init_ISA_irqs(void)
 {
struct irq_chip *chip = legacy_pic->chip;




[patch V3 00/44] x86: Cleanup IDT code

2017-08-28 Thread Thomas Gleixner
This is the 3rd iteration of the series. The previous version can be found
here:

  http://lkml.kernel.org/r/20170825214648.264521...@linutronix.de   

The series applies on top of

 git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/apic

and is available as a git branch from

 git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/apic

Changes since V2:

 - Addressed the review comments

 - Addressed the fallout reported by kbuild robot

 - Picked up acks and reviewed by tags

Thanks,

tglx


 arch/x86/boot/compressed/eboot.c |8 
 arch/x86/entry/entry_32.S|   20 -
 arch/x86/entry/entry_64.S|   26 --
 arch/x86/entry/vdso/vma.c|2 
 arch/x86/include/asm/desc.h  |  248 +++-
 arch/x86/include/asm/desc_defs.h |  122 +-
 arch/x86/include/asm/entry_arch.h|   17 -
 arch/x86/include/asm/hw_irq.h|   20 -
 arch/x86/include/asm/irq.h   |4 
 arch/x86/include/asm/irq_work.h  |8 
 arch/x86/include/asm/segment.h   |4 
 arch/x86/include/asm/trace/common.h  |   16 +
 arch/x86/include/asm/trace/exceptions.h  |8 
 arch/x86/include/asm/trace/irq_vectors.h |   51 +++-
 arch/x86/include/asm/traps.h |   10 
 arch/x86/include/asm/xen/hypercall.h |6 
 arch/x86/kernel/Makefile |3 
 arch/x86/kernel/apic/apic.c  |   70 +
 arch/x86/kernel/apic/vector.c|2 
 arch/x86/kernel/cpu/common.c |9 
 arch/x86/kernel/cpu/mcheck/mce_amd.c |   16 -
 arch/x86/kernel/cpu/mcheck/therm_throt.c |   20 -
 arch/x86/kernel/cpu/mcheck/threshold.c   |   16 -
 arch/x86/kernel/cpu/mshyperv.c   |9 
 arch/x86/kernel/eisa.c   |   18 +
 arch/x86/kernel/head32.c |4 
 arch/x86/kernel/head64.c |6 
 arch/x86/kernel/head_32.S|   42 ---
 arch/x86/kernel/idt.c|  367 +++
 arch/x86/kernel/irq.c|   40 +--
 arch/x86/kernel/irq_work.c   |   20 -
 arch/x86/kernel/irqinit.c|  102 
 arch/x86/kernel/kvm.c|4 
 arch/x86/kernel/machine_kexec_32.c   |   14 -
 arch/x86/kernel/reboot.c |4 
 arch/x86/kernel/setup.c  |4 
 arch/x86/kernel/setup_percpu.c   |9 
 arch/x86/kernel/smp.c|   81 +-
 arch/x86/kernel/tls.c|2 
 arch/x86/kernel/tracepoint.c |   57 +---
 arch/x86/kernel/traps.c  |  107 -
 arch/x86/kvm/vmx.c   |2 
 arch/x86/math-emu/fpu_entry.c|   11 
 arch/x86/math-emu/fpu_system.h   |   48 +++-
 arch/x86/math-emu/get_address.c  |   17 -
 arch/x86/mm/fault.c  |   49 +---
 arch/x86/xen/enlighten_pv.c  |   14 -
 drivers/xen/events/events_base.c |6 
 48 files changed, 760 insertions(+), 983 deletions(-)





[PATCH 1/3] clk: ux500: prcmu: constify clk_ops.

2017-08-28 Thread Arvind Yadav
clk_ops are not supposed to change at runtime. All functions
working with clk_ops provided by  work
with const clk_ops. So mark the non-const clk_ops as const.

Here, Function "clk_reg_prcmu" is used to initialized clk_init_data.
clk_init_data is working with const clk_ops. So make clk_reg_prcmu
non-const clk_ops argument as const.

Signed-off-by: Arvind Yadav 
---
 drivers/clk/ux500/clk-prcmu.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/clk/ux500/clk-prcmu.c b/drivers/clk/ux500/clk-prcmu.c
index 7f34382..6e3e16b 100644
--- a/drivers/clk/ux500/clk-prcmu.c
+++ b/drivers/clk/ux500/clk-prcmu.c
@@ -186,7 +186,7 @@ static void clk_prcmu_opp_volt_unprepare(struct clk_hw *hw)
clk->is_prepared = 0;
 }
 
-static struct clk_ops clk_prcmu_scalable_ops = {
+static const struct clk_ops clk_prcmu_scalable_ops = {
.prepare = clk_prcmu_prepare,
.unprepare = clk_prcmu_unprepare,
.is_prepared = clk_prcmu_is_prepared,
@@ -198,7 +198,7 @@ static void clk_prcmu_opp_volt_unprepare(struct clk_hw *hw)
.set_rate = clk_prcmu_set_rate,
 };
 
-static struct clk_ops clk_prcmu_gate_ops = {
+static const struct clk_ops clk_prcmu_gate_ops = {
.prepare = clk_prcmu_prepare,
.unprepare = clk_prcmu_unprepare,
.is_prepared = clk_prcmu_is_prepared,
@@ -208,19 +208,19 @@ static void clk_prcmu_opp_volt_unprepare(struct clk_hw 
*hw)
.recalc_rate = clk_prcmu_recalc_rate,
 };
 
-static struct clk_ops clk_prcmu_scalable_rate_ops = {
+static const struct clk_ops clk_prcmu_scalable_rate_ops = {
.is_enabled = clk_prcmu_is_enabled,
.recalc_rate = clk_prcmu_recalc_rate,
.round_rate = clk_prcmu_round_rate,
.set_rate = clk_prcmu_set_rate,
 };
 
-static struct clk_ops clk_prcmu_rate_ops = {
+static const struct clk_ops clk_prcmu_rate_ops = {
.is_enabled = clk_prcmu_is_enabled,
.recalc_rate = clk_prcmu_recalc_rate,
 };
 
-static struct clk_ops clk_prcmu_opp_gate_ops = {
+static const struct clk_ops clk_prcmu_opp_gate_ops = {
.prepare = clk_prcmu_opp_prepare,
.unprepare = clk_prcmu_opp_unprepare,
.is_prepared = clk_prcmu_is_prepared,
@@ -230,7 +230,7 @@ static void clk_prcmu_opp_volt_unprepare(struct clk_hw *hw)
.recalc_rate = clk_prcmu_recalc_rate,
 };
 
-static struct clk_ops clk_prcmu_opp_volt_scalable_ops = {
+static const struct clk_ops clk_prcmu_opp_volt_scalable_ops = {
.prepare = clk_prcmu_opp_volt_prepare,
.unprepare = clk_prcmu_opp_volt_unprepare,
.is_prepared = clk_prcmu_is_prepared,
@@ -247,7 +247,7 @@ static struct clk *clk_reg_prcmu(const char *name,
 u8 cg_sel,
 unsigned long rate,
 unsigned long flags,
-struct clk_ops *clk_prcmu_ops)
+const struct clk_ops *clk_prcmu_ops)
 {
struct clk_prcmu *clk;
struct clk_init_data clk_prcmu_init;
-- 
1.9.1



Re: connector: Delete an error message for a failed memory allocation in cn_queue_alloc_callback_entry()

2017-08-28 Thread SF Markus Elfring
> Did coccinelle trip on the message

I suggest to reconsider this implementation detail with the combination
of a function call like “kzalloc”.
A script for the semantic patch language can point various update candidates
out according to a source code search pattern which is similar to “OOM_MESSAGE”
in the script “checkpatch.pl”.


> or the fact you weren't returning NULL?

How does this concern fit to my update suggestion?

Regards,
Markus


Re: [PATCH 2/2 v2] sched/wait: Introduce lock breaker in wake_up_page_bit

2017-08-28 Thread Nicholas Piggin
On Sun, 27 Aug 2017 22:17:55 -0700
Linus Torvalds  wrote:

> On Sun, Aug 27, 2017 at 6:29 PM, Nicholas Piggin  wrote:
> >
> > BTW. since you are looking at this stuff, one other small problem I remember
> > with exclusive waiters is that losing to a concurrent locker puts them to
> > the back of the queue. I think that could be fixed with some small change to
> > the wait loops (first add to tail, then retries add to head). Thoughts?  
> 
> No, not that way.
> 
> First off, it's oddly complicated, but more importantly, the real
> unfairness you lose to is not other things on the wait queue, but to
> other lockers that aren't on the wait-queue at all, but instead just
> come in and do a "test-and-set" without ever even going through the
> slow path.

Right, there is that unfairness *as well*. The requeue-to-tail logic
seems to make that worse and I thought it seemed like a simple way
to improve it.

> 
> So instead of playing queuing games, you'd need to just change the
> unlock sequence. Right now we basically do:
> 
>  - clear lock bit and atomically test if contended (and we play games
> with bit numbering to do that atomic test efficiently)
> 
>  - if contended, wake things up
> 
> and you'd change the logic to be
> 
>  - if contended, don't clear the lock bit at all, just transfer the
> lock ownership directly to the waiters by walking the wait list
> 
>  - clear the lock bit only once there are no more wait entries (either
> because there were no waiters at all, or because all the entries were
> just waiting for the lock to be released)
> 
> which is certainly doable with a couple of small extensions to the
> page wait key data structure.

Yeah that would be ideal. Conceptually trivial, I guess care has to
be taken with transferring the memory ordering with the lock. Could
be a good concept to apply elsewhere too.

> 
> But most of my clever schemes the last few days were abject failures,
> and honestly, it's late in the rc.
> 
> In fact, this late in the game I probably wouldn't even have committed
> the small cleanups I did if it wasn't for the fact that thinking of
> the whole WQ_FLAG_EXCLUSIVE bit made me find the bug.
> 
> So the cleanups were actually what got me to look at the problem in
> the first place, and then I went "I'm going to commit the cleanup, and
> then I can think about the bug I just found".
> 
> I'm just happy that the fix seems to be trivial. I was afraid I'd have
> to do something nastier (like have the EINTR case send another
> explicit wakeup to make up for the lost one, or some ugly hack like
> that).
> 
> It was only when I started looking at the history of that code, and I
> saw the old bit_lock code, and I went "Hmm. That has the _same_ bug -
> oh wait, no it doesn't!" that I realized that there was that simple
> fix.
> 
> You weren't cc'd on the earlier part of the discussion, you only got
> added when I realized what the history and simple fix was.

You're right, no such improvement would be appropriate for 4.14.

Thanks,
Nick


Re: [PATCH] staging: rtl8723bs: remove memset before memcpy

2017-08-28 Thread Himanshu Jha
On Mon, Aug 28, 2017 at 09:19:06AM +0300, Dan Carpenter wrote:
> On Mon, Aug 28, 2017 at 01:43:31AM +0530, Himanshu Jha wrote:
> > calling memcpy immediately after memset with the same region of memory
> > makes memset redundant.
> > 
> > Build successfully.
> > 
> 
> Thanks for the patch, it looks good.  You don't need to say that it
> builds successfully, because we already assume that's true.
> 
> > Signed-off-by: Himanshu Jha 
> > ---
> 
> Sometimes I put a comment here under the cut off line if I want people
> to know that I haven't tested a patch.
> 
> Anyway, don't resend the patch.  It's fine as-is (unless Greg
> complains) but it's just for future reference.

Thanks for the feedback and i will keep that in mind for future patches.
Himanshu Jha
> 
> regards,
> dan carpenter
> 


Re: mmotm 2017-08-25-15-50 uploaded

2017-08-28 Thread Michal Hocko
On Fri 25-08-17 16:50:26, Randy Dunlap wrote:
> On 08/25/17 15:50, a...@linux-foundation.org wrote:
> > The mm-of-the-moment snapshot 2017-08-25-15-50 has been uploaded to
> > 
> >http://www.ozlabs.org/~akpm/mmotm/
> > 
> > mmotm-readme.txt says
> > 
> > README for mm-of-the-moment:
> > 
> > http://www.ozlabs.org/~akpm/mmotm/
> > 
> > This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
> > more than once a week.
> 
> lots of this one (on x86_64, i386, or UML):
> 
> ../kernel/fork.c:818:2: error: implicit declaration of function 'hmm_mm_init' 
> [-Werror=implicit-function-declaration]
> ../kernel/fork.c:897:2: error: implicit declaration of function 
> 'hmm_mm_destroy' [-Werror=implicit-function-declaration]
> 
> from mm-hmm-heterogeneous-memory-management-hmm-for-short-v5.patch
> 
> Cc: Jérôme Glisse 

This one should address it
---
>From 31d551dbcb1b7987a4cd07767c1e2805849b7a26 Mon Sep 17 00:00:00 2001
From: Michal Hocko 
Date: Mon, 28 Aug 2017 09:41:39 +0200
Subject: [PATCH] 
 mm-hmm-struct-hmm-is-only-use-by-hmm-mirror-functionality-v2-fix

Compiler is complaining for allnoconfig

kernel/fork.c: In function 'mm_init':
kernel/fork.c:814:2: error: implicit declaration of function 'hmm_mm_init' 
[-Werror=implicit-function-declaration]
  hmm_mm_init(mm);
  ^
kernel/fork.c: In function '__mmdrop':
kernel/fork.c:893:2: error: implicit declaration of function 'hmm_mm_destroy' 
[-Werror=implicit-function-declaration]
  hmm_mm_destroy(mm);

Make sure that hmm_mm_init/hmm_mm_destroy empty stups are defined when
CONFIG_HMM is disabled.

Signed-off-by: Michal Hocko 
---
 include/linux/hmm.h | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 9583d9a15f9c..aeb94e682dda 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -508,11 +508,10 @@ static inline void hmm_mm_init(struct mm_struct *mm)
 {
mm->hmm = NULL;
 }
-#else /* IS_ENABLED(CONFIG_HMM_MIRROR) */
+#endif
+
+#else /* IS_ENABLED(CONFIG_HMM) */
 static inline void hmm_mm_destroy(struct mm_struct *mm) {}
 static inline void hmm_mm_init(struct mm_struct *mm) {}
-#endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */
-
-
 #endif /* IS_ENABLED(CONFIG_HMM) */
 #endif /* LINUX_HMM_H */
-- 
2.13.2

-- 
Michal Hocko
SUSE Labs


[PATCH 4.12 17/99] sctp: fully initialize the IPv6 address in sctp_v6_to_addr()

2017-08-28 Thread Greg Kroah-Hartman
4.12-stable review patch.  If anyone has any objections, please let me know.

--

From: Alexander Potapenko 


[ Upstream commit 15339e441ec46fbc3bf3486bb1ae4845b0f1bb8d ]

KMSAN reported use of uninitialized sctp_addr->v4.sin_addr.s_addr and
sctp_addr->v6.sin6_scope_id in sctp_v6_cmp_addr() (see below).
Make sure all fields of an IPv6 address are initialized, which
guarantees that the IPv4 fields are also initialized.

==
 BUG: KMSAN: use of uninitialized memory in sctp_v6_cmp_addr+0x8d4/0x9f0
 net/sctp/ipv6.c:517
 CPU: 2 PID: 31056 Comm: syz-executor1 Not tainted 4.11.0-rc5+ #2944
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
 01/01/2011
 Call Trace:
  dump_stack+0x172/0x1c0 lib/dump_stack.c:42
  is_logbuf_locked mm/kmsan/kmsan.c:59 [inline]
  kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:938
  native_save_fl arch/x86/include/asm/irqflags.h:18 [inline]
  arch_local_save_flags arch/x86/include/asm/irqflags.h:72 [inline]
  arch_local_irq_save arch/x86/include/asm/irqflags.h:113 [inline]
  __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:467
  sctp_v6_cmp_addr+0x8d4/0x9f0 net/sctp/ipv6.c:517
  sctp_v6_get_dst+0x8c7/0x1630 net/sctp/ipv6.c:290
  sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
  sctp_assoc_add_peer+0x66d/0x16f0 net/sctp/associola.c:651
  sctp_sendmsg+0x35a5/0x4f90 net/sctp/socket.c:1871
  inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
  sock_sendmsg_nosec net/socket.c:633 [inline]
  sock_sendmsg net/socket.c:643 [inline]
  SYSC_sendto+0x608/0x710 net/socket.c:1696
  SyS_sendto+0x8a/0xb0 net/socket.c:1664
  entry_SYSCALL_64_fastpath+0x13/0x94
 RIP: 0033:0x44b479
 RSP: 002b:7f6213f21c08 EFLAGS: 0286 ORIG_RAX: 002c
 RAX: ffda RBX: 2000 RCX: 0044b479
 RDX: 0041 RSI: 20edd000 RDI: 0006
 RBP: 007080a8 R08: 20b85fe4 R09: 001c
 R10: 00040005 R11: 0286 R12: 
 R13: 3760 R14: 006e5820 R15: 00ff8000
 origin description: dst_saddr@sctp_v6_get_dst
 local variable created at:
  sk_fullsock include/net/sock.h:2321 [inline]
  inet6_sk include/linux/ipv6.h:309 [inline]
  sctp_v6_get_dst+0x91/0x1630 net/sctp/ipv6.c:241
  sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
==
 BUG: KMSAN: use of uninitialized memory in sctp_v6_cmp_addr+0x8d4/0x9f0
 net/sctp/ipv6.c:517
 CPU: 2 PID: 31056 Comm: syz-executor1 Not tainted 4.11.0-rc5+ #2944
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
 01/01/2011
 Call Trace:
  dump_stack+0x172/0x1c0 lib/dump_stack.c:42
  is_logbuf_locked mm/kmsan/kmsan.c:59 [inline]
  kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:938
  native_save_fl arch/x86/include/asm/irqflags.h:18 [inline]
  arch_local_save_flags arch/x86/include/asm/irqflags.h:72 [inline]
  arch_local_irq_save arch/x86/include/asm/irqflags.h:113 [inline]
  __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:467
  sctp_v6_cmp_addr+0x8d4/0x9f0 net/sctp/ipv6.c:517
  sctp_v6_get_dst+0x8c7/0x1630 net/sctp/ipv6.c:290
  sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
  sctp_assoc_add_peer+0x66d/0x16f0 net/sctp/associola.c:651
  sctp_sendmsg+0x35a5/0x4f90 net/sctp/socket.c:1871
  inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
  sock_sendmsg_nosec net/socket.c:633 [inline]
  sock_sendmsg net/socket.c:643 [inline]
  SYSC_sendto+0x608/0x710 net/socket.c:1696
  SyS_sendto+0x8a/0xb0 net/socket.c:1664
  entry_SYSCALL_64_fastpath+0x13/0x94
 RIP: 0033:0x44b479
 RSP: 002b:7f6213f21c08 EFLAGS: 0286 ORIG_RAX: 002c
 RAX: ffda RBX: 2000 RCX: 0044b479
 RDX: 0041 RSI: 20edd000 RDI: 0006
 RBP: 007080a8 R08: 20b85fe4 R09: 001c
 R10: 00040005 R11: 0286 R12: 
 R13: 3760 R14: 006e5820 R15: 00ff8000
 origin description: dst_saddr@sctp_v6_get_dst
 local variable created at:
  sk_fullsock include/net/sock.h:2321 [inline]
  inet6_sk include/linux/ipv6.h:309 [inline]
  sctp_v6_get_dst+0x91/0x1630 net/sctp/ipv6.c:241
  sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
==

Signed-off-by: Alexander Potapenko 
Reviewed-by: Xin Long 
Acked-by: Marcelo Ricardo Leitner 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/sctp/ipv6.c |2 ++
 1 file changed, 2 insertions(+)

--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -510,7 +510,9 @@ static void sctp_v6_to_addr(union sctp_a
 {
addr->sa.sa_family = AF_INET6;
addr->v6.sin6_port = port;
+   

[PATCH 4.12 23/99] irda: do not leak initialized list.dev to userspace

2017-08-28 Thread Greg Kroah-Hartman
4.12-stable review patch.  If anyone has any objections, please let me know.

--

From: Colin Ian King 


[ Upstream commit b024d949a3c24255a7ef1a470420eb478949aa4c ]

list.dev has not been initialized and so the copy_to_user is copying
data from the stack back to user space which is a potential
information leak. Fix this ensuring all of list is initialized to
zero.

Detected by CoverityScan, CID#1357894 ("Uninitialized scalar variable")

Signed-off-by: Colin Ian King 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/irda/af_irda.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -2225,7 +2225,7 @@ static int irda_getsockopt(struct socket
 {
struct sock *sk = sock->sk;
struct irda_sock *self = irda_sk(sk);
-   struct irda_device_list list;
+   struct irda_device_list list = { 0 };
struct irda_device_info *discoveries;
struct irda_ias_set *   ias_opt;/* IAS get/query params */
struct ias_object * ias_obj;/* Object in IAS */




[PATCH 4.12 13/99] ptr_ring: use kmalloc_array()

2017-08-28 Thread Greg Kroah-Hartman
4.12-stable review patch.  If anyone has any objections, please let me know.

--

From: Eric Dumazet 


[ Upstream commit 81fbfe8adaf38d4f5a98c19bebfd41c5d6acaee8 ]

As found by syzkaller, malicious users can set whatever tx_queue_len
on a tun device and eventually crash the kernel.

Lets remove the ALIGN(XXX, SMP_CACHE_BYTES) thing since a small
ring buffer is not fast anyway.

Fixes: 2e0ab8ca83c1 ("ptr_ring: array based FIFO for pointers")
Signed-off-by: Eric Dumazet 
Reported-by: Dmitry Vyukov 
Cc: Michael S. Tsirkin 
Cc: Jason Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 include/linux/ptr_ring.h  |9 +
 include/linux/skb_array.h |3 ++-
 2 files changed, 7 insertions(+), 5 deletions(-)

--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -371,9 +371,9 @@ static inline void *ptr_ring_consume_bh(
__PTR_RING_PEEK_CALL_v; \
 })
 
-static inline void **__ptr_ring_init_queue_alloc(int size, gfp_t gfp)
+static inline void **__ptr_ring_init_queue_alloc(unsigned int size, gfp_t gfp)
 {
-   return kzalloc(ALIGN(size * sizeof(void *), SMP_CACHE_BYTES), gfp);
+   return kcalloc(size, sizeof(void *), gfp);
 }
 
 static inline void __ptr_ring_set_size(struct ptr_ring *r, int size)
@@ -462,7 +462,8 @@ static inline int ptr_ring_resize(struct
  * In particular if you consume ring in interrupt or BH context, you must
  * disable interrupts/BH when doing so.
  */
-static inline int ptr_ring_resize_multiple(struct ptr_ring **rings, int nrings,
+static inline int ptr_ring_resize_multiple(struct ptr_ring **rings,
+  unsigned int nrings,
   int size,
   gfp_t gfp, void (*destroy)(void *))
 {
@@ -470,7 +471,7 @@ static inline int ptr_ring_resize_multip
void ***queues;
int i;
 
-   queues = kmalloc(nrings * sizeof *queues, gfp);
+   queues = kmalloc_array(nrings, sizeof(*queues), gfp);
if (!queues)
goto noqueues;
 
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -162,7 +162,8 @@ static inline int skb_array_resize(struc
 }
 
 static inline int skb_array_resize_multiple(struct skb_array **rings,
-   int nrings, int size, gfp_t gfp)
+   int nrings, unsigned int size,
+   gfp_t gfp)
 {
BUILD_BUG_ON(offsetof(struct skb_array, ring));
return ptr_ring_resize_multiple((struct ptr_ring **)rings,




[PATCH 4.12 18/99] tipc: fix use-after-free

2017-08-28 Thread Greg Kroah-Hartman
4.12-stable review patch.  If anyone has any objections, please let me know.

--

From: Eric Dumazet 


[ Upstream commit 5bfd37b4de5c98e86b12bd13be5aa46c7484a125 ]

syszkaller reported use-after-free in tipc [1]

When msg->rep skb is freed, set the pointer to NULL,
so that caller does not free it again.

[1]

==
BUG: KASAN: use-after-free in skb_push+0xd4/0xe0 net/core/skbuff.c:1466
Read of size 8 at addr 8801c6e71e90 by task syz-executor5/4115

CPU: 1 PID: 4115 Comm: syz-executor5 Not tainted 4.13.0-rc4+ #32
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:52
 print_address_description+0x73/0x250 mm/kasan/report.c:252
 kasan_report_error mm/kasan/report.c:351 [inline]
 kasan_report+0x24e/0x340 mm/kasan/report.c:409
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
 skb_push+0xd4/0xe0 net/core/skbuff.c:1466
 tipc_nl_compat_recv+0x833/0x18f0 net/tipc/netlink_compat.c:1209
 genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:598
 genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:623
 netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2397
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:634
 netlink_unicast_kernel net/netlink/af_netlink.c:1265 [inline]
 netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1291
 netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1854
 sock_sendmsg_nosec net/socket.c:633 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:643
 sock_write_iter+0x31a/0x5d0 net/socket.c:898
 call_write_iter include/linux/fs.h:1743 [inline]
 new_sync_write fs/read_write.c:457 [inline]
 __vfs_write+0x684/0x970 fs/read_write.c:470
 vfs_write+0x189/0x510 fs/read_write.c:518
 SYSC_write fs/read_write.c:565 [inline]
 SyS_write+0xef/0x220 fs/read_write.c:557
 entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x4512e9
RSP: 002b:7f3bc8184c08 EFLAGS: 0216 ORIG_RAX: 0001
RAX: ffda RBX: 00718000 RCX: 004512e9
RDX: 0020 RSI: 20fdb000 RDI: 0006
RBP: 0086 R08:  R09: 
R10:  R11: 0216 R12: 004b5e76
R13: 7f3bc8184b48 R14: 004b5e86 R15: 

Allocated by task 4115:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:447
 set_track mm/kasan/kasan.c:459 [inline]
 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:489
 kmem_cache_alloc_node+0x13d/0x750 mm/slab.c:3651
 __alloc_skb+0xf1/0x740 net/core/skbuff.c:219
 alloc_skb include/linux/skbuff.h:903 [inline]
 tipc_tlv_alloc+0x26/0xb0 net/tipc/netlink_compat.c:148
 tipc_nl_compat_dumpit+0xf2/0x3c0 net/tipc/netlink_compat.c:248
 tipc_nl_compat_handle net/tipc/netlink_compat.c:1130 [inline]
 tipc_nl_compat_recv+0x756/0x18f0 net/tipc/netlink_compat.c:1199
 genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:598
 genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:623
 netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2397
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:634
 netlink_unicast_kernel net/netlink/af_netlink.c:1265 [inline]
 netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1291
 netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1854
 sock_sendmsg_nosec net/socket.c:633 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:643
 sock_write_iter+0x31a/0x5d0 net/socket.c:898
 call_write_iter include/linux/fs.h:1743 [inline]
 new_sync_write fs/read_write.c:457 [inline]
 __vfs_write+0x684/0x970 fs/read_write.c:470
 vfs_write+0x189/0x510 fs/read_write.c:518
 SYSC_write fs/read_write.c:565 [inline]
 SyS_write+0xef/0x220 fs/read_write.c:557
 entry_SYSCALL_64_fastpath+0x1f/0xbe

Freed by task 4115:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:447
 set_track mm/kasan/kasan.c:459 [inline]
 kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
 __cache_free mm/slab.c:3503 [inline]
 kmem_cache_free+0x77/0x280 mm/slab.c:3763
 kfree_skbmem+0x1a1/0x1d0 net/core/skbuff.c:622
 __kfree_skb net/core/skbuff.c:682 [inline]
 kfree_skb+0x165/0x4c0 net/core/skbuff.c:699
 tipc_nl_compat_dumpit+0x36a/0x3c0 net/tipc/netlink_compat.c:260
 tipc_nl_compat_handle net/tipc/netlink_compat.c:1130 [inline]
 tipc_nl_compat_recv+0x756/0x18f0 net/tipc/netlink_compat.c:1199
 genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:598
 genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:623
 netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2397
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:634
 netlink_unicast_kernel net/netlink/af_netlink.c:1265 [inline]
 netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1291
 netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1854
 sock_sendmsg_nosec net/socket.c:633 [inline]
 sock_sendmsg+0xca/0x110 

  1   2   3   4   5   6   7   8   9   10   >