[PATCH V2 net-next 05/11] net: hns3: refactor interrupt coalescing init function

2018-01-11 Thread Peng Li
From: Fuyun Liang 

In the hardware, the coalesce configurable registers include GL0, GL1,
GL2. In the driver, the TX queues use the register GL1 and the RX queues
use the register GL0. This function initializes the configuration of the
interrupt coalescing, but does not distinguish between the TX direction
and the RX direction. It will cause some confusion.

This patch refactors the function to initialize the TX GL and the RX GL
separately. And the initialization of related variables also is added to
this patch.

Signed-off-by: Fuyun Liang 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 29 +
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 32c9f88..59d8d9f 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -206,21 +206,32 @@ void hns3_set_vector_coalesce_tx_gl(struct 
hns3_enet_tqp_vector *tqp_vector,
writel(tx_gl_reg, tqp_vector->mask_addr + HNS3_VECTOR_GL1_OFFSET);
 }
 
-static void hns3_vector_gl_rl_init(struct hns3_enet_tqp_vector *tqp_vector)
+static void hns3_vector_gl_rl_init(struct hns3_enet_tqp_vector *tqp_vector,
+  struct hns3_nic_priv *priv)
 {
+   struct hnae3_handle *h = priv->ae_handle;
+
/* initialize the configuration for interrupt coalescing.
 * 1. GL (Interrupt Gap Limiter)
 * 2. RL (Interrupt Rate Limiter)
 */
 
-   /* Default :enable interrupt coalesce */
-   tqp_vector->rx_group.int_gl = HNS3_INT_GL_50K;
+   /* Default: enable interrupt coalescing self-adaptive and GL */
+   tqp_vector->tx_group.gl_adapt_enable = 1;
+   tqp_vector->rx_group.gl_adapt_enable = 1;
+
tqp_vector->tx_group.int_gl = HNS3_INT_GL_50K;
-   hns3_set_vector_coalesc_gl(tqp_vector, HNS3_INT_GL_50K);
-   /* for now we are disabling Interrupt RL - we
-* will re-enable later
-*/
-   hns3_set_vector_coalesce_rl(tqp_vector, 0);
+   tqp_vector->rx_group.int_gl = HNS3_INT_GL_50K;
+
+   hns3_set_vector_coalesce_tx_gl(tqp_vector,
+  tqp_vector->tx_group.int_gl);
+   hns3_set_vector_coalesce_rx_gl(tqp_vector,
+  tqp_vector->rx_group.int_gl);
+
+   /* Default: disable RL */
+   h->kinfo.int_rl_setting = 0;
+   hns3_set_vector_coalesce_rl(tqp_vector, h->kinfo.int_rl_setting);
+
tqp_vector->rx_group.flow_level = HNS3_FLOW_LOW;
tqp_vector->tx_group.flow_level = HNS3_FLOW_LOW;
 }
@@ -2654,7 +2665,7 @@ static int hns3_nic_init_vector_data(struct hns3_nic_priv 
*priv)
tqp_vector->rx_group.total_packets = 0;
tqp_vector->tx_group.total_bytes = 0;
tqp_vector->tx_group.total_packets = 0;
-   hns3_vector_gl_rl_init(tqp_vector);
+   hns3_vector_gl_rl_init(tqp_vector, priv);
tqp_vector->handle = h;
 
ret = hns3_get_vector_ring_chain(tqp_vector,
-- 
1.9.1



[PATCH V2 net-next 06/11] net: hns3: refactor GL update function

2018-01-11 Thread Peng Li
From: Fuyun Liang 

The GL update function uses the max GL value between tx_int_gl and
rx_int_gl to set both new tx_int_gl and new rx_int_gl. Therefore, User
can not enable TX GL self-adaptive or RX GL self-adaptive individually.

This patch refactors the code to update the TX GL and the RX GL
separately, making user can enable TX GL self-adaptive or RX GL
self-adaptive individually.

Signed-off-by: Fuyun Liang 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 35 +++--
 1 file changed, 16 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 59d8d9f..2a139ef 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -2459,25 +2459,22 @@ static bool hns3_get_new_int_gl(struct 
hns3_enet_ring_group *ring_group)
 
 static void hns3_update_new_int_gl(struct hns3_enet_tqp_vector *tqp_vector)
 {
-   u16 rx_int_gl, tx_int_gl;
-   bool rx, tx;
-
-   rx = hns3_get_new_int_gl(&tqp_vector->rx_group);
-   tx = hns3_get_new_int_gl(&tqp_vector->tx_group);
-   rx_int_gl = tqp_vector->rx_group.int_gl;
-   tx_int_gl = tqp_vector->tx_group.int_gl;
-   if (rx && tx) {
-   if (rx_int_gl > tx_int_gl) {
-   tqp_vector->tx_group.int_gl = rx_int_gl;
-   tqp_vector->tx_group.flow_level =
-   tqp_vector->rx_group.flow_level;
-   hns3_set_vector_coalesc_gl(tqp_vector, rx_int_gl);
-   } else {
-   tqp_vector->rx_group.int_gl = tx_int_gl;
-   tqp_vector->rx_group.flow_level =
-   tqp_vector->tx_group.flow_level;
-   hns3_set_vector_coalesc_gl(tqp_vector, tx_int_gl);
-   }
+   struct hns3_enet_ring_group *rx_group = &tqp_vector->rx_group;
+   struct hns3_enet_ring_group *tx_group = &tqp_vector->tx_group;
+   bool rx_update, tx_update;
+
+   if (rx_group->gl_adapt_enable) {
+   rx_update = hns3_get_new_int_gl(rx_group);
+   if (rx_update)
+   hns3_set_vector_coalesce_rx_gl(tqp_vector,
+  rx_group->int_gl);
+   }
+
+   if (tx_group->gl_adapt_enable) {
+   tx_update = hns3_get_new_int_gl(&tqp_vector->tx_group);
+   if (tx_update)
+   hns3_set_vector_coalesce_tx_gl(tqp_vector,
+  tx_group->int_gl);
}
 }
 
-- 
1.9.1



RE: 答复: [f2fs-dev] [PATCH] f2fs: prevent newly created inode from being dirtied incorrectly

2018-01-11 Thread 정대호
Hi Zhikang,

We dropped vfs caches periodically to reproduce the kernel panic using drop 
cache command.
I mean you have to trigger a checkpoint right after 
f2fs_mark_inode_dirty_sync() for a new inode.
We don't have any special test cases for that and we just triggered to drop 
caches periodically.
It was not easy to reproduce that, but it always occurrs within 24 hours.

Thanks,

 
- Original Message -
Sender : zhangzhikang 
Date   : 2018-01-12 16:11 (GMT+9)
Title  : 答复: [f2fs-dev] [PATCH] f2fs: prevent newly created inode from being 
dirtied incorrectly
 
Hi Daeho:
 
We had tried to msleep(1) after f2fs_mark_inode_dirty_sync() in creating a 
new file, and then write checkpoint in another thread.
But it didn't cause a kernel panic.
 
So can you tell me what test case did you use, and provide the call trace?
 
Thank you!
 
Best regards,
Zhikang Zhang



[PATCH V2 net-next 09/11] net: hns3: add int_gl_idx setup for TX and RX queues

2018-01-11 Thread Peng Li
From: Fuyun Liang 

If the int_gl_idx does not be set, the default interrupt coalesce index
is 0. The TX queues and the RX queues will both use the GL0 as the
interrupt coalesce GL switch. But it should be GL1 for TX queues and GL0
for RX queues.

This patch adds the int_gl_idx setup for TX queues and RX queues.

Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 
SoC")
Signed-off-by: Fuyun Liang 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hnae3.h |  5 +
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 11 +++
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c |  5 +
 3 files changed, 21 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h 
b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
index 0bad0e3..634e932 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
@@ -133,11 +133,16 @@ struct hnae3_vector_info {
 #define HNAE3_RING_TYPE_B 0
 #define HNAE3_RING_TYPE_TX 0
 #define HNAE3_RING_TYPE_RX 1
+#define HNAE3_RING_GL_IDX_S 0
+#define HNAE3_RING_GL_IDX_M GENMASK(1, 0)
+#define HNAE3_RING_GL_RX 0
+#define HNAE3_RING_GL_TX 1
 
 struct hnae3_ring_chain_node {
struct hnae3_ring_chain_node *next;
u32 tqp_index;
u32 flag;
+   u32 int_gl_idx;
 };
 
 #define HNAE3_IS_TX_RING(node) \
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 2e9e61c..34879c4 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -2523,6 +2523,8 @@ static int hns3_get_vector_ring_chain(struct 
hns3_enet_tqp_vector *tqp_vector,
cur_chain->tqp_index = tx_ring->tqp->tqp_index;
hnae_set_bit(cur_chain->flag, HNAE3_RING_TYPE_B,
 HNAE3_RING_TYPE_TX);
+   hnae_set_field(cur_chain->int_gl_idx, HNAE3_RING_GL_IDX_M,
+  HNAE3_RING_GL_IDX_S, HNAE3_RING_GL_TX);
 
cur_chain->next = NULL;
 
@@ -2538,6 +2540,10 @@ static int hns3_get_vector_ring_chain(struct 
hns3_enet_tqp_vector *tqp_vector,
chain->tqp_index = tx_ring->tqp->tqp_index;
hnae_set_bit(chain->flag, HNAE3_RING_TYPE_B,
 HNAE3_RING_TYPE_TX);
+   hnae_set_field(chain->int_gl_idx,
+  HNAE3_RING_GL_IDX_M,
+  HNAE3_RING_GL_IDX_S,
+  HNAE3_RING_GL_TX);
 
cur_chain = chain;
}
@@ -2549,6 +2555,8 @@ static int hns3_get_vector_ring_chain(struct 
hns3_enet_tqp_vector *tqp_vector,
cur_chain->tqp_index = rx_ring->tqp->tqp_index;
hnae_set_bit(cur_chain->flag, HNAE3_RING_TYPE_B,
 HNAE3_RING_TYPE_RX);
+   hnae_set_field(cur_chain->int_gl_idx, HNAE3_RING_GL_IDX_M,
+  HNAE3_RING_GL_IDX_S, HNAE3_RING_GL_RX);
 
rx_ring = rx_ring->next;
}
@@ -2562,6 +2570,9 @@ static int hns3_get_vector_ring_chain(struct 
hns3_enet_tqp_vector *tqp_vector,
chain->tqp_index = rx_ring->tqp->tqp_index;
hnae_set_bit(chain->flag, HNAE3_RING_TYPE_B,
 HNAE3_RING_TYPE_RX);
+   hnae_set_field(chain->int_gl_idx, HNAE3_RING_GL_IDX_M,
+  HNAE3_RING_GL_IDX_S, HNAE3_RING_GL_RX);
+
cur_chain = chain;
 
rx_ring = rx_ring->next;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index d7352f5..27f0ab6 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -3409,6 +3409,11 @@ int hclge_bind_ring_with_vector(struct hclge_vport 
*vport,
   hnae_get_bit(node->flag, HNAE3_RING_TYPE_B));
hnae_set_field(tqp_type_and_id, HCLGE_TQP_ID_M,
   HCLGE_TQP_ID_S, node->tqp_index);
+   hnae_set_field(tqp_type_and_id, HCLGE_INT_GL_IDX_M,
+  HCLGE_INT_GL_IDX_S,
+  hnae_get_field(node->int_gl_idx,
+ HNAE3_RING_GL_IDX_M,
+ HNAE3_RING_GL_IDX_S));
req->tqp_type_and_id[i] = cpu_to_le16(tqp_type_and_id);
if (++i >= HCLGE_VECTOR_ELEMENTS_PER_CMD) {
req->int_cause_num = HCLGE_VECTOR_ELEMENTS_PER_CMD;
-- 
1.9.1



[PATCH V2 net-next 00/11] add some new features and fix some bugs

2018-01-11 Thread Peng Li
This patchset adds 3 ethtool features: get_channels,
get_coalesce and get_coalesce, and fix some bugs.

[patch 1/11] adds ethtool_ops.get_channels (ethtool -l) support
for VF.

[patch 2/11] removes TSO config command from VF driver,
as only main PF can config TSO MSS length according to
hardware.

[patch 3/11 - 4/11] add ethtool_ops {get|set}_coalesce
(ethtool -c/-C) support to PF.
[patch 5/11 - 9/11] fix some bugs related to {get|set}_coalesce.

[patch 10/11 - 11/11] fix the features handling in
hns3_nic_set_features(). Local variable "changed" was defined
to indicates features changed, but was used only for feature
NETIF_F_HW_VLAN_CTAG_RX. Add checking to improve the reliability.

---
Change log:
V1 -> V2:
1, Rewrite the cover letter requested by David Miller.
---

Fuyun Liang (7):
  net: hns3: add ethtool_ops.get_coalesce support to PF
  net: hns3: add ethtool_ops.set_coalesce support to PF
  net: hns3: refactor interrupt coalescing init function
  net: hns3: refactor GL update function
  net: hns3: remove unused GL setup function
  net: hns3: change the unit of GL value macro
  net: hns3: add int_gl_idx setup for TX and RX queues

Jian Shen (2):
  net: hns3: add feature check when feature changed
  net: hns3: check for NULL function pointer in hns3_nic_set_features

Peng Li (2):
  net: hns3: add ethtool_ops.get_channels support for VF
  net: hns3: remove TSO config command from VF driver

 drivers/net/ethernet/hisilicon/hns3/hnae3.h|   7 +
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c| 148 ++---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.h|  26 ++-
 drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 179 +
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c|   5 +
 .../ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.h   |   8 -
 .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c  |  50 +++---
 7 files changed, 336 insertions(+), 87 deletions(-)

-- 
1.9.1



[PATCH V2 net-next 03/11] net: hns3: add ethtool_ops.get_coalesce support to PF

2018-01-11 Thread Peng Li
From: Fuyun Liang 

This patch adds ethtool_ops.get_coalesce support to PF.

Whilst our hardware supports per queue values, external interfaces
support only a single shared value. As such we use the values for
queue 0.

Signed-off-by: Fuyun Liang 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hnae3.h|  2 ++
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.h|  1 +
 drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 37 ++
 3 files changed, 40 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h 
b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
index adec88d..0bad0e3 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
@@ -448,6 +448,8 @@ struct hnae3_knic_private_info {
u16 num_tqps; /* total number of TQPs in this handle */
struct hnae3_queue **tqp;  /* array base of all TQPs in this instance */
const struct hnae3_dcb_ops *dcb_ops;
+
+   u16 int_rl_setting;
 };
 
 struct hnae3_roce_private_info {
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h 
b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
index a2a7ea3..24f6109 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
@@ -464,6 +464,7 @@ struct hns3_enet_ring_group {
u16 count;
enum hns3_flow_level_range flow_level;
u16 int_gl;
+   u8 gl_adapt_enable;
 };
 
 struct hns3_enet_tqp_vector {
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
index f44336c..81b4b3b 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
@@ -887,6 +887,42 @@ static void hns3_get_channels(struct net_device *netdev,
h->ae_algo->ops->get_channels(h, ch);
 }
 
+static int hns3_get_coalesce_per_queue(struct net_device *netdev, u32 queue,
+  struct ethtool_coalesce *cmd)
+{
+   struct hns3_enet_tqp_vector *tx_vector, *rx_vector;
+   struct hns3_nic_priv *priv = netdev_priv(netdev);
+   struct hnae3_handle *h = priv->ae_handle;
+   u16 queue_num = h->kinfo.num_tqps;
+
+   if (queue >= queue_num) {
+   netdev_err(netdev,
+  "Invalid queue value %d! Queue max id=%d\n",
+  queue, queue_num - 1);
+   return -EINVAL;
+   }
+
+   tx_vector = priv->ring_data[queue].ring->tqp_vector;
+   rx_vector = priv->ring_data[queue_num + queue].ring->tqp_vector;
+
+   cmd->use_adaptive_tx_coalesce = tx_vector->tx_group.gl_adapt_enable;
+   cmd->use_adaptive_rx_coalesce = rx_vector->rx_group.gl_adapt_enable;
+
+   cmd->tx_coalesce_usecs = tx_vector->tx_group.int_gl;
+   cmd->rx_coalesce_usecs = rx_vector->rx_group.int_gl;
+
+   cmd->tx_coalesce_usecs_high = h->kinfo.int_rl_setting;
+   cmd->rx_coalesce_usecs_high = h->kinfo.int_rl_setting;
+
+   return 0;
+}
+
+static int hns3_get_coalesce(struct net_device *netdev,
+struct ethtool_coalesce *cmd)
+{
+   return hns3_get_coalesce_per_queue(netdev, 0, cmd);
+}
+
 static const struct ethtool_ops hns3vf_ethtool_ops = {
.get_drvinfo = hns3_get_drvinfo,
.get_ringparam = hns3_get_ringparam,
@@ -925,6 +961,7 @@ static void hns3_get_channels(struct net_device *netdev,
.nway_reset = hns3_nway_reset,
.get_channels = hns3_get_channels,
.set_channels = hns3_set_channels,
+   .get_coalesce = hns3_get_coalesce,
 };
 
 void hns3_ethtool_set_ops(struct net_device *netdev)
-- 
1.9.1



[PATCH V2 net-next 04/11] net: hns3: add ethtool_ops.set_coalesce support to PF

2018-01-11 Thread Peng Li
From: Fuyun Liang 

This patch adds ethtool_ops.set_coalesce support to PF.

Signed-off-by: Fuyun Liang 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c|  34 -
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.h|  17 +++
 drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 141 +
 3 files changed, 188 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 14c7625..32c9f88 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -170,14 +170,40 @@ static void hns3_set_vector_coalesc_gl(struct 
hns3_enet_tqp_vector *tqp_vector,
writel(gl_value, tqp_vector->mask_addr + HNS3_VECTOR_GL2_OFFSET);
 }
 
-static void hns3_set_vector_coalesc_rl(struct hns3_enet_tqp_vector *tqp_vector,
-  u32 rl_value)
+void hns3_set_vector_coalesce_rl(struct hns3_enet_tqp_vector *tqp_vector,
+u32 rl_value)
 {
+   u32 rl_reg = hns3_rl_usec_to_reg(rl_value);
+
/* this defines the configuration for RL (Interrupt Rate Limiter).
 * Rl defines rate of interrupts i.e. number of interrupts-per-second
 * GL and RL(Rate Limiter) are 2 ways to acheive interrupt coalescing
 */
-   writel(rl_value, tqp_vector->mask_addr + HNS3_VECTOR_RL_OFFSET);
+
+   if (rl_reg > 0 && !tqp_vector->tx_group.gl_adapt_enable &&
+   !tqp_vector->rx_group.gl_adapt_enable)
+   /* According to the hardware, the range of rl_reg is
+* 0-59 and the unit is 4.
+*/
+   rl_reg |=  HNS3_INT_RL_ENABLE_MASK;
+
+   writel(rl_reg, tqp_vector->mask_addr + HNS3_VECTOR_RL_OFFSET);
+}
+
+void hns3_set_vector_coalesce_rx_gl(struct hns3_enet_tqp_vector *tqp_vector,
+   u32 gl_value)
+{
+   u32 rx_gl_reg = hns3_gl_usec_to_reg(gl_value);
+
+   writel(rx_gl_reg, tqp_vector->mask_addr + HNS3_VECTOR_GL0_OFFSET);
+}
+
+void hns3_set_vector_coalesce_tx_gl(struct hns3_enet_tqp_vector *tqp_vector,
+   u32 gl_value)
+{
+   u32 tx_gl_reg = hns3_gl_usec_to_reg(gl_value);
+
+   writel(tx_gl_reg, tqp_vector->mask_addr + HNS3_VECTOR_GL1_OFFSET);
 }
 
 static void hns3_vector_gl_rl_init(struct hns3_enet_tqp_vector *tqp_vector)
@@ -194,7 +220,7 @@ static void hns3_vector_gl_rl_init(struct 
hns3_enet_tqp_vector *tqp_vector)
/* for now we are disabling Interrupt RL - we
 * will re-enable later
 */
-   hns3_set_vector_coalesc_rl(tqp_vector, 0);
+   hns3_set_vector_coalesce_rl(tqp_vector, 0);
tqp_vector->rx_group.flow_level = HNS3_FLOW_LOW;
tqp_vector->tx_group.flow_level = HNS3_FLOW_LOW;
 }
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h 
b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
index 24f6109..7adbda8 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
@@ -451,11 +451,15 @@ enum hns3_link_mode_bits {
HNS3_LM_COUNT = 15
 };
 
+#define HNS3_INT_GL_MAX0x1FE0
 #define HNS3_INT_GL_50K0x000A
 #define HNS3_INT_GL_20K0x0019
 #define HNS3_INT_GL_18K0x001B
 #define HNS3_INT_GL_8K 0x003E
 
+#define HNS3_INT_RL_MAX0x00EC
+#define HNS3_INT_RL_ENABLE_MASK0x40
+
 struct hns3_enet_ring_group {
/* array of pointers to rings */
struct hns3_enet_ring *ring;
@@ -595,6 +599,12 @@ static inline void hns3_write_reg(void __iomem *base, u32 
reg, u32 value)
 #define hns3_get_handle(ndev) \
(((struct hns3_nic_priv *)netdev_priv(ndev))->ae_handle)
 
+#define hns3_gl_usec_to_reg(int_gl) (int_gl >> 1)
+#define hns3_gl_round_down(int_gl) round_down(int_gl, 2)
+
+#define hns3_rl_usec_to_reg(int_rl) (int_rl >> 2)
+#define hns3_rl_round_down(int_rl) round_down(int_rl, 4)
+
 void hns3_ethtool_set_ops(struct net_device *netdev);
 int hns3_set_channels(struct net_device *netdev,
  struct ethtool_channels *ch);
@@ -607,6 +617,13 @@ int hns3_clean_rx_ring(
struct hns3_enet_ring *ring, int budget,
void (*rx_fn)(struct hns3_enet_ring *, struct sk_buff *));
 
+void hns3_set_vector_coalesce_rx_gl(struct hns3_enet_tqp_vector *tqp_vector,
+   u32 gl_value);
+void hns3_set_vector_coalesce_tx_gl(struct hns3_enet_tqp_vector *tqp_vector,
+   u32 gl_value);
+void hns3_set_vector_coalesce_rl(struct hns3_enet_tqp_vector *tqp_vector,
+u32 rl_value);
+
 #ifdef CONFIG_HNS3_DCB
 void hns3_dcbnl_setup(struct hnae3_handle *handle);
 #else
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
index 81b

[PATCH] drm/nouveau/core/client: use strlcpy() instead of strncpy()

2018-01-11 Thread Xiongfeng Wang
From: Xiongfeng Wang 

gcc-8 reports

drivers/gpu/drm/nouveau/nvif/client.c: In function 'nvif_client_init':
./include/linux/string.h:245:9: warning: '__builtin_strncpy' specified
bound 32 equals destination size [-Wstringop-truncation]

We need to use strlcpy() to make sure the dest string is nul-terminated.

Signed-off-by: Xiongfeng Wang 
---
 drivers/gpu/drm/nouveau/nvif/client.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvif/client.c 
b/drivers/gpu/drm/nouveau/nvif/client.c
index 12db549..f294d99 100644
--- a/drivers/gpu/drm/nouveau/nvif/client.c
+++ b/drivers/gpu/drm/nouveau/nvif/client.c
@@ -69,7 +69,7 @@
} nop = {};
int ret;
 
-   strncpy(args.name, name, sizeof(args.name));
+   strlcpy(args.name, name, sizeof(args.name));
ret = nvif_object_init(parent != client ? &parent->object : NULL,
   0, NVIF_CLASS_CLIENT, &args, sizeof(args),
   &client->object);
-- 
1.8.3.1



[PATCH V2 net-next 11/11] net: hns3: check for NULL function pointer in hns3_nic_set_features

2018-01-11 Thread Peng Li
From: Jian Shen 

It's necessary to check hook whether being defined before
calling, improve the reliability.

Signed-off-by: Jian Shen 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index a7ae4f3..ac84816 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -1133,14 +1133,16 @@ static int hns3_nic_set_features(struct net_device 
*netdev,
}
}
 
-   if (changed & NETIF_F_HW_VLAN_CTAG_FILTER) {
+   if ((changed & NETIF_F_HW_VLAN_CTAG_FILTER) &&
+   h->ae_algo->ops->enable_vlan_filter) {
if (features & NETIF_F_HW_VLAN_CTAG_FILTER)
h->ae_algo->ops->enable_vlan_filter(h, true);
else
h->ae_algo->ops->enable_vlan_filter(h, false);
}
 
-   if (changed & NETIF_F_HW_VLAN_CTAG_RX) {
+   if ((changed & NETIF_F_HW_VLAN_CTAG_RX) &&
+   h->ae_algo->ops->enable_hw_strip_rxvtag) {
if (features & NETIF_F_HW_VLAN_CTAG_RX)
ret = h->ae_algo->ops->enable_hw_strip_rxvtag(h, true);
else
-- 
1.9.1



[PATCH net-next v5 1/4] phy: add 2.5G SGMII mode to the phy_mode enum

2018-01-11 Thread Antoine Tenart
This patch adds one more generic PHY mode to the phy_mode enum, to allow
configuring generic PHYs to the 2.5G SGMII mode by using the set_mode
callback.

Signed-off-by: Antoine Tenart 
---
 include/linux/phy/phy.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/phy/phy.h b/include/linux/phy/phy.h
index 4f8423a948d5..5a80e9de3686 100644
--- a/include/linux/phy/phy.h
+++ b/include/linux/phy/phy.h
@@ -28,6 +28,7 @@ enum phy_mode {
PHY_MODE_USB_DEVICE,
PHY_MODE_USB_OTG,
PHY_MODE_SGMII,
+   PHY_MODE_2500SGMII,
PHY_MODE_10GKR,
PHY_MODE_UFS_HS_A,
PHY_MODE_UFS_HS_B,
-- 
2.14.3



RE: [PATCH v2 05/16] remoteproc: modify rproc_handle_carveout to support preallocated region

2018-01-11 Thread Loic PALLARDY


> -Original Message-
> From: Bjorn Andersson [mailto:bjorn.anders...@linaro.org]
> Sent: Thursday, December 14, 2017 1:59 AM
> To: Loic PALLARDY 
> Cc: o...@wizery.com; linux-remotep...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Arnaud POULIQUEN ;
> benjamin.gaign...@linaro.org
> Subject: Re: [PATCH v2 05/16] remoteproc: modify rproc_handle_carveout to
> support preallocated region
> 
> On Thu 30 Nov 08:46 PST 2017, Loic Pallardy wrote:
> 
> > In current version rproc_handle_carveout function support only dynamic
> > region allocation.
> > This patch extends rproc_handle_carveout function to support different
> carveout
> > configurations:
> > - fixed DA and fixed PA: check if already part of pre-registered carveouts
> > (platform driver). If no, return error.
> > - fixed DA and any PA: check if already part of pre-allocated carveouts
> > (platform driver). If not found and rproc supports iommu, continue with
> > dynamic allocation (DA will be used for iommu programming), else return
> > error as no way to force DA.
> > - any DA and any PA: use original dynamic allocation
> >
> > Signed-off-by: Loic Pallardy 
> > ---
> >  drivers/remoteproc/remoteproc_core.c | 40
> 
> >  1 file changed, 40 insertions(+)
> >
> > diff --git a/drivers/remoteproc/remoteproc_core.c
> b/drivers/remoteproc/remoteproc_core.c
> > index 78525d1..515a17a 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -184,6 +184,10 @@ void *rproc_da_to_va(struct rproc *rproc, u64 da,
> int len)
> > struct rproc_mem_entry *carveout;
> > void *ptr = NULL;
> >
> > +   /*
> > +* da_to_va platform driver is deprecated. Driver should register
> > +* carveout thanks to rproc_add_carveout function
> > +*/
> 
> I think this comment is unrelated to the rest of this patch. I also
> think that at the end of the carveout-rework we should have a patch
> removing this ops.

I'll remove this comment and add a da_to_va clean-up patch at the end of the 
series

> 
> > if (rproc->ops->da_to_va) {
> > ptr = rproc->ops->da_to_va(rproc, da, len);
> > if (ptr)
> > @@ -677,6 +681,7 @@ static int rproc_handle_carveout(struct rproc
> *rproc,
> > struct rproc_mem_entry *carveout, *mapping;
> > struct device *dev = &rproc->dev;
> > dma_addr_t dma;
> > +   phys_addr_t pa;
> > void *va;
> > int ret;
> >
> > @@ -698,6 +703,41 @@ static int rproc_handle_carveout(struct rproc
> *rproc,
> > if (!carveout)
> > return -ENOMEM;
> >
> > +   /* Check carveout rsc already part of a registered carveout */
> > +   if (rsc->da != FW_RSC_ADDR_ANY) {
> 
> As mentioned before, I consider it perfectly viable for rsc->da to be
> ANY and the driver providing a fixed carveout.

Yes I'll change sequence to lookup by name first and then verify exact 
parameters matching , not only da definition.

> 
> > +   va = rproc_find_carveout_by_da(rproc, rsc->da, rsc->len);
> > +
> > +   if (va) {
> 
> In a system with an iommu it's possible that rsc->len is larger than
> some carveout->len and va is NULL here so we fall through, allocate some
> memory and remap a segment of the carveout. (Or hopefully fails
> attempting).
> 
> > +   /* Registered region found */
> > +   pa = rproc_va_to_pa(va);
> > +   if (rsc->pa != FW_RSC_ADDR_ANY && rsc->pa !=
> (u32)pa) {
> > +   /* Carveout doesn't match request */
> > +   dev_err(dev->parent,
> > +   "Failed to find carveout fitting da and
> pa\n");
> > +   return -ENOMEM;
> > +   }
> > +
> > +   /* Update rsc table with physical address */
> > +   rsc->pa = (u32)pa;
> > +
> > +   /* Update carveouts list */
> > +   carveout->va = va;
> > +   carveout->len = rsc->len;
> > +   carveout->da = rsc->da;
> > +   carveout->priv = (void *)CARVEOUT_RSC;
> > +
> > +   list_add_tail(&carveout->node, &rproc->carveouts);
> 
> rproc_find_carveout_by_da() will return a reference into a carveout, now
> we add another overlapping carveout into the same list.
> 
> 
> I think it would be saner to not allow the resource table to describe
> subsets of carveouts registered by the driver.
> 
> In which case this would better find a carveout by name or exact da,
> then check that the pa, da, len and rsc->flags are adequate.

Agree
/Loic
> 
> > +
> > +   return 0;
> > +   }
> > +
> > +   if (!rproc->domain) {
> 
> Currently this function ignore invalid values of da when !domain, so I
> think it would be good you can submit this sanity check in it's own
> patch so that anyone bisecting this would know why their broken firmware
> suddenly isn't loadable.
> 
> > +  

[PATCH net-next v5 4/4] net: mvpp2: 2500baseX support

2018-01-11 Thread Antoine Tenart
This patch adds the 2500Base-X PHY mode support in the Marvell PPv2
driver. 2500Base-X is quite close to 1000Base-X and SGMII modes and uses
nearly the same code path.

Signed-off-by: Antoine Tenart 
Reviewed-by: Andrew Lunn 
---
 drivers/net/ethernet/marvell/mvpp2.c | 49 
 1 file changed, 39 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2.c 
b/drivers/net/ethernet/marvell/mvpp2.c
index 257a6b99b4ca..38f9a79481c6 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -4502,6 +4502,7 @@ static int mvpp22_gop_init(struct mvpp2_port *port)
break;
case PHY_INTERFACE_MODE_SGMII:
case PHY_INTERFACE_MODE_1000BASEX:
+   case PHY_INTERFACE_MODE_2500BASEX:
mvpp22_gop_init_sgmii(port);
break;
case PHY_INTERFACE_MODE_10GKR:
@@ -4540,7 +4541,8 @@ static void mvpp22_gop_unmask_irq(struct mvpp2_port *port)
 
if (phy_interface_mode_is_rgmii(port->phy_interface) ||
port->phy_interface == PHY_INTERFACE_MODE_SGMII ||
-   port->phy_interface == PHY_INTERFACE_MODE_1000BASEX) {
+   port->phy_interface == PHY_INTERFACE_MODE_1000BASEX ||
+   port->phy_interface == PHY_INTERFACE_MODE_2500BASEX) {
/* Enable the GMAC link status irq for this port */
val = readl(port->base + MVPP22_GMAC_INT_SUM_MASK);
val |= MVPP22_GMAC_INT_SUM_MASK_LINK_STAT;
@@ -4571,7 +4573,8 @@ static void mvpp22_gop_mask_irq(struct mvpp2_port *port)
 
if (phy_interface_mode_is_rgmii(port->phy_interface) ||
port->phy_interface == PHY_INTERFACE_MODE_SGMII ||
-   port->phy_interface == PHY_INTERFACE_MODE_1000BASEX) {
+   port->phy_interface == PHY_INTERFACE_MODE_1000BASEX ||
+   port->phy_interface == PHY_INTERFACE_MODE_2500BASEX) {
val = readl(port->base + MVPP22_GMAC_INT_SUM_MASK);
val &= ~MVPP22_GMAC_INT_SUM_MASK_LINK_STAT;
writel(val, port->base + MVPP22_GMAC_INT_SUM_MASK);
@@ -4584,7 +4587,8 @@ static void mvpp22_gop_setup_irq(struct mvpp2_port *port)
 
if (phy_interface_mode_is_rgmii(port->phy_interface) ||
port->phy_interface == PHY_INTERFACE_MODE_SGMII ||
-   port->phy_interface == PHY_INTERFACE_MODE_1000BASEX) {
+   port->phy_interface == PHY_INTERFACE_MODE_1000BASEX ||
+   port->phy_interface == PHY_INTERFACE_MODE_2500BASEX) {
val = readl(port->base + MVPP22_GMAC_INT_MASK);
val |= MVPP22_GMAC_INT_MASK_LINK_STAT;
writel(val, port->base + MVPP22_GMAC_INT_MASK);
@@ -4599,6 +4603,16 @@ static void mvpp22_gop_setup_irq(struct mvpp2_port *port)
mvpp22_gop_unmask_irq(port);
 }
 
+/* Sets the PHY mode of the COMPHY (which configures the serdes lanes).
+ *
+ * The PHY mode used by the PPv2 driver comes from the network subsystem, while
+ * the one given to the COMPHY comes from the generic PHY subsystem. Hence they
+ * differ.
+ *
+ * The COMPHY configures the serdes lanes regardless of the actual use of the
+ * lanes by the physical layer. This is why configurations like
+ * "PPv2 (2500BaseX) - COMPHY (2500SGMII)" are valid.
+ */
 static int mvpp22_comphy_init(struct mvpp2_port *port)
 {
enum phy_mode mode;
@@ -4612,6 +4626,9 @@ static int mvpp22_comphy_init(struct mvpp2_port *port)
case PHY_INTERFACE_MODE_1000BASEX:
mode = PHY_MODE_SGMII;
break;
+   case PHY_INTERFACE_MODE_2500BASEX:
+   mode = PHY_MODE_2500SGMII;
+   break;
case PHY_INTERFACE_MODE_10GKR:
mode = PHY_MODE_10GKR;
break;
@@ -4631,7 +4648,8 @@ static void mvpp2_port_mii_gmac_configure_mode(struct 
mvpp2_port *port)
u32 val;
 
if (port->phy_interface == PHY_INTERFACE_MODE_SGMII ||
-   port->phy_interface == PHY_INTERFACE_MODE_1000BASEX) {
+   port->phy_interface == PHY_INTERFACE_MODE_1000BASEX ||
+   port->phy_interface == PHY_INTERFACE_MODE_2500BASEX) {
val = readl(port->base + MVPP22_GMAC_CTRL_4_REG);
val |= MVPP22_CTRL4_SYNC_BYPASS_DIS | MVPP22_CTRL4_DP_CLK_SEL |
   MVPP22_CTRL4_QSGMII_BYPASS_ACTIVE;
@@ -4647,7 +4665,8 @@ static void mvpp2_port_mii_gmac_configure_mode(struct 
mvpp2_port *port)
}
 
val = readl(port->base + MVPP2_GMAC_CTRL_0_REG);
-   if (port->phy_interface == PHY_INTERFACE_MODE_1000BASEX)
+   if (port->phy_interface == PHY_INTERFACE_MODE_1000BASEX ||
+   port->phy_interface == PHY_INTERFACE_MODE_2500BASEX)
val |= MVPP2_GMAC_PORT_TYPE_MASK;
else
val &= ~MVPP2_GMAC_PORT_TYPE_MASK;
@@ -4660,7 +4679,13 @@ static void mvpp2_port_mii_gmac_configure_mode(struct 
mvpp2_port *port)
if (port->phy_interface == PHY_INTERFACE_MODE_SGMII)
 

[PATCH net-next v5 3/4] net: mvpp2: 1000baseX support

2018-01-11 Thread Antoine Tenart
This patch adds the 1000Base-X PHY mode support in the Marvell PPv2
driver. 1000Base-X is quite close the SGMII and uses nearly the same
code path.

Signed-off-by: Antoine Tenart 
---
 drivers/net/ethernet/marvell/mvpp2.c | 45 
 1 file changed, 35 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2.c 
b/drivers/net/ethernet/marvell/mvpp2.c
index a19760736b71..257a6b99b4ca 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -4501,6 +4501,7 @@ static int mvpp22_gop_init(struct mvpp2_port *port)
mvpp22_gop_init_rgmii(port);
break;
case PHY_INTERFACE_MODE_SGMII:
+   case PHY_INTERFACE_MODE_1000BASEX:
mvpp22_gop_init_sgmii(port);
break;
case PHY_INTERFACE_MODE_10GKR:
@@ -4538,7 +4539,8 @@ static void mvpp22_gop_unmask_irq(struct mvpp2_port *port)
u32 val;
 
if (phy_interface_mode_is_rgmii(port->phy_interface) ||
-   port->phy_interface == PHY_INTERFACE_MODE_SGMII) {
+   port->phy_interface == PHY_INTERFACE_MODE_SGMII ||
+   port->phy_interface == PHY_INTERFACE_MODE_1000BASEX) {
/* Enable the GMAC link status irq for this port */
val = readl(port->base + MVPP22_GMAC_INT_SUM_MASK);
val |= MVPP22_GMAC_INT_SUM_MASK_LINK_STAT;
@@ -4568,7 +4570,8 @@ static void mvpp22_gop_mask_irq(struct mvpp2_port *port)
}
 
if (phy_interface_mode_is_rgmii(port->phy_interface) ||
-   port->phy_interface == PHY_INTERFACE_MODE_SGMII) {
+   port->phy_interface == PHY_INTERFACE_MODE_SGMII ||
+   port->phy_interface == PHY_INTERFACE_MODE_1000BASEX) {
val = readl(port->base + MVPP22_GMAC_INT_SUM_MASK);
val &= ~MVPP22_GMAC_INT_SUM_MASK_LINK_STAT;
writel(val, port->base + MVPP22_GMAC_INT_SUM_MASK);
@@ -4580,7 +4583,8 @@ static void mvpp22_gop_setup_irq(struct mvpp2_port *port)
u32 val;
 
if (phy_interface_mode_is_rgmii(port->phy_interface) ||
-   port->phy_interface == PHY_INTERFACE_MODE_SGMII) {
+   port->phy_interface == PHY_INTERFACE_MODE_SGMII ||
+   port->phy_interface == PHY_INTERFACE_MODE_1000BASEX) {
val = readl(port->base + MVPP22_GMAC_INT_MASK);
val |= MVPP22_GMAC_INT_MASK_LINK_STAT;
writel(val, port->base + MVPP22_GMAC_INT_MASK);
@@ -4605,6 +4609,7 @@ static int mvpp22_comphy_init(struct mvpp2_port *port)
 
switch (port->phy_interface) {
case PHY_INTERFACE_MODE_SGMII:
+   case PHY_INTERFACE_MODE_1000BASEX:
mode = PHY_MODE_SGMII;
break;
case PHY_INTERFACE_MODE_10GKR:
@@ -4625,7 +4630,8 @@ static void mvpp2_port_mii_gmac_configure_mode(struct 
mvpp2_port *port)
 {
u32 val;
 
-   if (port->phy_interface == PHY_INTERFACE_MODE_SGMII) {
+   if (port->phy_interface == PHY_INTERFACE_MODE_SGMII ||
+   port->phy_interface == PHY_INTERFACE_MODE_1000BASEX) {
val = readl(port->base + MVPP22_GMAC_CTRL_4_REG);
val |= MVPP22_CTRL4_SYNC_BYPASS_DIS | MVPP22_CTRL4_DP_CLK_SEL |
   MVPP22_CTRL4_QSGMII_BYPASS_ACTIVE;
@@ -4640,9 +4646,11 @@ static void mvpp2_port_mii_gmac_configure_mode(struct 
mvpp2_port *port)
writel(val, port->base + MVPP22_GMAC_CTRL_4_REG);
}
 
-   /* The port is connected to a copper PHY */
val = readl(port->base + MVPP2_GMAC_CTRL_0_REG);
-   val &= ~MVPP2_GMAC_PORT_TYPE_MASK;
+   if (port->phy_interface == PHY_INTERFACE_MODE_1000BASEX)
+   val |= MVPP2_GMAC_PORT_TYPE_MASK;
+   else
+   val &= ~MVPP2_GMAC_PORT_TYPE_MASK;
writel(val, port->base + MVPP2_GMAC_CTRL_0_REG);
 
val = readl(port->base + MVPP2_GMAC_AUTONEG_CONFIG);
@@ -4651,6 +4659,19 @@ static void mvpp2_port_mii_gmac_configure_mode(struct 
mvpp2_port *port)
   MVPP2_GMAC_AN_DUPLEX_EN;
if (port->phy_interface == PHY_INTERFACE_MODE_SGMII)
val |= MVPP2_GMAC_IN_BAND_AUTONEG;
+
+   if (port->phy_interface == PHY_INTERFACE_MODE_1000BASEX)
+   /* 1000BaseX port cannot negotiate speed nor can it
+* negotiate duplex: they are always operating with a
+* fixed speed of 1000Mbps in full duplex, so force
+* 1000 speed and full duplex here.
+*/
+   val |= MVPP2_GMAC_CONFIG_GMII_SPEED |
+  MVPP2_GMAC_CONFIG_FULL_DUPLEX;
+   else
+   val |= MVPP2_GMAC_AN_SPEED_EN |
+  MVPP2_GMAC_AN_DUPLEX_EN;
+
writel(val, port->base + MVPP2_GMAC_AUTONEG_CONFIG);
 }
 
@@ -4671,7 +4692,8 @@ static void mvpp2_port_mii_gmac_configure(struct 
mvpp2_port *port)
 
/* Configure the PCS and in-band AN */
val = readl(port->base + MVPP2_GMAC_

[PATCH net-next v5 2/4] phy: cp110-comphy: 2.5G SGMII mode

2018-01-11 Thread Antoine Tenart
This patch allow the CP100 comphy to configure some lanes in the
2.5G SGMII mode. This mode is quite close to SGMII and uses nearly the
same code path.

Signed-off-by: Antoine Tenart 
---
 drivers/phy/marvell/phy-mvebu-cp110-comphy.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/phy/marvell/phy-mvebu-cp110-comphy.c 
b/drivers/phy/marvell/phy-mvebu-cp110-comphy.c
index a0d522154cdf..4ef429250d7b 100644
--- a/drivers/phy/marvell/phy-mvebu-cp110-comphy.c
+++ b/drivers/phy/marvell/phy-mvebu-cp110-comphy.c
@@ -135,19 +135,25 @@ struct mvebu_comhy_conf {
 static const struct mvebu_comhy_conf mvebu_comphy_cp110_modes[] = {
/* lane 0 */
MVEBU_COMPHY_CONF(0, 1, PHY_MODE_SGMII, 0x1),
+   MVEBU_COMPHY_CONF(0, 1, PHY_MODE_2500SGMII, 0x1),
/* lane 1 */
MVEBU_COMPHY_CONF(1, 2, PHY_MODE_SGMII, 0x1),
+   MVEBU_COMPHY_CONF(1, 2, PHY_MODE_2500SGMII, 0x1),
/* lane 2 */
MVEBU_COMPHY_CONF(2, 0, PHY_MODE_SGMII, 0x1),
+   MVEBU_COMPHY_CONF(2, 0, PHY_MODE_2500SGMII, 0x1),
MVEBU_COMPHY_CONF(2, 0, PHY_MODE_10GKR, 0x1),
/* lane 3 */
MVEBU_COMPHY_CONF(3, 1, PHY_MODE_SGMII, 0x2),
+   MVEBU_COMPHY_CONF(3, 1, PHY_MODE_2500SGMII, 0x2),
/* lane 4 */
MVEBU_COMPHY_CONF(4, 0, PHY_MODE_SGMII, 0x2),
+   MVEBU_COMPHY_CONF(4, 0, PHY_MODE_2500SGMII, 0x2),
MVEBU_COMPHY_CONF(4, 0, PHY_MODE_10GKR, 0x2),
MVEBU_COMPHY_CONF(4, 1, PHY_MODE_SGMII, 0x1),
/* lane 5 */
MVEBU_COMPHY_CONF(5, 2, PHY_MODE_SGMII, 0x1),
+   MVEBU_COMPHY_CONF(5, 2, PHY_MODE_2500SGMII, 0x1),
 };
 
 struct mvebu_comphy_priv {
@@ -206,6 +212,10 @@ static void mvebu_comphy_ethernet_init_reset(struct 
mvebu_comphy_lane *lane,
if (mode == PHY_MODE_10GKR)
val |= MVEBU_COMPHY_SERDES_CFG0_GEN_RX(0xe) |
   MVEBU_COMPHY_SERDES_CFG0_GEN_TX(0xe);
+   else if (mode == PHY_MODE_2500SGMII)
+   val |= MVEBU_COMPHY_SERDES_CFG0_GEN_RX(0x8) |
+  MVEBU_COMPHY_SERDES_CFG0_GEN_TX(0x8) |
+  MVEBU_COMPHY_SERDES_CFG0_HALF_BUS;
else if (mode == PHY_MODE_SGMII)
val |= MVEBU_COMPHY_SERDES_CFG0_GEN_RX(0x6) |
   MVEBU_COMPHY_SERDES_CFG0_GEN_TX(0x6) |
@@ -296,13 +306,13 @@ static int mvebu_comphy_init_plls(struct 
mvebu_comphy_lane *lane,
return 0;
 }
 
-static int mvebu_comphy_set_mode_sgmii(struct phy *phy)
+static int mvebu_comphy_set_mode_sgmii(struct phy *phy, enum phy_mode mode)
 {
struct mvebu_comphy_lane *lane = phy_get_drvdata(phy);
struct mvebu_comphy_priv *priv = lane->priv;
u32 val;
 
-   mvebu_comphy_ethernet_init_reset(lane, PHY_MODE_SGMII);
+   mvebu_comphy_ethernet_init_reset(lane, mode);
 
val = readl(priv->base + MVEBU_COMPHY_RX_CTRL1(lane->id));
val &= ~MVEBU_COMPHY_RX_CTRL1_CLK8T_EN;
@@ -487,7 +497,8 @@ static int mvebu_comphy_power_on(struct phy *phy)
 
switch (lane->mode) {
case PHY_MODE_SGMII:
-   ret = mvebu_comphy_set_mode_sgmii(phy);
+   case PHY_MODE_2500SGMII:
+   ret = mvebu_comphy_set_mode_sgmii(phy, lane->mode);
break;
case PHY_MODE_10GKR:
ret = mvebu_comphy_set_mode_10gkr(phy);
-- 
2.14.3



[PATCH net-next v5 0/4] net: mvpp2: 1000BaseX and 2500BaseX support

2018-01-11 Thread Antoine Tenart
Hi all,

This series adds 1000BaseX and 2500BaseX support to the Marvell PPv2
driver. In order to use it, the 2.5 SGMII mode is added in the Marvell
common PHY driver (cp110-comphy).

This was tested on a mcbin.

All patches should probably go through net-next as patch 4/4 depends on
patch 1/4 to build and work.

Please note the two mvpp2 patches do not conflict with the ACPI series
v2 Marcin sent a few days ago, and the two series can be processed in
parallel. (Marcin is aware of me sending this series).

Thanks!
Antoine

Since v4:
  - Fixed a compilation warning which was a real error in the code.

Since v3:
  - Stopped setting the MII_SPEED bit in the GMAC AN register, as the
GMII_SPEED bit takes over anyway.
  - Added Andrew's Reviewed-by on patch 4/4.

Since v2:
  - Added a comment before mvpp22_comphy_init() about the different PHY modes
used and why they differ between the PPv2 driver and the COMPHY one.

Since v1:
  - s/PHY_MODE_SGMII_2_5G/PHY_MODE_2500SGMII/
  - Fixed a build error in 'net: mvpp2: 1000baseX support' (which was solved in
the 2500baseX support one, but the bisection was broken).
  - Removed the dt patches, as the fourth network interface on the mcbin also
needs PHYLINK support in the PPv2 driver to be correctly supported.

Antoine Tenart (4):
  phy: add 2.5G SGMII mode to the phy_mode enum
  phy: cp110-comphy: 2.5G SGMII mode
  net: mvpp2: 1000baseX support
  net: mvpp2: 2500baseX support

 drivers/net/ethernet/marvell/mvpp2.c | 74 
 drivers/phy/marvell/phy-mvebu-cp110-comphy.c | 17 +--
 include/linux/phy/phy.h  |  1 +
 3 files changed, 79 insertions(+), 13 deletions(-)

-- 
2.14.3



[PATCH V2 net-next 08/11] net: hns3: change the unit of GL value macro

2018-01-11 Thread Peng Li
From: Fuyun Liang 

Previously, driver used 2us as the GL unit. The time unit ethtool
command "-c" and "-C" use is 1us, so now the GL unit driver uses
actually is 1us.

This patch changes the unit of GL value macro from
2us to 1us.

Signed-off-by: Fuyun Liang 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h 
b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
index 7adbda8..213f501 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
@@ -452,10 +452,10 @@ enum hns3_link_mode_bits {
 };
 
 #define HNS3_INT_GL_MAX0x1FE0
-#define HNS3_INT_GL_50K0x000A
-#define HNS3_INT_GL_20K0x0019
-#define HNS3_INT_GL_18K0x001B
-#define HNS3_INT_GL_8K 0x003E
+#define HNS3_INT_GL_50K0x0014
+#define HNS3_INT_GL_20K0x0032
+#define HNS3_INT_GL_18K0x0036
+#define HNS3_INT_GL_8K 0x007C
 
 #define HNS3_INT_RL_MAX0x00EC
 #define HNS3_INT_RL_ENABLE_MASK0x40
-- 
1.9.1



Re: [PATCH 1/5] x86/ibrs: Introduce native_rdmsrl, and native_wrmsrl

2018-01-11 Thread Greg KH
On Thu, Jan 11, 2018 at 05:32:15PM -0800, Ashok Raj wrote:
> - Remove including microcode.h, and use native macros from asm/msr.h
> - added license header for spec_ctrl.c

Worst changlog ever :(

Why are you touching spec_ctrl.c in this patch?  How does it belong here
in this series?

Come on, you know better than this...

greg k-h


[PATCH V2 net-next 02/11] net: hns3: remove TSO config command from VF driver

2018-01-11 Thread Peng Li
Only main PF can config TSO MSS length according to hardware.
This patch removes TSO config command from VF driver.

Signed-off-by: Peng Li 
---
 .../net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.h |  8 
 .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c| 20 
 2 files changed, 28 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.h 
b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.h
index ad8adfe..2caca93 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.h
@@ -86,8 +86,6 @@ enum hclgevf_opcode_type {
HCLGEVF_OPC_QUERY_TX_STATUS = 0x0B03,
HCLGEVF_OPC_QUERY_RX_STATUS = 0x0B13,
HCLGEVF_OPC_CFG_COM_TQP_QUEUE   = 0x0B20,
-   /* TSO cmd */
-   HCLGEVF_OPC_TSO_GENERIC_CONFIG  = 0x0C01,
/* RSS cmd */
HCLGEVF_OPC_RSS_GENERIC_CONFIG  = 0x0D01,
HCLGEVF_OPC_RSS_INDIR_TABLE = 0x0D07,
@@ -202,12 +200,6 @@ struct hclgevf_cfg_tx_queue_pointer_cmd {
u8 rsv[14];
 };
 
-#define HCLGEVF_TSO_ENABLE_B   0
-struct hclgevf_cfg_tso_status_cmd {
-   u8 tso_enable;
-   u8 rsv[23];
-};
-
 #define HCLGEVF_TYPE_CRQ   0
 #define HCLGEVF_TYPE_CSQ   1
 #define HCLGEVF_NIC_CSQ_BASEADDR_L_REG 0x27000
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index 5f9afa6..3d2bc9a 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -201,20 +201,6 @@ static int hclge_get_queue_info(struct hclgevf_dev *hdev)
return 0;
 }
 
-static int hclgevf_enable_tso(struct hclgevf_dev *hdev, int enable)
-{
-   struct hclgevf_cfg_tso_status_cmd *req;
-   struct hclgevf_desc desc;
-
-   req = (struct hclgevf_cfg_tso_status_cmd *)desc.data;
-
-   hclgevf_cmd_setup_basic_desc(&desc, HCLGEVF_OPC_TSO_GENERIC_CONFIG,
-false);
-   hnae_set_bit(req->tso_enable, HCLGEVF_TSO_ENABLE_B, enable);
-
-   return hclgevf_cmd_send(&hdev->hw, &desc, 1);
-}
-
 static int hclgevf_alloc_tqps(struct hclgevf_dev *hdev)
 {
struct hclgevf_tqp *tqp;
@@ -1375,12 +1361,6 @@ static int hclgevf_init_ae_dev(struct hnae3_ae_dev 
*ae_dev)
goto err_config;
}
 
-   ret = hclgevf_enable_tso(hdev, true);
-   if (ret) {
-   dev_err(&pdev->dev, "failed(%d) to enable tso\n", ret);
-   goto err_config;
-   }
-
/* Initialize VF's MTA */
hdev->accept_mta_mc = true;
ret = hclgevf_cfg_func_mta_filter(&hdev->nic, hdev->accept_mta_mc);
-- 
1.9.1



Re: [PATCH v4] perf tools: Add ARM Statistical Profiling Extensions (SPE) support

2018-01-11 Thread gengdongjiu
On 2018/1/11 22:17, Adrian Hunter wrote:
>>   (e.g., via 'perf inject --itrace'), are also not supported
>>
>> - technically both cs-etm and spe can be used simultaneously, however
>>   disabled for simplicity in this release
>>
>> Signed-off-by: Kim Phillips 
> For what is there now, it looks fine from the auxtrace point of view.  There
> are a couple of minor points below but nevertheless:
> 
> Acked-by: Adrian Hunter 

This patch is good to me.
Reviewed-by: gengdong...@huawei.com

> 
>> ---
>> v4: rebased onto acme's perf/core, whitespace fixes.



[PATCH] IB/cma: use strlcpy() instead of strncpy()

2018-01-11 Thread Xiongfeng Wang
From: Xiongfeng Wang 

gcc-8 reports

drivers/infiniband/core/cma_configfs.c: In function 'make_cma_dev':
./include/linux/string.h:245:9: warning: '__builtin_strncpy' specified
bound 64 equals destination size [-Wstringop-truncation]

We need to use strlcpy() to make sure the string is nul-terminated.

Signed-off-by: Xiongfeng Wang 
---
 drivers/infiniband/core/cma_configfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/cma_configfs.c 
b/drivers/infiniband/core/cma_configfs.c
index 31dfee0..eee38b4 100644
--- a/drivers/infiniband/core/cma_configfs.c
+++ b/drivers/infiniband/core/cma_configfs.c
@@ -295,7 +295,7 @@ static struct config_group *make_cma_dev(struct 
config_group *group,
goto fail;
}
 
-   strncpy(cma_dev_group->name, name, sizeof(cma_dev_group->name));
+   strlcpy(cma_dev_group->name, name, sizeof(cma_dev_group->name));
 
config_group_init_type_name(&cma_dev_group->ports_group, "ports",
&cma_ports_group_type);
-- 
1.8.3.1



RE: [PATCH v2 04/16] remoteproc: introduce rproc_find_carveout_by_da

2018-01-11 Thread Loic PALLARDY


> -Original Message-
> From: Bjorn Andersson [mailto:bjorn.anders...@linaro.org]
> Sent: Thursday, December 14, 2017 1:46 AM
> To: Loic PALLARDY 
> Cc: o...@wizery.com; linux-remotep...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Arnaud POULIQUEN ;
> benjamin.gaign...@linaro.org
> Subject: Re: [PATCH v2 04/16] remoteproc: introduce
> rproc_find_carveout_by_da
> 
> On Thu 30 Nov 08:46 PST 2017, Loic Pallardy wrote:
> 
> > This patch provides a new function to find a carveout according
> > to a device address (da).
> > If match found, this function returns CPU virtual address corresponding
> > to specified da.
> >
> > Signed-off-by: Loic Pallardy 
> > ---
> >  drivers/remoteproc/remoteproc_core.c | 42
> 
> >  1 file changed, 42 insertions(+)
> >
> > diff --git a/drivers/remoteproc/remoteproc_core.c
> b/drivers/remoteproc/remoteproc_core.c
> > index 279320a..78525d1 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -211,6 +211,48 @@ void *rproc_da_to_va(struct rproc *rproc, u64 da,
> int len)
> >  }
> >  EXPORT_SYMBOL(rproc_da_to_va);
> >
> > +/**
> > + * rproc_find_carveout_by_da() - lookup the carveout region for a
> remoteproc address
> > + * @rproc: handle of a remote processor
> > + * @da: remoteproc device address to find
> > + * @len: length of the memory region @da is pointing to
> > + *
> > + * Platform driver has the capability to register some pre-allacoted
> carveout
> > + * (physically contiguous memory regions) before rproc firmware loading
> and
> > + * associated resource table analysis. These regions may be dedicated
> memory
> > + * regions internal to the coprocessor or specified DDR region with 
> > specific
> > + * attributes
> > + *
> > + * This function is a helper function with which we can go over the
> > + * allocated carveouts and translate specific device addresse to virtual
> > + * addresse so we can fill firmware resource table.
> > + *
> > + * The function returns a valid virtual address on success or NULL on
> failure.
> > + */
> > +void *rproc_find_carveout_by_da(struct rproc *rproc, u64 da, int len)
> 
> The name suggest that this returns a struct rproc_mem_entry *, but the
> implementation is just a duplicate of rproc_da_to_va.
> 
> I think I prefer that we just use the name based lookup in the
> subsequent patch, alternatively I think this should be made to return
> the carveout and one could then use da_to_va if you need a reference
> within that carveout.
Ok, agree to have more coherent API. And as discussed we can first lookup by 
name and then check if requested da is matching.

/Loic
> 
> > +{
> > +   struct rproc_mem_entry *carveout;
> > +   void *va = NULL;
> > +
> > +   list_for_each_entry(carveout, &rproc->carveouts, node) {
> > +   int offset = da - carveout->da;
> > +
> > +   /* try next carveout if da is too small */
> > +   if (offset < 0)
> > +   continue;
> > +
> > +   /* try next carveout if da is too large */
> > +   if (offset + len > carveout->len)
> > +   continue;
> > +
> > +   va = carveout->va + offset;
> > +
> > +   break;
> > +   }
> > +
> > +   return va;
> > +}
> 
> Regards,
> Bjorn


RE: [PATCH v2 03/16] remoteproc: introduce rproc_add_carveout function

2018-01-11 Thread Loic PALLARDY


> -Original Message-
> From: Bjorn Andersson [mailto:bjorn.anders...@linaro.org]
> Sent: Thursday, December 14, 2017 1:37 AM
> To: Loic PALLARDY 
> Cc: o...@wizery.com; linux-remotep...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Arnaud POULIQUEN ;
> benjamin.gaign...@linaro.org
> Subject: Re: [PATCH v2 03/16] remoteproc: introduce rproc_add_carveout
> function
> 
> On Thu 30 Nov 08:46 PST 2017, Loic Pallardy wrote:
> > diff --git a/drivers/remoteproc/remoteproc_core.c
> b/drivers/remoteproc/remoteproc_core.c
> > index f23daf9..279320a 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -737,6 +737,7 @@ static int rproc_handle_carveout(struct rproc
> *rproc,
> > carveout->dma = dma;
> > carveout->da = rsc->da;
> > carveout->release = rproc_release_carveout;
> > +   carveout->priv = (void *)CARVEOUT_RSC_ALLOCATED;
> 
> I don't fancy the (ab)use of priv to keep track of this, I also don't
> see that it's ever used. Please drop it.
It was to distinguish carveout defined from resource table and carveout 
registered by driver.
But agree about priv field usage
> 
> [..]
> > +int rproc_add_carveout(struct rproc *rproc, struct rproc_mem_entry
> *mem)
> > +{
> > +   if (!rproc || !mem)
> > +   return -EINVAL;
> 
> I don't see this function doing more than adding the item to the list of
> carveouts, which can't fail. So let's just rely on the user calling it
> with valid references and make it return void.
Ok

> 
> > +
> > +   mem->priv = (void *)CARVEOUT_EXTERNAL;
> > +
> > +   list_add_tail(&mem->node, &rproc->carveouts);
> > +
> > +   return 0;
> > +}
> > +EXPORT_SYMBOL(rproc_add_carveout);
> 
> Regards,
> Bjorn


Re: [PATCH v2] KVM: arm/arm64: vgic-its: Fix vgicv4 init

2018-01-11 Thread Auger Eric
Hi Christoffer

On 11/01/18 19:55, Christoffer Dall wrote:
> On Mon, Jan 08, 2018 at 10:52:54AM +0100, Eric Auger wrote:
>> Commit 3d1ad640f8c94 ("KVM: arm/arm64: Fix GICv4 ITS initialization
>> issues") moved the vgic_supports_direct_msis() check in vgic_v4_init().
>> However when vgic_v4_init is called from vgic_its_create(), the has_its
>> field is not yet set. Hence vgic_supports_direct_msis returns false and
>> vgic_v4_init does nothing.
>>
>> Let's move the check back to vgic_v4_init caller.
>>
>> Fixes: 3d1ad640f8c94 ("KVM: arm/arm64: Fix GICv4 ITS initialization issues")
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> v1 -> v2:
>> - move the check to the caller
> 
> Why this change, I slightly preferred the first version of this patch,
> but I will admit that the "has_its = true; no_wait(); has_its = false;"
> things is pretty ugly...

I didn't find the 1st solution elegant either and reverted to how the
code looked like before your patch.
> 
>> - identify the right commit this patch fixes
>> ---
>>  virt/kvm/arm/vgic/vgic-init.c | 8 +---
>>  virt/kvm/arm/vgic/vgic-its.c  | 2 +-
>>  virt/kvm/arm/vgic/vgic-v4.c   | 3 ---
>>  3 files changed, 6 insertions(+), 7 deletions(-)
>>
>> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
>> index 6231012..40be908 100644
>> --- a/virt/kvm/arm/vgic/vgic-init.c
>> +++ b/virt/kvm/arm/vgic/vgic-init.c
>> @@ -285,9 +285,11 @@ int vgic_init(struct kvm *kvm)
>>  if (ret)
>>  goto out;
>>  
>> -ret = vgic_v4_init(kvm);
>> -if (ret)
>> -goto out;
>> +if (vgic_supports_direct_msis(kvm)) {
>> +ret = vgic_v4_init(kvm);
>> +if (ret)
>> +goto out;
>> +}
>>  
>>  kvm_for_each_vcpu(i, vcpu, kvm)
>>  kvm_vgic_vcpu_enable(vcpu);
>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>> index 8e633bd..aebc88d 100644
>> --- a/virt/kvm/arm/vgic/vgic-its.c
>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>> @@ -1687,7 +1687,7 @@ static int vgic_its_create(struct kvm_device *dev, u32 
>> type)
>>  if (!its)
>>  return -ENOMEM;
>>  
>> -if (vgic_initialized(dev->kvm)) {
>> +if (kvm_vgic_global_state.has_gicv4 && vgic_initialized(dev->kvm)) {
> 
> ... but now we're using vgic_supports_direct_msis() in one part of the
> init path and a half-open coded version of that in another path, which
> is not very pretty.
> 
> So I actually would suggest doing the init stuff more open-coded,
> because init of the gic/its/gicv4 is a mess anyway.
> 
> Something like this:
> 
> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
> index 62310122ee78..743ca5cb05ef 100644
> --- a/virt/kvm/arm/vgic/vgic-init.c
> +++ b/virt/kvm/arm/vgic/vgic-init.c
> @@ -285,9 +285,11 @@ int vgic_init(struct kvm *kvm)
>   if (ret)
>   goto out;
>  
> - ret = vgic_v4_init(kvm);
> - if (ret)
> - goto out;
> + if (vgic_has_its(kvm)) {
> + ret = vgic_v4_init(kvm);
> + if (ret)
> + goto out;
> + }
>  
>   kvm_for_each_vcpu(i, vcpu, kvm)
>   kvm_vgic_vcpu_enable(vcpu);
> diff --git a/virt/kvm/arm/vgic/vgic-v4.c b/virt/kvm/arm/vgic/vgic-v4.c
> index 4a37292855bc..bc4265154bac 100644
> --- a/virt/kvm/arm/vgic/vgic-v4.c
> +++ b/virt/kvm/arm/vgic/vgic-v4.c
> @@ -118,7 +118,7 @@ int vgic_v4_init(struct kvm *kvm)
>   struct kvm_vcpu *vcpu;
>   int i, nr_vcpus, ret;
>  
> - if (!vgic_supports_direct_msis(kvm))
> + if (!kvm_vgic_global_state.has_gicv4)
>   return 0; /* Nothing to see here... move along. */
>  
>   if (dist->its_vm.vpes)
> 
> Does that work?
Looks OK to me. Unfortunately I don't have access to this specific
machine anymore at the moment so I can't test it right now.

Thanks

Eric
> 
> Thanks,
> -Christoffer
> 


Re: [PATCH] input: multi-touch fix for ALPS touchpads ("SS4 plus" variant)

2018-01-11 Thread Dmitry Torokhov
On Fri, Jan 12, 2018 at 01:02:55AM +, Masaki Ota wrote:
> Hi, Nir,
> 
> Wow, thank you for fixing the bug.
> Your code is correct!

Great, I am putting you down as "Reviewed-by" then. Thanks!

> 
> Best Regards,
> Masaki Ota
> -Original Message-
> From: Nir Perry [mailto:nirpe...@gmail.com] 
> Sent: Saturday, January 06, 2018 8:55 PM
> To: 太田 真喜 Masaki Ota ; Dmitry Torokhov 
> ; Pali Rohár 
> Cc: linux-kernel@vger.kernel.org; linux-in...@vger.kernel.org
> Subject: [PATCH] input: multi-touch fix for ALPS touchpads ("SS4 plus" 
> variant)
> 
> Hi all,
> 
> I think a minor "typo" bug was accidentally introduced to ALPS touchpad 
> driver by a previous bug-fix (commit 
> 4a646580f793d19717f7e034c8d473b509c27d49, "Input: ALPS - fix two-finger 
> scroll breakage in right side on ALPS touchpad").
> It breaks how multi-touch events are decoded on some ALPS touchpads, so for 
> example tapping with three-fingers can no longer be used to emulate 
> middle-mouse-button (the kernel doesn't recognize this as the proper event, 
> and doesn't report it correctly to userspace).
> This affects touchpads that use SS4 "plus" protocol variant, like those found 
> on Dell E7270 & E7470 laptops (tested on E7270).
> 
> The cause of the problem
> --
> First, probably due to a typo, the code in alps_decode_ss4_v2() for case 
> SS4_PACKET_ID_MULTI used inconsistent indices to "f->mt[]". You can see 0 & 1 
> are used for the "if" part but 2 & 3 are used for the "else" part, which I 
> believe is a typo.
> Second, in the previous patch, new macros were introduced to decode X 
> coordinates specific to the SS4 "plus" variant, but the macro to define the 
> maximum X value wasn't changed accordingly. The macros to decode X values for 
> "plus" variant are effectively shifted right by 1 bit, but the max wasn't 
> shifted too. This causes the driver to incorrectly handle "no data" cases, 
> which also interfered with how multi-touch was handled. To fix it - I created 
> new SS4 "plus" macros for the max value - SS4_PLUS_MFPACKET_NO_AX & 
> SS4_PLUS_MFPACKET_NO_AX_BL. To make the change a little more readable, I 
> moved also the Y-max lines so they are closer to the X-max lines.
> To get three-finger tap to work both changes are required.
> 
> The included patch was generated against the mainline tree today, but was 
> also tested against the 4.14 kernel branch. I've included in this e-mail the 
> people involved with the old patch from August, plus Pali Rohár who is listed 
> as the ALPS PS/2 touchpad driver reviewer (in the maintainers file).
> 
> Fixes: 4a646580f793d19717f7e034c8d473b509c27d49 ("Input: ALPS - fix 
> two-finger scroll breakage in right side on ALPS touchpad")
> 
> Regards,
> Nir
> 
> Signed-off-by: Nir Perry  diff --git 
> a/drivers/input/mouse/alps.c b/drivers/input/mouse/alps.c index 
> 579b899..dbe57da 100644
> --- a/drivers/input/mouse/alps.c
> +++ b/drivers/input/mouse/alps.c
> @@ -1250,29 +1250,32 @@ static int alps_decode_ss4_v2(struct alps_fields *f,
> case SS4_PACKET_ID_MULTI:
> if (priv->flags & ALPS_BUTTONPAD) {
> if (IS_SS4PLUS_DEV(priv->dev_id)) {
> -   f->mt[0].x = SS4_PLUS_BTL_MF_X_V2(p, 0);
> -   f->mt[1].x = SS4_PLUS_BTL_MF_X_V2(p, 1);
> +   f->mt[2].x = SS4_PLUS_BTL_MF_X_V2(p, 0);
> +   f->mt[3].x = SS4_PLUS_BTL_MF_X_V2(p, 1);
> +   no_data_x = SS4_PLUS_MFPACKET_NO_AX_BL;
> } else {
> f->mt[2].x = SS4_BTL_MF_X_V2(p, 0);
> f->mt[3].x = SS4_BTL_MF_X_V2(p, 1);
> +   no_data_x = SS4_MFPACKET_NO_AX_BL;
> }
> +   no_data_y = SS4_MFPACKET_NO_AY_BL;
> 
> f->mt[2].y = SS4_BTL_MF_Y_V2(p, 0);
> f->mt[3].y = SS4_BTL_MF_Y_V2(p, 1);
> -   no_data_x = SS4_MFPACKET_NO_AX_BL;
> -   no_data_y = SS4_MFPACKET_NO_AY_BL;
> } else {
> if (IS_SS4PLUS_DEV(priv->dev_id)) {
> -   f->mt[0].x = SS4_PLUS_STD_MF_X_V2(p, 0);
> -   f->mt[1].x = SS4_PLUS_STD_MF_X_V2(p, 1);
> +   f->mt[2].x = SS4_PLUS_STD_MF_X_V2(p, 0);
> +   f->mt[3].x = SS4_PLUS_STD_MF_X_V2(p, 1);
> +   no_data_x = SS4_PLUS_MFPACKET_NO_AX;
> } else {
> -   f->mt[0].x = SS4_STD_MF_X_V2(p, 0);
> -   f->mt[1].x = SS4_STD_MF_X_V2(p, 1);
> +   f->mt[2].x = SS4_STD_MF_X_V2(p, 0);
> +   f->mt[3].x = SS4_STD_MF_X_V2(p, 1);
> +   no_data_x = SS4_MFPACKET_NO_AX

RE: [PATCH v2 02/16] remoteproc: add release ops in rproc_mem_entry struct

2018-01-11 Thread Loic PALLARDY


> -Original Message-
> From: Bjorn Andersson [mailto:bjorn.anders...@linaro.org]
> Sent: Thursday, December 14, 2017 1:34 AM
> To: Loic PALLARDY 
> Cc: o...@wizery.com; linux-remotep...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Arnaud POULIQUEN ;
> benjamin.gaign...@linaro.org
> Subject: Re: [PATCH v2 02/16] remoteproc: add release ops in
> rproc_mem_entry struct
> 
> On Thu 30 Nov 08:46 PST 2017, Loic Pallardy wrote:
> 
> > +static int rproc_release_carveout(struct rproc *rproc, struct
> rproc_mem_entry *mem)
> > +{
> > +   struct device *dev = &rproc->dev;
> > +
> > +   /* clean up carveout allocations */
> > +   dma_free_coherent(dev->parent, mem->len, mem->va, mem-
> >dma);
> > +   list_del(&mem->node);
> 
> The core is responsible for putting the node on a list, so let the
> cleanup take if off the list.
ok
> 
> > +   kfree(mem);
> > +   return 0;
> > +}
> > +
> [..]
> > @@ -319,12 +322,11 @@ struct rproc_mem_entry {
> > dma_addr_t dma;
> > int len;
> > u32 da;
> > +   int (*release)(struct rproc *rproc, struct rproc_mem_entry *mem);
> 
> The placement here seems random, please move it last in the struct.
ok
> 
> > void *priv;
> > struct list_head node;
> >  };
> >
> 
> Regards,
> Bjorn


RE: [PATCH v2 01/16] remoteproc: add rproc_va_to_pa function

2018-01-11 Thread Loic PALLARDY

Hi Bjorn,

Thanks for the review of this series.

> -Original Message-
> From: Bjorn Andersson [mailto:bjorn.anders...@linaro.org]
> Sent: Thursday, December 14, 2017 1:31 AM
> To: Loic PALLARDY 
> Cc: o...@wizery.com; linux-remotep...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Arnaud POULIQUEN ;
> benjamin.gaign...@linaro.org
> Subject: Re: [PATCH v2 01/16] remoteproc: add rproc_va_to_pa function
> 
> On Thu 30 Nov 08:46 PST 2017, Loic Pallardy wrote:
> 
> > This new function translates CPU virtual address in
> > CPU physical one according to virtual address location.
> >
> > Signed-off-by: Loic Pallardy 
> > ---
> >  drivers/remoteproc/remoteproc_core.c | 13 -
> >  1 file changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/remoteproc/remoteproc_core.c
> b/drivers/remoteproc/remoteproc_core.c
> > index eab14b4..faa18a7 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -139,6 +139,17 @@ static void rproc_disable_iommu(struct rproc
> *rproc)
> > iommu_domain_free(domain);
> >  }
> >
> > +static phys_addr_t rproc_va_to_pa(void *cpu_addr)
> > +{
> > +   if (is_vmalloc_addr(cpu_addr)) {
> 
> Please add a comment describing when is_vmalloc_addr() would be true.
Yes sure.
Regards,
Loic
> 
> > +   return page_to_phys(vmalloc_to_page(cpu_addr)) +
> > +   offset_in_page(cpu_addr);
> > +   }
> > +
> > +   WARN_ON(!virt_addr_valid(cpu_addr));
> > +   return virt_to_phys(cpu_addr);
> > +}
> > +
> >  /**
> >   * rproc_da_to_va() - lookup the kernel virtual address for a remoteproc
> address
> >   * @rproc: handle of a remote processor
> > @@ -700,7 +711,7 @@ static int rproc_handle_carveout(struct rproc
> *rproc,
> >  * In this case, the device address and the physical address
> >  * are the same.
> >  */
> > -   rsc->pa = dma;
> > +   rsc->pa = (u32)rproc_va_to_pa(va);
> 
> This is more correct than using "dma", so this is good.
> 
> Regards,
> Bjorn


[PATCH] iio: accel: use strlcpy() instead of strncpy()

2018-01-11 Thread Xiongfeng Wang
From: Xiongfeng Wang 

gcc-8 reports

drivers/iio/accel/st_accel_i2c.c: In function 'st_accel_i2c_probe':
./include/linux/string.h:245:9: warning: '__builtin_strncpy' specified
bound 20 equals destination size [-Wstringop-truncation]

The compiler require that the length of the dest string is greater than
the length we want to copy to make sure the dest string is
nul-terminated. We can just use strlcpy() to avoid this warning.

Signed-off-by: Xiongfeng Wang 
---
 drivers/iio/accel/st_accel_i2c.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/iio/accel/st_accel_i2c.c b/drivers/iio/accel/st_accel_i2c.c
index 363429b..6bdec8c 100644
--- a/drivers/iio/accel/st_accel_i2c.c
+++ b/drivers/iio/accel/st_accel_i2c.c
@@ -159,9 +159,8 @@ static int st_accel_i2c_probe(struct i2c_client *client,
if ((ret < 0) || (ret >= ST_ACCEL_MAX))
return -ENODEV;
 
-   strncpy(client->name, st_accel_id_table[ret].name,
+   strlcpy(client->name, st_accel_id_table[ret].name,
sizeof(client->name));
-   client->name[sizeof(client->name) - 1] = '\0';
} else if (!id)
return -ENODEV;
 
-- 
1.8.3.1



Re: [PATCH] selftests/x86: Add test_vsyscall

2018-01-11 Thread Greg Kroah-Hartman
On Thu, Jan 11, 2018 at 05:16:51PM -0800, Andy Lutomirski wrote:
> This tests that the vsyscall entries do what they're expected to do.
> It also confirms that attempts to read the vsyscall page behave as
> expected.
> 
> If changes are made to the vsyscall code or its memory map handling,
> running this test in all three of vsyscall=none, vsyscall=emulate,
> and vsyscall=native are helpful.
> 
> (Because it's easy, this also compares the vsyscall results to their
>  vDSO equivalents.)
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Andy Lutomirski 
> ---

Tested-by: Greg Kroah-Hartman 
Reviewed-by: Greg Kroah-Hartman 

> Note to KAISER backporters: please test this under all three
> vsyscall modes.  Also, in the emulate and native modes, make sure
> that test_vsyscall_64 agrees with the command line or config
> option as to which mode you're in.  It's quite easy to mess up
> the kernel such that native mode accidentally emulates
> or vice versa.
> 
> Greg, etc: please backport this to all your Meltdown-patched
> kernels.  It'll help make sure the patches didn't regress
> vsyscalls.

I will gladly do so, thanks so much for this test.

greg k-h


Re: [PATCH net-next v4 0/4] net: mvpp2: 1000BaseX and 2500BaseX support

2018-01-11 Thread Antoine Tenart
Hi David,

On Thu, Jan 11, 2018 at 11:32:03AM -0500, David Miller wrote:
> 
> Actually, this introduced build warnings, I'm reverting.  Please fix this
> and repost.

The warning points a real issue. I'm sorry about that, seems like I
forgot to test this one after the last change... I'll send a new
(tested) version.

Thanks!
Antoine

-- 
Antoine Ténart, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


Re: [PATCH 10/18] qla2xxx: prevent bounds-check bypass via speculative execution

2018-01-11 Thread Greg KH
On Thu, Jan 11, 2018 at 02:15:12PM -0800, Dan Williams wrote:
> On Sat, Jan 6, 2018 at 1:03 AM, Greg KH  wrote:
> > On Fri, Jan 05, 2018 at 05:10:48PM -0800, Dan Williams wrote:
> >> Static analysis reports that 'handle' may be a user controlled value
> >> that is used as a data dependency to read 'sp' from the
> >> 'req->outstanding_cmds' array.  In order to avoid potential leaks of
> >> kernel memory values, block speculative execution of the instruction
> >> stream that could issue reads based on an invalid value of 'sp'. In this
> >> case 'sp' is directly dereferenced later in the function.
> >
> > I'm pretty sure that 'handle' comes from the hardware, not from
> > userspace, from what I can tell here.  If we want to start auditing
> > __iomem data sources, great!  But that's a bigger task, and one I don't
> > think we are ready to tackle...
> 
> I think it falls in the hygiene bucket of shutting off an array index
> from a source that could be under attacker control. Should we leave
> this one un-patched while we decide if we generally have a problem
> with trusting completion 'tags' from hardware? My vote is patch it for
> now.

Hah, if you are worried about "tags" from hardware, we have a lot more
auditing to do, right?  I don't think anyone has looked into just basic
"bounds checking" for that type of information.  For USB devices we have
_just_ started doing that over the past year, the odds of anyone looking
at PCI devices for this same problem is slim-to-none.

Again, here are my questions/objections right now to this series:
- How can we audit this stuff?
- How did you audit this stuff to find these usages?
- How do you know that this series fixes all of the issues?
- What exact tree/date did you run your audit against?
- How do you know that linux-next does not contain a boatload
  more problems that we need to go back and fix after 4.16-rc1
  is out?
- How can we prevent this type of pattern showing up again?
- How can we audit the data coming from hardware correctly?

I'm all for merging this series, but if anyone things that somehow the
whole problem is now "solved" in this area, they are sorely mistaken.

thanks,

greg k-h


Re: [PATCH 4/5] x86/svm: Direct access to MSR_IA32_SPEC_CTRL

2018-01-11 Thread David Woodhouse
On Thu, 2018-01-11 at 17:32 -0800, Ashok Raj wrote:
> 
> @@ -4910,6 +4935,14 @@ static void svm_vcpu_run(struct kvm_vcpu
> *vcpu)
>  
> clgi();
>  
> +   if (boot_cpu_has(X86_FEATURE_SPEC_CTRL)) {
> +   /*
> +    * FIXME: lockdep_assert_irqs_disabled();
> +    */
> +   WARN_ON_ONCE(!irqs_disabled());
> +   spec_ctrl_set(svm->spec_ctrl);
> +   }
> +
> local_irq_enable();
>  

Same comments here as we've had previously. If you do this without an
'else lfence' then you need a comment showing that you've proved it's
safe.

And I don't think even using static_cpu_has() is good enough. We don't
already "rely" on that for anything but optimisations, AFAICT. Turning
a missed GCC optimisation into a security hole is not a good idea.

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH 5/6] arm64: tegra: Add Tegra194 chip device tree

2018-01-11 Thread Mikko Perttunen

On 11.01.2018 23:56, Rob Herring wrote:

On Mon, Jan 08, 2018 at 06:54:37AM +0200, Mikko Perttunen wrote:

Add the chip-level device tree, including binding headers, for the
NVIDIA Tegra194 "Xavier" system-on-chip. Only a small subset of devices
are initially available, enough to boot to UART console.

Signed-off-by: Mikko Perttunen 
---
 arch/arm64/boot/dts/nvidia/tegra194.dtsi   | 334 +
 include/dt-bindings/clock/tegra194-clock.h |  59 +
 include/dt-bindings/gpio/tegra194-gpio.h   |  59 +
 include/dt-bindings/reset/tegra194-reset.h |  40 
 4 files changed, 492 insertions(+)
 create mode 100644 arch/arm64/boot/dts/nvidia/tegra194.dtsi
 create mode 100644 include/dt-bindings/clock/tegra194-clock.h
 create mode 100644 include/dt-bindings/gpio/tegra194-gpio.h
 create mode 100644 include/dt-bindings/reset/tegra194-reset.h

diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
new file mode 100644
index ..51eff420816d
--- /dev/null
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -0,0 +1,334 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/ {
+   compatible = "nvidia,tegra194";


Documented?


Ah, wasn't aware these needed to be documented as well. Will add in v2.




+   interrupt-parent = <&gic>;
+   #address-cells = <2>;
+   #size-cells = <2>;
+
+   uarta: serial@310 {


These should all be under a bus node. Tegra failed to do this at the
start and we're still copy-n-pasting this mistake.

Then you probably don't need 2 address and size cells for all the
peripherals.


So I should create one big simple-bus node and put everything with an 
address apart from /memory (and maybe /sysram) inside it?





+   compatible = "nvidia,tegra194-uart", "nvidia,tegra20-uart";
+   reg = <0x0 0x0310 0x0 0x40>;
+   reg-shift = <2>;
+   interrupts = ;
+   clocks = <&bpmp TEGRA194_CLK_UARTA>;
+   clock-names = "serial";
+   resets = <&bpmp TEGRA194_RESET_UARTA>;
+   reset-names = "serial";
+   status = "disabled";
+   };
+
+   uartb: serial@311 {
+   compatible = "nvidia,tegra194-uart", "nvidia,tegra20-uart";
+   reg = <0x0 0x0311 0x0 0x40>;
+   reg-shift = <2>;
+   interrupts = ;
+   clocks = <&bpmp TEGRA194_CLK_UARTB>;
+   clock-names = "serial";
+   resets = <&bpmp TEGRA194_RESET_UARTB>;
+   reset-names = "serial";
+   status = "disabled";
+   };
+
+   uartd: serial@313 {
+   compatible = "nvidia,tegra194-uart", "nvidia,tegra20-uart";
+   reg = <0x0 0x0313 0x0 0x40>;
+   reg-shift = <2>;
+   interrupts = ;
+   clocks = <&bpmp TEGRA194_CLK_UARTD>;
+   clock-names = "serial";
+   resets = <&bpmp TEGRA194_RESET_UARTD>;
+   reset-names = "serial";
+   status = "disabled";
+   };
+
+   uarte: serial@314 {
+   compatible = "nvidia,tegra194-uart", "nvidia,tegra20-uart";
+   reg = <0x0 0x0314 0x0 0x40>;
+   reg-shift = <2>;
+   interrupts = ;
+   clocks = <&bpmp TEGRA194_CLK_UARTE>;
+   clock-names = "serial";
+   resets = <&bpmp TEGRA194_RESET_UARTE>;
+   reset-names = "serial";
+   status = "disabled";
+   };
+
+   uartf: serial@315 {
+   compatible = "nvidia,tegra194-uart", "nvidia,tegra20-uart";
+   reg = <0x0 0x0315 0x0 0x40>;
+   reg-shift = <2>;
+   interrupts = ;
+   clocks = <&bpmp TEGRA194_CLK_UARTF>;
+   clock-names = "serial";
+   resets = <&bpmp TEGRA194_RESET_UARTF>;
+   reset-names = "serial";
+   status = "disabled";
+   };
+
+   gen1_i2c: i2c@316 {
+   compatible = "nvidia,tegra194-i2c", "nvidia,tegra114-i2c";
+   reg = <0x0 0x0316 0x0 0x1>;
+   interrupts = ;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   clocks = <&bpmp TEGRA194_CLK_I2C1>;
+   clock-names = "div-clk";
+   resets = <&bpmp TEGRA194_RESET_I2C1>;
+   reset-names = "i2c";
+   status = "disabled";
+   };
+
+   uarth: serial@317 {
+   compatible = "nvidia,tegra194-uart", "nvidia,tegra20-uart";
+   reg = <0x0 0x0317 0x0 0x40>;
+   reg-shift = <2>;
+   interrupts = ;
+   clocks = <&bpmp TEGRA194_CLK_UARTH>;
+   clock-names = "serial";
+   resets = <&bpmp TEGRA194_RESET_UARTH>;
+   reset-names = "serial";
+   status = "disab

Re: [PATCH 4.4 00/37] 4.4.110-stable review

2018-01-11 Thread Greg Kroah-Hartman
On Fri, Jan 12, 2018 at 12:03:10AM +0100, Thomas Gleixner wrote:
> On Thu, 11 Jan 2018, Thomas Gleixner wrote:
> > On Thu, 11 Jan 2018, Thomas Gleixner wrote:
> > > On Thu, 11 Jan 2018, Linus Torvalds wrote:
> > > 
> > > > On Thu, Jan 11, 2018 at 12:37 PM, Thomas Gleixner  
> > > > wrote:
> > > > >
> > > > > 67a9108ed431 ("x86/efi: Build our own page table structures")
> > > > >
> > > > > got rid of EFI depending on real_mode_header->trampoline_pgd
> > > > 
> > > > So I think it only got rid of by default - the codepath is still
> > > > there, the allocation is still there, it's just that it's not actually
> > > > used unless somebody does that "efi=old_mmap" thing.
> > > 
> > > Yes, the trampoline_pgd is still around, but I can't figure out how it
> > > would be used after boot. Confused, digging more.
> > 
> > So coming back to the same commit. From the changelog:
> > 
> > This is caused by mapping EFI regions with RWX permissions.
> > There isn't much we can do to restrict the permissions for these
> > regions due to the way the firmware toolchains mix code and
> > data, but we can at least isolate these mappings so that they do
> > not appear in the regular kernel page tables.
> > 
> > In commit d2f7cbe7b26a ("x86/efi: Runtime services virtual
> > mapping") we started using 'trampoline_pgd' to map the EFI
> > regions because there was an existing identity mapping there
> > which we use during the SetVirtualAddressMap() call and for
> > broken firmware that accesses those addresses.
> > 
> > So this very commit gets rid of the (ab)use of trampoline_pgd and allocates
> > efi_pgd, which we made use the proper size.
> > 
> > trampoline_pgd is since then only used to get into long mode in
> > realmode/rm/trampoline_64.S and for reboot in machine_real_restart().
> > 
> > The runtime services stuff does not use it in kernel versions >= 4.6
> 
> But there is one very well hidden user for it after boot:
> 
> It's used for booting secondary CPUs from real mode
> 
> So the transition to long mode for secondaries uses the trampoline pgd for
> long mode transition and then jumping to secondary_startup_64 where CR3 is
> set to the real kernel page tables.

Ok, so the summary is that this patch is only needed for the 4.4 and 4.9
kernels, and _NOT_ for Linus's tree and 4.14, right?

thanks,

greg k-h


Re: [PATCH 2/6] crypto: engine - Permit to enqueue all async requests

2018-01-11 Thread Herbert Xu
On Wed, Jan 03, 2018 at 09:11:05PM +0100, Corentin Labbe wrote:
> The crypto engine could actually only enqueue hash and ablkcipher request.
> This patch permit it to enqueue any type of crypto_async_request.
> 
> Signed-off-by: Corentin Labbe 
> ---
>  crypto/crypto_engine.c  | 230 
> 
>  include/crypto/engine.h |  59 +++--
>  2 files changed, 148 insertions(+), 141 deletions(-)
> 
> diff --git a/crypto/crypto_engine.c b/crypto/crypto_engine.c
> index 61e7c4e02fd2..036270b61648 100644
> --- a/crypto/crypto_engine.c
> +++ b/crypto/crypto_engine.c
> @@ -15,7 +15,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include "internal.h"
>  
> @@ -34,11 +33,10 @@ static void crypto_pump_requests(struct crypto_engine 
> *engine,
>bool in_kthread)
>  {
>   struct crypto_async_request *async_req, *backlog;
> - struct ahash_request *hreq;
> - struct ablkcipher_request *breq;
>   unsigned long flags;
>   bool was_busy = false;
> - int ret, rtype;
> + int ret;
> + struct crypto_engine_reqctx *enginectx;

This all looks very good.  Just one minor nit, since you're storing
this in the tfm ctx as opposed to the request ctx (which is indeed
an improvement), you should remove the "req" from its name.

Thanks!
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


[PATCH] cpufreq: powernv: Dont assume distinct pstate values for nominal and pmin

2018-01-11 Thread Shilpasri G Bhat
Some OpenPOWER boxes can have same pstate values for nominal and
pmin pstates. In these boxes the current code will not initialize
'powernv_pstate_info.min' variable and result in erroneous CPU
frequency reporting. This patch fixes this problem.

Fixes: 09ca4c9b5958 ("cpufreq: powernv: Replacing pstate_id with frequency 
table index")
Reported-by: Alvin Wang 
Signed-off-by: Shilpasri G Bhat 
---
 drivers/cpufreq/powernv-cpufreq.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c 
b/drivers/cpufreq/powernv-cpufreq.c
index b6d7c4c..da7fdb4 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -288,9 +288,9 @@ static int init_powernv_pstates(void)
 
if (id == pstate_max)
powernv_pstate_info.max = i;
-   else if (id == pstate_nominal)
+   if (id == pstate_nominal)
powernv_pstate_info.nominal = i;
-   else if (id == pstate_min)
+   if (id == pstate_min)
powernv_pstate_info.min = i;
 
if (powernv_pstate_info.wof_enabled && id == pstate_turbo) {
-- 
1.8.3.1



Re: [PATCH v2 1/2] drm/bridge/synopsys: dsi: use common mipi_dsi_create_packet()

2018-01-11 Thread Andrzej Hajda
On 11.01.2018 14:51, Philippe CORNU wrote:
> Hi Brian & All *DSI DRM experts*,
>
> 1) Re-reading this patch, I realize that the returned value of 
> dw_mipi_dsi_host_transfer() is not correct: we should return the number 
> of transfered/received bytes...
>
> so I think there are two solutions: fix this in this serie or add a TODO 
> for later (both solutions are fine to me :-)
>
>
> 2) Digging more into the drm code, the function 
> mipi_dsi_device_transfer() in drm_mipi_dsi.c is called in the same file 
> by the 3 following functions: mipi_dsi_shutdown_peripheral(), 
> mipi_dsi_turn_on_peripheral() & 
> mipi_dsi_set_maximum_return_packet_size(). All these 3 functions are 
> expecting "Return: 0 on success or a negative error code on failure." 
> that is not in line with the transfer function.
>
> So then, we can change the documentation in this file and have instead 
> "* Return: The number of bytes transmitted on success or a negative 
> error code on failure." as for mipi_dsi_generic_write()...
> Or we can change the source code of these 3 functions to match with the 
> documentation "Return: 0 on success...".
>
> note: Hopefully, "users" of these 3 functions test the sign of the 
> return value (or do not use it).
>
> Does anyone have a preferred solutions?

All three functions performs single operation which can succeed only in
one way, nobody is interested in the number of bytes send to achieve the
result. So IMO the result should be 0 or error.

And mipi_dsi_device_transfer() is a different beast, it returns number
of written/read bytes, which can vary (more specifically, only number of
read bytes can vary :) ).

Regards
Andrzej


>
> Many thanks
> Philippe :-)
>
> On 01/09/2018 09:32 PM, Brian Norris wrote:
>> This takes care of 2 TODOs in this driver, by using the common DSI
>> packet-marshalling code instead of our custom short/long write code.
>> This both saves us some duplicated code and gets us free support for
>> command types that weren't already part of our switch block (e.g.,
>> MIPI_DSI_GENERIC_LONG_WRITE).
>>
>> The code logic stays mostly intact, except that it becomes unnecessary
>> to split the short/long write functions, and we have to copy data a bit
>> more.
>>
>> Along the way, I noticed that loop bounds were a little odd:
>>
>>  while (DIV_ROUND_UP(len, pld_data_bytes))
>>
>> This really was just supposed to be 'len != 0', so I made that more
>> clear.
>>
>> Tested on RK3399 with some pending refactoring patches by Nickey Yang,
>> to make the Rockchip DSI driver wrap this common driver.
>>
>> Signed-off-by: Brian Norris 
>> Reviewed-by: Philippe Cornu 
>> Tested-by: Philippe Cornu 
>> ---
>> v2:
>>   * remove "dcs" naming, since these commands handle generic DSI too, not
>> just DCS (thanks Philippe)
>>   * add Philippe's {Tested,Reviewed}-by
>> ---
>>   drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c | 78 
>> ++-
>>   1 file changed, 16 insertions(+), 62 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c 
>> b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
>> index d9cca4fd66ec..ed91e32ee43a 100644
>> --- a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
>> +++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
>> @@ -136,10 +136,6 @@
>>   GEN_SW_0P_TX_LP)
>>   
>>   #define DSI_GEN_HDR0x6c
>> -/* TODO These 2 defines will be reworked thanks to mipi_dsi_create_packet() 
>> */
>> -#define GEN_HDATA(data) (((data) & 0x) << 8)
>> -#define GEN_HTYPE(type) (((type) & 0xff) << 0)
>> -
>>   #define DSI_GEN_PLD_DATA   0x70
>>   
>>   #define DSI_CMD_PKT_STATUS 0x74
>> @@ -359,44 +355,15 @@ static int dw_mipi_dsi_gen_pkt_hdr_write(struct 
>> dw_mipi_dsi *dsi, u32 hdr_val)
>>  return 0;
>>   }
>>   
>> -static int dw_mipi_dsi_dcs_short_write(struct dw_mipi_dsi *dsi,
>> -   const struct mipi_dsi_msg *msg)
>> -{
>> -const u8 *tx_buf = msg->tx_buf;
>> -u16 data = 0;
>> -u32 val;
>> -
>> -if (msg->tx_len > 0)
>> -data |= tx_buf[0];
>> -if (msg->tx_len > 1)
>> -data |= tx_buf[1] << 8;
>> -
>> -if (msg->tx_len > 2) {
>> -dev_err(dsi->dev, "too long tx buf length %zu for short 
>> write\n",
>> -msg->tx_len);
>> -return -EINVAL;
>> -}
>> -
>> -val = GEN_HDATA(data) | GEN_HTYPE(msg->type);
>> -return dw_mipi_dsi_gen_pkt_hdr_write(dsi, val);
>> -}
>> -
>> -static int dw_mipi_dsi_dcs_long_write(struct dw_mipi_dsi *dsi,
>> -  const struct mipi_dsi_msg *msg)
>> +static int dw_mipi_dsi_write(struct dw_mipi_dsi *dsi,
>> + const struct mipi_dsi_packet *packet)
>>   {
>> -const u8 *tx_buf = msg->tx_buf;
>> -int len = msg->tx_len, pld_data_bytes = sizeof(u32), ret;
>> -u32 hdr_val = GEN_HDATA(msg->tx_len) | GEN_HTYPE(msg->type);
>> 

[PATCH v2] arm64: allwinner: a64: a64-olinuxino: add usb otg

2018-01-11 Thread Jagan Teki
Add usb otg support for a64-olinuxino board,
- USB0-ID connected with PH9
- USB0-VBUSDET connected with PH6
- USB-DRVVBUS controlled by N_VBUSEN pin from PMIC

Signed-off-by: Jagan Teki 
---
Changes for v2:
- rebase on master
- tested otg host mode.

 .../boot/dts/allwinner/sun50i-a64-olinuxino.dts| 26 ++
 1 file changed, 26 insertions(+)

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-olinuxino.dts 
b/arch/arm64/boot/dts/allwinner/sun50i-a64-olinuxino.dts
index 8807664..078ee94 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-olinuxino.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-olinuxino.dts
@@ -64,6 +64,10 @@
};
 };
 
+&ehci0 {
+   status = "okay";
+};
+
 &mmc0 {
pinctrl-names = "default";
pinctrl-0 = <&mmc0_pins>;
@@ -93,6 +97,10 @@
};
 };
 
+&ohci0 {
+   status = "okay";
+};
+
 &r_rsb {
status = "okay";
 
@@ -101,6 +109,7 @@
reg = <0x3a3>;
interrupt-parent = <&r_intc>;
interrupts = <0 IRQ_TYPE_LEVEL_LOW>;
+   x-powers,drive-vbus-en; /* set N_VBUSEN as output pin */
};
 };
 
@@ -215,8 +224,25 @@
regulator-name = "vcc-rtc";
 };
 
+®_drivevbus {
+   regulator-name = "usb0-vbus";
+   status = "okay";
+};
+
 &uart0 {
pinctrl-names = "default";
pinctrl-0 = <&uart0_pins_a>;
status = "okay";
 };
+
+&usb_otg {
+   dr_mode = "otg";
+   status = "okay";
+};
+
+&usbphy {
+   usb0_id_det-gpios = <&pio 7 9 GPIO_ACTIVE_HIGH>; /* PH9 */
+   usb0_vbus_det-gpio = <&pio 7 6 GPIO_ACTIVE_HIGH>; /* PH6 */
+   usb0_vbus-supply = <®_drivevbus>;
+   status = "okay";
+};
-- 
2.7.4



Re: [PATCH 1/2] crypto: Implement a generic crypto statistics

2018-01-11 Thread Stephan Mueller
Am Donnerstag, 11. Januar 2018, 20:56:56 CET schrieb Corentin Labbe:

Hi Corentin,

> This patch implement a generic way to get statistics about all crypto
> usages.
> 
> Signed-off-by: Corentin Labbe 
> ---
>  crypto/Kconfig  | 11 
>  crypto/ablkcipher.c |  9 +++
>  crypto/acompress.c  |  9 +++
>  crypto/aead.c   | 10 
>  crypto/ahash.c  |  8 ++
>  crypto/akcipher.c   | 13 ++
>  crypto/algapi.c |  6 +
>  crypto/blkcipher.c  |  9 +++
>  crypto/crypto_user.c| 28 +
>  crypto/kpp.c|  7 ++
>  crypto/rng.c|  8 ++
>  crypto/scompress.c  |  9 +++
>  crypto/shash.c  |  5 
>  crypto/skcipher.c   |  9 +++
>  include/crypto/acompress.h  | 22 
>  include/crypto/aead.h   | 22 
>  include/crypto/akcipher.h   | 42 +++
>  include/crypto/hash.h   | 21 
>  include/crypto/kpp.h| 28 +
>  include/crypto/rng.h| 17 +
>  include/crypto/skcipher.h   | 22 
>  include/linux/crypto.h  | 56
> + include/uapi/linux/cryptouser.h |
> 34 +
>  23 files changed, 405 insertions(+)
> 
> diff --git a/crypto/Kconfig b/crypto/Kconfig
> index 971d558494c3..3b88fba14b59 100644
> --- a/crypto/Kconfig
> +++ b/crypto/Kconfig
> @@ -1780,6 +1780,17 @@ config CRYPTO_USER_API_AEAD
> This option enables the user-spaces interface for AEAD
> cipher algorithms.
> 
> +config CRYPTO_STATS
> + bool "Crypto usage statistics for User-space"
> + help
> +   This option enables the gathering of crypto stats.
> +   This will collect:
> +   - encrypt/decrypt size and numbers of symmeric operations
> +   - compress/decompress size and numbers of compress operations
> +   - size and numbers of hash operations
> +   - encrypt/decrypt/sign/verify numbers for asymmetric operations
> +   - generate/seed numbers for rng operations
> +
>  config CRYPTO_HASH_INFO
>   bool
> 
> diff --git a/crypto/ablkcipher.c b/crypto/ablkcipher.c
> index d880a4897159..f6d20e4ca977 100644
> --- a/crypto/ablkcipher.c
> +++ b/crypto/ablkcipher.c
> @@ -369,6 +369,7 @@ static int crypto_init_ablkcipher_ops(struct crypto_tfm
> *tfm, u32 type, static int crypto_ablkcipher_report(struct sk_buff *skb,
> struct crypto_alg *alg) {
>   struct crypto_report_blkcipher rblkcipher;
> + u64 v;
> 
>   strncpy(rblkcipher.type, "ablkcipher", sizeof(rblkcipher.type));
>   strncpy(rblkcipher.geniv, alg->cra_ablkcipher.geniv ?: "",
> @@ -378,6 +379,14 @@ static int crypto_ablkcipher_report(struct sk_buff
> *skb, struct crypto_alg *alg) rblkcipher.min_keysize =
> alg->cra_ablkcipher.min_keysize;
>   rblkcipher.max_keysize = alg->cra_ablkcipher.max_keysize;
>   rblkcipher.ivsize = alg->cra_ablkcipher.ivsize;
> + v = atomic_read(&alg->encrypt_cnt);
> + rblkcipher.stat_encrypt_cnt = v;
> + v = atomic_read(&alg->encrypt_tlen);
> + rblkcipher.stat_encrypt_tlen = v;
> + v = atomic_read(&alg->decrypt_cnt);
> + rblkcipher.stat_decrypt_cnt = v;
> + v = atomic_read(&alg->decrypt_tlen);
> + rblkcipher.stat_decrypt_tlen = v;
> 
>   if (nla_put(skb, CRYPTOCFGA_REPORT_BLKCIPHER,
>   sizeof(struct crypto_report_blkcipher), &rblkcipher))
> diff --git a/crypto/acompress.c b/crypto/acompress.c
> index 1544b7c057fb..524c8a3e3f80 100644
> --- a/crypto/acompress.c
> +++ b/crypto/acompress.c
> @@ -32,8 +32,17 @@ static const struct crypto_type crypto_acomp_type;
>  static int crypto_acomp_report(struct sk_buff *skb, struct crypto_alg *alg)
> {
>   struct crypto_report_acomp racomp;
> + u64 v;
> 
>   strncpy(racomp.type, "acomp", sizeof(racomp.type));
> + v = atomic_read(&alg->compress_cnt);
> + racomp.stat_compress_cnt = v;
> + v = atomic_read(&alg->compress_tlen);
> + racomp.stat_compress_tlen = v;
> + v = atomic_read(&alg->decompress_cnt);
> + racomp.stat_decompress_cnt = v;
> + v = atomic_read(&alg->decompress_tlen);
> + racomp.stat_decompress_tlen = v;
> 
>   if (nla_put(skb, CRYPTOCFGA_REPORT_ACOMP,
>   sizeof(struct crypto_report_acomp), &racomp))
> diff --git a/crypto/aead.c b/crypto/aead.c
> index fe00cbd7243d..de13bd345d8b 100644
> --- a/crypto/aead.c
> +++ b/crypto/aead.c
> @@ -109,6 +109,7 @@ static int crypto_aead_report(struct sk_buff *skb,
> struct crypto_alg *alg) {
>   struct crypto_report_aead raead;
>   struct aead_alg *aead = container_of(alg, struct aead_alg, base);
> + u64 v;
> 
>   strncpy(raead.type, "aead", sizeof(raead.type));
>   strncpy(raead.geniv, "", sizeof(raead.geniv));

Re: [PATCH v4 1/9] dt-bindings: media: Add Renesas CEU bindings

2018-01-11 Thread Simon Horman
On Fri, Jan 12, 2018 at 12:50:41AM +0200, Laurent Pinchart wrote:
> Hi Jacopo,
> 
> Thank you for the patch.
> 
> On Tuesday, 9 January 2018 18:25:23 EET Jacopo Mondi wrote:
> > Add bindings documentation for Renesas Capture Engine Unit (CEU).
> > 
> > Signed-off-by: Jacopo Mondi 
> 
> Reviewed-by: Laurent Pinchart 

Hi,

I see that these bindings have now been reviewed. What is their (likely)
path to upstream from here? I'd like to accept the related DTS changes
once there is a clear path for the bindings to land in upstream.


[PATCH] kconfig: Warn if there is more than one help text

2018-01-11 Thread Ulf Magnusson
Avoids mistakes like in the following real-world example, where only the
final help string ("Say Y...") was used. This particular example was
fixed in commit 561b29e4ec8d ("media: fix media Kconfig help syntax
issues").

  config DVB_NETUP_UNIDVB
...
select DVB_CXD2841ER if MEDIA_SUBDRV_AUTOSELECT
---help---
  Support for NetUP PCI express Universal DVB card.
   help
Say Y when you want to support NetUP Dual Universal DVB card
...

This now prints the following warning:

  drivers/media/pci/netup_unidvb:13: warning: 'DVB_NETUP_UNIDVB' defined with 
more than one help text -- only the last one will be used

Also free() any extra help strings.

Signed-off-by: Ulf Magnusson 
---
 scripts/kconfig/zconf.y | 5 +
 1 file changed, 5 insertions(+)

diff --git a/scripts/kconfig/zconf.y b/scripts/kconfig/zconf.y
index c1e4e82f56b5..06ef304ee325 100644
--- a/scripts/kconfig/zconf.y
+++ b/scripts/kconfig/zconf.y
@@ -435,6 +435,11 @@ help_start: T_HELP T_EOL
 
 help: help_start T_HELPTEXT
 {
+   if (current_entry->help) {
+   free(current_entry->help);
+   zconfprint("warning: '%s' defined with more than one help text 
-- only the last one will be used",
+  current_entry->sym->name ?: "");
+   }
current_entry->help = $2;
 };
 
-- 
2.14.1



[PATCH] Input: trackpoint - force 3 buttons if 0 button is reported

2018-01-11 Thread Aaron Ma
Lenovo introduced trackpoint compatible sticks with minimum PS/2 commands.
Some of these sticks with 3 buttons always return 0 when reading
extended button info, set it as 3 buttons to enable middle button.

Cc: sta...@vger.kernel.org
Signed-off-by: Aaron Ma 
---
 drivers/input/mouse/trackpoint.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/input/mouse/trackpoint.c b/drivers/input/mouse/trackpoint.c
index 0871010f18d5..00c0d1706567 100644
--- a/drivers/input/mouse/trackpoint.c
+++ b/drivers/input/mouse/trackpoint.c
@@ -383,6 +383,10 @@ int trackpoint_detect(struct psmouse *psmouse, bool 
set_properties)
if (trackpoint_read(ps2dev, TP_EXT_BTN, &button_info)) {
psmouse_warn(psmouse, "failed to get extended button data, 
assuming 3 buttons\n");
button_info = 0x33;
+   } else if (!button_info) {
+   psmouse_warn(psmouse,
+   "got no extended button data, assuming 3 buttons\n");
+   button_info = 0x33;
}
 
psmouse->private = kzalloc(sizeof(struct trackpoint_data), GFP_KERNEL);
-- 
2.14.3



Crypto Fixes for 4.15

2018-01-11 Thread Herbert Xu
Hi Linus: 

This push fixes a NULL pointer dereference in crypto_remove_spawns
that can be triggered through af_alg.


Please pull from

git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git linus


Eric Biggers (1):
  crypto: algapi - fix NULL dereference in crypto_remove_spawns()

 crypto/algapi.c |   12 
 1 file changed, 12 insertions(+)

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [x86-tip] RSDP changes converted i4790 box SMP -> UP

2018-01-11 Thread Juergen Gross
On 12/01/18 05:25, Mike Galbraith wrote:
> Hi Juergen,
> 
> Yesterday I wanted to test the RETPOLINE stuff in tip and tip-rt, but
> discovered instead that my box had turned into a complete slug, not due
> to incredible RETPOLINE overhead, rather because box had forgotten that
> it had more than one CPU.  I was going to leave it for the weekend, but
> firing up gitk over morning java, I noticed the commits below, and sure
> enough, that's what broke my box.  Given other people's boxen work,
> seems likely that the authors of the AMI BIOS in this box were a bit
> more creative than usual.

So I'm curious how this should be possible.

Some questions:

- which bootloader are you using?
- what does /sys/kernel/boot_params/version contain?
- can you print the returned value of acpi_arch_get_root_pointer()
  in acpi_os_get_root_pointer() with the patches applied and report
  it, please?

Juergen


[PATCH v4 3/3] arm64: allwinner: a64: bananapi-m64: add usb otg

2018-01-11 Thread Jagan Teki
Add usb otg support for bananapi-m64 board,
- USB-ID connected with PH9
- USB-DRVVBUS controlled by N_VBUSEN pin from PMIC

Signed-off-by: Jagan Teki 
---
Changes for v4:
- rebase on master
- tested otg host mode.
Changes for v3:
- Move the position of reg_drivevbus as per binding documentation.
Changes for v2:
- add drvvbus regulator
- add N_VBUSEN pin

 .../boot/dts/allwinner/sun50i-a64-bananapi-m64.dts  | 21 +
 1 file changed, 21 insertions(+)

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts 
b/arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts
index a697567..26e8534 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts
@@ -86,6 +86,10 @@
};
 };
 
+&ehci0 {
+   status = "okay";
+};
+
 &ehci1 {
status = "okay";
 };
@@ -156,6 +160,10 @@
status = "okay";
 };
 
+&ohci0 {
+   status = "okay";
+};
+
 &ohci1 {
status = "okay";
 };
@@ -168,6 +176,7 @@
reg = <0x3a3>;
interrupt-parent = <&r_intc>;
interrupts = <0 IRQ_TYPE_LEVEL_LOW>;
+   x-powers,drive-vbus-en; /* set N_VBUSEN as output pin */
};
 };
 
@@ -283,6 +292,11 @@
regulator-name = "vcc-rtc";
 };
 
+®_drivevbus {
+   regulator-name = "usb0-vbus";
+   status = "okay";
+};
+
 &uart0 {
pinctrl-names = "default";
pinctrl-0 = <&uart0_pins_a>;
@@ -295,6 +309,13 @@
status = "okay";
 };
 
+&usb_otg {
+   dr_mode = "otg";
+   status = "okay";
+};
+
 &usbphy {
+   usb0_id_det-gpios = <&pio 7 9 GPIO_ACTIVE_HIGH>; /* PH9 */
+   usb0_vbus-supply = <®_drivevbus>;
status = "okay";
 };
-- 
2.7.4



[PATCH v4 2/3] arm64: allwinner: axp803: Add drivevbus regulator

2018-01-11 Thread Jagan Teki
Add reg_drivevbus regualtor for boards which are using
external regulator to drive the OTG VBus through N_VBUSEN
PMIC pin.

Signed-off-by: Jagan Teki 
---
Changes for v4:
- rebase on master
Changes for v3:
- none

 arch/arm64/boot/dts/allwinner/axp803.dtsi | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/boot/dts/allwinner/axp803.dtsi 
b/arch/arm64/boot/dts/allwinner/axp803.dtsi
index ff8af52..e5eae8b 100644
--- a/arch/arm64/boot/dts/allwinner/axp803.dtsi
+++ b/arch/arm64/boot/dts/allwinner/axp803.dtsi
@@ -146,5 +146,10 @@
regulator-max-microvolt = <300>;
regulator-name = "rtc-ldo";
};
+
+   reg_drivevbus: drivevbus {
+   regulator-name = "drivevbus";
+   status = "disabled";
+   };
};
 };
-- 
2.7.4



[PATCH v4 1/3] regulator: axp20x: add drivevbus support for axp803

2018-01-11 Thread Jagan Teki
Like axp221, axp223, axp813 the axp803 is also supporting external
regulator to drive the  OTG VBus through N_VBUSEN PMIC pin.

Add support for it.

Signed-off-by: Jagan Teki 
Reviewed-by: Rob Herring 
---
Changes for v4:
- rebase on master
Changes for v3:
- Update drivevbus in table of regulators

 Documentation/devicetree/bindings/mfd/axp20x.txt | 3 ++-
 drivers/regulator/axp20x-regulator.c | 2 ++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/mfd/axp20x.txt 
b/Documentation/devicetree/bindings/mfd/axp20x.txt
index 9455503..d1762f3 100644
--- a/Documentation/devicetree/bindings/mfd/axp20x.txt
+++ b/Documentation/devicetree/bindings/mfd/axp20x.txt
@@ -43,7 +43,7 @@ Optional properties:
  regulator to drive the OTG VBus, rather then
  as an input pin which signals whether the
  board is driving OTG VBus or not.
- (axp221 / axp223 / axp813 only)
+ (axp221 / axp223 / axp803/ axp813 only)
 
 - x-powers,master-mode: Boolean (axp806 only). Set this when the PMIC is
wired for master mode. The default is slave mode.
@@ -132,6 +132,7 @@ FLDO2   : LDO   : fldoin-supply 
: shared supply
 LDO_IO0: LDO   : ips-supply: GPIO 0
 LDO_IO1: LDO   : ips-supply: GPIO 1
 RTC_LDO: LDO   : ips-supply: always on
+DRIVEVBUS  : Enable output : drivevbus-supply  : external regulator
 
 AXP806 regulators, type, and corresponding input supply names:
 
diff --git a/drivers/regulator/axp20x-regulator.c 
b/drivers/regulator/axp20x-regulator.c
index 181622b..91b8ff8 100644
--- a/drivers/regulator/axp20x-regulator.c
+++ b/drivers/regulator/axp20x-regulator.c
@@ -721,6 +721,8 @@ static int axp20x_regulator_probe(struct platform_device 
*pdev)
case AXP803_ID:
regulators = axp803_regulators;
nregulators = AXP803_REG_ID_MAX;
+   drivevbus = of_property_read_bool(pdev->dev.parent->of_node,
+ "x-powers,drive-vbus-en");
break;
case AXP806_ID:
regulators = axp806_regulators;
-- 
2.7.4



Re: [RESEND PATCH 0/3] x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump kernel

2018-01-11 Thread Baoquan He
On 01/11/18 at 01:05pm, Eric W. Biederman wrote:
> Baoquan He  writes:
> 
> > Hi all,
> >
> > PING!
> >
> > (Add Fenghua and Eric to this thread)
> >
> > On 01/05/18 at 11:42am, Baoquan He wrote:
> >> On kvm guest, the latest kernel will always print warning during kdump 
> >> kernel boots
> >> as below. The reaons is the legacy irq mode is disabled before jump to 
> >> kexec/kdump
> >> kernel. So in setup_local_APIC(), the do { xxx } while (queued && 
> >> max_loops > 0)
> >> can't handle if pending irq exists in APIC IRR since LAPIC is disabled. It 
> >> will
> >> terminate the do while loop finally when max_loops overflows by 
> >> subtraction. Then
> >> WARN_ON(max_loops <= 0) is triggered.
> 
> Overall this looks like the code is setup_local_APIC is working largely
> as designed.  It does run into a snag so it warns.
> 
> Which leaves the question:  Does QEMU have buggy APIC emulation in this
> case or is that loop simply incapble of dealing with queued interrupts
> in APIC_IRR.

Thanks a lot for looking into this, Eric!

Yes, as you said, setup_local_APIC() is working well. It assumes the
current apic can handle the queued interrupts in APIC_IRR. However,
in the current native_machine_crash_shutdown(), it calls
lapic_shutdown() which will invoke disable_local_APIC() to disable APIC
completely with below code. Then when kdump kernel comes into
setup_local_APIC(), the queued interrupts in APIC_IRR can not be handled
at all.

void disable_local_APIC(void)   
   
{ 
..

/*  
   
 * Disable APIC (implies clearing of registers  
   
 * for 82489DX!).   
   
 */ 
   
value = apic_read(APIC_SPIV);   
   
value &= ~APIC_SPIV_APIC_ENABLED;   
   
apic_write(APIC_SPIV, value);
}

With legacy irq mode enabled before jump to kdump kernel,
setup_local_APIC() can handle it well.

So if we decide to disable legacy mode before jump to kdump kernel, we
need remove the do { xxx } while (queued && max_loops > 0) code block
in setup_local_APIC(), and need change disable_IO_APIC() too since it is
doing thing which does not match its name. Just leave those pending irqs
till final apic mode is setup.

> 
> >> 
> >> [0.001000] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1467 
> >> setup_local_APIC+0x228/0x330
> >> [0.001000] Modules linked in:
> >> [0.001000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #3
> >> [0.001000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> >> 1.10.2-1.fc26 04/01/2014
> >> [0.001000] RIP: 0010:setup_local_APIC+0x228/0x330
> >> [0.001000] RSP: :b6e03eb8 EFLAGS: 00010286
> >> [0.001000] RAX: 009edb4c4d84 RBX:  RCX: 
> >> b099d800
> >> [0.001000] RDX: 009e RSI:  RDI: 
> >> 0810
> >> [0.001000] RBP:  R08:  R09: 
> >> 0001
> >> [0.001000] R10: 98ce6a801c00 R11: 0761076d072f0776 R12: 
> >> 0001
> >> [0.001000] R13: 00f0 R14: 4000 R15: 
> >> c6ff
> >> [0.001000] FS:  () GS:98ce6bc0() 
> >> knlGS:
> >> [0.001000] CS:  0010 DS:  ES:  CR0: 80050033
> >> [0.001000] CR2:  CR3: 22209000 CR4: 
> >> 000406b0
> >> [0.001000] Call Trace:
> >> [0.001000]  apic_bsp_setup+0x56/0x74
> >> [0.001000]  x86_late_time_init+0x11/0x16
> >> [0.001000]  start_kernel+0x3c9/0x486
> >> [0.001000]  secondary_startup_64+0xa5/0xb0
> >> [0.001000] Code: 00 85 c9 74 2d 0f 31 c1 e1 0a 48 c1 e2 20 41 89 cf 4c 
> >> 03 7c 24 08 48 09 d0 49 29 c7 4c 89 3c 24 48 83 3c 24 00 0f 8f 8f fe ff ff 
> >> <0f> ff e9 10 ff ff ff 48 83 2c 24 01 eb e7 48 83 c4 18 5b 5d 41 
> >> [0.001000] ---[ end trace b88e71b9a6ebebdd ]---
> >> [0.001000] masked ExtINT on CPU#0
> >> 
> >> With patch 2/3 applied, the above warning disappeared. And with patch 2/3
> >> applied, the issue mentioned in patch 1/3 can also be fixed because the 
> >> LAPIC
> >> has been set as ExtINT before jump to kdump kernel, while we had better 
> >> set it
> >> explicitly. Seems no reason not to enable legacy irq mode before jump to
> >> kexec/kdump kernel, and can make it be consistent with normal kernel.
> >> 
> >> Patch 3/3 is doing clean up, I am fine if people think it's unnecessary.

Re: [RFC PATCH 2/2] softirq: Per vector thread deferment

2018-01-11 Thread Frederic Weisbecker
On Fri, Jan 12, 2018 at 06:35:54AM +0100, Frederic Weisbecker wrote:
> Some softirq vectors can be more CPU hungry than others. Especially
> networking may sometimes deal with packet storm and need more CPU than
> IRQ tail can offer without inducing scheduler latencies. In this case
> the current code defers to ksoftirqd that behaves nicer. Now this nice
> behaviour can be bad for other IRQ vectors that usually need quick
> processing.
> 
> To solve this we only defer to threading the vectors that outreached the
> time limit on IRQ tail processing and leave the others inline on real
> Soft-IRQs service. This is achieved using workqueues with
> per-CPU/per-vector worklets.
> 
> Note ksoftirqd is not removed as it is still needed for threaded IRQs
> mode.
> 
> Suggested-by: Linus Torvalds 
> Signed-off-by: Frederic Weisbecker 
> Cc: Dmitry Safonov 
> Cc: Eric Dumazet 
> Cc: Linus Torvalds 
> Cc: Peter Zijlstra 
> Cc: Andrew Morton 
> Cc: David Miller 
> Cc: Hannes Frederic Sowa 
> Cc: Ingo Molnar 
> Cc: Levin Alexander 
> Cc: Paolo Abeni 
> Cc: Paul E. McKenney 
> Cc: Radu Rendec 
> Cc: Rik van Riel 
> Cc: Stanislaw Gruszka 
> Cc: Thomas Gleixner 
> Cc: Wanpeng Li 
> ---
>  kernel/softirq.c | 90 
> ++--
>  1 file changed, 87 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index fa267f7..0c817ec6 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -74,6 +74,13 @@ struct softirq_stat {
>  
>  static DEFINE_PER_CPU(struct softirq_stat, softirq_stat_cpu);
>  
> +struct vector_work {
> + int vec;
> + struct work_struct work;
> +};
> +
> +static DEFINE_PER_CPU(struct vector_work[NR_SOFTIRQS], vector_work_cpu);
> +
>  /*
>   * we cannot loop indefinitely here to avoid userspace starvation,
>   * but we also don't want to introduce a worst case 1/HZ latency
> @@ -251,6 +258,70 @@ static inline bool lockdep_softirq_start(void) { return 
> false; }
>  static inline void lockdep_softirq_end(bool in_hardirq) { }
>  #endif
>  
> +static void vector_work_func(struct work_struct *work)
> +{
> + struct vector_work *vector_work;
> + u32 pending;
> + int vec;
> +
> + vector_work = container_of(work, struct vector_work, work);
> + vec = vector_work->vec;
> +
> + local_irq_disable();
> + pending = local_softirq_pending();
> + account_irq_enter_time(current);
> + __local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET);
> + lockdep_softirq_enter();
> + set_softirq_pending(pending & ~(1 << vec));
> + local_irq_enable();
> +
> + if (pending & (1 << vec)) {

Ah I see the problem. Say in do_softirq() we had pending VECTOR 1 and 2.
And we had overrun only VECTOR 1 so VECTOR 1 is enqueued to workqueue.
Right after that we go back to the restart loop in do_softirq in order to
handle pending VECTOR 2 but we erase the local_softirqs_pending state. So
when the workqueue runs, it doesn't see anymore VECTOR 1 pending and we lose
it.

So I need to remove the above condition and make the vector work
unconditionally execute the vector callback.

Now I can go to sleep...


> + struct softirq_action *sa = &softirq_vec[vec];
> +
> + kstat_incr_softirqs_this_cpu(vec);
> + trace_softirq_entry(vec);
> + sa->action(sa);
> + trace_softirq_exit(vec);
> + }


Re: [PATCH v4 10/16] phy: qcom-qmp: Move register offsets to header file

2018-01-11 Thread Vivek Gautam
Hi Manu,

On Wed, Jan 3, 2018 at 4:58 PM, Manu Gautam  wrote:
> New revision (v3) of QMP PHY uses different offsets
> for almost all of the registers. Hence, move these
> definitions to header file so that updated offsets
> can be added for QMP v3.
>
> Signed-off-by: Manu Gautam 
> ---
>  drivers/phy/qualcomm/phy-qcom-qmp.c | 119 +--
>  drivers/phy/qualcomm/phy-qcom-qmp.h | 137 
> 
>  2 files changed, 138 insertions(+), 118 deletions(-)
>  create mode 100644 drivers/phy/qualcomm/phy-qcom-qmp.h
>

[snip]

> diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.h 
> b/drivers/phy/qualcomm/phy-qcom-qmp.h
> new file mode 100644
> index 000..d930ca7
> --- /dev/null
> +++ b/drivers/phy/qualcomm/phy-qcom-qmp.h
> @@ -0,0 +1,137 @@
> +/*
> + * Copyright (c) 2017, Linux Foundation. All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 and
> + * only version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + */

nit: "SPDX-License" identifier now? That's less number of lines too :)
And when you are doing that, can you please consider moving
phy-qcom-qmp and phy-qcom-qusb2 as well to the new SPDX license
identifier. That will be cleaner.
Thanks!

> +
> +#ifndef QCOM_PHY_QMP_H_
> +#define QCOM_PHY_QMP_H_
> +
> +/* Only for QMP V2 PHY - QSERDES COM registers */
> +#define QSERDES_COM_BG_TIMER   0x00c
> +#define QSERDES_COM_SSC_EN_CENTER  0x010
> +#define QSERDES_COM_SSC_ADJ_PER1   0x014
> +#define QSERDES_COM_SSC_ADJ_PER2   0x018
> +#define QSERDES_COM_SSC_PER1   0x01c
> +#define QSERDES_COM_SSC_PER2   0x020
> +#define QSERDES_COM_SSC_STEP_SIZE1 0x024
> +#define QSERDES_COM_SSC_STEP_SIZE2 0x028
> +#define QSERDES_COM_BIAS_EN_CLKBUFLR_EN0x034
> +#define QSERDES_COM_CLK_ENABLE10x038
> +#define QSERDES_COM_SYS_CLK_CTRL   0x03c
> +#define QSERDES_COM_SYSCLK_BUF_ENABLE  0x040
> +#define QSERDES_COM_PLL_IVCO   0x048
> +#define QSERDES_COM_LOCK_CMP1_MODE00x04c
> +#define QSERDES_COM_LOCK_CMP2_MODE00x050
> +#define QSERDES_COM_LOCK_CMP3_MODE00x054
> +#define QSERDES_COM_LOCK_CMP1_MODE10x058
> +#define QSERDES_COM_LOCK_CMP2_MODE10x05c
> +#define QSERDES_COM_LOCK_CMP3_MODE10x060
> +#define QSERDES_COM_BG_TRIM0x070
> +#define QSERDES_COM_CLK_EP_DIV 0x074
> +#define QSERDES_COM_CP_CTRL_MODE0  0x078
> +#define QSERDES_COM_CP_CTRL_MODE1  0x07c
> +#define QSERDES_COM_PLL_RCTRL_MODE00x084
> +#define QSERDES_COM_PLL_RCTRL_MODE10x088
> +#define QSERDES_COM_PLL_CCTRL_MODE00x090
> +#define QSERDES_COM_PLL_CCTRL_MODE10x094
> +#define QSERDES_COM_BIAS_EN_CTRL_BY_PSM0x0a8
> +#define QSERDES_COM_SYSCLK_EN_SEL  0x0ac
> +#define QSERDES_COM_RESETSM_CNTRL  0x0b4
> +#define QSERDES_COM_RESTRIM_CTRL   0x0bc
> +#define QSERDES_COM_RESCODE_DIV_NUM0x0c4
> +#define QSERDES_COM_LOCK_CMP_EN0x0c8
> +#define QSERDES_COM_LOCK_CMP_CFG   0x0cc
> +#define QSERDES_COM_DEC_START_MODE00x0d0
> +#define QSERDES_COM_DEC_START_MODE10x0d4
> +#define QSERDES_COM_DIV_FRAC_START1_MODE0  0x0dc
> +#define QSERDES_COM_DIV_FRAC_START2_MODE0  0x0e0
> +#define QSERDES_COM_DIV_FRAC_START3_MODE0  0x0e4
> +#define QSERDES_COM_DIV_FRAC_START1_MODE1  0x0e8
> +#define QSERDES_COM_DIV_FRAC_START2_MODE1  0x0ec
> +#define QSERDES_COM_DIV_FRAC_START3_MODE1  0x0f0
> +#define QSERDES_COM_INTEGLOOP_GAIN0_MODE0  0x108
> +#define QSERDES_COM_INTEGLOOP_GAIN1_MODE0  0x10c
> +#define QSERDES_COM_INTEGLOOP_GAIN0_MODE1  0x110
> +#define QSERDES_COM_INTEGLOOP_GAIN1_MODE1  0x114
> +#define QSERDES_COM_VCO_TUNE_CTRL  0x124
> +#define QSERDES_COM_VCO_TUNE_MAP   0x128
> +#define QSERDES_COM_VCO_TUNE1_MODE00x12c
> +#define QSERDES_COM_VCO_TUNE2_MODE00x130
> +#define QSERDES_COM_VCO_TUNE1_MODE1

Re: [PATCH v7 7/8] dt-bindings: can: m_can: Document new can transceiver binding

2018-01-11 Thread Faiz Abbas
Hi Rob,

On Friday 12 January 2018 01:50 AM, Rob Herring wrote:
> On Wed, Jan 10, 2018 at 4:55 AM, Faiz Abbas  wrote:
>> From: Franklin S Cooper Jr 
>>
>> Add information regarding can-transceiver binding. This is especially
>> important for MCAN since the IP allows CAN FD mode to run significantly
>> faster than what most transceivers are capable of.
>>
>> Signed-off-by: Franklin S Cooper Jr 
>> Signed-off-by: Sekhar Nori 
>> Signed-off-by: Faiz Abbas 
>> ---
>>  Documentation/devicetree/bindings/net/can/m_can.txt | 9 +
>>  1 file changed, 9 insertions(+)
> 
> Why did you drop my ack from v6?

Sorry, I missed it. Will make sure its there in future versions.

Thanks,
Faiz


Re: [RFC PATCH 1/2] softirq: Account time and iteration stats per vector

2018-01-11 Thread Eric Dumazet
On Thu, Jan 11, 2018 at 9:35 PM, Frederic Weisbecker
 wrote:
> As we plan to be able to defer some specific softurq vector processing
> to workqueues when those vectors need more time than IRQs can offer,
> let's first count the time spent and the number of occurences per vector.
>
> For now we still defer to ksoftirqd when the per vector limits are reached
>
> Suggested-by: Linus Torvalds 
> Signed-off-by: Frederic Weisbecker 
> Cc: Dmitry Safonov 
> Cc: Eric Dumazet 
> Cc: Linus Torvalds 
> Cc: Peter Zijlstra 
> Cc: Andrew Morton 
> Cc: David Miller 
> Cc: Hannes Frederic Sowa 
> Cc: Ingo Molnar 
> Cc: Levin Alexander 
> Cc: Paolo Abeni 
> Cc: Paul E. McKenney 
> Cc: Radu Rendec 
> Cc: Rik van Riel 
> Cc: Stanislaw Gruszka 
> Cc: Thomas Gleixner 
> Cc: Wanpeng Li 
> ---
>  kernel/softirq.c | 37 +
>  1 file changed, 29 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 2f5e87f..fa267f7 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -26,6 +26,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #define CREATE_TRACE_POINTS
>  #include 
> @@ -62,6 +63,17 @@ const char * const softirq_to_name[NR_SOFTIRQS] = {
> "TASKLET", "SCHED", "HRTIMER", "RCU"
>  };
>
> +struct vector_stat {
> +   u64 time;
> +   int count;
> +};
> +
> +struct softirq_stat {
> +   struct vector_stat stat[NR_SOFTIRQS];
> +};
> +
> +static DEFINE_PER_CPU(struct softirq_stat, softirq_stat_cpu);
> +
>  /*
>   * we cannot loop indefinitely here to avoid userspace starvation,
>   * but we also don't want to introduce a worst case 1/HZ latency
> @@ -203,7 +215,7 @@ EXPORT_SYMBOL(__local_bh_enable_ip);
>   * we want to handle softirqs as soon as possible, but they
>   * should not be able to lock up the box.
>   */
> -#define MAX_SOFTIRQ_TIME  msecs_to_jiffies(2)
> +#define MAX_SOFTIRQ_TIME  (2 * NSEC_PER_MSEC)
>  #define MAX_SOFTIRQ_RESTART 10
>
>  #ifdef CONFIG_TRACE_IRQFLAGS
> @@ -241,12 +253,11 @@ static inline void lockdep_softirq_end(bool in_hardirq) 
> { }
>
>  asmlinkage __visible void __softirq_entry __do_softirq(void)
>  {
> -   unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
> +   struct softirq_stat *sstat = this_cpu_ptr(&softirq_stat_cpu);
> unsigned long old_flags = current->flags;
> -   int max_restart = MAX_SOFTIRQ_RESTART;
> struct softirq_action *h;
> bool in_hardirq;
> -   __u32 pending;
> +   __u32 pending, overrun = 0;
> int softirq_bit;
>
> /*
> @@ -262,6 +273,7 @@ asmlinkage __visible void __softirq_entry 
> __do_softirq(void)
> __local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET);
> in_hardirq = lockdep_softirq_start();
>
> +   memzero_explicit(sstat, sizeof(*sstat));

If you clear sstat here, it means it does not need to be a per cpu
variable, but an automatic one (defined on the stack)

I presume we need a per cpu var to track cpu usage on last time window.

( typical case of 99,000 IRQ per second, one packet delivered per IRQ,
10 usec spent per packet)



>  restart:
> /* Reset the pending bitmask before enabling irqs */
> set_softirq_pending(0);
> @@ -271,8 +283,10 @@ asmlinkage __visible void __softirq_entry 
> __do_softirq(void)
> h = softirq_vec;
>
> while ((softirq_bit = ffs(pending))) {
> +   struct vector_stat *vstat;
> unsigned int vec_nr;
> int prev_count;
> +   u64 startime;
>
> h += softirq_bit - 1;
>
> @@ -280,10 +294,18 @@ asmlinkage __visible void __softirq_entry 
> __do_softirq(void)
> prev_count = preempt_count();
>
> kstat_incr_softirqs_this_cpu(vec_nr);
> +   vstat = &sstat->stat[vec_nr];
>
> trace_softirq_entry(vec_nr);
> +   startime = local_clock();
> h->action(h);
> +   vstat->time += local_clock() - startime;

You might store local_clock() in a variable, so that we do not call
local_clock() two times per ->action() called.


> +   vstat->count++;
> trace_softirq_exit(vec_nr);
> +
> +   if (vstat->time > MAX_SOFTIRQ_TIME || vstat->count > 
> MAX_SOFTIRQ_RESTART)

If we trust local_clock() to be precise enough, we do not need to
track vstat->count anymore.

> +   overrun |= 1 << vec_nr;
> +
> if (unlikely(prev_count != preempt_count())) {
> pr_err("huh, entered softirq %u %s %p with 
> preempt_count %08x, exited with %08x?\n",
>vec_nr, softirq_to_name[vec_nr], h->action,
> @@ -299,11 +321,10 @@ asmlinkage __visible void __softirq_entry 
> __do_softirq(void)
>
> pending = local_softirq_pending();
> if (pending) {
> -   if (time_before(jiffies, end) && !need_resched() &&
> -   --max_restart)
> +   if (overrun || need_resched(

Re: [PATCH v2 05/12] drm/bridge/synopsys: dw-hdmi: Add deinit callback

2018-01-11 Thread Chen-Yu Tsai
On Thu, Jan 11, 2018 at 3:25 AM, Jernej Skrabec  wrote:
> Some SoCs, like Allwinner A83T, have to do additional cleanup when
> HDMI driver unloads. When using DW HDMI through DRM bridge API, there is
> no place to store driver's private data so it can be accessed in unbind
> function. Because of that, add deinit function which is called at the
> very end, so drivers can do a proper cleanup.
>
> Signed-off-by: Jernej Skrabec 

8242ecbd597d ("drm/bridge/synopsys: stop clobbering drvdata"), which is
already in drm-misc-next, is a much saner solution. :)

ChenYu


Re: [PATCH v2 17/19] qla2xxx: prevent bounds-check bypass via speculative execution

2018-01-11 Thread James Bottomley
On Thu, 2018-01-11 at 21:38 -0800, Dan Williams wrote:
> On Thu, Jan 11, 2018 at 5:19 PM, James Bottomley
>  wrote:
> > 
> > On Thu, 2018-01-11 at 16:47 -0800, Dan Williams wrote:
> > > 
> > > Static analysis reports that 'handle' may be a user controlled
> > > value that is used as a data dependency to read 'sp' from the
> > > 'req->outstanding_cmds' array.
> > 
> > Greg already told you it comes from hardware, specifically the
> > hardware response queue.  If you don't believe him, I can confirm
> > it's quite definitely all copied from the iomem where the mailbox
> > response is, so it can't be a user controlled value (well, unless
> > the user has some influence over the firmware of the
> > qla2xxx  controller, which probably means you have other things to
> > worry about than speculative information leaks).
> 
> I do believe him, and I still submitted this. I'm trying to probe at
> the meta question of where do we draw the line with these especially
> when it costs us relatively little to apply a few line patch? We fix
> theoretical lockdep races, why not theoretical data leak paths?

I think I've lost the thread of what you're after.  I thought you were
asking for the domain experts to look and see if there is the potential
for attack; if there's no theoretical way for a user to influence the
value what's the point of killing speculation?  Furthermore, if the
user could affect that 32 bit value, what they'd actually do is extract
information via the que variable which you didn't fix and which could
be used to compromise the kernel without resorting to side channel
attacks.

What's most puzzling to me is the inconsistency of the positions: if it
doesn't cost that much to turn off speculation, just do it on kernel
entry as Jiří suggested; we can make it a dynamic option and the cloud
providers can do it and the rest of us don't need to bother.  If it
does cost a lot to turn it off as Alan said, then you need us to
identify the cases above where there's no need to disrupt the
speculation pipeline and not turn it off there.  Which is it?

James



Re: [PATCH v5 01/44] dt-bindings: clock: Add new bindings for TI Davinci PLL clocks

2018-01-11 Thread Sekhar Nori
On Friday 12 January 2018 03:16 AM, David Lechner wrote:
> 
> Sekhar, have you had a chance to look at the rest of the patches in the
> series?

Not yet. Sorry about the slow (and piecemeal) review.

> 
> I'll wait a bit before I send a v6 to see if any other comments come.

Yes. I will let you know once I am done reviewing the whole series.

Thanks,
Sekhar



Re: [PATCH] phy: work around 'phys' references to usb-phy devices

2018-01-11 Thread Kishon Vijay Abraham I
Hi Arnd,

On Thursday 11 January 2018 11:46 PM, Eric Anholt wrote:
> Arnd Bergmann  writes:
> 
>> On Thu, Jan 11, 2018 at 2:30 PM, Kishon Vijay Abraham I  
>> wrote:
>>> On Thursday 11 January 2018 02:27 AM, Arnd Bergmann wrote:
 On Mon, Jan 8, 2018 at 7:32 PM, Kishon Vijay Abraham I  
 wrote:
> On Monday 08 January 2018 06:31 PM, Arnd Bergmann wrote:
>> Stefan Wahren reports a problem with a warning fix that was merged
>> ---
>> This obviously needs to be tested, I wrote this up as a reply to
>> Stefan's bug report. I'm fairly sure that I covered all usb-phy
>> driver strings here. My goal is to have a fix merged into 4.15
>> rather than reverting all the DT fixes.
>
> Shouldn't the fix be in phy consumer drivers to not return error if it's 
> able
> to find the phy either using usb-phy or generic phy?

 Stefan has posted a patch to that effect now, but I fear that might be
 a little fragile, in particular this short before the release with the
 regression
 in place.

 The main problem is that we'd have to change the generic
 usb_add_hcd() function in addition to dwc2 and dwc3 to ignore
 -EPROBE_DEFER from phy_get() whenever usb_get_phy_dev()
 has already succeeded.

 If there is any HCD that relies on usb_add_hcd() to get both the
 usb_phy and the phy structures, and it may need to defer probing
 when the latter one isn't ready yet, that fix would break another
 driver.
>>>
>>> hmm.. IMO the better thing right now would be to revert the dt patch which 
>>> adds
>>> #phy-cells.
>>> We have to see if there are better fixes in order to add #phy-cells warning 
>>> fix
>>> in stable tree.
>>
>> Let's see which patches that would be, I think this is the full list of
>> nodes that got an extra #phy-cells:
>>
>> c22fe696157d ARM: dts: Fix dm814x missing phy-cells property
>> f0e11ff8ff65 ARM: dts: am33xx: Add missing #phy-cells to ti,am335x-usb-phy
>> c5bbf358b790 arm: dts: nspire: Add missing #phy-cells to usb-nop-xceiv
>> 44e5dced2ef6 arm: dts: marvell: Add missing #phy-cells to usb-nop-xceiv
>> 014d6da6cb25 ARM: dts: bcm283x: Fix DTC warnings about missing phy-cells
>> f568f6f554b8 ARM: dts: omap: Add missing #phy-cells to usb-nop-xceiv
>>
>> plus a couple in linux-next:
>>
>> d745d5f277bf ARM: dts: imx51-zii-rdu1: Add missing #phy-cells to 
>> usb-nop-xceiv
>> 915fbe59cbf2 ARM: dts: imx: Add missing #phy-cells to usb-nop-xceiv
>>
>> It's a lot of patches to revert, and I guess it would get us back to hundreds
>> of warnings in an allmodconfig build, so I'd first try to come up with
>> ways to prove that at least some of them can stay.
>>
>> Almost all the warnings are about "usb-nop-xceiv" phys, the only exceptions
>> I could find are the OMAP ones (the first two patches), which use
>> "ti,am335x-usb-phy" and are referenced from a "ti,musb-am33xx". That
>> particular driver is not affected by the bug, so we can leave that in.
>>
>> To deal with all the "usb-nop-xceiv"  references including the one that
>> Stefan reported, we could use a much simpler version of my earlier
>> patch, do you think this is any better?

yeah, this looks simpler.
>>
>> Signed-off-by: Arnd Bergmann 

In case you want to take this patch yourself
Acked-by: Kishon Vijay Abraham I 

(or let me know if I have to create a separate pull request for Greg)

Thanks
Kishon

>>
>> diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c
>> index b4964b067aec..f056d8fb3921 100644
>> --- a/drivers/phy/phy-core.c
>> +++ b/drivers/phy/phy-core.c
>> @@ -410,6 +410,10 @@ static struct phy *_of_phy_get(struct device_node
>> *np, int index)
>> if (ret)
>> return ERR_PTR(-ENODEV);
>>
>> +   /* This phy type handled by the usb-phy subsystem for now */
>> +   if (of_device_is_compatible("usb-nop-xceiv"))
>> +   return ERR_PTR(-ENODEV);
>> +
>> mutex_lock(&phy_provider_mutex);
>> phy_provider = of_phy_provider_lookup(args.np);
>> if (IS_ERR(phy_provider) || !try_module_get(phy_provider->owner)) {
> 
> This seems like a nice workaround!
> 


Re: [PATCH IMPROVEMENT] block, bfq: limit sectors served with interactive weight raising

2018-01-11 Thread Paolo Valente


> Il giorno 28 dic 2017, alle ore 15:00, Holger Hoffstätte 
>  ha scritto:
> 
> 
> On 12/28/17 12:19, Paolo Valente wrote:
> (snip half a tech report ;)
> 
> So either this or the previous patch ("limit tags for writes and async I/O"
> can lead to a hard, unrecoverable hang with heavy writes. Since I couldn't
> log into the affected system anymore I couldn't get any stack traces, blk-mq
> debug output etc. but there was nothing in dmesg/on the console, so it
> wasn't a BUG/OOPS.
> 
> -h

Hi Holger,
if, as I guess, this problem hasn't gone away for you, I have two
requests:
1) could you share your exact test
2) if nothing happens in my systems with your test, would you be
willing to retry with the dev version of bfq?  It should be able to
tell us what takes to your hang.  If you are willing to do this test,
I'll prepare a branch with everything already configured for you.

Thanks,
Paolo

[PATCH] clk: aspeed: Handle inverse polarity of USB port 1 clock gate

2018-01-11 Thread Benjamin Herrenschmidt
The USB port 1 clock gate control has an inversed polarity
from all the other clock gates in the chip. This makes the
aspeed_clk_{enable,disable} functions honor the flag
CLK_GATE_SET_TO_DISABLE and set that flag appropriately
so it's set for all clocks except USB port 1.

Signed-off-by: Benjamin Herrenschmidt 
--

I chose not to add a column to the table for that one special
case. If future chips start growing more of these, we should
consider adding this to the table instead.

Without this, USB port 1 doesn't work properly with the new
clk driver.

---
 drivers/clk/clk-aspeed.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/clk/clk-aspeed.c b/drivers/clk/clk-aspeed.c
index 6fb344730cea..f5dc5101174e 100644
--- a/drivers/clk/clk-aspeed.c
+++ b/drivers/clk/clk-aspeed.c
@@ -211,6 +211,7 @@ static int aspeed_clk_enable(struct clk_hw *hw)
unsigned long flags;
u32 clk = BIT(gate->clock_idx);
u32 rst = BIT(gate->reset_idx);
+   u32 enval;
 
spin_lock_irqsave(gate->lock, flags);
 
@@ -223,7 +224,8 @@ static int aspeed_clk_enable(struct clk_hw *hw)
}
 
/* Enable clock */
-   regmap_update_bits(gate->map, ASPEED_CLK_STOP_CTRL, clk, 0);
+   enval = (gate->flags & CLK_GATE_SET_TO_DISABLE) ? 0 : clk;
+   regmap_update_bits(gate->map, ASPEED_CLK_STOP_CTRL, clk, enval);
 
if (gate->reset_idx >= 0) {
/* A delay of 10ms is specified by the ASPEED docs */
@@ -243,10 +245,12 @@ static void aspeed_clk_disable(struct clk_hw *hw)
struct aspeed_clk_gate *gate = to_aspeed_clk_gate(hw);
unsigned long flags;
u32 clk = BIT(gate->clock_idx);
+   u32 enval;
 
spin_lock_irqsave(gate->lock, flags);
 
-   regmap_update_bits(gate->map, ASPEED_CLK_STOP_CTRL, clk, clk);
+   enval = (gate->flags & CLK_GATE_SET_TO_DISABLE) ? clk : 0;
+   regmap_update_bits(gate->map, ASPEED_CLK_STOP_CTRL, clk, enval);
 
spin_unlock_irqrestore(gate->lock, flags);
 }
@@ -478,7 +482,12 @@ static int aspeed_clk_probe(struct platform_device *pdev)
 
for (i = 0; i < ARRAY_SIZE(aspeed_gates); i++) {
const struct aspeed_gate_data *gd = &aspeed_gates[i];
+   u32 gate_flags;
 
+   /* Special case: the USB port 1 clock (bit 14) is always
+* working the opposite way from the other ones.
+*/
+   gate_flags = (gd->clock_idx == 14) ? 0 : 
CLK_GATE_SET_TO_DISABLE;
hw = aspeed_clk_hw_register_gate(dev,
gd->name,
gd->parent_name,
@@ -486,7 +495,7 @@ static int aspeed_clk_probe(struct platform_device *pdev)
map,
gd->clock_idx,
gd->reset_idx,
-   CLK_GATE_SET_TO_DISABLE,
+   gate_flags,
&aspeed_clk_lock);
if (IS_ERR(hw))
return PTR_ERR(hw);




Re: [PATCH v5 1/6] base: power: runtime: Export pm_runtime_get/put_suppliers

2018-01-11 Thread Vivek Gautam



On 01/12/2018 04:23 AM, Rafael J. Wysocki wrote:

On Tue, Jan 9, 2018 at 11:01 AM, Vivek Gautam
 wrote:

The device link allows the pm framework to tie the supplier and
consumer. So, whenever the consumer is powered-on the supplier
is powered-on first.

There are however cases in which the consumer wants to power-on
the supplier, but not itself.
E.g., A Graphics or multimedia driver wants to power-on the SMMU
to unmap a buffer and finish the TLB operations without powering
on itself. Some of these unmap requests are coming from the
user space when the controller itself is not powered-up, and it
can be huge penalty in terms of power and latency to power-up
the graphics/mm controllers.
There can be an argument that the supplier should handle this case
on its own and there should not be a need for the consumer to
power-on the supplier. But as discussed on the thread [1] about
ARM-SMMU runtime pm, we don't want to introduce runtime pm calls
in atomic path in arm_smmu_unmap.

[1] https://patchwork.kernel.org/patch/9827825/

Signed-off-by: Vivek Gautam 

Acked-by: Rafael J. Wysocki 

Please feel free to route this along with the rest of the series.


Thanks Rafael.

regards
Vivek



Thanks!


---

  * This is v2 of the patch [1]. Adding it to this patch series.
[1] https://patchwork.kernel.org/patch/10102447/

  drivers/base/power/runtime.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 6e89b51ea3d9..06a2a88fe866 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -1579,6 +1579,7 @@ void pm_runtime_get_suppliers(struct device *dev)

 device_links_read_unlock(idx);
  }
+EXPORT_SYMBOL_GPL(pm_runtime_get_suppliers);

  /**
   * pm_runtime_put_suppliers - Drop references to supplier devices.
@@ -1597,6 +1598,7 @@ void pm_runtime_put_suppliers(struct device *dev)

 device_links_read_unlock(idx);
  }
+EXPORT_SYMBOL_GPL(pm_runtime_put_suppliers);

  void pm_runtime_new_link(struct device *dev)
  {
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project



Re: [PATCH] ASoC: hdac_hdmi: Ensuring proper setting of output widget power state

2018-01-11 Thread Vinod Koul
On Thu, Jan 11, 2018 at 05:04:27PM +0530, abhijeet.ku...@intel.com wrote:
> From: Abhijeet Kumar 
> 
> When we change the resolution of DP pannel or hot plug-unplug it while
> playing an audio clip,sometimes we observe a silent playback(no audio).

can you rephrase this please

> During no audio condition, we have noticed that the power state of the
> pin or the connector is D3. Optimzing the way we set the power could
> mitigate the issue.With this changes the verb is sent to set the power

space after .

> state and response is received. Thus ensuring power state is set.

am not sure I fully understood the problem here

> 
> Signed-off-by: Abhijeet Kumar 
> ---
>  sound/soc/codecs/hdac_hdmi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/sound/soc/codecs/hdac_hdmi.c b/sound/soc/codecs/hdac_hdmi.c
> index f3b4f4dfae6a..e24caecf0a4f 100644
> --- a/sound/soc/codecs/hdac_hdmi.c
> +++ b/sound/soc/codecs/hdac_hdmi.c
> @@ -718,7 +718,7 @@ static void hdac_hdmi_set_power_state(struct 
> hdac_ext_device *edev,
>  {
>   if (get_wcaps(&edev->hdac, nid) & AC_WCAP_POWER) {
>   if (!snd_hdac_check_power_state(&edev->hdac, nid, pwr_state))
> - snd_hdac_codec_write(&edev->hdac, nid, 0,
> + snd_hdac_codec_read(&edev->hdac, nid, 0,

how does read help instead of write?

-- 
~Vinod


Re: [RFC 1/2] softirq: Defer net rx/tx processing to ksoftirqd context

2018-01-11 Thread Frederic Weisbecker
On Thu, Jan 11, 2018 at 09:13:42PM +, Dmitry Safonov wrote:
> On Thu, 2018-01-11 at 12:53 -0800, Eric Dumazet wrote:
> > On Thu, Jan 11, 2018 at 12:46 PM, Dmitry Safonov 
> > wrote:
> > > On Thu, 2018-01-11 at 12:40 -0800, Linus Torvalds wrote:
> > > > On Thu, Jan 11, 2018 at 12:34 PM, Dmitry Safonov  > > > >
> > > > wrote:
> > > > > 
> > > > > I could try to write a PoC for that..
> > > > > What should be the trigger to fall into workqueue?
> > > > > How to tell if there're too many softirqs of the kind?
> > > > 
> > > > I suspect it would have to be time-based, probably using the
> > > > scheduler clock.
> > > 
> > > I thought about this, but I was a bit afraid of how much pricey it
> > > would be recalculate it each clock. Well, might just try to write
> > > that
> > > and measure the impact.
> > > 
> > > > Most softirqs are really really small. So just counting them
> > > > probably
> > > > isn't all that meaningful, although the count is good as a
> > > > fallback
> > > > (as shown by the jiffy issues).
> > > > 
> > > > The good news is that we only have a fairly small handful of
> > > > softirqs,
> > > > so counting/timing them separately is still mainly a pretty small
> > > > array (which needs to be percpu, of course).
> > 
> > Note that using (scheduler) clock might also help to break
> > net_rx_action()
> > not on a stupid netdev_budget, but on a more precise time limit as
> > well.
> > 
> > netdev_budget of 300 packets is quite big :/
> > 
> > (The time_limit based on jiffies + 2 does not work on hosts with one
> > cpu, since jiffies wont make progress while net_rx_action() is
> > running)
> 
> Thanks for the details, Eric.
> I'll try to come up with poc if no one beats me at it.

I just gave it a try. Sorry I couldn't resist :-s


Re: [PATCH v2 17/19] qla2xxx: prevent bounds-check bypass via speculative execution

2018-01-11 Thread Dan Williams
On Thu, Jan 11, 2018 at 5:19 PM, James Bottomley
 wrote:
> On Thu, 2018-01-11 at 16:47 -0800, Dan Williams wrote:
>> Static analysis reports that 'handle' may be a user controlled value
>> that is used as a data dependency to read 'sp' from the
>> 'req->outstanding_cmds' array.
>
> Greg already told you it comes from hardware, specifically the hardware
> response queue.  If you don't believe him, I can confirm it's quite
> definitely all copied from the iomem where the mailbox response is, so
> it can't be a user controlled value (well, unless the user has some
> influence over the firmware of the qla2xxx  controller, which probably
> means you have other things to worry about than speculative information
> leaks).

I do believe him, and I still submitted this. I'm trying to probe at
the meta question of where do we draw the line with these especially
when it costs us relatively little to apply a few line patch? We fix
theoretical lockdep races, why not theoretical data leak paths?


[RFC PATCH 0/2] softirq: Per vector threading

2018-01-11 Thread Frederic Weisbecker
So this is a first shot to implement what Linus suggested.
To summarize: when a softirq vector is stormed and needs more time than
what IRQ tail can offer, the whole softirq processing is offloaded to
ksoftirqd. But this has an impact on other softirq vectors that are
then subject to scheduler latencies.

So the softirqs time limits is now per vector and only the vectors that
get stormed are offloaded to a thread (workqueue).

This is in a very Proof of concept state. It doesn't even boot successfully
once in a while. So I'll do more debugging tomorrow (today in fact) but
you get the big picture.

It probably won't come free given the clock reads around softirq callbacks.

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
softirq/poc

HEAD: 0e982634115283710d0801048e5a316def26f31d

Thanks,
Frederic
---

Frederic Weisbecker (2):
  softirq: Account time and iteration stats per vector
  softirq: Per vector thread deferment


 kernel/softirq.c | 123 +++
 1 file changed, 114 insertions(+), 9 deletions(-)


[RFC PATCH 2/2] softirq: Per vector thread deferment

2018-01-11 Thread Frederic Weisbecker
Some softirq vectors can be more CPU hungry than others. Especially
networking may sometimes deal with packet storm and need more CPU than
IRQ tail can offer without inducing scheduler latencies. In this case
the current code defers to ksoftirqd that behaves nicer. Now this nice
behaviour can be bad for other IRQ vectors that usually need quick
processing.

To solve this we only defer to threading the vectors that outreached the
time limit on IRQ tail processing and leave the others inline on real
Soft-IRQs service. This is achieved using workqueues with
per-CPU/per-vector worklets.

Note ksoftirqd is not removed as it is still needed for threaded IRQs
mode.

Suggested-by: Linus Torvalds 
Signed-off-by: Frederic Weisbecker 
Cc: Dmitry Safonov 
Cc: Eric Dumazet 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Andrew Morton 
Cc: David Miller 
Cc: Hannes Frederic Sowa 
Cc: Ingo Molnar 
Cc: Levin Alexander 
Cc: Paolo Abeni 
Cc: Paul E. McKenney 
Cc: Radu Rendec 
Cc: Rik van Riel 
Cc: Stanislaw Gruszka 
Cc: Thomas Gleixner 
Cc: Wanpeng Li 
---
 kernel/softirq.c | 90 ++--
 1 file changed, 87 insertions(+), 3 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index fa267f7..0c817ec6 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -74,6 +74,13 @@ struct softirq_stat {
 
 static DEFINE_PER_CPU(struct softirq_stat, softirq_stat_cpu);
 
+struct vector_work {
+   int vec;
+   struct work_struct work;
+};
+
+static DEFINE_PER_CPU(struct vector_work[NR_SOFTIRQS], vector_work_cpu);
+
 /*
  * we cannot loop indefinitely here to avoid userspace starvation,
  * but we also don't want to introduce a worst case 1/HZ latency
@@ -251,6 +258,70 @@ static inline bool lockdep_softirq_start(void) { return 
false; }
 static inline void lockdep_softirq_end(bool in_hardirq) { }
 #endif
 
+static void vector_work_func(struct work_struct *work)
+{
+   struct vector_work *vector_work;
+   u32 pending;
+   int vec;
+
+   vector_work = container_of(work, struct vector_work, work);
+   vec = vector_work->vec;
+
+   local_irq_disable();
+   pending = local_softirq_pending();
+   account_irq_enter_time(current);
+   __local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET);
+   lockdep_softirq_enter();
+   set_softirq_pending(pending & ~(1 << vec));
+   local_irq_enable();
+
+   if (pending & (1 << vec)) {
+   struct softirq_action *sa = &softirq_vec[vec];
+
+   kstat_incr_softirqs_this_cpu(vec);
+   trace_softirq_entry(vec);
+   sa->action(sa);
+   trace_softirq_exit(vec);
+   }
+
+   local_irq_disable();
+
+   pending = local_softirq_pending();
+   if (pending & (1 << vec))
+   schedule_work_on(smp_processor_id(), work);
+
+   lockdep_softirq_exit();
+   account_irq_exit_time(current);
+   __local_bh_enable(SOFTIRQ_OFFSET);
+   local_irq_enable();
+}
+
+static int do_softirq_overrun(u32 overrun, u32 pending)
+{
+   struct softirq_action *h = softirq_vec;
+   int softirq_bit;
+
+   if (!overrun)
+   return pending;
+
+   overrun &= pending;
+   pending &= ~overrun;
+
+   while ((softirq_bit = ffs(overrun))) {
+   struct vector_work *work;
+   unsigned int vec_nr;
+
+   h += softirq_bit - 1;
+   vec_nr = h - softirq_vec;
+   work = this_cpu_ptr(&vector_work_cpu[vec_nr]);
+   schedule_work_on(smp_processor_id(), &work->work);
+   h++;
+   overrun >>= softirq_bit;
+   }
+
+   return pending;
+}
+
 asmlinkage __visible void __softirq_entry __do_softirq(void)
 {
struct softirq_stat *sstat = this_cpu_ptr(&softirq_stat_cpu);
@@ -321,10 +392,13 @@ asmlinkage __visible void __softirq_entry 
__do_softirq(void)
 
pending = local_softirq_pending();
if (pending) {
-   if (overrun || need_resched())
+   if (need_resched()) {
wakeup_softirqd();
-   else
-   goto restart;
+   } else {
+   pending = do_softirq_overrun(overrun, pending);
+   if (pending)
+   goto restart;
+   }
}
 
lockdep_softirq_end(in_hardirq);
@@ -661,10 +735,20 @@ void __init softirq_init(void)
int cpu;
 
for_each_possible_cpu(cpu) {
+   int i;
+
per_cpu(tasklet_vec, cpu).tail =
&per_cpu(tasklet_vec, cpu).head;
per_cpu(tasklet_hi_vec, cpu).tail =
&per_cpu(tasklet_hi_vec, cpu).head;
+
+   for (i = 0; i < NR_SOFTIRQS; i++) {
+   struct vector_work *work;
+
+   work = &per_cpu(vector_work_cpu[i], cpu);
+   work->vec = i;
+   INIT_W

[RFC PATCH 1/2] softirq: Account time and iteration stats per vector

2018-01-11 Thread Frederic Weisbecker
As we plan to be able to defer some specific softurq vector processing
to workqueues when those vectors need more time than IRQs can offer,
let's first count the time spent and the number of occurences per vector.

For now we still defer to ksoftirqd when the per vector limits are reached

Suggested-by: Linus Torvalds 
Signed-off-by: Frederic Weisbecker 
Cc: Dmitry Safonov 
Cc: Eric Dumazet 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Andrew Morton 
Cc: David Miller 
Cc: Hannes Frederic Sowa 
Cc: Ingo Molnar 
Cc: Levin Alexander 
Cc: Paolo Abeni 
Cc: Paul E. McKenney 
Cc: Radu Rendec 
Cc: Rik van Riel 
Cc: Stanislaw Gruszka 
Cc: Thomas Gleixner 
Cc: Wanpeng Li 
---
 kernel/softirq.c | 37 +
 1 file changed, 29 insertions(+), 8 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 2f5e87f..fa267f7 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define CREATE_TRACE_POINTS
 #include 
@@ -62,6 +63,17 @@ const char * const softirq_to_name[NR_SOFTIRQS] = {
"TASKLET", "SCHED", "HRTIMER", "RCU"
 };
 
+struct vector_stat {
+   u64 time;
+   int count;
+};
+
+struct softirq_stat {
+   struct vector_stat stat[NR_SOFTIRQS];
+};
+
+static DEFINE_PER_CPU(struct softirq_stat, softirq_stat_cpu);
+
 /*
  * we cannot loop indefinitely here to avoid userspace starvation,
  * but we also don't want to introduce a worst case 1/HZ latency
@@ -203,7 +215,7 @@ EXPORT_SYMBOL(__local_bh_enable_ip);
  * we want to handle softirqs as soon as possible, but they
  * should not be able to lock up the box.
  */
-#define MAX_SOFTIRQ_TIME  msecs_to_jiffies(2)
+#define MAX_SOFTIRQ_TIME  (2 * NSEC_PER_MSEC)
 #define MAX_SOFTIRQ_RESTART 10
 
 #ifdef CONFIG_TRACE_IRQFLAGS
@@ -241,12 +253,11 @@ static inline void lockdep_softirq_end(bool in_hardirq) { 
}
 
 asmlinkage __visible void __softirq_entry __do_softirq(void)
 {
-   unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
+   struct softirq_stat *sstat = this_cpu_ptr(&softirq_stat_cpu);
unsigned long old_flags = current->flags;
-   int max_restart = MAX_SOFTIRQ_RESTART;
struct softirq_action *h;
bool in_hardirq;
-   __u32 pending;
+   __u32 pending, overrun = 0;
int softirq_bit;
 
/*
@@ -262,6 +273,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
__local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET);
in_hardirq = lockdep_softirq_start();
 
+   memzero_explicit(sstat, sizeof(*sstat));
 restart:
/* Reset the pending bitmask before enabling irqs */
set_softirq_pending(0);
@@ -271,8 +283,10 @@ asmlinkage __visible void __softirq_entry 
__do_softirq(void)
h = softirq_vec;
 
while ((softirq_bit = ffs(pending))) {
+   struct vector_stat *vstat;
unsigned int vec_nr;
int prev_count;
+   u64 startime;
 
h += softirq_bit - 1;
 
@@ -280,10 +294,18 @@ asmlinkage __visible void __softirq_entry 
__do_softirq(void)
prev_count = preempt_count();
 
kstat_incr_softirqs_this_cpu(vec_nr);
+   vstat = &sstat->stat[vec_nr];
 
trace_softirq_entry(vec_nr);
+   startime = local_clock();
h->action(h);
+   vstat->time += local_clock() - startime;
+   vstat->count++;
trace_softirq_exit(vec_nr);
+
+   if (vstat->time > MAX_SOFTIRQ_TIME || vstat->count > 
MAX_SOFTIRQ_RESTART)
+   overrun |= 1 << vec_nr;
+
if (unlikely(prev_count != preempt_count())) {
pr_err("huh, entered softirq %u %s %p with 
preempt_count %08x, exited with %08x?\n",
   vec_nr, softirq_to_name[vec_nr], h->action,
@@ -299,11 +321,10 @@ asmlinkage __visible void __softirq_entry 
__do_softirq(void)
 
pending = local_softirq_pending();
if (pending) {
-   if (time_before(jiffies, end) && !need_resched() &&
-   --max_restart)
+   if (overrun || need_resched())
+   wakeup_softirqd();
+   else
goto restart;
-
-   wakeup_softirqd();
}
 
lockdep_softirq_end(in_hardirq);
-- 
2.7.4



linux-next: Tree for Jan 12

2018-01-11 Thread Stephen Rothwell
Hi all,

Changes since 20180111:

New tree: rdma-fixes

The arm64 tree gained a conflict against Linus' tree.

The pm tree gained a conflict against the i2c tree.

The net-next tree lost its build failure but gained another due to an
interaction with the net tree for which I reverted a commit.

The akpm-current tree still had its build failure for which I applied
a patch.

The akpm tree lost a patch that turned up elsewhere.

Non-merge commits (relative to Linus' tree): 8570
 8741 files changed, 353342 insertions(+), 232006 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 256 trees (counting Linus' and 44 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (5f615b97cdea Merge tag 'sound-4.15-rc8' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound)
Merging fixes/master (820bf5c419e4 Merge tag 'scsi-fixes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi)
Merging kbuild-current/fixes (9059a3493efe kconfig: fix relational operators 
for bool and tristate symbols)
Merging arc-current/for-curr (b2cd1df66037 Linux 4.15-rc7)
Merging arm-current/fixes (36b0cb84ee85 ARM: 8731/1: Fix 
csum_partial_copy_from_user() stack mismatch)
Merging m68k-current/for-linus (5e387199c17c m68k/defconfig: Update defconfigs 
for v4.14-rc7)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (6e032b350cd1 powerpc/powernv: Check device-tree 
for RFI flush settings)
Merging sparc/master (59585b4be9ae sparc64: repair calling incorrect hweight 
function from stubs)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (ccc12b11c533 ipv6: sr: fix TLVs not being copied using 
setsockopt)
Merging bpf/master (bbeb6e4323da bpf, array: fix overflow in max_entries and 
undefined behavior in index_mask)
Merging ipsec/master (76a420119181 xfrm: Fix a race in the xdst pcpu cache.)
Merging netfilter/master (889c604fd0b5 netfilter: x_tables: fix int overflow in 
xt_alloc_table_info())
Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook 
mask only if set)
Merging wireless-drivers/master (49fdde89e2b8 Merge ath-current from 
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git)
Merging mac80211/master (736a80bbfda7 mac80211: mesh: drop frames appearing to 
be from us)
Merging rdma-fixes/for-rc (a1ffa4670cb9 IB/srpt: Fix ACL lookup during login)
Merging sound-current/for-linus (b3defb791b26 ALSA: seq: Make ioctls race-free)
Merging pci-current/for-linus (03a551734cfc x86/PCI: Move and shrink AMD 64-bit 
window to avoid conflict)
Merging driver-core.current/driver-core-linus (30a7acd57389 Linux 4.15-rc6)
Merging tty.current/tty-linus (30a7acd57389 Linux 4.15-rc6)
Merging usb.current/usb-linus (1a2e91e795de Documentation: usb: fix typo in UVC 
gadgetfs config command)
Merging usb-gadget-fixes/fixes (b2cd1df66037 Linux 4.15-rc7)
Merging usb-serial-fixes/usb-linus (d14ac576d10f USB: serial: cp210x: add new 
device ID ELV ALC 8xxx)
Merging usb-chipidea-fixes/ci-for-usb-stable (964728f9f407 USB: chipidea: msm: 
fix ulpi-node lookup)
Merging phy/fixes (2b88212c4cc6 phy: rcar-gen3-usb2: select USB_COMMON)
Merging staging.current/stagi

Re: [PATCH v5 5/6] iommu/arm-smmu: Add support for qcom,smmu-v2 variant

2018-01-11 Thread Vivek Gautam

Hi Rob,


On 01/12/2018 03:53 AM, Rob Herring wrote:

On Tue, Jan 09, 2018 at 03:31:48PM +0530, Vivek Gautam wrote:

qcom,smmu-v2 is an arm,smmu-v2 implementation with specific
clock and power requirements. This smmu core is used with
multiple masters on msm8996, viz. mdss, video, etc.
Add bindings for the same.

Signed-off-by: Vivek Gautam 
---

  * Major change in this patch -
Changed compatible string from 'qcom,msm8996-smmu-v2' to
'qcom,smmu-v2' to reflect the IP version rather than the
platform on which it is used.

The bugs and how things are connected are all the same? I'd suggest you
keep both strings.


Sure,
compatible = "qcom,msm8996-smmu-v2", "qcom,smmu-v2";




The same IP is used across multiple platforms including msm8996,
and sdm845 etc.

But for only 2 or so platforms a fallback is not really worth it. You'll
probably be on SMMUv3 before too long...

Right. There's msm8998 as well, but as you said keeping both strings
will make more sense.
Thanks.

Best regards
Vivek

[snip]

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project



Re: [RFC 1/2] softirq: Defer net rx/tx processing to ksoftirqd context

2018-01-11 Thread Mike Galbraith
On Thu, 2018-01-11 at 12:22 -0800, Linus Torvalds wrote:
> On Thu, Jan 11, 2018 at 12:16 PM, Eric Dumazet  wrote:
> >
> > Note that when I implemented TCP Small queues, I did experiments between
> > using a work queue or a tasklet, and workqueues added unacceptable P99
> > latencies, when many user threads are competing with kernel threads.
> 
> Yes.
> 
> So I think one solution might be to have a hybrid system, where we do
> the softirq's synchronously normally (which is what you really want
> for good latency).
> 
> But then fall down on a threaded model - but that fallback case should
> be per-softirq, not global. So if one softirq uses a lot of CPU time,
> that shouldn't affect the latency of other softirqs.
> 
> So maybe we could get rid of the per-cpu ksoftirqd entirely, and
> replace it with with per-cpu and per-softirq workqueues?

How would that be better than what RT used to do, and I still do for my
RT kernels via boot option, namely split ksoftirqd into per-softirq
threads.

-Mike


Re: [PATCH 1/5] x86/ibrs: Introduce native_rdmsrl, and native_wrmsrl

2018-01-11 Thread Dave Hansen
On 01/11/2018 07:01 PM, Raj, Ashok wrote:
> On Thu, Jan 11, 2018 at 06:20:13PM -0800, Andy Lutomirski wrote:
>> On Thu, Jan 11, 2018 at 5:52 PM, Raj, Ashok  wrote:

 What's wrong with native_read_msr()?
>>>
>>> Yes, i think i should have added to msr.h. The names didn't read as a
>>> pair, one was native_read_msr, wrmsrl could be taken over when paravirt is
>>> defined?
>>
>> Why do you need to override paravirt?
> 
> The idea was since these MSR's are passed through we shouldn't need to 
> handle them any differently. Also its best to do this as soon as possible
> and avoid longer paths to get this barrier to hardware.

We were also worried about the indirect calls that are part of the
paravirt interfaces when retpolines are not in place.



Re: [PATCH] mm: ratelimit end_swap_bio_write() error

2018-01-11 Thread Sergey Senozhatsky
On (01/08/18 19:22), Sergey Senozhatsky wrote:
[..]
> > Your changelog is rather modest on the information.
> 
> fair point!
> 
> > Could you be more specific on how the problem actually happens how
> > likely it is?
> 
> ok. so what we have is
> 
>   slow_path / swap-out page
>__zram_bvec_write(page)
> compressed_page = zcomp_compress(page)
>  zs_malloc(compressed_page)
>   // no available zspage found, need to allocate new
>alloc_zspage()
>{
>   for (i = 0; i < class->pages_per_zspage; i++)
>   page = alloc_page(gfp);
>   if (!page)
>   return NULL
>}
> 
>return -ENOMEM
>   ...
>   printk("Write-error on swap-device...");
> 
> 
> zspage-s can consist of up to ->pages_per_zspage normal pages.
> if alloc_page() fails then we can't allocate the entire zspage,
> so we can't store the swapped out page, so it remains in ram
> and we don't make any progress. so we try to swap another page
> and may be do the whole zs_malloc()->alloc_zspage() again, may
> be not. depending on how bad the OOM situation is there can be
> few or many "Write-error on swap-device" errors.
> 
> > And again, I do not think the throttling is an appropriate counter
> > measure. We do want to print those messages when a critical situation
> > happens. If we have a fallback then simply do not print at all.
> 
> sure, but with the ratelimited printk we still print those messages.
> we just don't print it for every single page we failed to write
> to the device. the existing error messages can (*sometimes*) be noisy
> and not very informative - "Write-error on swap-device (%u:%u:%llu)\n";
> it's not like 1000 of those tell more than 1 or 10.

Michal, does that make sense? with the updated/reworked commit
message will the patch be good enough?

-ss


[x86-tip] RSDP changes converted i4790 box SMP -> UP

2018-01-11 Thread Mike Galbraith
Hi Juergen,

Yesterday I wanted to test the RETPOLINE stuff in tip and tip-rt, but
discovered instead that my box had turned into a complete slug, not due
to incredible RETPOLINE overhead, rather because box had forgotten that
it had more than one CPU.  I was going to leave it for the weekend, but
firing up gitk over morning java, I noticed the commits below, and sure
enough, that's what broke my box.  Given other people's boxen work,
seems likely that the authors of the AMI BIOS in this box were a bit
more creative than usual.

commit 9ede5d5e672586e016eadd5c0aedb6f12e660029 (HEAD -> x86-tip)
Author: Mike Galbraith 
Date:   Fri Jan 12 04:17:52 2018 +0100

Revert "x86/boot: Add the ACPI RSDP address to struct 
setup_header::acpi_rdsp_addr"

This reverts commit 2f74cbf947f45fa082dda8eac1a1f1299a372f49.

commit 0d0b6a9a0d452eaf635580fce8319d49be8b45ed
Author: Mike Galbraith 
Date:   Fri Jan 12 04:17:30 2018 +0100

Revert "x86/acpi: Take the RSDP address for boot parameters if available"

This reverts commit 0c89cf36424f7c1177de8a5712514d7cc2eb369f.

commit 2e9b091da44e75f16b8bad776f40de83abb637ff (origin/x86-tip)
Merge: 41c34211458c 1ccb8feda747
Author: Ingo Molnar 
Date:   Thu Jan 11 06:54:18 2018 +0100

Merge branch 'perf/core'




Re: linux-next: build failure after merge of the net-next tree

2018-01-11 Thread Alexei Starovoitov
On Thu, Jan 11, 2018 at 10:11:45PM -0500, David Miller wrote:
> From: Alexei Starovoitov 
> Date: Wed, 10 Jan 2018 17:58:54 -0800
> 
> > On Thu, Jan 11, 2018 at 11:53:55AM +1100, Stephen Rothwell wrote:
> >> Hi all,
> >> 
> >> After merging the net-next tree, today's linux-next build (x86_64
> >> allmodconfig) failed like this:
> >> 
> >> kernel/bpf/verifier.o: In function `bpf_check':
> >> verifier.c:(.text+0xd86e): undefined reference to `bpf_patch_call_args'
> >> 
> >> Caused by commit
> >> 
> >>   1ea47e01ad6e ("bpf: add support for bpf_call to interpreter")
> >> 
> >> interacting with commit
> >> 
> >>   290af86629b2 ("bpf: introduce BPF_JIT_ALWAYS_ON config")
> >> 
> >> from the bpf and net trees.
> >> 
> >> I have just reverted commit 290af86629b2 for today.  A better solution
> >> would be nice (lie fixing this in a merge between the net-next and net
> >> trees).
> > 
> > that's due to 'endif' from 290af86629b2 needs to be moved above
> > bpf_patch_call_args() definition.
> 
> That doesn't fix it, because then you'd need to expose
> interpreters_args as well and obviously that can't be right.
> 
> Instead, we should never call bpf_patch_call_args() when JIT always on
> is enabled.  So if we fail to JIT the subprogs we should fail
> immediately.

right, as I was trying to say one extra hunk would be needed for net-next.
I was reading this patch:
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a2b211262c25..ca80559c4ec3 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5267,7 +5267,11 @@ static int fixup_call_args(struct bpf_verifier_env *env)
depth = get_callee_stack_depth(env, insn, i);
if (depth < 0)
return depth;
+#ifdef CONFIG_BPF_JIT_ALWAYS_ON
+   return -ENOTSUPP;
+#else
bpf_patch_call_args(insn, depth);
+#endif
}
return 0;

but below should be fine too.
Will test it asap.

> This is the net --> net-next merge resolution I am about to use to fix
> this:
> 
> ...
>  +static int fixup_call_args(struct bpf_verifier_env *env)
>  +{
>  +struct bpf_prog *prog = env->prog;
>  +struct bpf_insn *insn = prog->insnsi;
> - int i, depth;
> ++int i, depth, err;
>  +
> - if (env->prog->jit_requested)
> - if (jit_subprogs(env) == 0)
> ++err = 0;
> ++if (env->prog->jit_requested) {
> ++err = jit_subprogs(env);
> ++if (err == 0)
>  +return 0;
> - 
> ++}
> ++#ifndef CONFIG_BPF_JIT_ALWAYS_ON
>  +for (i = 0; i < prog->len; i++, insn++) {
>  +if (insn->code != (BPF_JMP | BPF_CALL) ||
>  +insn->src_reg != BPF_PSEUDO_CALL)
>  +continue;
>  +depth = get_callee_stack_depth(env, insn, i);
>  +if (depth < 0)
>  +return depth;
>  +bpf_patch_call_args(insn, depth);
>  +}
> - return 0;
> ++err = 0;
> ++#endif
> ++return err;
>  +}
>  +
>   /* fixup insn->imm field of bpf_call instructions
>* and inline eligible helpers as explicit sequence of BPF instructions
>*


Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Steven Rostedt
On Thu, 11 Jan 2018 21:55:47 -0500
Steven Rostedt  wrote:

> I ran this on a box with 4 CPUs and a serial console (so it has a slow
> console). Again, all I have is each CPU doing exactly ONE printk()!
> then sleeping for a full millisecond! It will cause a lot of output,
> and perhaps slow the system down. But it should not lock up the system.
> But without my patch, it does!

I decided to see how this works without a slow serial console. So I
rebooted the box and enabled hyper-threading (doubling the number of
CPUs to 8), and then ran this module, with serial disabled.

As expected, it did not lock up. That's because there was only a single
console (VGA) and it is fast enough to keep up. Especially, since I
have a 1 millisecond sleep between printks.

But I ran the function_graph tracer to see what was happening. Here's
the unpatched case. It didn't take long to see a single CPU suffering
(and this is with a fast console!)

 kworker/1:2-309   [001]78.60: funcgraph_entry:   | 
 printk() {
 kworker/7:1-176   [007]78.62: funcgraph_entry:   | 
 printk() {
 kworker/3:1-72[003]78.62: funcgraph_entry:   | 
 printk() {
 kworker/7:1-176   [007]78.68: funcgraph_exit: 4.528 us   | 
 }
 kworker/3:1-72[003]78.69: funcgraph_exit: 5.875 us   | 
 }
 kworker/0:0-3 [000]78.678745: funcgraph_entry:   | 
 printk() {
 kworker/5:1-78[005]78.678749: funcgraph_entry:   | 
 printk() {
 kworker/4:1-73[004]78.678751: funcgraph_entry:   | 
 printk() {
 kworker/0:0-3 [000]78.678752: funcgraph_exit: 4.893 us   | 
 }
 kworker/5:1-78[005]78.678754: funcgraph_exit: 4.287 us   | 
 }
 kworker/4:1-73[004]78.678756: funcgraph_exit: 3.964 us   | 
 }
 kworker/6:1-147   [006]78.679751: funcgraph_entry:   | 
 printk() {
 kworker/2:3-1295  [002]78.679753: funcgraph_entry:   | 
 printk() {
 kworker/6:1-147   [006]78.679767: funcgraph_exit:   + 13.735 us  | 
 }
 kworker/2:3-1295  [002]78.679768: funcgraph_exit:   + 14.318 us  | 
 }
 kworker/7:1-176   [007]78.680751: funcgraph_entry:   | 
 printk() {
 kworker/3:1-72[003]78.680753: funcgraph_entry:   | 
 printk() {
 kworker/7:1-176   [007]78.680756: funcgraph_exit: 3.981 us   | 
 }
 kworker/3:1-72[003]78.680757: funcgraph_exit: 3.499 us   | 
 }
 kworker/5:1-78[005]78.681734: funcgraph_entry:3.388 us   | 
 printk();
 kworker/4:1-73[004]78.681752: funcgraph_entry:   | 
 printk() {
 kworker/0:0-3 [000]78.681753: funcgraph_entry:   | 
 printk() {
 kworker/4:1-73[004]78.681756: funcgraph_exit: 3.009 us   | 
 }
 kworker/0:0-3 [000]78.681757: funcgraph_exit: 3.708 us   | 
 }
 kworker/2:3-1295  [002]78.682742: funcgraph_entry:   | 
 printk() {
 kworker/6:1-147   [006]78.682746: funcgraph_entry:   | 
 printk() {
 kworker/2:3-1295  [002]78.682749: funcgraph_exit: 4.548 us   | 
 }
 kworker/6:1-147   [006]78.682750: funcgraph_exit: 3.001 us   | 
 }
 kworker/3:1-72[003]78.683751: funcgraph_entry:   | 
 printk() {
 kworker/7:1-176   [007]78.683753: funcgraph_entry:   | 
 printk() {
 kworker/3:1-72[003]78.683756: funcgraph_exit: 3.869 us   | 
 }
 kworker/7:1-176   [007]78.683757: funcgraph_exit: 4.300 us   | 
 }
 kworker/5:1-78[005]78.684736: funcgraph_entry:2.074 us   | 
 printk();
 kworker/4:1-73[004]78.684755: funcgraph_entry:   | 
 printk() {
 kworker/0:0-3 [000]78.684755: funcgraph_entry:3.065 us   | 
 printk();
 kworker/4:1-73[004]78.684760: funcgraph_exit: 4.091 us   | 
 }
 kworker/6:1-147   [006]78.685744: funcgraph_entry:   | 
 printk() {
 kworker/2:3-1295  [002]78.685744: funcgraph_entry:4.616 us   | 
 printk();
 kworker/6:1-147   [006]78.685752: funcgraph_exit: 5.943 us   | 
 }
 kworker/7:1-176   [007]78.686763: funcgraph_entry:   | 
 printk() {
 kworker/3:1-72[003]78.686767: funcgraph_entry:   | 
 printk() {
 kworker/7:1-176   [007]78.686770: funcgraph_exit: 4.570 us   | 
 }
 kworker/3:1-72[003]78.686771: funcgraph_exit: 3.262 us   | 
 }
 kworker/1:2-309   [001]78.687626: funcgraph_exit:   # 9854.982 us 
|  }


CPU 1 was stuck for 9 milliseconds doing nothing but handling printk.
And this is without a serial or slow console.

With a patched kernel:

 kw

Re: [PATCH 3/3] tracing: don't set parser->cont if it has reached the end of input buffer

2018-01-11 Thread Du, Changbin

Hi Rostedt,
On Tue, Jan 09, 2018 at 11:19:36PM -0500, Steven Rostedt wrote:
> On Wed, 10 Jan 2018 11:18:23 +0800
> "Du, Changbin"  wrote:
> 
> > write(3, "abcdefg", 7)  
> > > 
> > > From my point of view, the above isn't done writing the function name
> > > yet and we SHOULD continue waiting for more input.
> > >   
> > hmm, thanks for the background. Your above case is a postive use case. So by
> > this design, instead of write(3, "abcdefg", 7), it should be
> > write(3, "abcdefg\0", 8), right?
> 
> BTW, gcc would translate the above string to 'abcdefg\0\0'. When
> defining strings with "", gcc (and all C compilers) append a '\0' to
> the end.
> 
I should clarify the expression here first. :) All the strings here is to 
express
all the content of a string buffer, including the compiler appended '\0'. (Just 
like
the output of 'strace').
If this description is still not clear, please let me know!

> But I replied to the first patch, saying that allowing \0 as whitespace
> may be OK, given the usecase I showed.
> 
> > 
> > If true, it means kernel expect userspace write every string terminated with
> > '\0'. So to fix this issue:
> > open("/sys/kernel/debug/tracing//set_ftrace_pid", O_WRONLY|O_TRUNC) = 3
> > write(3, " \0", 2)  = -1 EINVAL (Invalid argument)
> > 
> > Fix would be:
> > write(3, "\0", 1)?
> > 
> > So far, I am still confused. Some of the tracing debugfs entry accept '\0'
> > while some not. AFIK, 'echo xxx > ' always has a '\0'
> > terminated.
> 
> I don't believe that's true.
> 
>  $ echo -n abc > /tmp/abc
>  $ wc /tmp/abc
>  0 1 3 /tmp/abc
> 
> Echo writes only the characters you put on the line, nothing more.
> 
Sorry, I misundertood it. The extra character is '\n'.
  $ echo abc > set_ftrace_filter
0.000 probe:ftrace_filter_write_line0:(a7b8db80) ubuf=0xc77408 
cnt=0x4)
  $ echo -n abc > set_ftrace_filter
8889.832 probe:ftrace_filter_write_line0:(a7b8db80) ubuf=0xc77408 
cnt=0x3)

> Note, when the file descriptor is closed, the code also executes on
> what was written but not terminated. That is:
> 
>   write(fd, "abc", 3);
>   close(fd);
> 
> Will keep the "abc" in the continue buffer, but the closing of the file
> descriptor will flush it, and execute it.
> 
Thanks, so now I unstand why below corner case. The userspace try to set the
filter with a unrecognized symbole name (e.g "abcdefg").
open("/sys/kernel/debug/tracing/set_ftrace_filter", O_WRONLY|O_TRUNC) = 3
write(3, "abcdefg", 7)

Since "abcdefg" is not in the symbole list, so we would expect the write return
-EINVAL, right? As below:
# echo abcdefg > set_ftrace_filter
bash: echo: write error: Invalid argument

But the above mechanism hide the error. It return success actually no filter is
apllied at all.
# echo -n abcdefg > set_ftrace_filter

I think in this case kernel may request the userspace append a '\0' or space to 
the
string buffer so everything can work.

Also there is another corner case. Below write dosn't work.
open("/sys/kernel/debug/tracing//set_ftrace_pid", O_WRONLY|O_TRUNC) = 3
write(3, " \0", 2)  = -1 EINVAL (Invalid argument)

While these works:
# echo "" > set_ftrace_pid
# echo " " > set_ftrace_pid
# echo -n " " > set_ftrace_pid

These is the reason why I think '\0' should be recognized by the parser.

> -- Steve

-- 
Thanks,
Changbin Du


Re: [PATCH] selftests/x86: Add test_vsyscall

2018-01-11 Thread Kees Cook
On Thu, Jan 11, 2018 at 5:16 PM, Andy Lutomirski  wrote:
> This tests that the vsyscall entries do what they're expected to do.
> It also confirms that attempts to read the vsyscall page behave as
> expected.
>
> If changes are made to the vsyscall code or its memory map handling,
> running this test in all three of vsyscall=none, vsyscall=emulate,
> and vsyscall=native are helpful.
>
> (Because it's easy, this also compares the vsyscall results to their
>  vDSO equivalents.)
>
> Cc: sta...@vger.kernel.org
> Signed-off-by: Andy Lutomirski 

Acked-by: Kees Cook 
Tested-by: Kees Cook 

(random note: if you're crazy and built with CONFIG_COMPAT_VDSO, this
doesn't actually fail, it just kind of limps along with warnings and
"RUN" but no "OK".)

-Kees

> ---
>
> Note to KAISER backporters: please test this under all three
> vsyscall modes.  Also, in the emulate and native modes, make sure
> that test_vsyscall_64 agrees with the command line or config
> option as to which mode you're in.  It's quite easy to mess up
> the kernel such that native mode accidentally emulates
> or vice versa.
>
> Greg, etc: please backport this to all your Meltdown-patched
> kernels.  It'll help make sure the patches didn't regress
> vsyscalls.
>
> Changes from RFC version:
>  - Doesn't warn on 32-bit
>  - Detects native vs emulate
>
> tools/testing/selftests/x86/Makefile|   2 +-
>  tools/testing/selftests/x86/test_vsyscall.c | 500 
> 
>  2 files changed, 501 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/x86/test_vsyscall.c
>
> diff --git a/tools/testing/selftests/x86/Makefile 
> b/tools/testing/selftests/x86/Makefile
> index 939a337128db..5d4f10ac2af2 100644
> --- a/tools/testing/selftests/x86/Makefile
> +++ b/tools/testing/selftests/x86/Makefile
> @@ -7,7 +7,7 @@ include ../lib.mk
>
>  TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt 
> ptrace_syscall test_mremap_vdso \
> check_initial_reg_state sigreturn ldt_gdt iopl 
> mpx-mini-test ioperm \
> -   protection_keys test_vdso
> +   protection_keys test_vdso test_vsyscall
>  TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault test_syscall_vdso 
> unwind_vdso \
> test_FCMOV test_FCOMI test_FISTTP \
> vdso_restorer
> diff --git a/tools/testing/selftests/x86/test_vsyscall.c 
> b/tools/testing/selftests/x86/test_vsyscall.c
> new file mode 100644
> index ..7a744fa7b786
> --- /dev/null
> +++ b/tools/testing/selftests/x86/test_vsyscall.c
> @@ -0,0 +1,500 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#define _GNU_SOURCE
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#ifdef __x86_64__
> +# define VSYS(x) (x)
> +#else
> +# define VSYS(x) 0
> +#endif
> +
> +#ifndef SYS_getcpu
> +# ifdef __x86_64__
> +#  define SYS_getcpu 309
> +# else
> +#  define SYS_getcpu 318
> +# endif
> +#endif
> +
> +static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
> +  int flags)
> +{
> +   struct sigaction sa;
> +   memset(&sa, 0, sizeof(sa));
> +   sa.sa_sigaction = handler;
> +   sa.sa_flags = SA_SIGINFO | flags;
> +   sigemptyset(&sa.sa_mask);
> +   if (sigaction(sig, &sa, 0))
> +   err(1, "sigaction");
> +}
> +
> +/* vsyscalls and vDSO */
> +bool should_read_vsyscall = false;
> +
> +typedef long (*gtod_t)(struct timeval *tv, struct timezone *tz);
> +gtod_t vgtod = (gtod_t)VSYS(0xff60);
> +gtod_t vdso_gtod;
> +
> +typedef int (*vgettime_t)(clockid_t, struct timespec *);
> +vgettime_t vdso_gettime;
> +
> +typedef long (*time_func_t)(time_t *t);
> +time_func_t vtime = (time_func_t)VSYS(0xff600400);
> +time_func_t vdso_time;
> +
> +typedef long (*getcpu_t)(unsigned *, unsigned *, void *);
> +getcpu_t vgetcpu = (getcpu_t)VSYS(0xff600800);
> +getcpu_t vdso_getcpu;
> +
> +static void init_vdso(void)
> +{
> +   void *vdso = dlopen("linux-vdso.so.1", RTLD_LAZY | RTLD_LOCAL | 
> RTLD_NOLOAD);
> +   if (!vdso)
> +   vdso = dlopen("linux-gate.so.1", RTLD_LAZY | RTLD_LOCAL | 
> RTLD_NOLOAD);
> +   if (!vdso) {
> +   printf("[WARN]\tfailed to find vDSO\n");
> +   return;
> +   }
> +
> +   vdso_gtod = (gtod_t)dlsym(vdso, "__vdso_gettimeofday");
> +   if (!vdso_gtod)
> +   printf("[WARN]\tfailed to find gettimeofday in vDSO\n");
> +
> +   vdso_gettime = (vgettime_t)dlsym(vdso, "__vdso_clock_gettime");
> +   if (!vdso_gettime)
> +   printf("[WARN]\tfailed to find clock_gettime in vDSO\n");
> +
> +   vdso_time = (time_func_t)dlsym(vdso, "__vdso_time");
> +   if (!vdso_time)
> +   printf("[WARN]\tfailed to find time in vDSO\n");
> 

[ANNOUNCE] Git v2.16.0-rc2

2018-01-11 Thread Junio C Hamano
A release candidate Git v2.16.0-rc2 is now available for testing
at the usual places.  It is comprised of 483 non-merge commits
since v2.15.0, contributed by 80 people, 23 of which are new faces.

The tarballs are found at:

https://www.kernel.org/pub/software/scm/git/testing/

The following public repositories all have a copy of the
'v2.16.0-rc2' tag and the 'master' branch that the tag points at:

  url = https://kernel.googlesource.com/pub/scm/git/git
  url = git://repo.or.cz/alt-git.git
  url = https://github.com/gitster/git

New contributors whose contributions weren't in v2.15.0 are as follows.
Welcome to the Git development community!

  Albert Astals Cid, Antoine Beaupré, Damien Marié, Daniel
  Bensoussan, Florian Klink, Gennady Kupava, Guillaume Castagnino,
  Haaris Mehmood, Hans Jerry Illikainen, Ingo Ruhnke, Jakub
  Bereżański, Jean Carlo Machado, Julien Dusser, J Wyman,
  Kevin, Łukasz Stelmach, Marius Paliga, Olga Telezhnaya,
  Rafael Ascensão, Robert Abel, Robert P. J. Day, Shuyu Wei,
  and Wei Shuyu.

Returning contributors who helped this release are as follows.
Thanks for your continued support.

  Adam Dinwoodie, Ævar Arnfjörð Bjarmason, Alex Vandiver,
  Anders Kaseorg, Andrey Okoshkin, Ann T Ropea, Beat Bolli,
  Ben Peart, Brandon Williams, brian m. carlson, Carlos Martín
  Nieto, Charles Bailey, Christian Couder, Dave Borowitz, Dennis
  Kaarsemaker, Derrick Stolee, Elijah Newren, Emily Xie, Eric
  Sunshine, Eric Wong, Heiko Voigt, Jacob Keller, Jameson Miller,
  Jean-Noel Avila, Jeff Hostetler, Jeff King, Johannes Schindelin,
  Jonathan Nieder, Jonathan Tan, Junio C Hamano, Kaartic Sivaraam,
  Kevin Daudt, Lars Schneider, Liam Beguin, Luke Diamand, Martin
  Ågren, Michael Haggerty, Nicolas Morey-Chaisemartin, Phil Hord,
  Phillip Wood, Pranit Bauva, Prathamesh Chavan, Ralf Thielow,
  Ramsay Jones, Randall S. Becker, Rasmus Villemoes, René Scharfe,
  Simon Ruderich, Stefan Beller, Steffen Prohaska, Stephan Beyer,
  SZEDER Gábor, Thomas Braun, Thomas Gummerer, Todd Zullinger,
  Torsten Bögershausen, and W. Trevor King.



Git 2.16 Release Notes (draft)
==

Backward compatibility notes and other notable changes.

 * Use of an empty string as a pathspec element that is used for
   'everything matches' is now an error.


Updates since v2.15
---

UI, Workflows & Features

 * An empty string as a pathspec element that means "everything"
   i.e. 'git add ""', is now illegal.  We started this by first
   deprecating and warning a pathspec that has such an element in
   2.11 (Nov 2016).

 * A hook script that is set unexecutable is simply ignored.  Git
   notifies when such a file is ignored, unless the message is
   squelched via advice.ignoredHook configuration.

 * "git pull" has been taught to accept "--[no-]signoff" option and
   pass it down to "git merge".

 * The "--push-option=" option to "git push" now defaults to a
   list of strings configured via push.pushOption variable.

 * "gitweb" checks if a directory is searchable with Perl's "-x"
   operator, which can be enhanced by using "filetest 'access'"
   pragma, which now we do.

 * "git stash save" has been deprecated in favour of "git stash push".

 * The set of paths output from "git status --ignored" was tied
   closely with its "--untracked=" option, but now it can be
   controlled more flexibly.  Most notably, a directory that is
   ignored because it is listed to be ignored in the ignore/exclude
   mechanism can be handled differently from a directory that ends up
   to be ignored only because all files in it are ignored.

 * The remote-helper for talking to MediaWiki has been updated to
   truncate an overlong pagename so that ".mw" suffix can still be
   added.

 * The remote-helper for talking to MediaWiki has been updated to
   work with mediawiki namespaces.

 * The "--format=..." option "git for-each-ref" takes learned to show
   the name of the 'remote' repository and the ref at the remote side
   that is affected for 'upstream' and 'push' via "%(push:remotename)"
   and friends.

 * Doc and message updates to teach users "bisect view" is a synonym
   for "bisect visualize".

 * "git bisect run" that did not specify any command to run used to go
   ahead and treated all commits to be tested as 'good'.  This has
   been corrected by making the command error out.

 * The SubmittingPatches document has been converted to produce an
   HTML version via AsciiDoc/Asciidoctor.

 * We learned to talk to watchman to speed up "git status" and other
   operations that need to see which paths have been modified.

 * The "diff" family of commands learned to ignore differences in
   carriage return at the end of line.

 * Places that know about "sendemail.to", like documentation and shell
   completion (in contrib/) have been taught about "sendemail.tocmd",
   too.

 * "git add --renormalize ." is a new and safer way to record

Re: [PATCH] drm/virtio: Add window server support

2018-01-11 Thread Dave Airlie
>
> this work is based on the virtio_wl driver in the ChromeOS kernel by
> Zach Reizner, currently at:
>
> https://chromium.googlesource.com/chromiumos/third_party/kernel/+/chromeos-4.4/drivers/virtio/virtio_wl.c
>
> There's two features missing in this patch when compared with virtio_wl:
>
> * Allow the guest access directly host memory, without having to resort
> to TRANSFER_TO_HOST
>
> * Pass FDs from host to guest (Wayland specifies that the compositor
> shares keyboard data with the guest via a shared buffer)
>
> I plan to work on this next, but I would like to get some comments on
> the general approach so I can better choose which patch to follow.

Shouldn't qemu expose some kind of capability to enable this so we know to
look for the extra vqs?

What happens if you run this on plain qemu, does it fallback correctly?

Are there any scenarios where we don't want to expose this API because there
is nothing to back it.

Dave.


Re: [PATCH v2 09/22] mmc: tmio: use mmc_can_gpio_cd() instead of checking TMIO_MMC_USE_GPIO_CD

2018-01-11 Thread Masahiro Yamada
Hi Ulf,


2018-01-02 21:56 GMT+09:00 Wolfram Sang :
> On Sat, Nov 25, 2017 at 01:24:44AM +0900, Masahiro Yamada wrote:
>> To use a GPIO line for card detection, TMIO_MMC_USE_GPIO_CD is set
>> by a legacy board (arch/sh/boards/mach-ecovec24).
>>
>> For DT platforms, the "cd-gpios" property is a legitimate way for that
>> in case the IP-builtin card detection can not be used for some reason.
>> mmc_of_parse() calls mmc_gpiod_request_cd() to set up ctx->cd_gpio if
>> the "cd-gpios" property is specified.
>>
>> To cater to both cases, mmc_can_gpio_cd() is a correct way to check
>> which card detection logic is used.
>>
>> Signed-off-by: Masahiro Yamada 
>
> This patch is correct, yet needed some time for testing because it
> inverts the results for R-Car SoCs. Again, it is correct that it inverts
> it because those SoCs have GPIOs defined in their devicetrees, so they
> shouldn't be using native hotplug. Still, this meant checking that no
> regression gets introduced. Also, for R-Car Gen 2 & 3 native hotplug
> seems to work fine, so I was trying to find out why we use GPIOs here. I
> wasn't successful up to now, but since GPIOs work well, too, and seem to
> react a bit faster even, I am fine with the patch being merged.
>
> Reviewed-by: Wolfram Sang 
>
>> ---
>>
>> Changes in v2: None
>>
>>  drivers/mmc/host/tmio_mmc_core.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/mmc/host/tmio_mmc_core.c 
>> b/drivers/mmc/host/tmio_mmc_core.c
>> index efffb04..610f26f 100644
>> --- a/drivers/mmc/host/tmio_mmc_core.c
>> +++ b/drivers/mmc/host/tmio_mmc_core.c
>> @@ -1232,7 +1232,7 @@ int tmio_mmc_host_probe(struct tmio_mmc_host *_host,
>>   }
>>   mmc->max_seg_size = mmc->max_req_size;
>>
>> - _host->native_hotplug = !(pdata->flags & TMIO_MMC_USE_GPIO_CD ||
>> + _host->native_hotplug = !(mmc_can_gpio_cd(mmc) ||
>> mmc->caps & MMC_CAP_NEEDS_POLL ||
>> !mmc_card_is_removable(mmc));
>>
>> --
>> 2.7.4
>>

Wolfram issued Reviewed-by.

Could you pick up this patch for -next?

-- 
Best Regards
Masahiro Yamada


[PATCH] usb: dwc3: core: power on PHYs before initializing core

2018-01-11 Thread William Wu
The dwc3_core_init() gets the PHYs and initializes the PHYs with
the usb_phy_init() and phy_init() functions before initializing
core, and power on the PHYs after core initialization is done.

However, some platforms (e.g. Rockchip RK3399 DWC3 with Type-C
USB3 PHY), it needs to do some special operation while power on
the Type-C PHY before initializing DWC3 core. It's because that
the RK3399 Type-C PHY requires to hold the DWC3 controller in
reset state to keep the PIPE power state in P2 while configuring
the Type-C PHY, otherwise, it may cause waiting for the PIPE ready
timeout. In this case, if we power on the PHYs after the DWC3 core
initialization is done, the core will be reset to uninitialized
state after power on the PHYs.

Fix this by powering on the PHYs before initializing core. And
because the GUID register may also be reset in this case, so we
need to configure the GUID register after powering on the PHYs.

Signed-off-by: William Wu 
---
 drivers/usb/dwc3/core.c | 46 ++
 1 file changed, 22 insertions(+), 24 deletions(-)

diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index c32d2b9..4f5573f 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -741,12 +741,6 @@ static int dwc3_core_init(struct dwc3 *dwc)
goto err0;
}
 
-   /*
-* Write Linux Version Code to our GUID register so it's easy to figure
-* out which kernel version a bug was found.
-*/
-   dwc3_writel(dwc->regs, DWC3_GUID, LINUX_VERSION_CODE);
-
/* Handle USB2.0-only core configuration */
if (DWC3_GHWPARAMS3_SSPHY_IFC(dwc->hwparams.hwparams3) ==
DWC3_GHWPARAMS3_SSPHY_IFC_DIS) {
@@ -762,34 +756,40 @@ static int dwc3_core_init(struct dwc3 *dwc)
if (ret)
goto err0;
 
+   usb_phy_set_suspend(dwc->usb2_phy, 0);
+   usb_phy_set_suspend(dwc->usb3_phy, 0);
+   ret = phy_power_on(dwc->usb2_generic_phy);
+   if (ret < 0)
+   goto err1;
+
+   ret = phy_power_on(dwc->usb3_generic_phy);
+   if (ret < 0)
+   goto err2;
+
ret = dwc3_phy_setup(dwc);
if (ret)
-   goto err0;
+   goto err3;
+
+   /*
+* Write Linux Version Code to our GUID register so it's easy to figure
+* out which kernel version a bug was found.
+*/
+   dwc3_writel(dwc->regs, DWC3_GUID, LINUX_VERSION_CODE);
 
dwc3_core_setup_global_control(dwc);
dwc3_core_num_eps(dwc);
 
ret = dwc3_setup_scratch_buffers(dwc);
if (ret)
-   goto err1;
+   goto err3;
 
/* Adjust Frame Length */
dwc3_frame_length_adjustment(dwc);
 
-   usb_phy_set_suspend(dwc->usb2_phy, 0);
-   usb_phy_set_suspend(dwc->usb3_phy, 0);
-   ret = phy_power_on(dwc->usb2_generic_phy);
-   if (ret < 0)
-   goto err2;
-
-   ret = phy_power_on(dwc->usb3_generic_phy);
-   if (ret < 0)
-   goto err3;
-
ret = dwc3_event_buffers_setup(dwc);
if (ret) {
dev_err(dwc->dev, "failed to setup event buffers\n");
-   goto err4;
+   goto err3;
}
 
/*
@@ -821,17 +821,15 @@ static int dwc3_core_init(struct dwc3 *dwc)
 
return 0;
 
-err4:
+err3:
phy_power_off(dwc->usb3_generic_phy);
 
-err3:
+err2:
phy_power_off(dwc->usb2_generic_phy);
 
-err2:
+err1:
usb_phy_set_suspend(dwc->usb2_phy, 1);
usb_phy_set_suspend(dwc->usb3_phy, 1);
-
-err1:
usb_phy_shutdown(dwc->usb2_phy);
usb_phy_shutdown(dwc->usb3_phy);
phy_exit(dwc->usb2_generic_phy);
-- 
2.0.0




Re: [PATCH v1 3/8] x86/entry/clearregs: Clear registers for 64bit SYSCALL

2018-01-11 Thread Josh Poimboeuf
On Tue, Jan 09, 2018 at 05:03:23PM -0800, Andi Kleen wrote:
> From: Andi Kleen 
> 
> We clear all the non argument registers for 64bit SYSCALLs
> to minimize any risk of bad speculation using user values.
> 
> So far unused argument registers still leak. To be addressed
> in future patches.
> 
> Signed-off-by: Andi Kleen 
> ---
>  arch/x86/entry/entry_64.S | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index bbdfbdd817d6..632081fd7086 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -236,6 +236,14 @@ GLOBAL(entry_SYSCALL_64_after_hwframe)
>   pushq   %r11/* pt_regs->r11 */
>   sub $(6*8), %rsp
>   SAVE_EXTRA_REGS
> + /* Sanitize registers against speculation attacks */

This comment isn't necessary, though it would be good to add comments
above the CLEAR macros themselves explaining why they're needed.

> + /* r10 is cleared later, arguments are handled in san_args* */

What is san_args?

> + CLEAR_R11_TO_R15
> +#ifndef CONFIG_FRAME_POINTER
> + xor %ebp, %ebp
> +#endif

Why is %rbp not cleared with CONFIG_FRAME_POINTER?  Is it because it
will get clobbered by the first called function?

> + xor %ebx, %ebx
> + xor %ecx, %ecx

I think clearing %ecx isn't needed, it gets clobbered below for the fast
path, and gets clobbered by do_syscall_64() for the slow path.

>  
>   UNWIND_HINT_REGS extra=0
>  
> @@ -263,6 +271,7 @@ entry_SYSCALL_64_fastpath:
>  #endif
>   ja  1f  /* return -ENOSYS (already in 
> pt_regs->ax) */
>   movq%r10, %rcx
> + xor %r10, %r10
>  
>  #ifdef CONFIG_RETPOLINE
>   movqsys_call_table(, %rax, 8), %rax

Now that the fast path is getting slower, I wonder if it still makes
sense to have a "fast path"?  It would be good to see measurements
comparing the fast and slow paths.

-- 
Josh


[PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating

2018-01-11 Thread Jianchao Wang
Customer reported memory corruption issue on previous mlx4_en driver
version where the order-3 pages and multiple page reference counting
were still used.

Finally, find out one of the root causes is that the HW may see stale
rx_descs due to prod db updating reaches HW before rx_desc. Especially
when cross order-3 pages boundary and update a new one, HW may write
on the pages which may has been freed and allocated again by others.

To fix it, add a wmb between rx_desc and prod db updating to ensure
the order. Even thougth order-0 and page recycling has been introduced,
the disorder between rx_desc and prod db still could lead to corruption
on different inbound packages.

Signed-off-by: Jianchao Wang 
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c 
b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 85e28ef..eefa82c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -555,7 +555,7 @@ static void mlx4_en_refill_rx_buffers(struct mlx4_en_priv 
*priv,
break;
ring->prod++;
} while (likely(--missing));
-
+   wmb(); /* ensure rx_desc updating reaches HW before prod db updating */
mlx4_en_update_rx_prod_db(ring);
 }
 
-- 
2.7.4



Re: [PATCH v2 04/19] x86: implement ifence()

2018-01-11 Thread Dan Williams
On Thu, Jan 11, 2018 at 6:27 PM, Eric W. Biederman
 wrote:
>
> Dan Williams  writes:
>
> > The new barrier, 'ifence', ensures that no instructions past the
> > boundary are speculatively executed.
>
> This needs a much better description.
>
> If that description was valid we could add ifence in the syscall
> entry path and not have any speculative execution to worry about in the
> kernel.

True, I'll fix that up.

>
> Perhaps:
> 'ifence', ensures that no speculative execution that reaches the 'ifence'
> boundary continues past the 'ifence' boundary.
>
> > Previously the kernel only needed this fence in 'rdtsc_ordered', but it
> > can also be used as a mitigation against Spectre variant1 attacks that
> > speculative access memory past an array bounds check.
> >
> > 'ifence', via 'ifence_array_ptr', is an opt-in fallback to the default
> > mitigation provided by '__array_ptr'. It is also proposed for blocking
> > speculation in the 'get_user' path to bypass 'access_ok' checks. For
> > now, just provide the common definition for later patches to build
> > upon.
>
> This part of the description is probably unnecessary.

Probably, but having some redundant information in the changelog eases
'git blame' archaeology expeditions in the future.

>
> Eric
>
> >
> > Suggested-by: Peter Zijlstra 
> > Suggested-by: Alan Cox 
> > Cc: Tom Lendacky 
> > Cc: Mark Rutland 
> > Cc: Greg KH 
> > Cc: Thomas Gleixner 
> > Cc: Ingo Molnar 
> > Cc: "H. Peter Anvin" 
> > Cc: x...@kernel.org
> > Signed-off-by: Elena Reshetova 
> > Signed-off-by: Dan Williams 
> > ---
> >  arch/x86/include/asm/barrier.h |4 
> >  arch/x86/include/asm/msr.h |3 +--
> >  2 files changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
> > index 7fb336210e1b..b04f572d6d97 100644
> > --- a/arch/x86/include/asm/barrier.h
> > +++ b/arch/x86/include/asm/barrier.h
> > @@ -24,6 +24,10 @@
> >  #define wmb()asm volatile("sfence" ::: "memory")
> >  #endif
> >
> > +/* prevent speculative execution past this barrier */
> > +#define ifence() alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC, \
> > +"lfence", X86_FEATURE_LFENCE_RDTSC)
> > +
> >  #ifdef CONFIG_X86_PPRO_FENCE
> >  #define dma_rmb()rmb()
> >  #else
> > diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
> > index 07962f5f6fba..e426d2a33ff3 100644
> > --- a/arch/x86/include/asm/msr.h
> > +++ b/arch/x86/include/asm/msr.h
> > @@ -214,8 +214,7 @@ static __always_inline unsigned long long 
> > rdtsc_ordered(void)
> >* that some other imaginary CPU is updating continuously with a
> >* time stamp.
> >*/
> > - alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC,
> > -   "lfence", X86_FEATURE_LFENCE_RDTSC);
> > + ifence();
> >   return rdtsc();
> >  }
> >


Re: [PATCH 05/12] arm64: dts: mt7622: add PMIC MT6380 related nodes

2018-01-11 Thread Sean Wang
Hi, Philippe

Currently, I'm really confused about what usage STYLE of SPDX license
identifier I should use for each type of file. 

could you point me where I can find the related document describing SPDX
usage style for those files expected by the community in the future?

I found more than one way STYLE of SPDX present at current code, for
example as below. If there's no absolute definition for them, and then
which way that is better?

1)

for *.dts, applied with "// " at head or within " /* */ " not at head

such as 

arch/arm/boot/dts/bcm953012hr.dts:2: *  SPDX-License-Identifier:
BSD-3-Clause
arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts:4: *
SPDX-License-Identifier: (GPL-2.0+ OR MIT)
arch/arm/boot/dts/uniphier-ld6b-ref.dts:7: * SPDX-License-Identifier:
(GPL-2.0+ OR MIT)
arch/arm/boot/dts/owl-s500-guitar-bb-rev-b.dts:4: *
SPDX-License-Identifier: (GPL-2.0+ OR MIT)
arch/arm/boot/dts/keystone-k2g-ice.dts:6: * SPDX-License-Identifier:
GPL-2.0
arch/arm/boot/dts/uniphier-pro4-sanji.dts:7: * SPDX-License-Identifier:
(GPL-2.0+ OR MIT)
arch/arm/boot/dts/owl-s500-cubieboard6.dts:6: * SPDX-License-Identifier:
(GPL-2.0+ OR MIT)
arch/arm/boot/dts/uniphier-pro4-ace.dts:7: * SPDX-License-Identifier:
(GPL-2.0+ OR MIT)
arch/arm/boot/dts/uniphier-sld8-ref.dts:7: * SPDX-License-Identifier:
(GPL-2.0+ OR MIT)
arch/arm/boot/dts/uniphier-pro4-ref.dts:7: * SPDX-License-Identifier:
(GPL-2.0+ OR MIT)
arch/arm/boot/dts/uniphier-pxs2-gentil.dts:7: * SPDX-License-Identifier:
(GPL-2.0+ OR MIT)
arch/arm/boot/dts/uniphier-pxs2-vodka.dts:7: * SPDX-License-Identifier:
(GPL-2.0+ OR MIT)
arch/arm/boot/dts/uniphier-ld4-ref.dts:7: * SPDX-License-Identifier:
(GPL-2.0+ OR MIT)

arch/arm64/boot/dts/nvidia/tegra132-norrin.dts:1://
SPDX-License-Identifier: GPL-2.0
arch/arm64/boot/dts/nvidia/tegra186-p2771-.dts:1://
SPDX-License-Identifier: GPL-2.0
arch/arm64/boot/dts/nvidia/tegra210-smaug.dts:1://
SPDX-License-Identifier: GPL-2.0
arch/arm64/boot/dts/arm/vexpress-v2f-1xv7-ca53x2.dts:1://
SPDX-License-Identifier: GPL-2.0
arch/arm64/boot/dts/arm/rtsm_ve-aemv8a.dts:1:// SPDX-License-Identifier:
GPL-2.0
arch/arm64/boot/dts/arm/foundation-v8.dts:1:// SPDX-License-Identifier:
GPL-2.0
arch/arm64/boot/dts/arm/foundation-v8-gicv3.dts:1://
SPDX-License-Identifier: GPL-2.0
arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts:1://
SPDX-License-Identifier: GPL-2.0

2)

for *.c, applied with "// " at head or within " /* */ " not at head

such as

drivers/base/memory.c:1:// SPDX-License-Identifier: GPL-2.0
drivers/base/devtmpfs.c:1:// SPDX-License-Identifier: GPL-2.0
drivers/base/node.c:1:// SPDX-License-Identifier: GPL-2.0
drivers/base/dma-coherent.c:1:// SPDX-License-Identifier: GPL-2.0
drivers/cpuidle/cpuidle-pseries.c:1:// SPDX-License-Identifier: GPL-2.0
drivers/cpuidle/cpuidle-powernv.c:1:// SPDX-License-Identifier: GPL-2.0
drivers/mtd/maps/tsunami_flash.c:1:// SPDX-License-Identifier: GPL-2.0
drivers/mtd/maps/physmap_of_gemini.c:1:// SPDX-License-Identifier:
GPL-2.0
drivers/mtd/tests/mtd_test.c:1:// SPDX-License-Identifier: GPL-2.0
drivers/mtd/onenand/onenand_bbt.c:1:// SPDX-License-Identifier: GPL-2.0
drivers/media/common/b2c2/flexcop-i2c.c:1:// SPDX-License-Identifier:
GPL-2.0

drivers/soc/xilinx/zynqmp/pm.c:10: * SPDX-License-Identifier:   GPL-2.0+
drivers/soc/amlogic/meson-gx-pwrc-vpu.c:5: * SPDX-License-Identifier:
GPL-2.0+
drivers/soc/amlogic/meson-gx-socinfo.c:5: * SPDX-License-Identifier:
GPL-2.0+
drivers/soc/amlogic/meson-mx-socinfo.c:4: * SPDX-License-Identifier:
GPL-2.0+
drivers/i2c/busses/i2c-sprd.c:4: * SPDX-License-Identifier: (GPL-2.0+ OR
MIT)
drivers/spi/spi-meson-spicc.c:7: * SPDX-License-Identifier: GPL-2.0+
drivers/spi/spi-sprd-adi.c:4: * SPDX-License-Identifier: GPL-2.0
drivers/dma/sprd-dma.c:4: * SPDX-License-Identifier: GPL-2.0


3)

for *.h, applied with "// " at head or within " /* */ " at head

such as 

drivers/usb/dwc3/gadget.h:1:// SPDX-License-Identifier: GPL-2.0
drivers/usb/dwc3/io.h:1:// SPDX-License-Identifier: GPL-2.0
drivers/usb/dwc3/debug.h:1:// SPDX-License-Identifier: GPL-2.0
drivers/usb/dwc3/core.h:1:// SPDX-License-Identifier: GPL-2.0
drivers/usb/atm/usbatm.h:1:// SPDX-License-Identifier: GPL-2.0+
drivers/usb/misc/rio500_usb.h:1:// SPDX-License-Identifier: GPL-2.0+
drivers/usb/misc/sisusbvga/sisusb.h:1:// SPDX-License-Identifier:
(GPL-2.0 OR BSD-3-Clause)
drivers/usb/misc/sisusbvga/sisusb_init.h:1:// SPDX-License-Identifier:
(GPL-2.0+ OR BSD-3-Clause)
drivers/usb/misc/sisusbvga/sisusb_struct.h:1:// SPDX-License-Identifier:
(GPL-2.0+ OR BSD-3-Clause)
drivers/usb/misc/usb_u132.h:1:// SPDX-License-Identifier: GPL-2.0

drivers/tty/serial/dz.h:1:/* SPDX-License-Identifier: GPL-2.0 */
drivers/tty/serial/apbuart.h:1:/* SPDX-License-Identifier: GPL-2.0 */
drivers/tty/serial/sunzilog.h:1:/* SPDX-License-Identifier: GPL-2.0 */
drivers/tty/serial/zs.h:1:/* SPDX-License-Identifier: GPL-2.0 */
drivers/tty/serial/sh-sci.h:1:/* SPDX-License-Identifier: GPL-2.0 */
drivers/tty/serial/cpm_uart/cpm_uart_cpm1.h:1:/*
SPDX-License-I

Re: [PATCH net-next 00/11] add some new features and fix some bugs

2018-01-11 Thread lipeng (Y)



On 2018/1/12 1:07, David Miller wrote:

From: Peng Li 
Date: Thu, 11 Jan 2018 19:45:55 +0800


This patchset adds some new features and fixes some bugs:
[patch 1/11] adds ethtool_ops.get_channels support for VF.
[patch 2/11] removes TSO config command from VF driver.
[patch 3/11] adds ethtool_ops.get_coalesce support to PF.
[patch 4/11] adds ethtool_ops.set_coalesce support to PF.
[patch 5/11 - 11/11] do some code improvements and fix some bugs.

Can you please write a real commit message in your header postings
please?

Don't just copy the subject lines from the patches, and add one
sentence with a brief description.

Really write real paragraphs describing what the patch series
is doing, how it is doing it, and why it is doing it that
way.

A real explanation that tells the reader what exactly to
expect when they review the patches themselves.

Thanks for your advice.
A detail explanation is better for review, I will write
the "real explanation" in V2 patch-set.

Peng Li

Thank you.

.






Re: [PATCH v1 1/8] x86/entry/clearregs: Remove partial stack frame in fast system call

2018-01-11 Thread Josh Poimboeuf
On Tue, Jan 09, 2018 at 05:03:21PM -0800, Andi Kleen wrote:
> From: Andi Kleen 
> 
> Remove the partial stack frame in the 64bit syscall fast path.
> In the next patch we want to clear the extra registers, which requires
> to always save all registers. So remove the partial stack frame
> in the syscall fast path and always save everything.
> 
> This actually simplifies the code because the ptregs stubs
> are not needed anymore.
> 
> arch/x86/entry/entry_64.S   | 57 
> -
> arch/x86/entry/syscall_64.c |  2 +-

This diffstat doesn't need to be in the changelog.

-- 
Josh


Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Steven Rostedt
On Fri, 12 Jan 2018 11:56:12 +0900
Sergey Senozhatsky  wrote:

> Hi,
> 
> On (01/11/18 11:29), Steven Rostedt wrote:
> [..]
> > > - if the patch's goal is to bound (not necessarily to watchdog's 
> > > threshold)
> > > the amount of time we spend in console_unlock(), then the patch is kinda
> > > overcomplicated. but no further questions in this case.  
> > 
> > It's goal is to keep printk from running amok on a single CPU like it
> > currently does. This prevents one printk from never ending. And it is
> > far from complex. It doesn't deal with "offloading". The "handover" is
> > only done to those that are doing printks. What do you do if all CPUs
> > are in "critical sections", how would a "handoff to safe" work? Will
> > the printks never get out? If the machine were to triple fault and
> > reboot, we lost all of it.  
> 
> make printk_kthread to be just one of the things that compete for
> handed off console_sem, along with other CPUs.

Are you going to make printk thread a high priority task?

> 
> > > - but if the patch's goal is to bound (to lockup threshold) the amount of
> > > time spent in console_unlock() in order to avoid lockups [uh, a reason],
> > > then the patch is rather oversimplified.  
> > 
> > It's bound to print all the information that has been added to the
> > printk buffer. You want to bound it to some "time"  
> 
> not some... it's aligned with watchdog expectations.
> which is deterministic, isn't it?

When do you start the timer? What you are trying to solve isn't a
single printk that gets stuck. Just look at Tejun's module. To trigger
what he wanted, he had to do 10,000 printks from an interrupt context.

> 
> > My method, there's really no delay between a hand off. There's always
> > an active CPU doing printing. It matches the current method which works
> > well for getting information out. A delayed approach will break  
> 
> no, not necessarily. and my previous patch set had some bits of that
> "combined offloading and hand off" behaviour. I was thinking about
> extending it further, but decided not to. - printk_kthread would spin
> on console_owner until current console_sem hand off.

Is printk_thread always running, taking up CPU cycles?

> 
> > > claiming that for any given A, B, C the following is always true
> > > 
> > >   A * B < C
> > > 
> > > where
> > >   A is the amount of data to print in the worst case
> > >   B the time call_console_drivers() needs to print a single
> > > char to all registered and enabled consoles
> > >   C the watchdog's threshold
> > > 
> > > is not really a step forward.  
> > 
> > It's no different than what we have, except that we currently have A
> > being infinite. My patch makes A no longer infinite, but a constant.  
> 
> my point is - the constant can be unrealistically high. and can
> easily overlap watchdog_threshold, returning printk back to unbound
> land. IOW, if your bound is above the watchdog threshold then you
> don't have any bounds.

That makes no sense.

> 
> by example, with console=ttyS1,57600n8
> - keep increasing the watchdog_threshold until watchdog stops
>   complaining?
> or
> - keep reducing the logbuf size until it can be flushed under
>   watchdog_threshold seconds?

After playing with the module in my last email, I think your trying to
solve multiple printks, not one that is stuck. I'm solving the one that
is stuck problem, which was easily triggered by a simple (non stess
test) module.

> 
> 
> and I demonstrated how exactly we end up having a full logbuf of pending
> messages even on systems with faster consoles.

Where did you demonstrate that. There's so many emails I can't keep up.

But still, take a look at my simple module. I locked up the system
immediately with something that shouldn't have locked up the system.
And my patch fixed it. I think that speaks louder than any of our
opinions.

> 
> 
> [..]
> > Great, and there's cases that die that my patch solves. Lets add my
> > patch now since it is orthogonal to an offloading approach and see how
> > it works, because it would solve issues that I have hit. If you can
> > show that this isn't good enough we can add another approach.  
> 
> it bounds printk. yes, good! that's what I want. but it bounds it to a
> wrong value. I want more deterministic and close to reality bound.
> and I also want to get rid of "the last console_sem owner prints it all"
> thing. I demonstrated with the traces how that thing can bite.

I have not seen any realistic traces, but perhaps I missed something. It
all requires lots of printks, in weird scenarios. I demonstrated that
the system can be locked up with few printks (one per cpu per
millisecond), and my patch solves it.

> 
> 
> > Honestly, I don't see why you are against this patch.  
> 
> prove it! show me exactly when and where I said that I NACK or
> block the patch? seriously.

Why are we having this discussion then? Just give your Ack to my patch,
and we can look to see if we need to imp

Re: [PATCH 3/5] x86/ibrs: Add direct access support for MSR_IA32_SPEC_CTRL

2018-01-11 Thread Raj, Ashok
On Thu, Jan 11, 2018 at 05:58:11PM -0800, Dave Hansen wrote:
> On 01/11/2018 05:32 PM, Ashok Raj wrote:
> > +static void save_guest_spec_ctrl(struct vcpu_vmx *vmx)
> > +{
> > +   if (boot_cpu_has(X86_FEATURE_SPEC_CTRL)) {
> > +   vmx->spec_ctrl = spec_ctrl_get();
> > +   spec_ctrl_restriction_on();
> > +   } else
> > +   rmb();
> > +}
> 
> Does this need to be "ifence()"?  Better yet, do we just need to create
> a helper for boot_cpu_has(X86_FEATURE_SPEC_CTRL) that does the barrier?

Yes... Didn't keep track of ifence() evolution :-)..

We could do a helper, will look into other uses and see we can make find a 
common way to comprehend usages like above.

Cheers,
Ashok


Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Sergey Senozhatsky
On (01/11/18 20:30), Steven Rostedt wrote:
[..]
> Today, printk() can print for a time of A * B, where, as you state
> above:
> 
>A is the amount of data to print in the worst case
>B the time call_console_drivers() needs to print a single
> char to all registered and enabled consoles
> 
> In the worse case, the current approach is A is infinite. That is,
> printk() never stops, as long as there is a printk happening on another
> CPU before B can finish. A will keep growing. The call to printk() will
> never return. The more CPUs you have, the more likely this will occur.
> All it takes is a few CPUs doing periodic printks. If there is a slow
> console, where the periodic printk on other CPUs occur quicker than the
> first can finish, the first one will be stuck forever. Doesn't take
> much to have this happen.

console_sem owner can stuck in console_unlock() not because of printk-s
happening right now on other CPUs, but because those printk-s could have
happened while console_sem owner was preempted. when it comes back it has
a ton of pending messages.

I said it before - "we stuck in console_unlock() because others CPUs
printk right now a lot" is not always true. we have preemption. and
the "last console_sem owner prints it all" is not good in this case.

> With my patch, A is fixed to the size of the buffer. A single printk()
> can never print more than that. If another CPU comes in and does a
> printk, then it will take over the task of printing, and release the
> first printk.

yes. and "another CPU" that comes to take over has to print all the
pending messages. from whatever context it's currently in. and bringing
A * B below C can be quite tricky, if possible at all (!). most likely
people will just add more touch_nmi_watchdog().

again, I don't disagree on "let's bound printk". yes, we totally
should! but the bound must be realistic if we want to fix the damn
thing (either with printk_kthread, or hand off, or anything else).

-ss


Re: linux-next: build failure after merge of the net-next tree

2018-01-11 Thread David Miller
From: Alexei Starovoitov 
Date: Wed, 10 Jan 2018 17:58:54 -0800

> On Thu, Jan 11, 2018 at 11:53:55AM +1100, Stephen Rothwell wrote:
>> Hi all,
>> 
>> After merging the net-next tree, today's linux-next build (x86_64
>> allmodconfig) failed like this:
>> 
>> kernel/bpf/verifier.o: In function `bpf_check':
>> verifier.c:(.text+0xd86e): undefined reference to `bpf_patch_call_args'
>> 
>> Caused by commit
>> 
>>   1ea47e01ad6e ("bpf: add support for bpf_call to interpreter")
>> 
>> interacting with commit
>> 
>>   290af86629b2 ("bpf: introduce BPF_JIT_ALWAYS_ON config")
>> 
>> from the bpf and net trees.
>> 
>> I have just reverted commit 290af86629b2 for today.  A better solution
>> would be nice (lie fixing this in a merge between the net-next and net
>> trees).
> 
> that's due to 'endif' from 290af86629b2 needs to be moved above
> bpf_patch_call_args() definition.

That doesn't fix it, because then you'd need to expose
interpreters_args as well and obviously that can't be right.

Instead, we should never call bpf_patch_call_args() when JIT always on
is enabled.  So if we fail to JIT the subprogs we should fail
immediately.

This is the net --> net-next merge resolution I am about to use to fix
this:

...
 +static int fixup_call_args(struct bpf_verifier_env *env)
 +{
 +  struct bpf_prog *prog = env->prog;
 +  struct bpf_insn *insn = prog->insnsi;
-   int i, depth;
++  int i, depth, err;
 +
-   if (env->prog->jit_requested)
-   if (jit_subprogs(env) == 0)
++  err = 0;
++  if (env->prog->jit_requested) {
++  err = jit_subprogs(env);
++  if (err == 0)
 +  return 0;
- 
++  }
++#ifndef CONFIG_BPF_JIT_ALWAYS_ON
 +  for (i = 0; i < prog->len; i++, insn++) {
 +  if (insn->code != (BPF_JMP | BPF_CALL) ||
 +  insn->src_reg != BPF_PSEUDO_CALL)
 +  continue;
 +  depth = get_callee_stack_depth(env, insn, i);
 +  if (depth < 0)
 +  return depth;
 +  bpf_patch_call_args(insn, depth);
 +  }
-   return 0;
++  err = 0;
++#endif
++  return err;
 +}
 +
  /* fixup insn->imm field of bpf_call instructions
   * and inline eligible helpers as explicit sequence of BPF instructions
   *


Re: [PATCH 1/5] x86/ibrs: Introduce native_rdmsrl, and native_wrmsrl

2018-01-11 Thread Raj, Ashok
On Thu, Jan 11, 2018 at 06:20:13PM -0800, Andy Lutomirski wrote:
> On Thu, Jan 11, 2018 at 5:52 PM, Raj, Ashok  wrote:
> >>
> >> What's wrong with native_read_msr()?
> >
> > Yes, i think i should have added to msr.h. The names didn't read as a
> > pair, one was native_read_msr, wrmsrl could be taken over when paravirt is
> > defined?
> 
> Why do you need to override paravirt?

The idea was since these MSR's are passed through we shouldn't need to 
handle them any differently. Also its best to do this as soon as possible
and avoid longer paths to get this barrier to hardware.



Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Sergey Senozhatsky
Hi,

On (01/11/18 11:29), Steven Rostedt wrote:
[..]
> > - if the patch's goal is to bound (not necessarily to watchdog's threshold)
> > the amount of time we spend in console_unlock(), then the patch is kinda
> > overcomplicated. but no further questions in this case.
> 
> It's goal is to keep printk from running amok on a single CPU like it
> currently does. This prevents one printk from never ending. And it is
> far from complex. It doesn't deal with "offloading". The "handover" is
> only done to those that are doing printks. What do you do if all CPUs
> are in "critical sections", how would a "handoff to safe" work? Will
> the printks never get out? If the machine were to triple fault and
> reboot, we lost all of it.

make printk_kthread to be just one of the things that compete for
handed off console_sem, along with other CPUs.

> > - but if the patch's goal is to bound (to lockup threshold) the amount of
> > time spent in console_unlock() in order to avoid lockups [uh, a reason],
> > then the patch is rather oversimplified.
> 
> It's bound to print all the information that has been added to the
> printk buffer. You want to bound it to some "time"

not some... it's aligned with watchdog expectations.
which is deterministic, isn't it?

> My method, there's really no delay between a hand off. There's always
> an active CPU doing printing. It matches the current method which works
> well for getting information out. A delayed approach will break

no, not necessarily. and my previous patch set had some bits of that
"combined offloading and hand off" behaviour. I was thinking about
extending it further, but decided not to. - printk_kthread would spin
on console_owner until current console_sem hand off.

> > claiming that for any given A, B, C the following is always true
> > 
> > A * B < C
> > 
> > where
> > A is the amount of data to print in the worst case
> > B the time call_console_drivers() needs to print a single
> >   char to all registered and enabled consoles
> > C the watchdog's threshold
> > 
> > is not really a step forward.
> 
> It's no different than what we have, except that we currently have A
> being infinite. My patch makes A no longer infinite, but a constant.

my point is - the constant can be unrealistically high. and can
easily overlap watchdog_threshold, returning printk back to unbound
land. IOW, if your bound is above the watchdog threshold then you
don't have any bounds.

by example, with console=ttyS1,57600n8
- keep increasing the watchdog_threshold until watchdog stops
  complaining?
or
- keep reducing the logbuf size until it can be flushed under
  watchdog_threshold seconds?


and I demonstrated how exactly we end up having a full logbuf of pending
messages even on systems with faster consoles.


[..]
> Great, and there's cases that die that my patch solves. Lets add my
> patch now since it is orthogonal to an offloading approach and see how
> it works, because it would solve issues that I have hit. If you can
> show that this isn't good enough we can add another approach.

it bounds printk. yes, good! that's what I want. but it bounds it to a
wrong value. I want more deterministic and close to reality bound.
and I also want to get rid of "the last console_sem owner prints it all"
thing. I demonstrated with the traces how that thing can bite.


> Honestly, I don't see why you are against this patch.

prove it! show me exactly when and where I said that I NACK or
block the patch? seriously.


> It doesn't stop your work.

and I never said it would. your patch changes nothing on my side, that's
my message. as of now I have out-of-tree patches, well I'll keep using
them. nothing new.


> If this patch isn't enough

BINGO! this is all I'm trying to say.
and the only reply (if there is any at all!) I'm getting is
"GTFO!!! your problems are unrealistic! we gonna release the
patch and wait for someone to come along and say us something
new about printk issues. but not you!".


> (but it does fix some issues)

obviously there are cases which your patch addresses. have I ever
denied that? but, once again, obviously, there are cases which it
doesn't. and those cases tend to bite my setups. I have repeated
it many times, and have explained in great details which parts I'm
talking about.

and I have never run unrealistic test_printk.ko against your patch
or anything alike; why the heck would I do that.


> Really, it sounds like you are afraid of this patch, that it might
> be good enough for most cases which would make adding another approach
> even more difficult.

LOL! wish I knew how to capture screenshots on Linux!

-ss


Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Steven Rostedt
On Thu, 11 Jan 2018 20:30:57 -0500
Steven Rostedt  wrote:

> I have to say that your analysis here really does point out the benefit
> of my patch.
> 
> Today, printk() can print for a time of A * B, where, as you state
> above:
> 
>A is the amount of data to print in the worst case
>B the time call_console_drivers() needs to print a single
> char to all registered and enabled consoles
> 
> In the worse case, the current approach is A is infinite. That is,
> printk() never stops, as long as there is a printk happening on another
> CPU before B can finish. A will keep growing. The call to printk() will
> never return. The more CPUs you have, the more likely this will occur.
> All it takes is a few CPUs doing periodic printks. If there is a slow
> console, where the periodic printk on other CPUs occur quicker than the
> first can finish, the first one will be stuck forever. Doesn't take
> much to have this happen.
> 
> With my patch, A is fixed to the size of the buffer. A single printk()
> can never print more than that. If another CPU comes in and does a
> printk, then it will take over the task of printing, and release the
> first printk.

In fact, below is a module I made (starting with Tejun's crazy stress
test, then removing all the craziness). This simple module locks up the
system without my patch. After applying my patch, the system runs fine.

All I did was start off a work queue on each CPU, and each CPU does one
printk() followed by a millisecond sleep. No 10,000 printks, nothing
in an interrupt handler. Preemption is disabled while the printk
happens, but that's normal.

This is much closer to an OOM happening all over the system, where OOMs
stack dumps are occurring on different CPUS.

I ran this on a box with 4 CPUs and a serial console (so it has a slow
console). Again, all I have is each CPU doing exactly ONE printk()!
then sleeping for a full millisecond! It will cause a lot of output,
and perhaps slow the system down. But it should not lock up the system.
But without my patch, it does!

Try it!

Test it on a box, and it will lock up. Then add my patch and see what
the results are. I think this speaks very loudly in favor of applying
my patch.

Again, the below module locks up my system immediately without my
patch. With my patch, no problem. In fact, it's still running, while I
wrote this email, and it hardly shows a slow down in the system.


-- Steve

#include 
#include 
#include 
#include 
#include 
#include 

static bool stop_testing;

static void preempt_printk_workfn(struct work_struct *work)
{
while (!READ_ONCE(stop_testing)) {
preempt_disable();
printk("%5d%-75s\n", smp_processor_id(), " XXX PREEMPT");
preempt_enable();
msleep(1);
}
}

static struct work_struct __percpu *works;

static void finish(void)
{
int cpu;

WRITE_ONCE(stop_testing, true);
for_each_online_cpu(cpu)
flush_work(per_cpu_ptr(works, cpu));
free_percpu(works);
}

static int __init test_init(void)
{
int cpu;

works = alloc_percpu(struct work_struct);
if (!works)
return -ENOMEM;

/*
 * This is just a test module. This will break if you
 * do any CPU hot plugging between loading and
 * unloading the module.
 */

for_each_online_cpu(cpu) {
struct work_struct *work = per_cpu_ptr(works, cpu);

INIT_WORK(work, &preempt_printk_workfn);
schedule_work_on(cpu, work);
}

return 0;
}

static void __exit test_exit(void)
{
finish();
}

module_init(test_init);
module_exit(test_exit);
MODULE_LICENSE("GPL");



[PATCH 1/2] genirq/affinity: assign vectors to all possible CPUs

2018-01-11 Thread Ming Lei
From: Christoph Hellwig 

Currently we assign managed interrupt vectors to all present CPUs.  This
works fine for systems were we only online/offline CPUs.  But in case of
systems that support physical CPU hotplug (or the virtualized version of
it) this means the additional CPUs covered for in the ACPI tables or on
the command line are not catered for.  To fix this we'd either need to
introduce new hotplug CPU states just for this case, or we can start
assining vectors to possible but not present CPUs.

Reported-by: Christian Borntraeger 
Tested-by: Christian Borntraeger 
Tested-by: Stefan Haberland 
Cc: linux-kernel@vger.kernel.org
Cc: Thomas Gleixner 
Signed-off-by: Christoph Hellwig 
---
 kernel/irq/affinity.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index e12d35108225..a37a3b4b6342 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -39,7 +39,7 @@ static void irq_spread_init_one(struct cpumask *irqmsk, 
struct cpumask *nmsk,
}
 }
 
-static cpumask_var_t *alloc_node_to_present_cpumask(void)
+static cpumask_var_t *alloc_node_to_possible_cpumask(void)
 {
cpumask_var_t *masks;
int node;
@@ -62,7 +62,7 @@ static cpumask_var_t *alloc_node_to_present_cpumask(void)
return NULL;
 }
 
-static void free_node_to_present_cpumask(cpumask_var_t *masks)
+static void free_node_to_possible_cpumask(cpumask_var_t *masks)
 {
int node;
 
@@ -71,22 +71,22 @@ static void free_node_to_present_cpumask(cpumask_var_t 
*masks)
kfree(masks);
 }
 
-static void build_node_to_present_cpumask(cpumask_var_t *masks)
+static void build_node_to_possible_cpumask(cpumask_var_t *masks)
 {
int cpu;
 
-   for_each_present_cpu(cpu)
+   for_each_possible_cpu(cpu)
cpumask_set_cpu(cpu, masks[cpu_to_node(cpu)]);
 }
 
-static int get_nodes_in_cpumask(cpumask_var_t *node_to_present_cpumask,
+static int get_nodes_in_cpumask(cpumask_var_t *node_to_possible_cpumask,
const struct cpumask *mask, nodemask_t *nodemsk)
 {
int n, nodes = 0;
 
/* Calculate the number of nodes in the supplied affinity mask */
for_each_node(n) {
-   if (cpumask_intersects(mask, node_to_present_cpumask[n])) {
+   if (cpumask_intersects(mask, node_to_possible_cpumask[n])) {
node_set(n, *nodemsk);
nodes++;
}
@@ -109,7 +109,7 @@ irq_create_affinity_masks(int nvecs, const struct 
irq_affinity *affd)
int last_affv = affv + affd->pre_vectors;
nodemask_t nodemsk = NODE_MASK_NONE;
struct cpumask *masks;
-   cpumask_var_t nmsk, *node_to_present_cpumask;
+   cpumask_var_t nmsk, *node_to_possible_cpumask;
 
/*
 * If there aren't any vectors left after applying the pre/post
@@ -125,8 +125,8 @@ irq_create_affinity_masks(int nvecs, const struct 
irq_affinity *affd)
if (!masks)
goto out;
 
-   node_to_present_cpumask = alloc_node_to_present_cpumask();
-   if (!node_to_present_cpumask)
+   node_to_possible_cpumask = alloc_node_to_possible_cpumask();
+   if (!node_to_possible_cpumask)
goto out;
 
/* Fill out vectors at the beginning that don't need affinity */
@@ -135,8 +135,8 @@ irq_create_affinity_masks(int nvecs, const struct 
irq_affinity *affd)
 
/* Stabilize the cpumasks */
get_online_cpus();
-   build_node_to_present_cpumask(node_to_present_cpumask);
-   nodes = get_nodes_in_cpumask(node_to_present_cpumask, cpu_present_mask,
+   build_node_to_possible_cpumask(node_to_possible_cpumask);
+   nodes = get_nodes_in_cpumask(node_to_possible_cpumask, 
cpu_possible_mask,
 &nodemsk);
 
/*
@@ -146,7 +146,7 @@ irq_create_affinity_masks(int nvecs, const struct 
irq_affinity *affd)
if (affv <= nodes) {
for_each_node_mask(n, nodemsk) {
cpumask_copy(masks + curvec,
-node_to_present_cpumask[n]);
+node_to_possible_cpumask[n]);
if (++curvec == last_affv)
break;
}
@@ -160,7 +160,7 @@ irq_create_affinity_masks(int nvecs, const struct 
irq_affinity *affd)
vecs_per_node = (affv - (curvec - affd->pre_vectors)) / nodes;
 
/* Get the cpus on this node which are in the mask */
-   cpumask_and(nmsk, cpu_present_mask, node_to_present_cpumask[n]);
+   cpumask_and(nmsk, cpu_possible_mask, 
node_to_possible_cpumask[n]);
 
/* Calculate the number of cpus per vector */
ncpus = cpumask_weight(nmsk);
@@ -192,7 +192,7 @@ irq_create_affinity_masks(int nvecs, const struct 
irq_affinity *affd)
/* Fill out vectors at the end 

[PATCH 2/2] blk-mq: simplify queue mapping & schedule with each possisble CPU

2018-01-11 Thread Ming Lei
From: Christoph Hellwig 

The previous patch assigns interrupt vectors to all possible CPUs, so
now hctx can be mapped to possible CPUs, this patch applies this fact
to simplify queue mapping & schedule so that we don't need to handle
CPU hotplug for dealing with physical CPU plug & unplug. With this
simplication, we can work well on physical CPU plug & unplug, which
is a normal use case for VM at least.

Make sure we allocate blk_mq_ctx structures for all possible CPUs, and
set hctx->numa_node for possible CPUs which are mapped to this hctx. And
only choose the online CPUs for schedule.

Reported-by: Christian Borntraeger 
Tested-by: Christian Borntraeger 
Tested-by: Stefan Haberland 
Cc: Thomas Gleixner 
Signed-off-by: Christoph Hellwig 
(merged the three into one because any single one may not work, and fix
selecting online CPUs for scheduler)
Signed-off-by: Ming Lei 
---
 block/blk-mq.c | 19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8000ba6db07d..ef9beca2d117 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -440,7 +440,7 @@ struct request *blk_mq_alloc_request_hctx(struct 
request_queue *q,
blk_queue_exit(q);
return ERR_PTR(-EXDEV);
}
-   cpu = cpumask_first(alloc_data.hctx->cpumask);
+   cpu = cpumask_first_and(alloc_data.hctx->cpumask, cpu_online_mask);
alloc_data.ctx = __blk_mq_get_ctx(q, cpu);
 
rq = blk_mq_get_request(q, NULL, op, &alloc_data);
@@ -1323,9 +1323,10 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx 
*hctx)
if (--hctx->next_cpu_batch <= 0) {
int next_cpu;
 
-   next_cpu = cpumask_next(hctx->next_cpu, hctx->cpumask);
+   next_cpu = cpumask_next_and(hctx->next_cpu, hctx->cpumask,
+   cpu_online_mask);
if (next_cpu >= nr_cpu_ids)
-   next_cpu = cpumask_first(hctx->cpumask);
+   next_cpu = 
cpumask_first_and(hctx->cpumask,cpu_online_mask);
 
hctx->next_cpu = next_cpu;
hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
@@ -2219,16 +2220,11 @@ static void blk_mq_init_cpu_queues(struct request_queue 
*q,
INIT_LIST_HEAD(&__ctx->rq_list);
__ctx->queue = q;
 
-   /* If the cpu isn't present, the cpu is mapped to first hctx */
-   if (!cpu_present(i))
-   continue;
-
-   hctx = blk_mq_map_queue(q, i);
-
/*
 * Set local node, IFF we have more than one hw queue. If
 * not, we remain on the home node of the device
 */
+   hctx = blk_mq_map_queue(q, i);
if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE)
hctx->numa_node = local_memory_node(cpu_to_node(i));
}
@@ -2285,7 +2281,7 @@ static void blk_mq_map_swqueue(struct request_queue *q)
 *
 * If the cpu isn't present, the cpu is mapped to first hctx.
 */
-   for_each_present_cpu(i) {
+   for_each_possible_cpu(i) {
hctx_idx = q->mq_map[i];
/* unmapped hw queue can be remapped after CPU topo changed */
if (!set->tags[hctx_idx] &&
@@ -2339,7 +2335,8 @@ static void blk_mq_map_swqueue(struct request_queue *q)
/*
 * Initialize batch roundrobin counts
 */
-   hctx->next_cpu = cpumask_first(hctx->cpumask);
+   hctx->next_cpu = cpumask_first_and(hctx->cpumask,
+   cpu_online_mask);
hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
}
 }
-- 
2.9.5



[PATCH 0/2] blk-mq: support physical CPU hotplug

2018-01-11 Thread Ming Lei
Hi,

This two patches support physical CPU hotplug, so that we can make blk-mq
scale well when new physical CPU is added or removed, and this use case
is normal for VM world.

Also this patchset fixes the following warning reported by Christian
Borntraeger:

https://marc.info/?l=linux-block&m=151092973417143&w=2

Christoph Hellwig (2):
  genirq/affinity: assign vectors to all possible CPUs
  blk-mq: simplify queue mapping & schedule with each possisble CPU

 block/blk-mq.c| 19 ---
 kernel/irq/affinity.c | 30 +++---
 2 files changed, 23 insertions(+), 26 deletions(-)

-- 
2.9.5



Re: KASLR may break some kernel features (was Re: [PATCH v5 1/4] kaslr: add immovable_mem=nn[KMG]@ss[KMG] to specify extracting memory)

2018-01-11 Thread Chao Fan
On Fri, Jan 12, 2018 at 10:31:52AM +0800, Baoquan He wrote:
>On 01/11/18 at 10:04am, Kees Cook wrote:
>> On Thu, Jan 11, 2018 at 1:00 AM, Baoquan He  wrote:
>> > Hi Luiz,
>> >
>> > On 01/04/18 at 11:21am, Luiz Capitulino wrote:
>> >> Having a generic kaslr parameter to control where the kernel is extracted
>> >> is one solution for this problem.
>> >>
>> >> The general problem statement is that KASLR may break some kernel features
>> >> depending on where the kernel is extracted. Two examples are hot-plugged
>> >> memory (this series) and 1GB HugeTLB pages.
>> >>
>> >> The 1GB HugeTLB page issue is not specific to KVM guests. It just happens
>> >> that there's a bunch of people running guests with up to 5GB of memory and
>> >> with that amount of memory you have one or two 1GB pages and is easier for
>> >> KASLR to extract the kernel into a 1GB region and split a 1GB page. So,
>> >> you may not get any 1GB pages at all when this happens. However, I can 
>> >> also
>> >> reproduce this on bare-metal with lots of memory where I can loose a 1GB
>> >> page from time to time.
>> >>
>> >> Having a kaslr_range= parameter solves both issues, but two major 
>> >> drawbacks
>> >> is that it breaks existing setups and I guess users will have a very hard
>> >> time choosing good ranges.
>> >>
>> >> Another idea would be to have a CONFIG_KASLR_RANGES, where each arch
>> >> could have a list of ranges known to contain holes and/or immovable
>> >> memory and only extract the kernel into those ranges.
>> >
>> > If add CONFIG_KASLR_RANGES, then a distro like RHEL will have this range
>> > always, whether people need hugetlb or not.
>> >
>> > So in this case, what range do we need to avoid? Only [1G, 2G]?
>> 
>> Any ranges like that that need to be avoided should be known at build
>> time, so they should simply be added to the mem_avoid list that is
>> already present in the KASLR code...
>
>Seems KASLR doesn't have an solution which allow user to specify avoided
>range for kernel text KASLR stage only. The memmap="!#$" can add range to
>mem_avoid, while it will make them not added to e820.
>

How about adding a new option, like "huge_page=nn@ss". Fill the regions
to mem_avoid. But this parameter will only be parsed in kaslr period.
The followed handlling of memmap will not be excuted.

Thanks,
Chao Fan

>Here like this hugetlb case, Luiz wants kernel to avoid the [2G, 3G)
>candidate position for hugetlb allocation, meanwhile wants it to be
>added to mm subsystem later.
>
>Thanks
>Baoquan
>
>
>




Re: [PATCH v2 06/19] asm-generic/barrier: mask speculative execution flows

2018-01-11 Thread Eric W. Biederman
Dan Williams  writes:

> diff --git a/include/linux/nospec.h b/include/linux/nospec.h
> new file mode 100644
> index ..5c66fc30f919
> --- /dev/null
> +++ b/include/linux/nospec.h
> @@ -0,0 +1,71 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright(c) 2018 Intel Corporation. All rights reserved.
> +
> +#ifndef __NOSPEC_H__
> +#define __NOSPEC_H__
> +
> +#include 
> +#include 
> +
> +#ifndef array_ptr_mask
> +#define array_ptr_mask(idx, sz)  
> \
> +({   \
> + unsigned long mask; \
> + unsigned long _i = (idx);   \
> + unsigned long _s = (sz);\
> + \
> + mask = ~(long)(_i | (_s - 1 - _i)) >> (BITS_PER_LONG - 1);  \
> + mask;   \
> +})
> +#endif

This could really use a comment that explains it generates 0
for out of bound accesses and -1L aka 0x for
all other accesses.

The code is clever enough which values it generates is not obvious.

Eric


Re: [PATCH] arm64: dts: angler: add pstore-ramoops support

2018-01-11 Thread Jeremy McNicoll
On Thu, Dec 28, 2017 at 02:38:29AM -0500, zhuoweizh...@yahoo.com wrote:
> From: Zhuowei Zhang 
> 
> Support pstore-ramoops for retrieving kernel oops and panics after reboot.
> 
> The address and configs are taken from the downstream kernel's device tree.
> 
> Signed-off-by: Zhuowei Zhang 
> ---
>  arch/arm64/boot/dts/qcom/msm8994-angler-rev-101.dts | 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/qcom/msm8994-angler-rev-101.dts 
> b/arch/arm64/boot/dts/qcom/msm8994-angler-rev-101.dts
> index dfa08f5..9ce3a6e 100644
> --- a/arch/arm64/boot/dts/qcom/msm8994-angler-rev-101.dts
> +++ b/arch/arm64/boot/dts/qcom/msm8994-angler-rev-101.dts
> @@ -37,4 +37,19 @@
>   pinctrl-1 = <&blsp1_uart2_sleep>;
>   };
>   };
> +
> + reserved-memory {
> + #address-cells = <2>;
> + #size-cells = <2>;
> + ranges;
> +
> + ramoops@1fe0 {
> + compatible = "ramoops";
> + reg = <0 0x1fe0 0 0x0020>;

Downstream doesn't use 0x0 to denote 0.  I am fine with this, if 
someone has a concern or issue with it we can change it before merging.


> + console-size = <0x10>;
> + record-size = <0x1>;
> + ftrace-size = <0x1>;
> + pmsg-size = <0x8>;

We could pad this with leading 0's but I find this much easier
to read.

> + };
> + };
>  };

Thank you very much for taking the time to send this.  


Acked-by: Jeremy McNicoll 




Re: KASLR may break some kernel features (was Re: [PATCH v5 1/4] kaslr: add immovable_mem=nn[KMG]@ss[KMG] to specify extracting memory)

2018-01-11 Thread Baoquan He
On 01/11/18 at 10:04am, Kees Cook wrote:
> On Thu, Jan 11, 2018 at 1:00 AM, Baoquan He  wrote:
> > Hi Luiz,
> >
> > On 01/04/18 at 11:21am, Luiz Capitulino wrote:
> >> Having a generic kaslr parameter to control where the kernel is extracted
> >> is one solution for this problem.
> >>
> >> The general problem statement is that KASLR may break some kernel features
> >> depending on where the kernel is extracted. Two examples are hot-plugged
> >> memory (this series) and 1GB HugeTLB pages.
> >>
> >> The 1GB HugeTLB page issue is not specific to KVM guests. It just happens
> >> that there's a bunch of people running guests with up to 5GB of memory and
> >> with that amount of memory you have one or two 1GB pages and is easier for
> >> KASLR to extract the kernel into a 1GB region and split a 1GB page. So,
> >> you may not get any 1GB pages at all when this happens. However, I can also
> >> reproduce this on bare-metal with lots of memory where I can loose a 1GB
> >> page from time to time.
> >>
> >> Having a kaslr_range= parameter solves both issues, but two major drawbacks
> >> is that it breaks existing setups and I guess users will have a very hard
> >> time choosing good ranges.
> >>
> >> Another idea would be to have a CONFIG_KASLR_RANGES, where each arch
> >> could have a list of ranges known to contain holes and/or immovable
> >> memory and only extract the kernel into those ranges.
> >
> > If add CONFIG_KASLR_RANGES, then a distro like RHEL will have this range
> > always, whether people need hugetlb or not.
> >
> > So in this case, what range do we need to avoid? Only [1G, 2G]?
> 
> Any ranges like that that need to be avoided should be known at build
> time, so they should simply be added to the mem_avoid list that is
> already present in the KASLR code...

Seems KASLR doesn't have an solution which allow user to specify avoided
range for kernel text KASLR stage only. The memmap="!#$" can add range to
mem_avoid, while it will make them not added to e820.

Here like this hugetlb case, Luiz wants kernel to avoid the [2G, 3G)
candidate position for hugetlb allocation, meanwhile wants it to be
added to mm subsystem later.

Thanks
Baoquan



[PATCH v2 0/2] Add reboot modes for LEGO MINDSTORMS EV3

2018-01-11 Thread David Lechner
This series adds a new device tree node to declare a special memory
address that is used by the I2C bootloader on LEGO MINDSTORMS EV3
to boot into a special firmware update mode and enables the required
module to use it.

v2 changes:
* rebase on linux-davinci/master

David Lechner (2):
  ARM: dts: da850-lego-ev3: Add node for reboot modes
  ARM: davinci_all_defconfig: enable SYSCON_REBOOT_MODE

 arch/arm/boot/dts/da850-lego-ev3.dts   | 17 +
 arch/arm/configs/davinci_all_defconfig |  1 +
 2 files changed, 18 insertions(+)

-- 
2.7.4



[PATCH v2 2/2] ARM: davinci_all_defconfig: enable SYSCON_REBOOT_MODE

2018-01-11 Thread David Lechner
This enables SYSCON_REBOOT_MODE as a module. This is used by LEGO
MINDSTORMS EV3 to reboot into a special firmware update mode.

Signed-off-by: David Lechner 
---
 arch/arm/configs/davinci_all_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/davinci_all_defconfig 
b/arch/arm/configs/davinci_all_defconfig
index 026154c..bcb70c2 100644
--- a/arch/arm/configs/davinci_all_defconfig
+++ b/arch/arm/configs/davinci_all_defconfig
@@ -126,6 +126,7 @@ CONFIG_GPIO_PCA953X=y
 CONFIG_GPIO_PCA953X_IRQ=y
 CONFIG_POWER_RESET=y
 CONFIG_POWER_RESET_GPIO=y
+CONFIG_SYSCON_REBOOT_MODE=m
 CONFIG_BATTERY_LEGO_EV3=m
 CONFIG_WATCHDOG=y
 CONFIG_DAVINCI_WATCHDOG=m
-- 
2.7.4



  1   2   3   4   5   6   7   8   9   10   >