Re: [PATCH 2/2] printk: Add boottime and real timestamps

2017-07-25 Thread Mark Salyzyn

On 07/25/2017 06:00 AM, Peter Zijlstra wrote:

On Tue, Jul 25, 2017 at 08:17:27AM -0400, Prarit Bhargava wrote:

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 5b1662ec546f..6cd38a25f8ea 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1,8 +1,8 @@
  menu "printk and dmesg options"
  
  config PRINTK_TIME

-   int "Show timing information on printks (0-1)"
-   range 0 1
+   int "Show timing information on printks (0-3)"
+   range 0 3
default "0"
depends on PRINTK
help
@@ -13,7 +13,8 @@ config PRINTK_TIME
  The timestamp is always recorded internally, and exported
  to /dev/kmsg. This flag just specifies if the timestamp should
  be included, not that the timestamp is recorded. 0 disables the
- timestamp and 1 uses the local clock.
+ timestamp and 1 uses the local clock, 2 uses the monotonic clock, and
+ 3 uses real clock.
  
  	  The behavior is also controlled by the kernel command line

  parameter printk.time=1. See 
Documentation/admin-guide/kernel-parameters.rst


choice
prompt "printk default clock"
default PRIMTK_TIME_DISABLE
help
 goes here

config PRINTK_TIME_DISABLE
bool "Disabled"
help
 goes here

config PRINTK_TIME_LOCAL
bool "local clock"
help
 goes here

config PRINTK_TIME_MONO
bool "CLOCK_MONOTONIC"
help
 goes here

config PRINTK_TIME_REAL
bool "CLOCK_REALTIME"
help
 goes here

endchoice

config PRINTK_TIME
int
default 0 if PRINTK_TIME_DISABLE
default 1 if PRINTK_TIME_LOCAL
default 2 if PRINTK_TIME_MONO
default 3 if PRINTK_TIME_REAL


Although I must strongly discourage using REALTIME, DST will make
untangling your logs an absolute nightmare. I would simply not provide
it.


I agree with using select, ensures only valid values are landed. It does 
mean that CONFIG_PRINTK_TIME in-effect gets deprecated.


REALTIME is always UTC in the kernel.

What about BOOTTIME?

-- Mark

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v2 01/10] net: dsa: lan9303: Fixed MDIO interface

2017-07-25 Thread Vivien Didelot
Hi Egil,

Egil Hjelmeland  writes:

> Fixes after testing on actual HW:
>
> - lan9303_mdio_write()/_read() must multiply register number
>   by 4 to get offset
>
> - Indirect access (PMI) to phy register only work in I2C mode. In
>   MDIO mode phy registers must be accessed directly. Introduced
>   struct lan9303_phy_ops to handle the two modes. Renamed functions
>   to clarify.
>
> - lan9303_detect_phy_setup() : Failed MDIO read return 0x.
>   Handle that.

Small patch series when possible are better. Bullet points in commit
messages are likely to describe how a patch or series may be split up
;-)

This patch seems to be the unique patch of the series resolving what is
described in the cover letter as "Make the MDIO interface work".

I'd suggest you to split up this one commit in several *atomic* and easy
to review patches and send them separately as on thread named "net: dsa:
lan9303: fix MDIO interface" (also note that imperative is prefered for
subject lines, see: https://chris.beams.io/posts/git-commit/#imperative)

<...>

> -static int lan9303_port_phy_reg_wait_for_completion(struct lan9303 *chip)
> +static int lan9303_indirect_phy_wait_for_completion(struct lan9303 *chip)

For instance you can have a first commit only renaming the functions.
The reason for it is to separate the functional changes from cosmetic
changes, which makes it easier for review.

<...>

> - if (reg != 0)
> + if ((reg != 0) && (reg != 0x))

if (reg && reg != 0x) should be enough.

>   chip->phy_addr_sel_strap = 1;
>   else
>   chip->phy_addr_sel_strap = 0;

<...>

> +struct lan9303;
> +
> +struct lan9303_phy_ops {
> + /* PHY 1 &2 access*/

The spacing is weird in the comment. "/* PHY 1 & 2 access */" maybe?

<...>

> +int lan9303_mdio_phy_write(struct lan9303 *chip, int phy, int regnum, u16 
> val)
> +{
> + struct lan9303_mdio *sw_dev = dev_get_drvdata(chip->dev);
> + struct mdio_device *mdio = sw_dev->device;
> +
> + mutex_lock(>bus->mdio_lock);
> + mdio->bus->write(mdio->bus, phy, regnum, val);
> + mutex_unlock(>bus->mdio_lock);

This is exactly what mdiobus_write(mdio->bus, phy, regnum, val) is
doing. There are very few valid reasons to go play in the mii_bus
structure, using generic APIs are strongly prefered. Plus you have
checks and traces for free!

> +
> + return 0;
> +}
> +
> +int lan9303_mdio_phy_read(struct lan9303 *chip, int phy,  int reg)
> +{
> + struct lan9303_mdio *sw_dev = dev_get_drvdata(chip->dev);
> + struct mdio_device *mdio = sw_dev->device;
> + int val;
> +
> + mutex_lock(>bus->mdio_lock);
> + val  =  mdio->bus->read(mdio->bus, phy, reg);
> + mutex_unlock(>bus->mdio_lock);

Same here, mdiobus_read().


Thanks,

Vivien
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v2 06/10] net: dsa: lan9303: added sysfs node swe_bcst_throt

2017-07-25 Thread Egil Hjelmeland
Allowing per-port access to Switch Engine Broadcast Throttling Register

Also added lan9303_write_switch_reg_mask()

Signed-off-by: Egil Hjelmeland 
---
 drivers/net/dsa/lan9303-core.c | 83 ++
 1 file changed, 83 insertions(+)

diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
index be6d78f45a5f..b70acb73aad6 100644
--- a/drivers/net/dsa/lan9303-core.c
+++ b/drivers/net/dsa/lan9303-core.c
@@ -154,6 +154,7 @@
 # define LAN9303_SWE_PORT_MIRROR_ENABLE_RX_MIRRORING BIT(1)
 # define LAN9303_SWE_PORT_MIRROR_ENABLE_TX_MIRRORING BIT(0)
 #define LAN9303_SWE_INGRESS_PORT_TYPE 0x1847
+#define LAN9303_SWE_BCST_THROT 0x1848
 #define LAN9303_BM_CFG 0x1c00
 #define LAN9303_BM_EGRSS_PORT_TYPE 0x1c0c
 # define LAN9303_BM_EGRSS_PORT_TYPE_SPECIAL_TAG_PORT2 (BIT(17) | BIT(16))
@@ -426,6 +427,20 @@ static int lan9303_read_switch_reg(struct lan9303 *chip, 
u16 regnum, u32 *val)
return ret;
 }
 
+static int lan9303_write_switch_reg_mask(
+   struct lan9303 *chip, u16 regnum, u32 val, u32 mask)
+{
+   int ret;
+   u32 reg;
+
+   ret = lan9303_read_switch_reg(chip, regnum, );
+   if (ret)
+   return ret;
+   reg = (reg & ~mask) | val;
+
+   return lan9303_write_switch_reg(chip, regnum, reg);
+}
+
 static int lan9303_detect_phy_setup(struct lan9303 *chip)
 {
int reg;
@@ -614,6 +629,66 @@ static int lan9303_check_device(struct lan9303 *chip)
return 0;
 }
 
+/* -- Sysfs on slave port --*/
+/*13.4.3.23 Switch Engine Broadcast Throttling Register (SWE_BCST_THROT)*/
+static ssize_t
+swe_bcst_throt_show(struct device *dev, struct device_attribute *attr,
+   char *buf)
+{
+   struct dsa_port *dp = dsa_net_device_to_dsa_port(to_net_dev(dev));
+   struct lan9303 *chip = dp->ds->priv;
+   int port = dp->index;
+   int reg;
+
+   if (lan9303_read_switch_reg(chip, LAN9303_SWE_BCST_THROT, ))
+   return 0;
+
+   reg = (reg >> (9 * port)) & 0x1ff; /*extract port N*/
+   if (reg & 0x100)
+   reg &= 0xff; /* remove enable bit */
+   else
+   reg = 0; /* not enabled*/
+
+   return scnprintf(buf, PAGE_SIZE, "%d\n", reg);
+}
+
+static ssize_t
+swe_bcst_throt_store(struct device *dev, struct device_attribute *attr,
+const char *buf, size_t len)
+{
+   struct dsa_port *dp = dsa_net_device_to_dsa_port(to_net_dev(dev));
+   struct lan9303 *chip = dp->ds->priv;
+   int port = dp->index;
+   int ret;
+   unsigned long level;
+
+   ret = kstrtoul(buf, 0, );
+   if (ret)
+   return ret;
+   level &= 0xff; /* ensure valid range */
+   if (level)
+   level |= 0x100; /* Set enable bit  */
+
+   ret = lan9303_write_switch_reg_mask(chip, LAN9303_SWE_BCST_THROT,
+   level << (9 * port),
+   0x1ff << (9 * port));
+   if (ret)
+   return ret;
+   return len;
+}
+
+static DEVICE_ATTR_RW(swe_bcst_throt);
+
+static struct attribute *lan9303_attrs[] = {
+   _attr_swe_bcst_throt.attr,
+   NULL
+};
+
+static struct attribute_group lan9303_group = {
+   .name = "lan9303",
+   .attrs = lan9303_attrs,
+};
+
 /*  DSA ---*/
 
 static enum dsa_tag_protocol lan9303_get_tag_protocol(struct dsa_switch *ds)
@@ -787,6 +862,11 @@ static int lan9303_port_enable(struct dsa_switch *ds, int 
port,
switch (port) {
case 1:
case 2:
+   /* lan9303_setup is too early to attach sysfs nodes... */
+   if (sysfs_create_group(
+   >ports[port].netdev->dev.kobj,
+   _group))
+   dev_dbg(chip->dev, "cannot create sysfs group\n");
return lan9303_enable_packet_processing(chip, port);
default:
dev_dbg(chip->dev,
@@ -805,6 +885,9 @@ static void lan9303_port_disable(struct dsa_switch *ds, int 
port,
switch (port) {
case 1:
case 2:
+   sysfs_remove_group(>ports[port].netdev->dev.kobj,
+  _group);
+
lan9303_disable_packet_processing(chip, port);
lan9303_phy_write(ds, chip->phy_addr_sel_strap + port,
  MII_BMCR, BMCR_PDOWN);
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v2 02/10] net: dsa: lan9303: Do not disable/enable switch fabric port 0 at startup

2017-07-25 Thread Egil Hjelmeland
For some mysterious reason enable switch fabric port 0 TX fails to
work, when the TX has previous been disabled. Resolved by not
disable/enable switch fabric port 0 at startup. Port 1 and 2 are
still disabled in early init.

Signed-off-by: Egil Hjelmeland 
---
 drivers/net/dsa/lan9303-core.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
index e622db586c3d..c2b53659f58f 100644
--- a/drivers/net/dsa/lan9303-core.c
+++ b/drivers/net/dsa/lan9303-core.c
@@ -557,9 +557,6 @@ static int lan9303_disable_processing(struct lan9303 *chip)
 {
int ret;
 
-   ret = lan9303_disable_packet_processing(chip, LAN9303_PORT_0_OFFSET);
-   if (ret)
-   return ret;
ret = lan9303_disable_packet_processing(chip, LAN9303_PORT_1_OFFSET);
if (ret)
return ret;
@@ -633,10 +630,6 @@ static int lan9303_setup(struct dsa_switch *ds)
if (ret)
dev_err(chip->dev, "failed to separate ports %d\n", ret);
 
-   ret = lan9303_enable_packet_processing(chip, LAN9303_PORT_0_OFFSET);
-   if (ret)
-   dev_err(chip->dev, "failed to re-enable switching %d\n", ret);
-
return 0;
 }
 
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v2 05/10] net: dsa: added dsa_net_device_to_dsa_port()

2017-07-25 Thread Egil Hjelmeland
Allowing dsa drivers to attach sysfs nodes.

Signed-off-by: Egil Hjelmeland 
---
 include/net/dsa.h |  1 +
 net/dsa/slave.c   | 10 ++
 2 files changed, 11 insertions(+)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 88da272d20d0..a71c0a2401ee 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -450,6 +450,7 @@ void unregister_switch_driver(struct dsa_switch_driver 
*type);
 struct mii_bus *dsa_host_dev_to_mii_bus(struct device *dev);
 
 struct net_device *dsa_dev_to_net_device(struct device *dev);
+struct dsa_port *dsa_net_device_to_dsa_port(struct net_device *dev);
 
 /* Keep inline for faster access in hot path */
 static inline bool netdev_uses_dsa(struct net_device *dev)
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 9507bd38cf04..40410f1740de 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -209,6 +209,16 @@ static int dsa_slave_ioctl(struct net_device *dev, struct 
ifreq *ifr, int cmd)
return -EOPNOTSUPP;
 }
 
+struct dsa_port *dsa_net_device_to_dsa_port(struct net_device *dev)
+{
+   struct dsa_slave_priv *p = netdev_priv(dev);
+
+   if (!dsa_slave_dev_check(dev))
+   return NULL;
+   return p->dp;
+}
+EXPORT_SYMBOL_GPL(dsa_net_device_to_dsa_port);
+
 static int dsa_slave_port_attr_set(struct net_device *dev,
   const struct switchdev_attr *attr,
   struct switchdev_trans *trans)
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v2 08/10] net: dsa: lan9303: Added ALR/fdb/mdb handling

2017-07-25 Thread Egil Hjelmeland
Added functions for accessing / managing the lan9303 ALR (Address Logic
Resolution).

Implemented DSA methods: set_addr, port_fast_age, port_fdb_prepare,
port_fdb_add, port_fdb_del, port_fdb_dump, port_mdb_prepare,
port_mdb_add and port_mdb_del.

Since the lan9303 do not offer reading specific ALR entry, the driver
caches all static entries - in a flat table.

Signed-off-by: Egil Hjelmeland 
---
 drivers/net/dsa/lan9303-core.c | 369 +
 drivers/net/dsa/lan9303.h  |  11 ++
 2 files changed, 380 insertions(+)

diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
index 426a75bd89f4..dc95973d62ed 100644
--- a/drivers/net/dsa/lan9303-core.c
+++ b/drivers/net/dsa/lan9303-core.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "lan9303.h"
 
@@ -121,6 +122,21 @@
 #define LAN9303_MAC_RX_CFG_2 0x0c01
 #define LAN9303_MAC_TX_CFG_2 0x0c40
 #define LAN9303_SWE_ALR_CMD 0x1800
+# define ALR_CMD_MAKE_ENTRYBIT(2)
+# define ALR_CMD_GET_FIRST BIT(1)
+# define ALR_CMD_GET_NEXT  BIT(0)
+#define LAN9303_SWE_ALR_WR_DAT_0 0x1801
+#define LAN9303_SWE_ALR_WR_DAT_1 0x1802
+# define ALR_DAT1_VALIDBIT(26)
+# define ALR_DAT1_END_OF_TABL  BIT(25)
+# define ALR_DAT1_AGE_OVERRID  BIT(25)
+# define ALR_DAT1_STATIC   BIT(24)
+# define ALR_DAT1_PORT_BITOFFS  16
+# define ALR_DAT1_PORT_MASK(7 << ALR_DAT1_PORT_BITOFFS)
+#define LAN9303_SWE_ALR_RD_DAT_0 0x1805
+#define LAN9303_SWE_ALR_RD_DAT_1 0x1806
+#define LAN9303_SWE_ALR_CMD_STS 0x1808
+# define ALR_STS_MAKE_PEND BIT(0)
 #define LAN9303_SWE_VLAN_CMD 0x180b
 # define LAN9303_SWE_VLAN_CMD_RNW BIT(5)
 # define LAN9303_SWE_VLAN_CMD_PVIDNVLAN BIT(4)
@@ -473,6 +489,229 @@ static int lan9303_detect_phy_setup(struct lan9303 *chip)
return 0;
 }
 
+/* - Address Logic Resolution (ALR)--*/
+
+/* Map ALR-port bits to port bitmap, and back*/
+static const int alrport_2_portmap[] = {1, 2, 4, 0, 3, 5, 6, 7 };
+static const int portmap_2_alrport[] = {3, 0, 1, 4, 2, 5, 6, 7 };
+
+/* ALR: Cache static entries: mac address + port bitmap */
+
+/* Return pointer to first free ALR cache entry, return NULL if none */
+static struct lan9303_alr_cache_entry *lan9303_alr_cache_find_free(
+   struct lan9303 *chip)
+{
+   int i;
+   struct lan9303_alr_cache_entry *entr = chip->alr_cache;
+
+   for (i = 0; i < LAN9303_NUM_ALR_RECORDS; i++, entr++)
+   if (entr->port_map == 0)
+   return entr;
+   return NULL;
+}
+
+/* Return pointer to ALR cache entry matching MAC address */
+static struct lan9303_alr_cache_entry *lan9303_alr_cache_find_mac(
+   struct lan9303 *chip,
+   const u8 *mac_addr)
+{
+   int i;
+   struct lan9303_alr_cache_entry *entr = chip->alr_cache;
+
+   BUILD_BUG_ON_MSG(sizeof(struct lan9303_alr_cache_entry) & 1,
+"ether_addr_equal require u16 alignment");
+
+   for (i = 0; i < LAN9303_NUM_ALR_RECORDS; i++, entr++)
+   if (ether_addr_equal(entr->mac_addr, mac_addr))
+   return entr;
+   return NULL;
+}
+
+/* ALR: Actual register access functions */
+
+/* This function will wait a while until mask & reg == value */
+/* Otherwise, return timeout */
+static int lan9303_csr_reg_wait(struct lan9303 *chip, int regno,
+   int mask, char value)
+{
+   int i;
+
+   for (i = 0; i < 0x1000; i++) {
+   u32 reg;
+
+   lan9303_read_switch_reg(chip, regno, );
+   if ((reg & mask) == value)
+   return 0;
+   }
+   return -ETIMEDOUT;
+}
+
+static int _lan9303_alr_make_entry_raw(struct lan9303 *chip, u32 dat0, u32 
dat1)
+{
+   lan9303_write_switch_reg(
+   chip, LAN9303_SWE_ALR_WR_DAT_0, dat0);
+   lan9303_write_switch_reg(
+   chip, LAN9303_SWE_ALR_WR_DAT_1, dat1);
+   lan9303_write_switch_reg(
+   chip, LAN9303_SWE_ALR_CMD, ALR_CMD_MAKE_ENTRY);
+   lan9303_csr_reg_wait(
+   chip, LAN9303_SWE_ALR_CMD_STS, ALR_STS_MAKE_PEND, 0);
+   lan9303_write_switch_reg(chip, LAN9303_SWE_ALR_CMD, 0);
+   return 0;
+}
+
+typedef void alr_loop_cb_t(
+   struct lan9303 *chip, u32 dat0, u32 dat1, int portmap, void *ctx);
+
+static void lan9303_alr_loop(struct lan9303 *chip, alr_loop_cb_t *cb, void 
*ctx)
+{
+   int i;
+
+   lan9303_write_switch_reg(chip, LAN9303_SWE_ALR_CMD, ALR_CMD_GET_FIRST);
+   lan9303_write_switch_reg(chip, LAN9303_SWE_ALR_CMD, 0);
+
+   for (i = 1; i < LAN9303_NUM_ALR_RECORDS; i++) {
+   u32 dat0, dat1;
+   int alrport, portmap;
+
+   lan9303_read_switch_reg(chip, LAN9303_SWE_ALR_RD_DAT_0, );
+   lan9303_read_switch_reg(chip, LAN9303_SWE_ALR_RD_DAT_1, );
+   if (dat1 & ALR_DAT1_END_OF_TABL)
+   break;
+
+   alrport 

[PATCH net-next v2 10/10] net: dsa: lan9303: Only allocate 3 ports

2017-07-25 Thread Egil Hjelmeland
Saving 2628 bytes.

Signed-off-by: Egil Hjelmeland 
Reviewed-by: Florian Fainelli 
---
 drivers/net/dsa/lan9303-core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
index dc95973d62ed..ad7a4c72e1fb 100644
--- a/drivers/net/dsa/lan9303-core.c
+++ b/drivers/net/dsa/lan9303-core.c
@@ -23,6 +23,8 @@
 
 #include "lan9303.h"
 
+#define LAN9303_NUM_PORTS 3
+
 /* 13.2 System Control and Status Registers
  * Multiply register number by 4 to get address offset.
  */
@@ -1361,7 +1363,7 @@ static struct dsa_switch_ops lan9303_switch_ops = {
 
 static int lan9303_register_switch(struct lan9303 *chip)
 {
-   chip->ds = dsa_switch_alloc(chip->dev, DSA_MAX_PORTS);
+   chip->ds = dsa_switch_alloc(chip->dev, LAN9303_NUM_PORTS);
if (!chip->ds)
return -ENOMEM;
 
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v2 09/10] net: dsa: lan9303: Added Documentation/networking/dsa/lan9303.txt

2017-07-25 Thread Egil Hjelmeland
Signed-off-by: Egil Hjelmeland 
---
 Documentation/networking/dsa/lan9303.txt | 63 
 1 file changed, 63 insertions(+)
 create mode 100644 Documentation/networking/dsa/lan9303.txt

diff --git a/Documentation/networking/dsa/lan9303.txt 
b/Documentation/networking/dsa/lan9303.txt
new file mode 100644
index ..ef5b3ca12a29
--- /dev/null
+++ b/Documentation/networking/dsa/lan9303.txt
@@ -0,0 +1,63 @@
+LAN9303 Ethernet switch driver
+==
+
+The LAN9303 is a three port 10/100 ethernet switch with integrated phys
+for the two external ethernet ports. The third port is an RMII/MII
+interface to a host master network interface (e.g. fixed link).
+
+
+Driver details
+==
+
+The driver is implemented as a DSA driver, see
+Documentation/networking/dsa/dsa.txt.
+
+See Documentation/devicetree/bindings/net/dsa/lan9303.txt for device
+tree binding.
+
+The LAN9303 can be managed both via MDIO and I2C, both supported by this
+driver.
+
+At startup the driver configures the device to provide two separate
+network interfaces (which is the default state of a DSA device).
+
+When both user ports are joined to the same bridge, the normal
+HW MAC learning is enabled. This means that unicast traffic is forwarded
+in HW. STP is also supported in this mode.
+
+If one of the user ports leave the bridge,
+the ports goes back to the initial separated operation.
+
+The driver implements the port_fdb_xxx/port_mdb_xxx methods.
+
+
+Sysfs nodes
+===
+
+When a user port is enabled, the driver creates sysfs directory
+/sys/class/net/xxx/lan9303 with the following files:
+
+ - swe_bcst_throt (RW): Set/get 6.4.7 Broadcast Storm Control
+  Throttle Level for the port. Accesses the corresponding bits of
+  the SWE_BCST_THROT register (13.4.3.23).
+
+
+Driver limitations
+==
+
+ - No support for VLAN
+
+
+Bridging notes
+==
+When the user ports are bridged, broadcasts, multicasts and unknown
+frames with unknown destination are flooded by the chip. Therefore SW
+flooding must be disabled by:
+
+   echo 0 > /sys/class/net/p1/brport/broadcast_flood
+   echo 0 > /sys/class/net/p1/brport/multicast_flood
+   echo 0 > /sys/class/net/p1/brport/unicast_flood
+   echo 0 > /sys/class/net/p2/brport/broadcast_flood
+   echo 0 > /sys/class/net/p2/brport/multicast_flood
+   echo 0 > /sys/class/net/p2/brport/unicast_flood
+
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v2 07/10] net: dsa: lan9303: Added basic offloading of unicast traffic

2017-07-25 Thread Egil Hjelmeland
When both user ports are joined to the same bridge, the normal
HW MAC learning is enabled. This means that unicast traffic is forwarded
in HW. Support for STP is also added.

If one of the user ports leave the bridge,
the ports goes back to the initial separated operation.

Added brigde methods port_bridge_join, port_bridge_leave and
port_stp_state_set.

Signed-off-by: Egil Hjelmeland 
---
 drivers/net/dsa/lan9303-core.c | 115 ++---
 drivers/net/dsa/lan9303.h  |   1 +
 2 files changed, 98 insertions(+), 18 deletions(-)

diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
index b70acb73aad6..426a75bd89f4 100644
--- a/drivers/net/dsa/lan9303-core.c
+++ b/drivers/net/dsa/lan9303-core.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "lan9303.h"
 
@@ -143,6 +144,7 @@
 # define LAN9303_SWE_PORT_STATE_FORWARDING_PORT0 (0)
 # define LAN9303_SWE_PORT_STATE_LEARNING_PORT0 BIT(1)
 # define LAN9303_SWE_PORT_STATE_BLOCKING_PORT0 BIT(0)
+# define LAN9303_SWE_PORT_STATE_DISABLED_PORT0 (3)
 #define LAN9303_SWE_PORT_MIRROR 0x1846
 # define LAN9303_SWE_PORT_MIRROR_SNIFF_ALL BIT(8)
 # define LAN9303_SWE_PORT_MIRROR_SNIFFER_PORT2 BIT(7)
@@ -515,11 +517,30 @@ static int lan9303_enable_packet_processing(struct 
lan9303 *chip,
LAN9303_MAC_TX_CFG_X_TX_ENABLE);
 }
 
+/* forward special tagged packets from port 0 to port 1 *or* port 2 */
+static int lan9303_setup_tagging(struct lan9303 *chip)
+{
+   int ret;
+   /* enable defining the destination port via special VLAN tagging
+* for port 0
+*/
+   ret = lan9303_write_switch_reg(chip, LAN9303_SWE_INGRESS_PORT_TYPE,
+  0x03);
+   if (ret)
+   return ret;
+
+   /* tag incoming packets at port 1 and 2 on their way to port 0 to be
+* able to discover their source port
+*/
+   return lan9303_write_switch_reg(
+   chip, LAN9303_BM_EGRSS_PORT_TYPE,
+   LAN9303_BM_EGRSS_PORT_TYPE_SPECIAL_TAG_PORT0);
+}
+
 /* We want a special working switch:
  * - do not forward packets between port 1 and 2
  * - forward everything from port 1 to port 0
  * - forward everything from port 2 to port 0
- * - forward special tagged packets from port 0 to port 1 *or* port 2
  */
 static int lan9303_separate_ports(struct lan9303 *chip)
 {
@@ -534,22 +555,6 @@ static int lan9303_separate_ports(struct lan9303 *chip)
if (ret)
return ret;
 
-   /* enable defining the destination port via special VLAN tagging
-* for port 0
-*/
-   ret = lan9303_write_switch_reg(chip, LAN9303_SWE_INGRESS_PORT_TYPE,
-  0x03);
-   if (ret)
-   return ret;
-
-   /* tag incoming packets at port 1 and 2 on their way to port 0 to be
-* able to discover their source port
-*/
-   ret = lan9303_write_switch_reg(chip, LAN9303_BM_EGRSS_PORT_TYPE,
-   LAN9303_BM_EGRSS_PORT_TYPE_SPECIAL_TAG_PORT0);
-   if (ret)
-   return ret;
-
/* prevent port 1 and 2 from forwarding packets by their own */
return lan9303_write_switch_reg(chip, LAN9303_SWE_PORT_STATE,
LAN9303_SWE_PORT_STATE_FORWARDING_PORT0 |
@@ -557,6 +562,12 @@ static int lan9303_separate_ports(struct lan9303 *chip)
LAN9303_SWE_PORT_STATE_BLOCKING_PORT2);
 }
 
+static void lan9303_bridge_ports(struct lan9303 *chip)
+{
+   /* ports bridged: remove mirroring */
+   lan9303_write_switch_reg(chip, LAN9303_SWE_PORT_MIRROR, 0);
+}
+
 static int lan9303_handle_reset(struct lan9303 *chip)
 {
if (!chip->reset_gpio)
@@ -707,6 +718,10 @@ static int lan9303_setup(struct dsa_switch *ds)
return -EINVAL;
}
 
+   ret = lan9303_setup_tagging(chip);
+   if (ret)
+   dev_err(chip->dev, "failed to setup port tagging %d\n", ret);
+
ret = lan9303_separate_ports(chip);
if (ret)
dev_err(chip->dev, "failed to separate ports %d\n", ret);
@@ -898,17 +913,81 @@ static void lan9303_port_disable(struct dsa_switch *ds, 
int port,
}
 }
 
+static int lan9303_port_bridge_join(struct dsa_switch *ds, int port,
+   struct net_device *br)
+{
+   struct lan9303 *chip = ds->priv;
+
+   dev_dbg(chip->dev, "%s(port %d)\n", __func__, port);
+   if (ds->ports[1].bridge_dev ==  ds->ports[2].bridge_dev) {
+   lan9303_bridge_ports(chip);
+   chip->is_bridged = true;  /* unleash stp_state_set() */
+   }
+
+   return 0;
+}
+
+static void lan9303_port_bridge_leave(struct dsa_switch *ds, int port,
+ struct net_device *br)
+{
+   struct lan9303 *chip = ds->priv;
+
+   dev_dbg(chip->dev, "%s(port %d)\n", __func__, port);
+   

[PATCH net-next v2 01/10] net: dsa: lan9303: Fixed MDIO interface

2017-07-25 Thread Egil Hjelmeland
Fixes after testing on actual HW:

- lan9303_mdio_write()/_read() must multiply register number
  by 4 to get offset

- Indirect access (PMI) to phy register only work in I2C mode. In
  MDIO mode phy registers must be accessed directly. Introduced
  struct lan9303_phy_ops to handle the two modes. Renamed functions
  to clarify.

- lan9303_detect_phy_setup() : Failed MDIO read return 0x.
  Handle that.

Signed-off-by: Egil Hjelmeland 
---
 drivers/net/dsa/lan9303-core.c | 42 +++---
 drivers/net/dsa/lan9303.h  | 11 +++
 drivers/net/dsa/lan9303_i2c.c  |  2 ++
 drivers/net/dsa/lan9303_mdio.c | 34 ++
 4 files changed, 74 insertions(+), 15 deletions(-)

diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
index cd76e61f1fca..e622db586c3d 100644
--- a/drivers/net/dsa/lan9303-core.c
+++ b/drivers/net/dsa/lan9303-core.c
@@ -20,6 +20,9 @@
 
 #include "lan9303.h"
 
+/* 13.2 System Control and Status Registers
+ * Multiply register number by 4 to get address offset.
+ */
 #define LAN9303_CHIP_REV 0x14
 # define LAN9303_CHIP_ID 0x9303
 #define LAN9303_IRQ_CFG 0x15
@@ -53,6 +56,9 @@
 #define LAN9303_VIRT_PHY_BASE 0x70
 #define LAN9303_VIRT_SPECIAL_CTRL 0x77
 
+/*13.4 Switch Fabric Control and Status Registers
+ * Accessed indirectly via SWITCH_CSR_CMD, SWITCH_CSR_DATA.
+ */
 #define LAN9303_SW_DEV_ID 0x
 #define LAN9303_SW_RESET 0x0001
 #define LAN9303_SW_RESET_RESET BIT(0)
@@ -242,7 +248,7 @@ static int lan9303_virt_phy_reg_write(struct lan9303 *chip, 
int regnum, u16 val)
return regmap_write(chip->regmap, LAN9303_VIRT_PHY_BASE + regnum, val);
 }
 
-static int lan9303_port_phy_reg_wait_for_completion(struct lan9303 *chip)
+static int lan9303_indirect_phy_wait_for_completion(struct lan9303 *chip)
 {
int ret, i;
u32 reg;
@@ -262,7 +268,7 @@ static int lan9303_port_phy_reg_wait_for_completion(struct 
lan9303 *chip)
return -EIO;
 }
 
-static int lan9303_port_phy_reg_read(struct lan9303 *chip, int addr, int 
regnum)
+static int lan9303_indirect_phy_read(struct lan9303 *chip, int addr, int 
regnum)
 {
int ret;
u32 val;
@@ -272,7 +278,7 @@ static int lan9303_port_phy_reg_read(struct lan9303 *chip, 
int addr, int regnum)
 
mutex_lock(>indirect_mutex);
 
-   ret = lan9303_port_phy_reg_wait_for_completion(chip);
+   ret = lan9303_indirect_phy_wait_for_completion(chip);
if (ret)
goto on_error;
 
@@ -281,7 +287,7 @@ static int lan9303_port_phy_reg_read(struct lan9303 *chip, 
int addr, int regnum)
if (ret)
goto on_error;
 
-   ret = lan9303_port_phy_reg_wait_for_completion(chip);
+   ret = lan9303_indirect_phy_wait_for_completion(chip);
if (ret)
goto on_error;
 
@@ -299,8 +305,8 @@ static int lan9303_port_phy_reg_read(struct lan9303 *chip, 
int addr, int regnum)
return ret;
 }
 
-static int lan9303_phy_reg_write(struct lan9303 *chip, int addr, int regnum,
-unsigned int val)
+static int lan9303_indirect_phy_write(struct lan9303 *chip, int addr,
+ int regnum, u16 val)
 {
int ret;
u32 reg;
@@ -311,7 +317,7 @@ static int lan9303_phy_reg_write(struct lan9303 *chip, int 
addr, int regnum,
 
mutex_lock(>indirect_mutex);
 
-   ret = lan9303_port_phy_reg_wait_for_completion(chip);
+   ret = lan9303_indirect_phy_wait_for_completion(chip);
if (ret)
goto on_error;
 
@@ -328,6 +334,11 @@ static int lan9303_phy_reg_write(struct lan9303 *chip, int 
addr, int regnum,
return ret;
 }
 
+const struct lan9303_phy_ops lan9303_indirect_phy_ops = {
+   .phy_read = lan9303_indirect_phy_read,
+   .phy_write = lan9303_indirect_phy_write,
+};
+
 static int lan9303_switch_wait_for_completion(struct lan9303 *chip)
 {
int ret, i;
@@ -427,14 +438,15 @@ static int lan9303_detect_phy_setup(struct lan9303 *chip)
 * Special reg 18 of phy 3 reads as 0x, if 'phy_addr_sel_strap' is 0
 * and the IDs are 0-1-2, else it contains something different from
 * 0x, which means 'phy_addr_sel_strap' is 1 and the IDs are 1-2-3.
+* 0x is returned for failed MDIO access.
 */
-   reg = lan9303_port_phy_reg_read(chip, 3, MII_LAN911X_SPECIAL_MODES);
+   reg = chip->ops->phy_read(chip, 3, MII_LAN911X_SPECIAL_MODES);
if (reg < 0) {
dev_err(chip->dev, "Failed to detect phy config: %d\n", reg);
return reg;
}
 
-   if (reg != 0)
+   if ((reg != 0) && (reg != 0x))
chip->phy_addr_sel_strap = 1;
else
chip->phy_addr_sel_strap = 0;
@@ -719,7 +731,7 @@ static int lan9303_phy_read(struct dsa_switch *ds, int phy, 
int regnum)
if (phy > phy_base + 2)
return -ENODEV;
 
-   return 

[PATCH net-next v2 03/10] net: dsa: lan9303: Refactor lan9303_enable_packet_processing()

2017-07-25 Thread Egil Hjelmeland
lan9303_enable_packet_processing, lan9303_disable_packet_processing()
Pass port number (0,1,2) as parameter instead of port offset.
Simplify accordingly.

Signed-off-by: Egil Hjelmeland 
---
 drivers/net/dsa/lan9303-core.c | 66 --
 1 file changed, 32 insertions(+), 34 deletions(-)

diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
index c2b53659f58f..0806a0684d55 100644
--- a/drivers/net/dsa/lan9303-core.c
+++ b/drivers/net/dsa/lan9303-core.c
@@ -159,9 +159,7 @@
 # define LAN9303_BM_EGRSS_PORT_TYPE_SPECIAL_TAG_PORT1 (BIT(9) | BIT(8))
 # define LAN9303_BM_EGRSS_PORT_TYPE_SPECIAL_TAG_PORT0 (BIT(1) | BIT(0))
 
-#define LAN9303_PORT_0_OFFSET 0x400
-#define LAN9303_PORT_1_OFFSET 0x800
-#define LAN9303_PORT_2_OFFSET 0xc00
+#define LAN9303_SWITCH_PORT_REG(port, reg0) (0x400 * (port) + (reg0))
 
 /* the built-in PHYs are of type LAN911X */
 #define MII_LAN911X_SPECIAL_MODES 0x12
@@ -457,24 +455,25 @@ static int lan9303_detect_phy_setup(struct lan9303 *chip)
return 0;
 }
 
-#define LAN9303_MAC_RX_CFG_OFFS (LAN9303_MAC_RX_CFG_0 - LAN9303_PORT_0_OFFSET)
-#define LAN9303_MAC_TX_CFG_OFFS (LAN9303_MAC_TX_CFG_0 - LAN9303_PORT_0_OFFSET)
-
 static int lan9303_disable_packet_processing(struct lan9303 *chip,
 unsigned int port)
 {
int ret;
 
/* disable RX, but keep register reset default values else */
-   ret = lan9303_write_switch_reg(chip, LAN9303_MAC_RX_CFG_OFFS + port,
-  LAN9303_MAC_RX_CFG_X_REJECT_MAC_TYPES);
+   ret = lan9303_write_switch_reg(
+   chip,
+   LAN9303_SWITCH_PORT_REG(port, LAN9303_MAC_RX_CFG_0),
+   LAN9303_MAC_RX_CFG_X_REJECT_MAC_TYPES);
if (ret)
return ret;
 
/* disable TX, but keep register reset default values else */
-   return lan9303_write_switch_reg(chip, LAN9303_MAC_TX_CFG_OFFS + port,
-   LAN9303_MAC_TX_CFG_X_TX_IFG_CONFIG_DEFAULT |
-   LAN9303_MAC_TX_CFG_X_TX_PAD_ENABLE);
+   return lan9303_write_switch_reg(
+   chip,
+   LAN9303_SWITCH_PORT_REG(port, LAN9303_MAC_TX_CFG_0),
+   LAN9303_MAC_TX_CFG_X_TX_IFG_CONFIG_DEFAULT |
+   LAN9303_MAC_TX_CFG_X_TX_PAD_ENABLE);
 }
 
 static int lan9303_enable_packet_processing(struct lan9303 *chip,
@@ -483,17 +482,21 @@ static int lan9303_enable_packet_processing(struct 
lan9303 *chip,
int ret;
 
/* enable RX and keep register reset default values else */
-   ret = lan9303_write_switch_reg(chip, LAN9303_MAC_RX_CFG_OFFS + port,
-  LAN9303_MAC_RX_CFG_X_REJECT_MAC_TYPES |
-  LAN9303_MAC_RX_CFG_X_RX_ENABLE);
+   ret = lan9303_write_switch_reg(
+   chip,
+   LAN9303_SWITCH_PORT_REG(port, LAN9303_MAC_RX_CFG_0),
+   LAN9303_MAC_RX_CFG_X_REJECT_MAC_TYPES |
+   LAN9303_MAC_RX_CFG_X_RX_ENABLE);
if (ret)
return ret;
 
/* enable TX and keep register reset default values else */
-   return lan9303_write_switch_reg(chip, LAN9303_MAC_TX_CFG_OFFS + port,
-   LAN9303_MAC_TX_CFG_X_TX_IFG_CONFIG_DEFAULT |
-   LAN9303_MAC_TX_CFG_X_TX_PAD_ENABLE |
-   LAN9303_MAC_TX_CFG_X_TX_ENABLE);
+   return lan9303_write_switch_reg(
+   chip,
+   LAN9303_SWITCH_PORT_REG(port, LAN9303_MAC_TX_CFG_0),
+   LAN9303_MAC_TX_CFG_X_TX_IFG_CONFIG_DEFAULT |
+   LAN9303_MAC_TX_CFG_X_TX_PAD_ENABLE |
+   LAN9303_MAC_TX_CFG_X_TX_ENABLE);
 }
 
 /* We want a special working switch:
@@ -555,12 +558,14 @@ static int lan9303_handle_reset(struct lan9303 *chip)
 /* stop processing packets for all ports */
 static int lan9303_disable_processing(struct lan9303 *chip)
 {
-   int ret;
+   int ret, p;
 
-   ret = lan9303_disable_packet_processing(chip, LAN9303_PORT_1_OFFSET);
-   if (ret)
-   return ret;
-   return lan9303_disable_packet_processing(chip, LAN9303_PORT_2_OFFSET);
+   for (p = 1; p <= 2; p++) {
+   ret = lan9303_disable_packet_processing(chip, p);
+   if (ret)
+   return ret;
+   }
+   return 0;
 }
 
 static int lan9303_check_device(struct lan9303 *chip)
@@ -696,7 +701,7 @@ static void lan9303_get_ethtool_stats(struct dsa_switch 
*ds, int port,
unsigned int u, poff;
int ret;
 
-   poff = port * 0x400;
+   poff = LAN9303_SWITCH_PORT_REG(port, 0);
 
for (u = 0; u < ARRAY_SIZE(lan9303_mib); u++) {
ret = lan9303_read_switch_reg(chip,
@@ -749,11 

[PATCH net-next v2 04/10] net: dsa: lan9303: Added adjust_link() method

2017-07-25 Thread Egil Hjelmeland
This makes the driver react to device tree "fixed-link" declaration
on CPU port.

- turn off autonegotiation
- force speed 10 or 100 mb/s
- force duplex mode

Signed-off-by: Egil Hjelmeland 
---
 drivers/net/dsa/lan9303-core.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
index 0806a0684d55..be6d78f45a5f 100644
--- a/drivers/net/dsa/lan9303-core.c
+++ b/drivers/net/dsa/lan9303-core.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "lan9303.h"
 
@@ -746,6 +747,37 @@ static int lan9303_phy_write(struct dsa_switch *ds, int 
phy, int regnum,
return chip->ops->phy_write(chip, phy, regnum, val);
 }
 
+static void lan9303_adjust_link(struct dsa_switch *ds, int port,
+   struct phy_device *phydev)
+{
+   struct lan9303 *chip = ds->priv;
+
+   int ctl, res;
+
+   ctl = lan9303_phy_read(ds, port, MII_BMCR);
+
+   if (!phy_is_pseudo_fixed_link(phydev))
+   return;
+
+   ctl &= ~BMCR_ANENABLE;
+   if (phydev->speed == SPEED_100)
+   ctl |= BMCR_SPEED100;
+
+   if (phydev->duplex == DUPLEX_FULL)
+   ctl |= BMCR_FULLDPLX;
+
+   res =  lan9303_phy_write(ds, port, MII_BMCR, ctl);
+
+   if (port == chip->phy_addr_sel_strap) {
+   /* Virtual Phy: Remove Turbo 200Mbit mode */
+   lan9303_read(chip->regmap, LAN9303_VIRT_SPECIAL_CTRL, );
+
+   ctl &= ~(1 << 10); // TURBO BIT
+   res =  regmap_write(chip->regmap,
+   LAN9303_VIRT_SPECIAL_CTRL, ctl);
+   }
+}
+
 static int lan9303_port_enable(struct dsa_switch *ds, int port,
   struct phy_device *phy)
 {
@@ -789,6 +821,7 @@ static struct dsa_switch_ops lan9303_switch_ops = {
.get_strings = lan9303_get_strings,
.phy_read = lan9303_phy_read,
.phy_write = lan9303_phy_write,
+   .adjust_link = lan9303_adjust_link,
.get_ethtool_stats = lan9303_get_ethtool_stats,
.get_sset_count = lan9303_get_sset_count,
.port_enable = lan9303_port_enable,
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v2 00/10] net: dsa: lan9303: unicast offload, fdb,mdb,STP

2017-07-25 Thread Egil Hjelmeland
This series extends the LAN9303 3 port switch DSA driver. Highlights:
 - Make the MDIO interface work
 - Bridging: Unicast offload
 - Bridging: Added fdb/mdb handling
 - Bridging: STP support
 - Documentation


Changes v1 -> v2:
- sorted out emailing issues, threading and date. And sent from private
  account in order to avoid company disclaimer in emails.
- Removed the three last "work around" patches. But first moved one doc 
  paragraph to the document patch.  

Egil Hjelmeland (10):
  net: dsa: lan9303: Fixed MDIO interface
  net: dsa: lan9303: Do not disable/enable switch fabric port 0 at
startup
  net: dsa: lan9303: Refactor lan9303_enable_packet_processing()
  net: dsa: lan9303: Added adjust_link() method
  net: dsa: added dsa_net_device_to_dsa_port()
  net: dsa: lan9303: added sysfs node swe_bcst_throt
  net: dsa: lan9303: Added basic offloading of unicast traffic
  net: dsa: lan9303: Added ALR/fdb/mdb handling
  net: dsa: lan9303: Added Documentation/networking/dsa/lan9303.txt
  net: dsa: lan9303: Only allocate 3 ports

 Documentation/networking/dsa/lan9303.txt |  63 +++
 drivers/net/dsa/lan9303-core.c   | 709 ---
 drivers/net/dsa/lan9303.h|  23 +
 drivers/net/dsa/lan9303_i2c.c|   2 +
 drivers/net/dsa/lan9303_mdio.c   |  34 ++
 include/net/dsa.h|   1 +
 net/dsa/slave.c  |  10 +
 7 files changed, 772 insertions(+), 70 deletions(-)
 create mode 100644 Documentation/networking/dsa/lan9303.txt

-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/12] ima: added Documentation/security/IMA-digest-lists.txt

2017-07-25 Thread Roberto Sassu
This patch adds the documentation of the new IMA feature, to load
and measure file digest lists.

Signed-off-by: Roberto Sassu 
---
 Documentation/security/IMA-digest-lists.txt | 150 
 1 file changed, 150 insertions(+)
 create mode 100644 Documentation/security/IMA-digest-lists.txt

diff --git a/Documentation/security/IMA-digest-lists.txt 
b/Documentation/security/IMA-digest-lists.txt
new file mode 100644
index 000..f9eed21
--- /dev/null
+++ b/Documentation/security/IMA-digest-lists.txt
@@ -0,0 +1,150 @@
+File Digest Lists
+
+ INTRODUCTION 
+
+IMA, for each file matching policy rules, calculates a digest, creates
+a new entry in the measurement list and extends a TPM PCR with the digest
+of entry data. The last step causes a noticeable performance reduction.
+
+Since systems likely access the same files, repeating the above tasks at
+every boot can be avoided by replacing individual measurements of likely
+accessed files with only one measurement of their digests: the advantage
+is that the system performance significantly improves due to less PCR
+extend operations; on the other hand, the information about which files
+have exactly been accessed and in which sequence is lost.
+
+If this new measurement reports only good digests (e.g. those of
+files included in a Linux distribution), and if verifiers only check
+that a system executed good software and didn't access malicious data,
+the disadvantages reported earlier would be acceptable.
+
+The Trusted Computing paradigm measure & load is still respected by IMA
+with the proposed optimization. If a file being accessed is not in a
+measured digest list, a measurement will be recorded as before. If it is,
+the list has already been measured, and the verifier must assume that
+files with digest in the list have been accessed.
+
+Measuring digest lists gives the following benefits:
+
+- boot time reduction
+  For a minimal Linux installation with 1400 measurements, the boot time
+  decreases from 1 minute 30 seconds to 15 seconds, after loading to IMA
+  the digest of all files packaged by the distribution (32000). The new
+  list contains 92 entries. Without IMA, the boot time is 8.5 seconds.
+
+- lower network and CPU requirements for remote attestation
+  With the IMA optimization, both the measurement and digest lists
+  must be verified for a complete evaluation. However, since the lists
+  are fixed, they could be sent to and checked by the verifier only once.
+  Then, during a remote attestation, the only remaining task is to verify
+  the short measurement list.
+
+- signature-based remote attestation
+  Digest list signature can be used as a proof of the provenance for the
+  files whose digest is in the list. Then, if verifiers trust the signer
+  and only check provenance, remote attestation verification would simply
+  consist on checking digest lists signatures and that the measurement
+  list only contain list metadata digests (reference measurement databases
+  would be no longer required). An example of a signed digest list,
+  that can be parsed with this patch set, is the RPM package header.
+
+Digest lists are loaded in two stages by IMA through the new securityfs
+interface called 'digest_lists'. Users supply metadata, for the digest
+lists they want to load: path, format, digest, signature and algorithm
+of the digest.
+
+Then, after the metadata digest is added to the measurement list, IMA
+reads the digest lists at the path specified and loads the digests in
+a hash table (digest lists are not measured, since their digest is already
+included in the metadata). With metadata measurement instead of digest list
+measurement, it is possible to avoid a performance reduction that would
+occur by measuring many digest lists (e.g. RPM headers) individually.
+If, alternatively, digest lists are loaded together, their signature
+cannot be verified.
+
+Lastly, when a file is accessed, IMA searches the calculated digest in
+the hash table. Only if the digest is not found a new entry is added
+to the measurement list.
+
+
+
+ FORMAT 
+
+The format of digest list metadata is:
+
+algo[2] digest_len[4] digest[digest_len]
+signature_len[4] signature[signature_len]
+path_len[4] path[path_len]
+ref_id_len[4] ref_id[ref_id_len]
+list_type_len[4] list_type[list_type_len]
+
+algo, list_type and _len are little endian.
+
+
+algo values are defined in include/uapi/linux/hash_info.h. The algorithms
+in the list metadata must be the same of ima_hash_algo (algorithm used
+by IMA to calculate the file digest).
+
+list type values:
+
+0: compact digest list
+1: RPM package header
+
+
+The format of the compact digest list is:
+
+entry_id[2] count[4] data_len[4]
+data[data_len]
+[...]
+entry_id[2] count[4] data_len[4]
+data[data_len]
+
+entry_id, count and data_len are little endian.
+
+At the moment, entry_id can have value 0, which 

[PATCH 11/12] ima: don't report measurements if digests are included in the loaded lists

2017-07-25 Thread Roberto Sassu
Don't report measurements if the file digest has been included in
an uploaded digest list.

The advantage of this solution is that the boot time overhead, when
a TPM is available, is very small because a PCR is extended only
for unknown files. The disadvantage is that verifiers do not know
anymore which and when files are accessed (they must assume that
the worst case happened, i.e. all files have been accessed).

Signed-off-by: Roberto Sassu 
---
 security/integrity/ima/ima_main.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/security/integrity/ima/ima_main.c 
b/security/integrity/ima/ima_main.c
index c329549..e289b7c 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -253,6 +253,14 @@ static int process_measurement(struct file *file, char 
*buf, loff_t size,
goto out_digsig;
}
 
+   if (!ima_disable_digest_check) {
+   if (ima_lookup_loaded_digest(iint->ima_hash->digest)) {
+   action ^= IMA_MEASURE;
+   iint->flags |= IMA_MEASURED;
+   iint->measured_pcrs |= (0x1 << pcr);
+   }
+   }
+
if (!pathbuf)   /* ima_rdwr_violation possibly pre-fetched */
pathname = ima_d_path(>f_path, , filename);
 
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/12] ima: disable digest lookup if digest lists are not measured

2017-07-25 Thread Roberto Sassu
Loading digest lists affects the behavior of IMA, as files whose digest
has been uploaded will not be displayed in the measurement list.
If the digest lists loading event is not reported, verifiers would believe
that the files with uploaded digests have not been accessed.

To prevent this, the DIGEST_CHECK hook has been defined and a new rule
to measure files accessed by the new hook has been added to the default
policy. If the currently loaded policy does not contain that rule,
digest lookup is disabled.

Digest lookup is also disabled if CONFIG_IMA_DIGEST_LIST is not defined.

Signed-off-by: Roberto Sassu 
---
 security/integrity/ima/ima.h|  1 +
 security/integrity/ima/ima_main.c   | 15 ++-
 security/integrity/ima/ima_policy.c |  1 +
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 77dd4d0..2a558ee 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -199,6 +199,7 @@ static inline unsigned long ima_hash_key(u8 *digest)
hook(KEXEC_KERNEL_CHECK)\
hook(KEXEC_INITRAMFS_CHECK) \
hook(POLICY_CHECK)  \
+   hook(DIGEST_LIST_CHECK) \
hook(MAX_CHECK)
 #define __ima_hook_enumify(ENUM)   ENUM,
 
diff --git a/security/integrity/ima/ima_main.c 
b/security/integrity/ima/ima_main.c
index 2aebb79..c329549 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -29,6 +29,12 @@
 
 int ima_initialized;
 
+#ifdef CONFIG_IMA_DIGEST_LIST
+static int ima_disable_digest_check;
+#else
+static int ima_disable_digest_check = 1;
+#endif
+
 #ifdef CONFIG_IMA_APPRAISE
 int ima_appraise = IMA_APPRAISE_ENFORCE;
 #else
@@ -171,6 +177,9 @@ static int process_measurement(struct file *file, char 
*buf, loff_t size,
bool violation_check;
enum hash_algo hash_algo;
 
+   if (func == DIGEST_LIST_CHECK && !ima_policy_flag)
+   ima_disable_digest_check = 1;
+
if (!ima_policy_flag || !S_ISREG(inode->i_mode))
return 0;
 
@@ -181,6 +190,9 @@ static int process_measurement(struct file *file, char 
*buf, loff_t size,
action = ima_get_action(inode, mask, func, );
violation_check = ((func == FILE_CHECK || func == MMAP_CHECK) &&
   (ima_policy_flag & IMA_MEASURE));
+   if (func == DIGEST_LIST_CHECK && !(action & IMA_MEASURE))
+   ima_disable_digest_check = 1;
+
if (!action && !violation_check)
return 0;
 
@@ -375,7 +387,8 @@ static int read_idmap[READING_MAX_ID] = {
[READING_MODULE] = MODULE_CHECK,
[READING_KEXEC_IMAGE] = KEXEC_KERNEL_CHECK,
[READING_KEXEC_INITRAMFS] = KEXEC_INITRAMFS_CHECK,
-   [READING_POLICY] = POLICY_CHECK
+   [READING_POLICY] = POLICY_CHECK,
+   [READING_DIGEST_LIST] = DIGEST_LIST_CHECK
 };
 
 /**
diff --git a/security/integrity/ima/ima_policy.c 
b/security/integrity/ima/ima_policy.c
index 95209a5..b5c004d 100644
--- a/security/integrity/ima/ima_policy.c
+++ b/security/integrity/ima/ima_policy.c
@@ -127,6 +127,7 @@ static struct ima_rule_entry default_measurement_rules[] 
__ro_after_init = {
{.action = MEASURE, .func = MODULE_CHECK, .flags = IMA_FUNC},
{.action = MEASURE, .func = FIRMWARE_CHECK, .flags = IMA_FUNC},
{.action = MEASURE, .func = POLICY_CHECK, .flags = IMA_FUNC},
+   {.action = MEASURE, .func = DIGEST_LIST_CHECK, .flags = IMA_FUNC},
 };
 
 static struct ima_rule_entry default_appraise_rules[] __ro_after_init = {
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/12] ima: introduce securityfs interfaces for digest lists

2017-07-25 Thread Roberto Sassu
This patch introduces the file 'digest_lists' in the securityfs
filesystem, to load digest lists metadata. IMA will parse the metadata
and loads the digest lists from the path provided.

It also introduces 'digests_count', to show the number of digests
stored in the digest hash table.

Signed-off-by: Roberto Sassu 
---
 security/integrity/ima/ima_fs.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index ad3d674..08174c1 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -34,11 +34,15 @@ static struct dentry *ascii_runtime_measurements;
 static struct dentry *runtime_measurements_count;
 static struct dentry *violations;
 static struct dentry *ima_policy;
+static struct dentry *digest_lists;
+static struct dentry *digests_count;
 
 static enum kernel_read_file_id ima_get_file_id(struct dentry *dentry)
 {
if (dentry == ima_policy)
return READING_POLICY;
+   else if (dentry == digest_lists)
+   return READING_DIGEST_LIST;
 
return READING_UNKNOWN;
 }
@@ -66,6 +70,8 @@ static ssize_t ima_show_htable_value(struct file *filp, char 
__user *buf,
val = _htable.violations;
else if (filp->f_path.dentry == runtime_measurements_count)
val = _htable.len;
+   else if (filp->f_path.dentry == digests_count)
+   val = _digests_htable.len;
 
len = scnprintf(tmpbuf, TMPBUFLEN, "%li\n", atomic_long_read(val));
return simple_read_from_buffer(buf, count, ppos, tmpbuf, len);
@@ -301,6 +307,9 @@ static ssize_t ima_read_file(char *path, enum 
kernel_read_file_id file_id)
 
pr_debug("rule: %s\n", p);
rc = ima_parse_add_rule(p);
+   } else if (file_id == READING_DIGEST_LIST) {
+   rc = ima_parse_digest_list_metadata(size, datap);
+   datap += rc;
}
if (rc < 0)
break;
@@ -510,8 +519,22 @@ int __init ima_fs_init(void)
if (IS_ERR(ima_policy))
goto out;
 
+#ifdef CONFIG_IMA_DIGEST_LIST
+   digest_lists = securityfs_create_file("digest_lists", S_IWUSR, ima_dir,
+ NULL, _data_upload_ops);
+   if (IS_ERR(digest_lists))
+   goto out;
+
+   digests_count = securityfs_create_file("digests_count",
+  S_IRUSR | S_IRGRP, ima_dir,
+  NULL, _htable_value_ops);
+   if (IS_ERR(digests_count))
+   goto out;
+#endif
return 0;
 out:
+   securityfs_remove(digests_count);
+   securityfs_remove(digest_lists);
securityfs_remove(violations);
securityfs_remove(runtime_measurements_count);
securityfs_remove(ascii_runtime_measurements);
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/12] ima: added parser of digest lists metadata

2017-07-25 Thread Roberto Sassu
Userspace applications will be able to load digest lists by supplying
their metadata.

Digest list metadata are:

- DATA_ALGO: algorithm of the digests to be uploaded
- DATA_DIGEST: digest of the file containing the digest list
- DATA_SIGNATURE: signature of the file containing the digest list
- DATA_FILE_PATH: pathname
- DATA_REF_ID: reference ID of the digest list
- DATA_TYPE: type of digest list

The new function ima_parse_digest_list_metadata() parses the metadata
and load each file individually. Then, it parses the data according
to the data type specified.

Since digest lists are measured, their digest is added to the hash table
so that IMA does not create a measurement entry for them (which would
affect the performance). The only measurement entry created will be
for the metadata.

Signed-off-by: Roberto Sassu 
---
 include/linux/fs.h   |   1 +
 security/integrity/ima/Kconfig   |  11 
 security/integrity/ima/Makefile  |   1 +
 security/integrity/ima/ima.h |   8 +++
 security/integrity/ima/ima_digest_list.c | 105 +++
 5 files changed, 126 insertions(+)
 create mode 100644 security/integrity/ima/ima_digest_list.c

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6e1fd5d..2eb6e7c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2751,6 +2751,7 @@ extern int do_pipe_flags(int *, int);
id(KEXEC_IMAGE, kexec-image)\
id(KEXEC_INITRAMFS, kexec-initramfs)\
id(POLICY, security-policy) \
+   id(DIGEST_LIST, security-digest-list)   \
id(MAX_ID, )
 
 #define __fid_enumify(ENUM, dummy) READING_ ## ENUM,
diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
index 35ef693..8965dcc 100644
--- a/security/integrity/ima/Kconfig
+++ b/security/integrity/ima/Kconfig
@@ -227,3 +227,14 @@ config IMA_APPRAISE_SIGNED_INIT
default n
help
   This option requires user-space init to be signed.
+
+config IMA_DIGEST_LIST
+   bool "Measure files depending on uploaded digest lists"
+   depends on IMA
+   default n
+   help
+  This option allows users to load digest lists. If a measured
+  file has the same digest of one from loaded lists, IMA will
+  not create a new measurement entry. A measurement entry will
+  be created only when digest lists are loaded (this entry
+  contains the digest of digest lists metadata).
diff --git a/security/integrity/ima/Makefile b/security/integrity/ima/Makefile
index 29f198b..00dbe3a 100644
--- a/security/integrity/ima/Makefile
+++ b/security/integrity/ima/Makefile
@@ -9,4 +9,5 @@ ima-y := ima_fs.o ima_queue.o ima_init.o ima_main.o 
ima_crypto.o ima_api.o \
 ima_policy.o ima_template.o ima_template_lib.o
 ima-$(CONFIG_IMA_APPRAISE) += ima_appraise.o
 ima-$(CONFIG_HAVE_IMA_KEXEC) += ima_kexec.o
+ima-$(CONFIG_IMA_DIGEST_LIST) += ima_digest_list.o
 obj-$(CONFIG_IMA_BLACKLIST_KEYRING) += ima_mok.o
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index a0c6808..77dd4d0 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -157,6 +157,14 @@ int ima_restore_measurement_entry(struct 
ima_template_entry *entry);
 int ima_restore_measurement_list(loff_t bufsize, void *buf);
 struct ima_digest *ima_lookup_loaded_digest(u8 *digest);
 int ima_add_digest_data_entry(u8 *digest);
+#ifdef CONFIG_IMA_DIGEST_LIST
+ssize_t ima_parse_digest_list_metadata(loff_t size, void *buf);
+#else
+static inline ssize_t ima_parse_digest_list_metadata(loff_t size, void *buf)
+{
+   return -ENOTSUP;
+}
+#endif
 int ima_measurements_show(struct seq_file *m, void *v);
 unsigned long ima_get_binary_runtime_size(void);
 int ima_init_template(void);
diff --git a/security/integrity/ima/ima_digest_list.c 
b/security/integrity/ima/ima_digest_list.c
new file mode 100644
index 000..3e1ff69b
--- /dev/null
+++ b/security/integrity/ima/ima_digest_list.c
@@ -0,0 +1,105 @@
+/*
+ * Copyright (C) 2017 Huawei Technologies Co. Ltd.
+ *
+ * Author: Roberto Sassu 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation, version 2 of the
+ * License.
+ *
+ * File: ima_digest_list.c
+ *  Functions to manage digest lists.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+
+#include "ima.h"
+#include "ima_template_lib.h"
+
+enum digest_metadata_fields {DATA_ALGO, DATA_DIGEST, DATA_SIGNATURE,
+DATA_FILE_PATH, DATA_REF_ID, DATA_TYPE,
+DATA__LAST};
+
+static int ima_parse_digest_list_data(struct ima_field_data *data)
+{
+   void *digest_list;
+   loff_t digest_list_size;
+   u16 data_algo = le16_to_cpu(*(u16 *)data[DATA_ALGO].data);
+   u16 data_type = 

[PATCH 08/12] ima: added parser for RPM data type

2017-07-25 Thread Roberto Sassu
This patch introduces a parser for RPM packages. It extracts the digests
from the RPMTAG_FILEDIGESTS header section and converts them to binary data
before adding them to the hash table.

The advantage of this data type is that verifiers can determine who
produced that data, as headers are signed by Linux distributions vendors.
RPM headers signatures can be provided as digest list metadata.

Signed-off-by: Roberto Sassu 
---
 security/integrity/ima/ima_digest_list.c | 84 +++-
 1 file changed, 83 insertions(+), 1 deletion(-)

diff --git a/security/integrity/ima/ima_digest_list.c 
b/security/integrity/ima/ima_digest_list.c
index c1ef79a..11ee77e 100644
--- a/security/integrity/ima/ima_digest_list.c
+++ b/security/integrity/ima/ima_digest_list.c
@@ -19,11 +19,13 @@
 #include "ima.h"
 #include "ima_template_lib.h"
 
+#define RPMTAG_FILEDIGESTS 1035
+
 enum digest_metadata_fields {DATA_ALGO, DATA_DIGEST, DATA_SIGNATURE,
 DATA_FILE_PATH, DATA_REF_ID, DATA_TYPE,
 DATA__LAST};
 
-enum digest_data_types {DATA_TYPE_COMPACT_LIST};
+enum digest_data_types {DATA_TYPE_COMPACT_LIST, DATA_TYPE_RPM};
 
 enum compact_list_entry_ids {COMPACT_LIST_ID_DIGEST};
 
@@ -33,6 +35,20 @@ struct compact_list_hdr {
u32 datalen;
 } __packed;
 
+struct rpm_hdr {
+   u32 magic;
+   u32 reserved;
+   u32 tags;
+   u32 datasize;
+} __packed;
+
+struct rpm_entryinfo {
+   int32_t tag;
+   u32 type;
+   int32_t offset;
+   u32 count;
+} __packed;
+
 static int ima_parse_compact_list(loff_t size, void *buf)
 {
void *bufp = buf, *bufendp = buf + size;
@@ -80,6 +96,69 @@ static int ima_parse_compact_list(loff_t size, void *buf)
return 0;
 }
 
+static int ima_parse_rpm(loff_t size, void *buf)
+{
+   void *bufp = buf, *bufendp = buf + size;
+   struct rpm_hdr *hdr = bufp;
+   u32 tags = be32_to_cpu(hdr->tags);
+   struct rpm_entryinfo *entry;
+   void *datap = bufp + sizeof(*hdr) + tags * sizeof(struct rpm_entryinfo);
+   int digest_len = hash_digest_size[ima_hash_algo];
+   u8 digest[digest_len];
+   int ret, i, j;
+
+   const unsigned char rpm_header_magic[8] = {
+   0x8e, 0xad, 0xe8, 0x01, 0x00, 0x00, 0x00, 0x00
+   };
+
+   if (size < sizeof(*hdr)) {
+   pr_err("Missing RPM header\n");
+   return -EINVAL;
+   }
+
+   if (memcmp(bufp, rpm_header_magic, sizeof(rpm_header_magic))) {
+   pr_err("Invalid RPM header\n");
+   return -EINVAL;
+   }
+
+   bufp += sizeof(*hdr);
+
+   for (i = 0; i < tags && (bufp + sizeof(*entry)) <= bufendp;
+i++, bufp += sizeof(*entry)) {
+   entry = bufp;
+
+   if (be32_to_cpu(entry->tag) != RPMTAG_FILEDIGESTS)
+   continue;
+
+   datap += be32_to_cpu(entry->offset);
+
+   for (j = 0; j < be32_to_cpu(entry->count) &&
+datap < bufendp; j++) {
+   if (strlen(datap) == 0) {
+   datap++;
+   continue;
+   }
+
+   if (datap + digest_len * 2 + 1 > bufendp) {
+   pr_err("RPM header read at invalid offset\n");
+   return -EINVAL;
+   }
+
+   hex2bin(digest, datap, digest_len);
+
+   ret = ima_add_digest_data_entry(digest);
+   if (ret < 0 && ret != -EEXIST)
+   return ret;
+
+   datap += digest_len * 2 + 1;
+   }
+
+   break;
+   }
+
+   return 0;
+}
+
 static int ima_parse_digest_list_data(struct ima_field_data *data)
 {
void *digest_list;
@@ -107,6 +186,9 @@ static int ima_parse_digest_list_data(struct ima_field_data 
*data)
case DATA_TYPE_COMPACT_LIST:
ret = ima_parse_compact_list(digest_list_size, digest_list);
break;
+   case DATA_TYPE_RPM:
+   ret = ima_parse_rpm(digest_list_size, digest_list);
+   break;
default:
pr_err("Parser for data type %d not implemented\n", data_type);
ret = -EINVAL;
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/12] ima: measure digest lists instead of individual files

2017-07-25 Thread Roberto Sassu
This patch set applies on top of kernel v4.13-rc2.

IMA, for each file matching policy rules, calculates a digest, creates
a new entry in the measurement list and extends a TPM PCR with the digest
of entry data. The last step causes a noticeable performance reduction.

Since systems likely access the same files, repeating the above tasks at
every boot can be avoided by replacing individual measurements of likely
accessed files with only one measurement of their digests: the advantage
is that the system performance significantly improves due to less PCR
extend operations; on the other hand, the information about which files
have exactly been accessed and in which sequence is lost.

If this new measurement reports only good digests (e.g. those of
files included in a Linux distribution), and if verifiers only check
that a system executed good software and didn't access malicious data,
the disadvantages reported earlier would be acceptable.

The Trusted Computing paradigm measure & load is still respected by IMA
with the proposed optimization. If a file being accessed is not in a
measured digest list, a measurement will be recorded as before. If it is,
the list has already been measured, and the verifier must assume that
files with digest in the list have been accessed.

Measuring digest lists gives the following benefits:

- boot time reduction
  For a minimal Linux installation with 1400 measurements, the boot time
  decreases from 1 minute 30 seconds to 15 seconds, after loading to IMA
  the digest of all files packaged by the distribution (32000). The new
  list contains 92 entries. Without IMA, the boot time is 8.5 seconds.

- lower network and CPU requirements for remote attestation
  With the IMA optimization, both the measurement and digest lists
  must be verified for a complete evaluation. However, since the lists
  are fixed, they could be sent to and checked by the verifier only once.
  Then, during a remote attestation, the only remaining task is to verify
  the short measurement list.

- signature-based remote attestation
  Digest list signature can be used as a proof of the provenance for the
  files whose digest is in the list. Then, if verifiers trust the signer
  and only check provenance, remote attestation verification would simply
  consist on checking digest lists signatures and that the measurement
  list only contain list metadata digests (reference measurement databases
  would be no longer required). An example of a signed digest list,
  that can be parsed with this patch set, is the RPM package header.

Digest lists are loaded in two stages by IMA through the new securityfs
interface called 'digest_lists'. Users supply metadata, for the digest
lists they want to load: path, format, digest, signature and algorithm
of the digest.

Then, after the metadata digest is added to the measurement list, IMA
reads the digest lists at the path specified and loads the digests in
a hash table (digest lists are not measured, since their digest is already
included in the metadata). With metadata measurement instead of digest list
measurement, it is possible to avoid a performance reduction that would
occur by measuring many digest lists (e.g. RPM headers) individually.
If, alternatively, digest lists are loaded together, their signature
cannot be verified.

Lastly, when a file is accessed, IMA searches the calculated digest in
the hash table. Only if the digest is not found a new entry is added
to the measurement list.


Roberto Sassu (12):
  ima: generalize ima_read_policy()
  ima: generalize ima_write_policy()
  ima: generalize policy file operations
  ima: use ima_show_htable_value to show hash table data
  ima: add functions to manage digest lists
  ima: added parser of digest lists metadata
  ima: added parser for compact digest list
  ima: added parser for RPM data type
  ima: introduce securityfs interfaces for digest lists
  ima: disable digest lookup if digest lists are not measured
  ima: don't report measurements if digests are included in the loaded
lists
  ima: added Documentation/security/IMA-digest-lists.txt

 Documentation/security/IMA-digest-lists.txt | 150 +
 include/linux/fs.h  |   1 +
 security/integrity/ima/Kconfig  |  11 ++
 security/integrity/ima/Makefile |   1 +
 security/integrity/ima/ima.h|  17 ++
 security/integrity/ima/ima_digest_list.c| 247 
 security/integrity/ima/ima_fs.c | 178 
 security/integrity/ima/ima_main.c   |  23 ++-
 security/integrity/ima/ima_policy.c |   1 +
 security/integrity/ima/ima_queue.c  |  39 +
 10 files changed, 602 insertions(+), 66 deletions(-)
 create mode 100644 Documentation/security/IMA-digest-lists.txt
 create mode 100644 security/integrity/ima/ima_digest_list.c

-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message 

[PATCH 05/12] ima: add functions to manage digest lists

2017-07-25 Thread Roberto Sassu
This patch first introduces a new structure called ima_digest, which will
contain a digest parsed from the digest list. It has been preferred to
ima_queue_entry, as the existing structure includes an additional member
(a list head), which is not necessary for digest lookup.

Then, this patch introduces functions to lookup and add a digest to
a hash table, which will be used by the parsers.

Signed-off-by: Roberto Sassu 
---
 security/integrity/ima/ima.h   |  8 
 security/integrity/ima/ima_queue.c | 39 ++
 2 files changed, 47 insertions(+)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index d52b487..a0c6808 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -107,6 +107,11 @@ struct ima_queue_entry {
 };
 extern struct list_head ima_measurements;  /* list of all measurements */
 
+struct ima_digest {
+   struct hlist_node hnext;
+   u8 digest[0];
+};
+
 /* Some details preceding the binary serialized measurement list */
 struct ima_kexec_hdr {
u16 version;
@@ -150,6 +155,8 @@ void ima_print_digest(struct seq_file *m, u8 *digest, u32 
size);
 struct ima_template_desc *ima_template_desc_current(void);
 int ima_restore_measurement_entry(struct ima_template_entry *entry);
 int ima_restore_measurement_list(loff_t bufsize, void *buf);
+struct ima_digest *ima_lookup_loaded_digest(u8 *digest);
+int ima_add_digest_data_entry(u8 *digest);
 int ima_measurements_show(struct seq_file *m, void *v);
 unsigned long ima_get_binary_runtime_size(void);
 int ima_init_template(void);
@@ -166,6 +173,7 @@ struct ima_h_table {
struct hlist_head queue[IMA_MEASURE_HTABLE_SIZE];
 };
 extern struct ima_h_table ima_htable;
+extern struct ima_h_table ima_digests_htable;
 
 static inline unsigned long ima_hash_key(u8 *digest)
 {
diff --git a/security/integrity/ima/ima_queue.c 
b/security/integrity/ima/ima_queue.c
index a02a86d..d1a3d3f 100644
--- a/security/integrity/ima/ima_queue.c
+++ b/security/integrity/ima/ima_queue.c
@@ -42,6 +42,11 @@ struct ima_h_table ima_htable = {
.queue[0 ... IMA_MEASURE_HTABLE_SIZE - 1] = HLIST_HEAD_INIT
 };
 
+struct ima_h_table ima_digests_htable = {
+   .len = ATOMIC_LONG_INIT(0),
+   .queue[0 ... IMA_MEASURE_HTABLE_SIZE - 1] = HLIST_HEAD_INIT
+};
+
 /* mutex protects atomicity of extending measurement list
  * and extending the TPM PCR aggregate. Since tpm_extend can take
  * long (and the tpm driver uses a mutex), we can't use the spinlock.
@@ -212,3 +217,37 @@ int ima_restore_measurement_entry(struct 
ima_template_entry *entry)
mutex_unlock(_extend_list_mutex);
return result;
 }
+
+struct ima_digest *ima_lookup_loaded_digest(u8 *digest)
+{
+   struct ima_digest *d = NULL;
+   int digest_len = hash_digest_size[ima_hash_algo];
+   unsigned int key = ima_hash_key(digest);
+
+   rcu_read_lock();
+   hlist_for_each_entry_rcu(d, _digests_htable.queue[key], hnext) {
+   if (memcmp(d->digest, digest, digest_len) == 0)
+   break;
+   }
+   rcu_read_unlock();
+   return d;
+}
+
+int ima_add_digest_data_entry(u8 *digest)
+{
+   struct ima_digest *d = ima_lookup_loaded_digest(digest);
+   int digest_len = hash_digest_size[ima_hash_algo];
+   unsigned int key = ima_hash_key(digest);
+
+   if (d)
+   return -EEXIST;
+
+   d = kmalloc(sizeof(*d) + digest_len, GFP_KERNEL);
+   if (d == NULL)
+   return -ENOMEM;
+
+   memcpy(d->digest, digest, digest_len);
+   hlist_add_head_rcu(>hnext, _digests_htable.queue[key]);
+   atomic_long_inc(_digests_htable.len);
+   return 0;
+}
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/12] ima: use ima_show_htable_value to show hash table data

2017-07-25 Thread Roberto Sassu
This patch removes ima_show_htable_violations() and
ima_show_measurements_count(). ima_show_htable_value(), called
by those functions, determines which hash table data should be
copied to the buffer depending on the dentry of the file passed
as argument.

Signed-off-by: Roberto Sassu 
---
 security/integrity/ima/ima_fs.c | 38 --
 1 file changed, 12 insertions(+), 26 deletions(-)

diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index f4199f2..ad3d674 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -55,38 +55,24 @@ __setup("ima_canonical_fmt", default_canonical_fmt_setup);
 
 static int valid_policy = 1;
 #define TMPBUFLEN 12
-static ssize_t ima_show_htable_value(char __user *buf, size_t count,
-loff_t *ppos, atomic_long_t *val)
+static ssize_t ima_show_htable_value(struct file *filp, char __user *buf,
+size_t count, loff_t *ppos)
 {
+   atomic_long_t *val = NULL;
char tmpbuf[TMPBUFLEN];
ssize_t len;
 
+   if (filp->f_path.dentry == violations)
+   val = _htable.violations;
+   else if (filp->f_path.dentry == runtime_measurements_count)
+   val = _htable.len;
+
len = scnprintf(tmpbuf, TMPBUFLEN, "%li\n", atomic_long_read(val));
return simple_read_from_buffer(buf, count, ppos, tmpbuf, len);
 }
 
-static ssize_t ima_show_htable_violations(struct file *filp,
- char __user *buf,
- size_t count, loff_t *ppos)
-{
-   return ima_show_htable_value(buf, count, ppos, _htable.violations);
-}
-
-static const struct file_operations ima_htable_violations_ops = {
-   .read = ima_show_htable_violations,
-   .llseek = generic_file_llseek,
-};
-
-static ssize_t ima_show_measurements_count(struct file *filp,
-  char __user *buf,
-  size_t count, loff_t *ppos)
-{
-   return ima_show_htable_value(buf, count, ppos, _htable.len);
-
-}
-
-static const struct file_operations ima_measurements_count_ops = {
-   .read = ima_show_measurements_count,
+static const struct file_operations ima_htable_value_ops = {
+   .read = ima_show_htable_value,
.llseek = generic_file_llseek,
 };
 
@@ -508,13 +494,13 @@ int __init ima_fs_init(void)
runtime_measurements_count =
securityfs_create_file("runtime_measurements_count",
   S_IRUSR | S_IRGRP, ima_dir, NULL,
-  _measurements_count_ops);
+  _htable_value_ops);
if (IS_ERR(runtime_measurements_count))
goto out;
 
violations =
securityfs_create_file("violations", S_IRUSR | S_IRGRP,
-  ima_dir, NULL, _htable_violations_ops);
+  ima_dir, NULL, _htable_value_ops);
if (IS_ERR(violations))
goto out;
 
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/12] ima: generalize policy file operations

2017-07-25 Thread Roberto Sassu
This patch renames ima_open_policy() and ima_release_policy() respectively
to ima_open_data_upload() and ima_release_data_upload(). They will be used
to implement file operations for interfaces allowing to upload and read
provided data.

Also, the new flag IMA_POLICY_BUSY has been defined specifically for
the policy, as it might not be cleared at file release. This would prevent
userspace applications from uploading files after a policy has been loaded.

Signed-off-by: Roberto Sassu 
---
 security/integrity/ima/ima_fs.c | 46 -
 1 file changed, 32 insertions(+), 14 deletions(-)

diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index e375206..f4199f2 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -384,6 +384,7 @@ static ssize_t ima_write_data(struct file *file, const char 
__user *buf,
 }
 
 enum ima_fs_flags {
+   IMA_POLICY_BUSY,
IMA_FS_BUSY,
 };
 
@@ -399,22 +400,33 @@ static const struct seq_operations ima_policy_seqops = {
 #endif
 
 /*
- * ima_open_policy: sequentialize access to the policy file
+ * ima_open_data_upload: sequentialize access to the data upload interface
  */
-static int ima_open_policy(struct inode *inode, struct file *filp)
+static int ima_open_data_upload(struct inode *inode, struct file *filp)
 {
+   enum kernel_read_file_id file_id = ima_get_file_id(filp->f_path.dentry);
+   const struct seq_operations *seq_ops = NULL;
+   enum ima_fs_flags flag = IMA_FS_BUSY;
+   bool read_allowed = false;
+
+   if (file_id == READING_POLICY) {
+   flag = IMA_POLICY_BUSY;
+#ifdef CONFIG_IMA_READ_POLICY
+   read_allowed = true;
+   seq_ops = _policy_seqops;
+#endif
+   }
+
if (!(filp->f_flags & O_WRONLY)) {
-#ifndefCONFIG_IMA_READ_POLICY
-   return -EACCES;
-#else
+   if (!read_allowed)
+   return -EACCES;
if ((filp->f_flags & O_ACCMODE) != O_RDONLY)
return -EACCES;
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
-   return seq_open(filp, _policy_seqops);
-#endif
+   return seq_open(filp, seq_ops);
}
-   if (test_and_set_bit(IMA_FS_BUSY, _fs_flags))
+   if (test_and_set_bit(flag, _fs_flags))
return -EBUSY;
return 0;
 }
@@ -426,13 +438,19 @@ static int ima_open_policy(struct inode *inode, struct 
file *filp)
  * point to the new policy rules, and remove the securityfs policy file,
  * assuming a valid policy.
  */
-static int ima_release_policy(struct inode *inode, struct file *file)
+static int ima_release_data_upload(struct inode *inode, struct file *file)
 {
+   enum kernel_read_file_id file_id = ima_get_file_id(file->f_path.dentry);
const char *cause = valid_policy ? "completed" : "failed";
 
if ((file->f_flags & O_ACCMODE) == O_RDONLY)
return seq_release(inode, file);
 
+   if (file_id != READING_POLICY) {
+   clear_bit(IMA_FS_BUSY, _fs_flags);
+   return 0;
+   }
+
if (valid_policy && ima_check_policy() < 0) {
cause = "failed";
valid_policy = 0;
@@ -454,16 +472,16 @@ static int ima_release_policy(struct inode *inode, struct 
file *file)
securityfs_remove(ima_policy);
ima_policy = NULL;
 #else
-   clear_bit(IMA_FS_BUSY, _fs_flags);
+   clear_bit(IMA_POLICY_BUSY, _fs_flags);
 #endif
return 0;
 }
 
-static const struct file_operations ima_measure_policy_ops = {
-   .open = ima_open_policy,
+static const struct file_operations ima_data_upload_ops = {
+   .open = ima_open_data_upload,
.write = ima_write_data,
.read = seq_read,
-   .release = ima_release_policy,
+   .release = ima_release_data_upload,
.llseek = generic_file_llseek,
 };
 
@@ -502,7 +520,7 @@ int __init ima_fs_init(void)
 
ima_policy = securityfs_create_file("policy", POLICY_FILE_FLAGS,
ima_dir, NULL,
-   _measure_policy_ops);
+   _data_upload_ops);
if (IS_ERR(ima_policy))
goto out;
 
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/12] ima: generalize ima_write_policy()

2017-07-25 Thread Roberto Sassu
This patch renames ima_write_policy() to ima_write_data(). Also,
it determines the kernel_read_file_id from the dentry associated
to the file, and passes it to ima_read_file().

Signed-off-by: Roberto Sassu 
---
 security/integrity/ima/ima_fs.c | 55 ++---
 1 file changed, 35 insertions(+), 20 deletions(-)

diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 058d3c1..e375206 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -28,6 +28,21 @@
 
 static DEFINE_MUTEX(ima_write_mutex);
 
+static struct dentry *ima_dir;
+static struct dentry *binary_runtime_measurements;
+static struct dentry *ascii_runtime_measurements;
+static struct dentry *runtime_measurements_count;
+static struct dentry *violations;
+static struct dentry *ima_policy;
+
+static enum kernel_read_file_id ima_get_file_id(struct dentry *dentry)
+{
+   if (dentry == ima_policy)
+   return READING_POLICY;
+
+   return READING_UNKNOWN;
+}
+
 bool ima_canonical_fmt;
 static int __init default_canonical_fmt_setup(char *str)
 {
@@ -315,11 +330,12 @@ static ssize_t ima_read_file(char *path, enum 
kernel_read_file_id file_id)
return pathlen;
 }
 
-static ssize_t ima_write_policy(struct file *file, const char __user *buf,
-   size_t datalen, loff_t *ppos)
+static ssize_t ima_write_data(struct file *file, const char __user *buf,
+ size_t datalen, loff_t *ppos)
 {
char *data;
ssize_t result;
+   enum kernel_read_file_id file_id = ima_get_file_id(file->f_path.dentry);
 
if (datalen >= PAGE_SIZE)
datalen = PAGE_SIZE - 1;
@@ -340,34 +356,33 @@ static ssize_t ima_write_policy(struct file *file, const 
char __user *buf,
goto out_free;
 
if (data[0] == '/') {
-   result = ima_read_file(data, READING_POLICY);
-   } else if (ima_appraise & IMA_APPRAISE_POLICY) {
-   pr_err("IMA: signed policy file (specified as an absolute 
pathname) required\n");
-   integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL, NULL,
-   "policy_update", "signed policy required",
-   1, 0);
-   if (ima_appraise & IMA_APPRAISE_ENFORCE)
-   result = -EACCES;
+   result = ima_read_file(data, file_id);
+   } else if (file_id == READING_POLICY) {
+   if (ima_appraise & IMA_APPRAISE_POLICY) {
+   pr_err("IMA: signed policy file (specified "
+  "as an absolute pathname) required\n");
+   integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL, NULL,
+   "policy_update", "signed policy required",
+   1, 0);
+   if (ima_appraise & IMA_APPRAISE_ENFORCE)
+   result = -EACCES;
+   } else {
+   result = ima_parse_add_rule(data);
+   }
} else {
-   result = ima_parse_add_rule(data);
+   pr_err("Unknown data type\n");
+   result = -EINVAL;
}
mutex_unlock(_write_mutex);
 out_free:
kfree(data);
 out:
-   if (result < 0)
+   if (file_id == READING_POLICY && result < 0)
valid_policy = 0;
 
return result;
 }
 
-static struct dentry *ima_dir;
-static struct dentry *binary_runtime_measurements;
-static struct dentry *ascii_runtime_measurements;
-static struct dentry *runtime_measurements_count;
-static struct dentry *violations;
-static struct dentry *ima_policy;
-
 enum ima_fs_flags {
IMA_FS_BUSY,
 };
@@ -446,7 +461,7 @@ static int ima_release_policy(struct inode *inode, struct 
file *file)
 
 static const struct file_operations ima_measure_policy_ops = {
.open = ima_open_policy,
-   .write = ima_write_policy,
+   .write = ima_write_data,
.read = seq_read,
.release = ima_release_policy,
.llseek = generic_file_llseek,
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/12] ima: generalize ima_read_policy()

2017-07-25 Thread Roberto Sassu
Rename ima_read_policy() to ima_read_file(), and add file_id as new
parameter. If file_id is equal to READING_POLICY, ima_read_file()
behavior is the same of that without the patch.

ima_read_file() will be used to read digest lists, to avoid reporting
measurements when the file digest is known.

Signed-off-by: Roberto Sassu 
---
 security/integrity/ima/ima_fs.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index ad491c5..058d3c1 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -272,7 +272,7 @@ static const struct file_operations 
ima_ascii_measurements_ops = {
.release = seq_release,
 };
 
-static ssize_t ima_read_policy(char *path)
+static ssize_t ima_read_file(char *path, enum kernel_read_file_id file_id)
 {
void *data;
char *datap;
@@ -285,16 +285,22 @@ static ssize_t ima_read_policy(char *path)
datap = path;
strsep(, "\n");
 
-   rc = kernel_read_file_from_path(path, , , 0, READING_POLICY);
+   rc = kernel_read_file_from_path(path, , , 0, file_id);
if (rc < 0) {
pr_err("Unable to open file: %s (%d)", path, rc);
return rc;
}
 
datap = data;
-   while (size > 0 && (p = strsep(, "\n"))) {
-   pr_debug("rule: %s\n", p);
-   rc = ima_parse_add_rule(p);
+   while (size > 0) {
+   if (file_id == READING_POLICY) {
+   p = strsep(, "\n");
+   if (p == NULL)
+   break;
+
+   pr_debug("rule: %s\n", p);
+   rc = ima_parse_add_rule(p);
+   }
if (rc < 0)
break;
size -= rc;
@@ -334,7 +340,7 @@ static ssize_t ima_write_policy(struct file *file, const 
char __user *buf,
goto out_free;
 
if (data[0] == '/') {
-   result = ima_read_policy(data);
+   result = ima_read_file(data, READING_POLICY);
} else if (ima_appraise & IMA_APPRAISE_POLICY) {
pr_err("IMA: signed policy file (specified as an absolute 
pathname) required\n");
integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL, NULL,
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 1/5] mm: add vm_insert_mixed_mkwrite()

2017-07-25 Thread Kirill A. Shutemov
On Tue, Jul 25, 2017 at 02:50:37PM +0200, Jan Kara wrote:
> On Tue 25-07-17 14:15:22, Christoph Hellwig wrote:
> > On Tue, Jul 25, 2017 at 11:35:08AM +0200, Jan Kara wrote:
> > > On Tue 25-07-17 10:01:58, Christoph Hellwig wrote:
> > > > On Tue, Jul 25, 2017 at 01:14:00AM +0300, Kirill A. Shutemov wrote:
> > > > > I guess it's up to filesystem if it wants to reuse the same spot to 
> > > > > write
> > > > > data or not. I think your assumptions works for ext4 and xfs. I 
> > > > > wouldn't
> > > > > be that sure for btrfs or other filesystems with CoW support.
> > > > 
> > > > Or XFS with reflinks for that matter.  Which currently can't be
> > > > combined with DAX, but I had a somewhat working version a few month
> > > > ago.
> > > 
> > > But in cases like COW when the block mapping changes, the process
> > > must run unmap_mapping_range() before installing the new PTE so that all
> > > processes mapping this file offset actually refault and see the new
> > > mapping. So this would go through pte_none() case. Am I missing something?
> > 
> > Yes, for DAX COW mappings we'd probably need something like this, unlike
> > the pagecache COW handling for which only the underlying block change,
> > but not the page.
> 
> Right. So again nothing where the WARN_ON should trigger.

Yes. I was confused on how COW is handled.

Acked-by: Kirill A. Shutemov 

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] printk: Add boottime and real timestamps

2017-07-25 Thread Peter Zijlstra
On Tue, Jul 25, 2017 at 08:17:27AM -0400, Prarit Bhargava wrote:
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 5b1662ec546f..6cd38a25f8ea 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1,8 +1,8 @@
>  menu "printk and dmesg options"
>  
>  config PRINTK_TIME
> - int "Show timing information on printks (0-1)"
> - range 0 1
> + int "Show timing information on printks (0-3)"
> + range 0 3
>   default "0"
>   depends on PRINTK
>   help
> @@ -13,7 +13,8 @@ config PRINTK_TIME
> The timestamp is always recorded internally, and exported
> to /dev/kmsg. This flag just specifies if the timestamp should
> be included, not that the timestamp is recorded. 0 disables the
> -   timestamp and 1 uses the local clock.
> +   timestamp and 1 uses the local clock, 2 uses the monotonic clock, and
> +   3 uses real clock.
>  
> The behavior is also controlled by the kernel command line
> parameter printk.time=1. See 
> Documentation/admin-guide/kernel-parameters.rst


choice
prompt "printk default clock"
default PRIMTK_TIME_DISABLE
help
 goes here

config PRINTK_TIME_DISABLE
bool "Disabled"
help
 goes here

config PRINTK_TIME_LOCAL
bool "local clock"
help
 goes here

config PRINTK_TIME_MONO
bool "CLOCK_MONOTONIC"
help
 goes here

config PRINTK_TIME_REAL
bool "CLOCK_REALTIME"
help
 goes here

endchoice

config PRINTK_TIME
int
default 0 if PRINTK_TIME_DISABLE
default 1 if PRINTK_TIME_LOCAL
default 2 if PRINTK_TIME_MONO
default 3 if PRINTK_TIME_REAL


Although I must strongly discourage using REALTIME, DST will make
untangling your logs an absolute nightmare. I would simply not provide
it.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] printk: Make CONFIG_PRINTK_TIME an int

2017-07-25 Thread Luis R. Rodriguez
On Tue, Jul 25, 2017 at 08:17:26AM -0400, Prarit Bhargava wrote:
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index fc47863f629c..26cf6cadd267 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1202,8 +1202,40 @@ static inline void boot_delay_msec(int level)
>  }
>  #endif
>  
> -static bool printk_time = IS_ENABLED(CONFIG_PRINTK_TIME);
> -module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR);
> +static int printk_time = CONFIG_PRINTK_TIME;

You could just use unsigned int but is the reason you went with int to
enable backward compatibility with the old bool =y or =n?

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 1/5] mm: add vm_insert_mixed_mkwrite()

2017-07-25 Thread Jan Kara
On Tue 25-07-17 14:15:22, Christoph Hellwig wrote:
> On Tue, Jul 25, 2017 at 11:35:08AM +0200, Jan Kara wrote:
> > On Tue 25-07-17 10:01:58, Christoph Hellwig wrote:
> > > On Tue, Jul 25, 2017 at 01:14:00AM +0300, Kirill A. Shutemov wrote:
> > > > I guess it's up to filesystem if it wants to reuse the same spot to 
> > > > write
> > > > data or not. I think your assumptions works for ext4 and xfs. I wouldn't
> > > > be that sure for btrfs or other filesystems with CoW support.
> > > 
> > > Or XFS with reflinks for that matter.  Which currently can't be
> > > combined with DAX, but I had a somewhat working version a few month
> > > ago.
> > 
> > But in cases like COW when the block mapping changes, the process
> > must run unmap_mapping_range() before installing the new PTE so that all
> > processes mapping this file offset actually refault and see the new
> > mapping. So this would go through pte_none() case. Am I missing something?
> 
> Yes, for DAX COW mappings we'd probably need something like this, unlike
> the pagecache COW handling for which only the underlying block change,
> but not the page.

Right. So again nothing where the WARN_ON should trigger. That being said I
don't care about the WARN_ON too deeply but it can help to catch DAX bugs
so if we can keep it I'd prefer to do so...

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] printk: Add boottime and real timestamps

2017-07-25 Thread Prarit Bhargava
printk.time=1/CONFIG_PRINTK_TIME=Y timestamps printks with an unmodified
hardware clock timestamp.  This clock loses time each day making it
difficult to determine when an issue has occurred in the kernel log.

Modify printk.time to output local, monotonic, or a real timestamp.
Modify the output of /sys/module/printk/parameters/time to output the type
of clock so userspace programs can interpret the timestamp.

Real clock & 32-bit systems:  Selecting the real clock printk timestamp
may lead to unlikely situations where a timestamp is wrong because the
real time offset is read without the protection of a sequence lock in the
call to ktime_get_log_ts() in printk_get_ts().

Signed-off-by: Prarit Bhargava 
Cc: Mark Salyzyn 
Cc: Jonathan Corbet 
Cc: Petr Mladek 
Cc: Sergey Senozhatsky 
Cc: Steven Rostedt 
Cc: John Stultz 
Cc: Thomas Gleixner 
Cc: Stephen Boyd 
Cc: Andrew Morton 
Cc: Greg Kroah-Hartman 
Cc: "Paul E. McKenney" 
Cc: Christoffer Dall 
Cc: Deepa Dinamani 
Cc: Ingo Molnar 
Cc: Joel Fernandes 
Cc: Kees Cook 
Cc: Peter Zijlstra 
Cc: Geert Uytterhoeven 
Cc: "Luis R. Rodriguez" 
Cc: Nicholas Piggin 
Cc: "Jason A. Donenfeld" 
Cc: Olof Johansson 
Cc: "Theodore Ts'o" 
Cc: Josh Poimboeuf 
Cc: linux-doc@vger.kernel.org


---
 Documentation/admin-guide/kernel-parameters.txt |  6 +-
 include/linux/timekeeping.h |  1 +
 kernel/printk/printk.c  | 77 +
 kernel/time/timekeeping.c   | 14 +
 lib/Kconfig.debug   |  7 ++-
 5 files changed, 89 insertions(+), 16 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index c3b14abf9da4..c03240d057b1 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3188,8 +3188,10 @@
ratelimit - ratelimit the logging
Default: ratelimit
 
-   printk.time=Show timing data prefixed to each printk message line
-   Format:   (1/Y/y=enable, 0/N/n=disable)
+   printk.time=Show timestamp prefixed to each printk message line
+   Format: 
+   (0/N/n = disable, 1/Y/y = local/unadjusted HW,
+2 = monotonic, 3 = real)
 
processor.max_cstate=   [HW,ACPI]
Limit processor to maximum C-state
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index ddc229ff6d1e..adb84af42deb 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -239,6 +239,7 @@ static inline u64 ktime_get_raw_ns(void)
 extern u64 ktime_get_mono_fast_ns(void);
 extern u64 ktime_get_raw_fast_ns(void);
 extern u64 ktime_get_boot_fast_ns(void);
+extern u64 ktime_get_log_ts(u64 *offset_real);
 
 /*
  * Timespec interfaces utilizing the ktime based ones
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 26cf6cadd267..35536369a56d 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -576,6 +576,8 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len,
return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len);
 }
 
+static u64 printk_get_ts(void);
+
 /* insert record into the buffer, discard old ones, update heads */
 static int log_store(int facility, int level,
 enum log_flags flags, u64 ts_nsec,
@@ -624,7 +626,7 @@ static int log_store(int facility, int level,
if (ts_nsec > 0)
msg->ts_nsec = ts_nsec;
else
-   msg->ts_nsec = local_clock();
+   msg->ts_nsec = printk_get_ts();
memset(log_dict(msg) + dict_len, 0, pad_len);
msg->len = size;
 
@@ -1203,26 +1205,60 @@ static inline void boot_delay_msec(int level)
 #endif
 
 static int printk_time = CONFIG_PRINTK_TIME;
+static int printk_time_setting; /* initial setting */
 
+/*
+ * Real clock & 32-bit systems:  Selecting the real clock printk timestamp may
+ * lead to unlikely situations where a timestamp is wrong because the real time
+ * offset is read without the protection of a sequence lock in the call to
+ * ktime_get_log_ts() in printk_get_ts() below.
+ */
 static int printk_time_set(const char *val, const struct kernel_param *kp)
 {
char *param = strstrip((char *)val);
+   int _printk_time;
 
if (strlen(param) != 1)
   

[PATCH 0/2] printk: allow different timestamps for printk.time

2017-07-25 Thread Prarit Bhargava
Over the past years I've seen many reports of bugs that include
time-stamped kernel logs (enabled when CONFIG_PRINTK_TIME=y or
print.time=1 is specified as a kernel parameter) that do not align
with either external time stamped logs or /var/log/messages.  This
also makes determining the time of a failure difficult in cases where
/var/log/messages is unavailable.

For example,

[root@intel-wildcatpass-06 ~]# date; echo "Hello!" > /dev/kmsg ; date
Thu Jul 20 11:38:22 EST 2017
Thu Jul 20 11:38:22 EST 2017

which displays

[83973.768912] Hello!

on the serial console.

Running a script to convert this to the stamped time,

[root@intel-wildcatpass-06 ~]# ./human.sh  | tail -1
[Thu July 17 11:39:45 2017] Hello!

which is already off by 1 minute and 23 seconds off after ~24 hours of
uptime.

This occurs because the printk time stamp is obtained from a call to
local_clock() which (on x86) is a direct call to the hardware.  These
hardware clock reads are not modified by the standard ntp or ptp protocol
The other timestamps are and that results in situations external
time sources are further and further offset from the kernel log
timestamps.

Implement printk.time settings to allow a user to specify the monotonic
or real clocks.  The default is the local clock (hardware clock).

Real clock & 32-bit systems:  Selecting the real clock printk timestamp may
lead to unlikely situations where a timestamp is wrong because the real time
offset is read without the protection of a sequence lock in the call to
ktime_get_log_ts() in printk_get_ts().

Signed-off-by: Prarit Bhargava 
Cc: Mark Salyzyn 
Cc: Jonathan Corbet 
Cc: Petr Mladek 
Cc: Sergey Senozhatsky 
Cc: Steven Rostedt 
Cc: John Stultz 
Cc: Thomas Gleixner 
Cc: Stephen Boyd 
Cc: Andrew Morton 
Cc: Greg Kroah-Hartman 
Cc: "Paul E. McKenney" 
Cc: Christoffer Dall 
Cc: Deepa Dinamani 
Cc: Ingo Molnar 
Cc: Joel Fernandes 
Cc: Kees Cook 
Cc: Peter Zijlstra 
Cc: Geert Uytterhoeven 
Cc: "Luis R. Rodriguez" 
Cc: Nicholas Piggin 
Cc: "Jason A. Donenfeld" 
Cc: Olof Johansson 
Cc: "Theodore Ts'o" 
Cc: Josh Poimboeuf 
Cc: linux-doc@vger.kernel.org

Prarit Bhargava (2):
  printk: Make CONFIG_PRINTK_TIME an int
  printk: Add boottime and real timestamps

 Documentation/admin-guide/kernel-parameters.txt|  6 +-
 arch/arm/configs/aspeed_g4_defconfig   |  2 +-
 arch/arm/configs/aspeed_g5_defconfig   |  2 +-
 arch/arm/configs/axm55xx_defconfig |  2 +-
 arch/arm/configs/bcm2835_defconfig |  2 +-
 arch/arm/configs/colibri_pxa270_defconfig  |  2 +-
 arch/arm/configs/colibri_pxa300_defconfig  |  2 +-
 arch/arm/configs/dove_defconfig|  2 +-
 arch/arm/configs/efm32_defconfig   |  2 +-
 arch/arm/configs/exynos_defconfig  |  2 +-
 arch/arm/configs/ezx_defconfig |  2 +-
 arch/arm/configs/h5000_defconfig   |  2 +-
 arch/arm/configs/hisi_defconfig|  2 +-
 arch/arm/configs/imote2_defconfig  |  2 +-
 arch/arm/configs/imx_v6_v7_defconfig   |  2 +-
 arch/arm/configs/keystone_defconfig|  2 +-
 arch/arm/configs/lpc18xx_defconfig |  2 +-
 arch/arm/configs/magician_defconfig|  2 +-
 arch/arm/configs/mmp2_defconfig|  2 +-
 arch/arm/configs/moxart_defconfig  |  2 +-
 arch/arm/configs/mps2_defconfig|  2 +-
 arch/arm/configs/multi_v7_defconfig|  2 +-
 arch/arm/configs/mvebu_v7_defconfig|  2 +-
 arch/arm/configs/mxs_defconfig |  2 +-
 arch/arm/configs/omap2plus_defconfig   |  2 +-
 arch/arm/configs/pxa168_defconfig  |  2 +-
 arch/arm/configs/pxa3xx_defconfig  |  2 +-
 arch/arm/configs/pxa910_defconfig  |  2 +-
 arch/arm/configs/pxa_defconfig |  2 +-
 arch/arm/configs/qcom_defconfig|  2 +-
 arch/arm/configs/raumfeld_defconfig|  2 +-
 arch/arm/configs/shmobile_defconfig|  2 +-
 arch/arm/configs/socfpga_defconfig |  2 +-
 arch/arm/configs/stm32_defconfig   |  2 +-
 arch/arm/configs/sunxi_defconfig   |  2 +-
 arch/arm/configs/tango4_defconfig  |  2 +-
 arch/arm/configs/tegra_defconfig   |  2 +-
 

[PATCH 1/2] printk: Make CONFIG_PRINTK_TIME an int

2017-07-25 Thread Prarit Bhargava
CONFIG_PRINTK_TIME is a bool and in order to add timestamp options for
the monotonic and real time clock it must be expanded to an int.

Signed-off-by: Prarit Bhargava 
Cc: Mark Salyzyn 
Cc: Jonathan Corbet 
Cc: Petr Mladek 
Cc: Sergey Senozhatsky 
Cc: Steven Rostedt 
Cc: John Stultz 
Cc: Thomas Gleixner 
Cc: Stephen Boyd 
Cc: Andrew Morton 
Cc: Greg Kroah-Hartman 
Cc: "Paul E. McKenney" 
Cc: Christoffer Dall 
Cc: Deepa Dinamani 
Cc: Ingo Molnar 
Cc: Joel Fernandes 
Cc: Kees Cook 
Cc: Peter Zijlstra 
Cc: Geert Uytterhoeven 
Cc: "Luis R. Rodriguez" 
Cc: Nicholas Piggin 
Cc: "Jason A. Donenfeld" 
Cc: Olof Johansson 
Cc: "Theodore Ts'o" 
Cc: Josh Poimboeuf 
Cc: linux-doc@vger.kernel.org


---
 Documentation/admin-guide/kernel-parameters.txt|  2 +-
 arch/arm/configs/aspeed_g4_defconfig   |  2 +-
 arch/arm/configs/aspeed_g5_defconfig   |  2 +-
 arch/arm/configs/axm55xx_defconfig |  2 +-
 arch/arm/configs/bcm2835_defconfig |  2 +-
 arch/arm/configs/colibri_pxa270_defconfig  |  2 +-
 arch/arm/configs/colibri_pxa300_defconfig  |  2 +-
 arch/arm/configs/dove_defconfig|  2 +-
 arch/arm/configs/efm32_defconfig   |  2 +-
 arch/arm/configs/exynos_defconfig  |  2 +-
 arch/arm/configs/ezx_defconfig |  2 +-
 arch/arm/configs/h5000_defconfig   |  2 +-
 arch/arm/configs/hisi_defconfig|  2 +-
 arch/arm/configs/imote2_defconfig  |  2 +-
 arch/arm/configs/imx_v6_v7_defconfig   |  2 +-
 arch/arm/configs/keystone_defconfig|  2 +-
 arch/arm/configs/lpc18xx_defconfig |  2 +-
 arch/arm/configs/magician_defconfig|  2 +-
 arch/arm/configs/mmp2_defconfig|  2 +-
 arch/arm/configs/moxart_defconfig  |  2 +-
 arch/arm/configs/mps2_defconfig|  2 +-
 arch/arm/configs/multi_v7_defconfig|  2 +-
 arch/arm/configs/mvebu_v7_defconfig|  2 +-
 arch/arm/configs/mxs_defconfig |  2 +-
 arch/arm/configs/omap2plus_defconfig   |  2 +-
 arch/arm/configs/pxa168_defconfig  |  2 +-
 arch/arm/configs/pxa3xx_defconfig  |  2 +-
 arch/arm/configs/pxa910_defconfig  |  2 +-
 arch/arm/configs/pxa_defconfig |  2 +-
 arch/arm/configs/qcom_defconfig|  2 +-
 arch/arm/configs/raumfeld_defconfig|  2 +-
 arch/arm/configs/shmobile_defconfig|  2 +-
 arch/arm/configs/socfpga_defconfig |  2 +-
 arch/arm/configs/stm32_defconfig   |  2 +-
 arch/arm/configs/sunxi_defconfig   |  2 +-
 arch/arm/configs/tango4_defconfig  |  2 +-
 arch/arm/configs/tegra_defconfig   |  2 +-
 arch/arm/configs/u300_defconfig|  2 +-
 arch/arm/configs/u8500_defconfig   |  2 +-
 arch/arm/configs/vt8500_v6_v7_defconfig|  2 +-
 arch/arm/configs/xcep_defconfig|  2 +-
 arch/arm/configs/zx_defconfig  |  2 +-
 arch/arm64/configs/defconfig   |  2 +-
 arch/m68k/configs/amcore_defconfig |  2 +-
 arch/mips/configs/ath25_defconfig  |  2 +-
 arch/mips/configs/bcm47xx_defconfig|  2 +-
 arch/mips/configs/bmips_be_defconfig   |  2 +-
 arch/mips/configs/bmips_stb_defconfig  |  2 +-
 arch/mips/configs/ci20_defconfig   |  2 +-
 arch/mips/configs/generic_defconfig|  2 +-
 arch/mips/configs/lemote2f_defconfig   |  2 +-
 arch/mips/configs/loongson3_defconfig  |  2 +-
 arch/mips/configs/nlm_xlp_defconfig|  2 +-
 arch/mips/configs/nlm_xlr_defconfig|  2 +-
 arch/mips/configs/pistachio_defconfig  |  2 +-
 arch/mips/configs/qi_lb60_defconfig|  2 +-
 arch/mips/configs/rt305x_defconfig |  2 +-
 arch/mips/configs/xway_defconfig   |  2 +-
 arch/parisc/configs/generic-64bit_defconfig|  2 +-
 arch/powerpc/configs/40x/virtex_defconfig  |  2 +-
 arch/powerpc/configs/44x/fsp2_defconfig|  2 +-
 arch/powerpc/configs/44x/virtex5_defconfig |  2 +-
 arch/powerpc/configs/44x/warp_defconfig  

[PATCH v4 0/6] Add HiSilicon SoC uncore Performance Monitoring Unit driver

2017-07-25 Thread Shaokun Zhang
This patchset adds support for HiSilicon SoC uncore PMUs driver. It
includes L3C, Hydra Home Agent (HHA) and DDRC.

Changes in v4:
* remove redundant code and comments
* reverse the functions order in exit function
* remove some GPL information
* revise including header file
* fix Jonathan's other comments

Changes in v3:
* rebase to 4.13-rc1
* add dev_err if ioremap fails for PMUs
 
Changes in v2:
* fix kbuild test robot error
* make hisi_uncore_ops static

Shaokun Zhang (6):
  Documentation: perf: hisi: Documentation for HiSilicon SoC PMU driver
  perf: hisi: Add support for HiSilicon SoC uncore PMU driver
  perf: hisi: Add support for HiSilicon SoC L3C PMU driver
  perf: hisi: Add support for HiSilicon SoC HHA PMU driver
  perf: hisi: Add support for HiSilicon SoC DDRC PMU driver
  arm64: MAINTAINERS: hisi: Add HiSilicon SoC PMU support

 Documentation/perf/hisi-pmu.txt   |  52 +++
 MAINTAINERS   |   7 +
 drivers/perf/Kconfig  |   7 +
 drivers/perf/Makefile |   1 +
 drivers/perf/hisilicon/Makefile   |   1 +
 drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c | 420 
 drivers/perf/hisilicon/hisi_uncore_hha_pmu.c  | 436 +
 drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c  | 538 ++
 drivers/perf/hisilicon/hisi_uncore_pmu.c  | 398 +++
 drivers/perf/hisilicon/hisi_uncore_pmu.h  | 103 +
 include/linux/cpuhotplug.h|   1 +
 11 files changed, 1964 insertions(+)
 create mode 100644 Documentation/perf/hisi-pmu.txt
 create mode 100644 drivers/perf/hisilicon/Makefile
 create mode 100644 drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c
 create mode 100644 drivers/perf/hisilicon/hisi_uncore_hha_pmu.c
 create mode 100644 drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
 create mode 100644 drivers/perf/hisilicon/hisi_uncore_pmu.c
 create mode 100644 drivers/perf/hisilicon/hisi_uncore_pmu.h

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 2/6] perf: hisi: Add support for HiSilicon SoC uncore PMU driver

2017-07-25 Thread Shaokun Zhang
This patch adds support HiSilicon SoC uncore PMU driver framework and
interfaces.

Reviewed-by: Jonathan Cameron 
Signed-off-by: Shaokun Zhang 
Signed-off-by: Anurup M 
---
 drivers/perf/Kconfig |   7 +
 drivers/perf/Makefile|   1 +
 drivers/perf/hisilicon/Makefile  |   1 +
 drivers/perf/hisilicon/hisi_uncore_pmu.c | 398 +++
 drivers/perf/hisilicon/hisi_uncore_pmu.h | 103 
 5 files changed, 510 insertions(+)
 create mode 100644 drivers/perf/hisilicon/Makefile
 create mode 100644 drivers/perf/hisilicon/hisi_uncore_pmu.c
 create mode 100644 drivers/perf/hisilicon/hisi_uncore_pmu.h

diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index e5197ff..78fc4bc 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -17,6 +17,13 @@ config ARM_PMU_ACPI
depends on ARM_PMU && ACPI
def_bool y
 
+config HISI_PMU
+   bool "HiSilicon SoC PMU"
+   depends on ARM64 && ACPI
+   help
+ Support for HiSilicon SoC uncore performance monitoring
+ unit (PMU), such as: L3C, HHA and DDRC.
+
 config QCOM_L2_PMU
bool "Qualcomm Technologies L2-cache PMU"
depends on ARCH_QCOM && ARM64 && ACPI
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index 6420bd4..41d3342 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -1,5 +1,6 @@
 obj-$(CONFIG_ARM_PMU) += arm_pmu.o arm_pmu_platform.o
 obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o
+obj-$(CONFIG_HISI_PMU) += hisilicon/
 obj-$(CONFIG_QCOM_L2_PMU)  += qcom_l2_pmu.o
 obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o
 obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
diff --git a/drivers/perf/hisilicon/Makefile b/drivers/perf/hisilicon/Makefile
new file mode 100644
index 000..2783bb3
--- /dev/null
+++ b/drivers/perf/hisilicon/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o
diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c 
b/drivers/perf/hisilicon/hisi_uncore_pmu.c
new file mode 100644
index 000..d868447
--- /dev/null
+++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
@@ -0,0 +1,398 @@
+/*
+ * HiSilicon SoC Hardware event counters support
+ *
+ * Copyright (C) 2017 Hisilicon Limited
+ * Author: Anurup M 
+ * Shaokun Zhang 
+ *
+ * This code is based on the uncore PMUs like arm-cci and arm-ccn.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "hisi_uncore_pmu.h"
+
+#define HISI_GET_EVENTID(ev) (ev->hw.config_base & 0xff)
+#define HISI_MAX_PERIOD(nr) (BIT_ULL(nr) - 1)
+
+/*
+ * PMU format attributes
+ */
+ssize_t hisi_format_sysfs_show(struct device *dev,
+  struct device_attribute *attr, char *buf)
+{
+   struct dev_ext_attribute *eattr;
+
+   eattr = container_of(attr, struct dev_ext_attribute, attr);
+
+   return sprintf(buf, "%s\n", (char *)eattr->var);
+}
+
+/*
+ * PMU event attributes
+ */
+ssize_t hisi_event_sysfs_show(struct device *dev,
+ struct device_attribute *attr, char *page)
+{
+   struct dev_ext_attribute *eattr;
+
+   eattr = container_of(attr, struct dev_ext_attribute, attr);
+
+   return sprintf(page, "config=0x%lx\n", (unsigned long)eattr->var);
+}
+
+/*
+ * sysfs cpumask attributes
+ */
+ssize_t hisi_cpumask_sysfs_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   struct hisi_pmu *hisi_pmu = to_hisi_pmu(dev_get_drvdata(dev));
+
+   return cpumap_print_to_pagebuf(true, buf, _pmu->cpus);
+}
+
+/* Read Super CPU cluster and CPU cluster ID from MPIDR_EL1 */
+void hisi_read_sccl_and_ccl_id(u32 *sccl_id, u32 *ccl_id)
+{
+   u64 mpidr;
+
+   mpidr = read_cpuid_mpidr();
+   if (mpidr & MPIDR_MT_BITMASK) {
+   if (sccl_id)
+   *sccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 3);
+   if (ccl_id)
+   *ccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
+   } else {
+   if (sccl_id)
+   *sccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
+   if (ccl_id)
+   *ccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+   }
+}
+
+static bool hisi_validate_event_group(struct perf_event *event)
+{
+   struct perf_event *sibling, *leader = event->group_leader;
+   struct hisi_pmu *hisi_pmu = to_hisi_pmu(event->pmu);
+   /* Include count for the event */
+   int counters = 1;
+
+   /*
+* We must NOT create groups containing mixed PMUs, although
+* software events are acceptable
+*/
+   if (leader->pmu != event->pmu && !is_software_event(leader))
+   

[PATCH v4 4/6] perf: hisi: Add support for HiSilicon SoC HHA PMU driver

2017-07-25 Thread Shaokun Zhang
L3 cache coherence is maintained by Hydra Home Agent (HHA) in HiSilicon
SoC. This patch adds support for HHA PMU driver, Each HHA has own
control, counter and interrupt registers and is an separate PMU. For
each HHA PMU, it has 16-programable counters and supports 0x50 events,
event code is 8-bits and every counter is free-running. Interrupt is
supported to handle counter (48-bits) overflow.

Reviewed-by: Jonathan Cameron 
Signed-off-by: Shaokun Zhang 
Signed-off-by: Anurup M 
---
 drivers/perf/hisilicon/Makefile  |   2 +-
 drivers/perf/hisilicon/hisi_uncore_hha_pmu.c | 436 +++
 2 files changed, 437 insertions(+), 1 deletion(-)
 create mode 100644 drivers/perf/hisilicon/hisi_uncore_hha_pmu.c

diff --git a/drivers/perf/hisilicon/Makefile b/drivers/perf/hisilicon/Makefile
index 4a3d3e6..a72afe8 100644
--- a/drivers/perf/hisilicon/Makefile
+++ b/drivers/perf/hisilicon/Makefile
@@ -1 +1 @@
-obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o hisi_uncore_l3c_pmu.o
+obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o hisi_uncore_l3c_pmu.o 
hisi_uncore_hha_pmu.o
diff --git a/drivers/perf/hisilicon/hisi_uncore_hha_pmu.c 
b/drivers/perf/hisilicon/hisi_uncore_hha_pmu.c
new file mode 100644
index 000..6798d5f
--- /dev/null
+++ b/drivers/perf/hisilicon/hisi_uncore_hha_pmu.c
@@ -0,0 +1,436 @@
+/*
+ * HiSilicon SoC HHA uncore Hardware event counters support
+ *
+ * Copyright (C) 2017 Hisilicon Limited
+ * Author: Shaokun Zhang 
+ * Anurup M 
+ *
+ * This code is based on the uncore PMUs like arm-cci and arm-ccn.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "hisi_uncore_pmu.h"
+
+/* HHA register definition */
+#define HHA_INT_MASK   0x0804
+#define HHA_INT_STATUS 0x0808
+#define HHA_INT_CLEAR  0x080C
+#define HHA_PERF_CTRL  0x1E00
+#define HHA_EVENT_CTRL 0x1E04
+#define HHA_EVENT_TYPE00x1E80
+#define HHA_CNT0_LOWER 0x1F00
+
+/* HHA has 16-counters and supports 0x50 events */
+#define HHA_NR_COUNTERS0x10
+#define HHA_NR_EVENTS  0x50
+
+#define HHA_PERF_CTRL_EN   0x1
+#define HHA_EVTYPE_NONE0xff
+
+#define HHA_EVTYPE_REG(idx) (HHA_EVENT_TYPE0 + 4 * ((idx) / 4))
+
+/*
+ * Select the counter register offset using the counter index
+ * every counter is 48-bits and [48:63] is reserved.
+ */
+static u32 get_counter_reg_off(int cntr_idx)
+{
+   return (HHA_CNT0_LOWER + (cntr_idx * 8));
+}
+
+static u64 hisi_hha_pmu_read_counter(struct hisi_pmu *hha_pmu,
+struct hw_perf_event *hwc)
+{
+   u32 idx = hwc->idx;
+   u32 reg;
+
+   if (!hisi_uncore_pmu_counter_valid(hha_pmu, idx)) {
+   dev_err(hha_pmu->dev, "Unsupported event index:%d!\n", idx);
+   return 0;
+   }
+
+   reg = get_counter_reg_off(idx);
+
+   /* Read 64 bits and like L3C, top 16 bits are RAZ */
+   return readq(hha_pmu->base + reg);
+}
+
+static void hisi_hha_pmu_write_counter(struct hisi_pmu *hha_pmu,
+  struct hw_perf_event *hwc, u64 val)
+{
+   u32 idx = hwc->idx;
+   u32 reg;
+
+   if (!hisi_uncore_pmu_counter_valid(hha_pmu, idx)) {
+   dev_err(hha_pmu->dev, "Unsupported event index:%d!\n", idx);
+   return;
+   }
+
+   reg = get_counter_reg_off(idx);
+   /* Write 64 bits and like L3C, top 16 bits are WI */
+   writeq(val, hha_pmu->base + reg);
+}
+
+static void hisi_hha_pmu_write_evtype(struct hisi_pmu *hha_pmu, int idx,
+ u32 type)
+{
+   u32 reg, reg_idx, shift, val;
+
+   /*
+* Select the appropriate event select register(HHA_EVENT_TYPEx).
+* There are 4 event select registers for the 16 hardware counters.
+* Event code is 8-bits and for the first 4 hardware counters,
+* HHA_EVENT_TYPE0 is chosen. For the next 4 hardware counters,
+* HHA_EVENT_TYPE1 is chosen and so on.
+*/
+   reg = HHA_EVTYPE_REG(idx);
+   reg_idx = idx % 4;
+   shift = 8 * reg_idx;
+
+   /* Write event code to HHA_EVENT_TYPEx register */
+   val = readl(hha_pmu->base + reg);
+   val &= ~(HHA_EVTYPE_NONE << shift);
+   val |= (type << shift);
+   writel(val, hha_pmu->base + reg);
+}
+
+static void hisi_hha_pmu_start_counters(struct hisi_pmu *hha_pmu)
+{
+   u32 val;
+
+   /*
+* Set perf_enable bit in HHA_PERF_CTRL to start event
+* counting for all enabled counters.
+*/
+   val = readl(hha_pmu->base + HHA_PERF_CTRL);
+   val |= HHA_PERF_CTRL_EN;
+   writel(val, 

[PATCH v4 1/6] Documentation: perf: hisi: Documentation for HiSilicon SoC PMU driver

2017-07-25 Thread Shaokun Zhang
This patch adds documentation for the uncore PMUs on HiSilicon SoC.

Reviewed-by: Jonathan Cameron 
Signed-off-by: Shaokun Zhang 
Signed-off-by: Anurup M 
---
 Documentation/perf/hisi-pmu.txt | 52 +
 1 file changed, 52 insertions(+)
 create mode 100644 Documentation/perf/hisi-pmu.txt

diff --git a/Documentation/perf/hisi-pmu.txt b/Documentation/perf/hisi-pmu.txt
new file mode 100644
index 000..f45a03d
--- /dev/null
+++ b/Documentation/perf/hisi-pmu.txt
@@ -0,0 +1,52 @@
+HiSilicon SoC uncore Performance Monitoring Unit (PMU)
+==
+The HiSilicon SoC chip comprehends various independent system device PMUs
+such as L3 cache (L3C), Hydra Home Agent (HHA) and DDRC. These PMUs are
+independent and have hardware logic to gather statistics and performance
+information.
+
+HiSilicon SoC encapsulates multiple CPU and IO dies. Each CPU cluster
+(CCL) is made up of 4 cpu cores sharing one L3 cache; Each CPU die is
+called Super CPU cluster (SCCL) and is made up of 6 CCLs. Each SCCL has
+two HHAs (0 - 1) and four DDRCs (0 - 3), respectively.
+
+HiSilicon SoC uncore PMU driver
+---
+Each device PMU has separate registers for event counting, control and
+interrupt, and the PMU driver shall register perf PMU drivers like L3C,
+HHA and DDRC etc. The available events and configuration options shall
+be described in the sysfs, see /sys/devices/hisi_* or /sys/bus/
+event_source/devices/hisi_*.
+The "perf list" command shall list the available events from sysfs.
+
+Each L3C, HHA and DDRC in one SCCL are registered as an separate PMU with perf.
+The PMU name will appear in event listing as hisi_module _.
+where "index-id" is the index of module and "sccl-id" is the identifier of
+the SCCL.
+e.g. hisi_l3c0_1/rd_hit_cpipe is READ_HIT_CPIPE event of L3C index #0 and SCCL
+ID #1.
+e.g. hisi_hha0_1/rx_operations is RX_OPERATIONS event of HHA index #0 and SCCL
+ID #1.
+
+The driver also provides a "cpumask" sysfs attribute, which shows the CPU core
+ID used to count the uncore PMU event.
+
+Example usage of perf:
+$# perf list
+hisi_l3c0_3/rd_hit_cpipe/ [kernel PMU event]
+--
+hisi_l3c0_3/wr_hit_cpipe/ [kernel PMU event]
+--
+hisi_l3c0_1/rd_hit_cpipe/ [kernel PMU event]
+--
+hisi_l3c0_1/wr_hit_cpipe/ [kernel PMU event]
+--
+
+$# perf stat -a -e hisi_l3c0_1/rd_hit_cpipe/ sleep 5
+$# perf stat -a -e hisi_l3c0_1/config=0x02/ sleep 5
+
+The current driver does not support sampling. So "perf record" is unsupported.
+Also attach to a task is unsupported as the events are all uncore.
+
+Note: Please contact the maintainer for a complete list of events supported for
+the PMU devices in the SoC and its information if needed.
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 3/6] perf: hisi: Add support for HiSilicon SoC L3C PMU driver

2017-07-25 Thread Shaokun Zhang
This patch adds support for L3C PMU driver in HiSilicon SoC chip, Each
L3C has own control, counter and interrupt registers and is an separate
PMU. For each L3C PMU, it has 8-programable counters and supports 0x60
events, event code is 8-bits and every counter is free-running.
Interrupt is supported to handle counter (48-bits) overflow.

Reviewed-by: Jonathan Cameron 
Signed-off-by: Shaokun Zhang 
Signed-off-by: Anurup M 
---
 drivers/perf/hisilicon/Makefile  |   2 +-
 drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 538 +++
 include/linux/cpuhotplug.h   |   1 +
 3 files changed, 540 insertions(+), 1 deletion(-)
 create mode 100644 drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c

diff --git a/drivers/perf/hisilicon/Makefile b/drivers/perf/hisilicon/Makefile
index 2783bb3..4a3d3e6 100644
--- a/drivers/perf/hisilicon/Makefile
+++ b/drivers/perf/hisilicon/Makefile
@@ -1 +1 @@
-obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o
+obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o hisi_uncore_l3c_pmu.o
diff --git a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c 
b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
new file mode 100644
index 000..33146bb
--- /dev/null
+++ b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
@@ -0,0 +1,538 @@
+/*
+ * HiSilicon SoC L3C uncore Hardware event counters support
+ *
+ * Copyright (C) 2017 Hisilicon Limited
+ * Author: Anurup M 
+ * Shaokun Zhang 
+ *
+ * This code is based on the uncore PMUs like arm-cci and arm-ccn.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "hisi_uncore_pmu.h"
+
+/* L3C register definition */
+#define L3C_PERF_CTRL  0x0408
+#define L3C_INT_MASK   0x0800
+#define L3C_INT_STATUS 0x0808
+#define L3C_INT_CLEAR  0x080c
+#define L3C_EVENT_CTRL 0x1c00
+#define L3C_EVENT_TYPE00x1d00
+#define L3C_CNTR0_LOWER0x1e00
+
+/* L3C has 8-counters and supports 0x60 events */
+#define L3C_NR_COUNTERS0x8
+#define L3C_NR_EVENTS  0x60
+
+#define L3C_PERF_CTRL_EN   0x2
+#define L3C_EVTYPE_NONE0xff
+
+/*
+ * Select the counter register offset using the counter index
+ * every counter is 48-bits and [48:63] is reserved.
+ */
+static u32 get_counter_reg_off(int cntr_idx)
+{
+   return (L3C_CNTR0_LOWER + (cntr_idx * 8));
+}
+
+static u64 hisi_l3c_pmu_read_counter(struct hisi_pmu *l3c_pmu,
+struct hw_perf_event *hwc)
+{
+   u32 idx = hwc->idx;
+   u32 reg;
+
+   if (!hisi_uncore_pmu_counter_valid(l3c_pmu, idx)) {
+   dev_err(l3c_pmu->dev, "Unsupported event index:%d!\n", idx);
+   return 0;
+   }
+
+   reg = get_counter_reg_off(idx);
+
+   /* Read 64-bits and the upper 16 bits are Read-As-Zero */
+   return readq(l3c_pmu->base + reg);
+}
+
+static void hisi_l3c_pmu_write_counter(struct hisi_pmu *l3c_pmu,
+  struct hw_perf_event *hwc, u64 val)
+{
+   u32 idx = hwc->idx;
+   u32 reg;
+
+   if (!hisi_uncore_pmu_counter_valid(l3c_pmu, idx)) {
+   dev_err(l3c_pmu->dev, "Unsupported event index:%d!\n", idx);
+   return;
+   }
+
+   reg = get_counter_reg_off(idx);
+   /* Write 64-bits and the upper 16 bits are Writes-Ignored */
+   writeq(val, l3c_pmu->base + reg);
+}
+
+static void hisi_l3c_pmu_write_evtype(struct hisi_pmu *l3c_pmu, int idx,
+ u32 type)
+{
+   u32 reg, reg_idx, shift, val;
+
+   /*
+* Select the appropriate event select register(L3C_EVENT_TYPE0/1).
+* There are 2 event select registers for the 8 hardware counters.
+* Event code is 8-bits and for the former 4 hardware counters,
+* L3C_EVENT_TYPE0 is chosen. For the latter 4 hardware counters,
+* L3C_EVENT_TYPE1 is chosen.
+*/
+   reg = L3C_EVENT_TYPE0 + (idx / 4) * 4;
+   reg_idx = idx % 4;
+   shift = 8 * reg_idx;
+
+   /* Write event code to L3C_EVENT_TYPEx Register */
+   val = readl(l3c_pmu->base + reg);
+   val &= ~(L3C_EVTYPE_NONE << shift);
+   val |= (type << shift);
+   writel(val, l3c_pmu->base + reg);
+}
+
+static void hisi_l3c_pmu_start_counters(struct hisi_pmu *l3c_pmu)
+{
+   u32 val;
+
+   /*
+* Set perf_enable bit in L3C_PERF_CTRL register to start counting
+* for all enabled counters.
+*/
+   val = readl(l3c_pmu->base + L3C_PERF_CTRL);
+   val |= L3C_PERF_CTRL_EN;
+   writel(val, l3c_pmu->base + L3C_PERF_CTRL);
+}
+
+static void 

[PATCH v4 5/6] perf: hisi: Add support for HiSilicon SoC DDRC PMU driver

2017-07-25 Thread Shaokun Zhang
This patch adds support for DDRC PMU driver in HiSilicon SoC chip, Each
DDRC has own control, counter and interrupt registers and is an separate
PMU. For each DDRC PMU, it has 8-fixed-purpose counters which have been
mapped to 8-events by hardware, it assumes that counter index is equal
to event code (0 - 7) in DDRC PMU driver. Interrupt is supported to
handle counter (32-bits) overflow.

Reviewed-by: Jonathan Cameron 
Signed-off-by: Shaokun Zhang 
Signed-off-by: Anurup M 
---
 drivers/perf/hisilicon/Makefile   |   2 +-
 drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c | 420 ++
 2 files changed, 421 insertions(+), 1 deletion(-)
 create mode 100644 drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c

diff --git a/drivers/perf/hisilicon/Makefile b/drivers/perf/hisilicon/Makefile
index a72afe8..2621d51 100644
--- a/drivers/perf/hisilicon/Makefile
+++ b/drivers/perf/hisilicon/Makefile
@@ -1 +1 @@
-obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o hisi_uncore_l3c_pmu.o 
hisi_uncore_hha_pmu.o
+obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o hisi_uncore_l3c_pmu.o 
hisi_uncore_hha_pmu.o hisi_uncore_ddrc_pmu.o
diff --git a/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c 
b/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c
new file mode 100644
index 000..e178a09
--- /dev/null
+++ b/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c
@@ -0,0 +1,420 @@
+/*
+ * HiSilicon SoC DDRC uncore Hardware event counters support
+ *
+ * Copyright (C) 2017 Hisilicon Limited
+ * Author: Shaokun Zhang 
+ * Anurup M 
+ *
+ * This code is based on the uncore PMUs like arm-cci and arm-ccn.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "hisi_uncore_pmu.h"
+
+/* DDRC register definition */
+#define DDRC_PERF_CTRL 0x010
+#define DDRC_FLUX_WR   0x380
+#define DDRC_FLUX_RD   0x384
+#define DDRC_FLUX_WCMD  0x388
+#define DDRC_FLUX_RCMD  0x38c
+#define DDRC_PRE_CMD0x3c0
+#define DDRC_ACT_CMD0x3c4
+#define DDRC_BNK_CHG0x3c8
+#define DDRC_RNK_CHG0x3cc
+#define DDRC_EVENT_CTRL 0x6C0
+#define DDRC_INT_MASK  0x6c8
+#define DDRC_INT_STATUS0x6cc
+#define DDRC_INT_CLEAR 0x6d0
+
+/* DDRC supports 8-events and counter is fixed-purpose */
+#define DDRC_NR_COUNTERS   0x8
+#define DDRC_NR_EVENTS DDRC_NR_COUNTERS
+
+#define DDRC_PERF_CTRL_EN  0x2
+
+/*
+ * For DDRC PMU, there are eight-events and every event has been mapped
+ * to fixed-purpose counters which register offset is not consistent.
+ * Therefore there is no write event type and we assume that event
+ * code (0 to 7) is equal to counter index in PMU driver.
+ */
+#define GET_DDRC_EVENTID(hwc)  (hwc->config_base & 0x7)
+
+static const u32 ddrc_reg_off[] = {
+   DDRC_FLUX_WR, DDRC_FLUX_RD, DDRC_FLUX_WCMD, DDRC_FLUX_RCMD,
+   DDRC_PRE_CMD, DDRC_ACT_CMD, DDRC_BNK_CHG, DDRC_RNK_CHG
+};
+
+/*
+ * Select the counter register offset using the counter index.
+ * In DDRC there are no programmable counter, the count
+ * is readed form the statistics counter register itself.
+ */
+static u32 get_counter_reg_off(int cntr_idx)
+{
+   return ddrc_reg_off[cntr_idx];
+}
+
+static u64 hisi_ddrc_pmu_read_counter(struct hisi_pmu *ddrc_pmu,
+ struct hw_perf_event *hwc)
+{
+   /* Use event code as counter index */
+   u32 idx = GET_DDRC_EVENTID(hwc);
+   u32 reg;
+
+   if (!hisi_uncore_pmu_counter_valid(ddrc_pmu, idx)) {
+   dev_err(ddrc_pmu->dev, "Unsupported event index:%d!\n", idx);
+   return 0;
+   }
+
+   reg = get_counter_reg_off(idx);
+
+   return readl(ddrc_pmu->base + reg);
+}
+
+static void hisi_ddrc_pmu_write_counter(struct hisi_pmu *ddrc_pmu,
+   struct hw_perf_event *hwc, u64 val)
+{
+   u32 idx = GET_DDRC_EVENTID(hwc);
+   u32 reg;
+
+   if (!hisi_uncore_pmu_counter_valid(ddrc_pmu, idx)) {
+   dev_err(ddrc_pmu->dev, "Unsupported event index:%d!\n", idx);
+   return;
+   }
+
+   reg = get_counter_reg_off(idx);
+   writel((u32)val, ddrc_pmu->base + reg);
+}
+
+static void hisi_ddrc_pmu_start_counters(struct hisi_pmu *ddrc_pmu)
+{
+   u32 val;
+
+   /* Set perf_enable in DDRC_PERF_CTRL to start event counting */
+   val = readl(ddrc_pmu->base + DDRC_PERF_CTRL);
+   val |= DDRC_PERF_CTRL_EN;
+   writel(val, ddrc_pmu->base + DDRC_PERF_CTRL);
+}
+
+static void hisi_ddrc_pmu_stop_counters(struct hisi_pmu *ddrc_pmu)
+{
+   u32 val;
+
+   /* Clear perf_enable in DDRC_PERF_CTRL to stop event 

[PATCH v4 6/6] arm64: MAINTAINERS: hisi: Add HiSilicon SoC PMU support

2017-07-25 Thread Shaokun Zhang
Add support HiSilicon SoC uncore PMU driver.

Signed-off-by: Shaokun Zhang 
---
 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 205d397..649b144 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6197,6 +6197,13 @@ S:   Maintained
 F: drivers/net/ethernet/hisilicon/
 F: Documentation/devicetree/bindings/net/hisilicon*.txt
 
+HISILICON PMU DRIVER
+M: Shaokun Zhang 
+W: http://www.hisilicon.com
+S: Supported
+F: drivers/perf/hisilicon
+F: Documentation/perf/hisi-pmu.txt
+
 HISILICON ROCE DRIVER
 M: Lijun Ou 
 M: Wei Hu(Xavier) 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 1/5] mm: add vm_insert_mixed_mkwrite()

2017-07-25 Thread Jan Kara
On Tue 25-07-17 10:01:58, Christoph Hellwig wrote:
> On Tue, Jul 25, 2017 at 01:14:00AM +0300, Kirill A. Shutemov wrote:
> > I guess it's up to filesystem if it wants to reuse the same spot to write
> > data or not. I think your assumptions works for ext4 and xfs. I wouldn't
> > be that sure for btrfs or other filesystems with CoW support.
> 
> Or XFS with reflinks for that matter.  Which currently can't be
> combined with DAX, but I had a somewhat working version a few month
> ago.

But in cases like COW when the block mapping changes, the process
must run unmap_mapping_range() before installing the new PTE so that all
processes mapping this file offset actually refault and see the new
mapping. So this would go through pte_none() case. Am I missing something?

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/13] net: dsa: lan9303: Added "stp_enable" sysfs attribute

2017-07-25 Thread Egil Hjelmeland

On 24. juli 2017 18:55, Florian Fainelli wrote:

On 07/20/2017 06:42 AM, Egil Hjelmeland wrote:

Must be set to 1 by user space when STP is used on the lan9303.
If bridging without local STP, leave at 0, so external STP BPDUs
are forwarded.

Hopefully the kernel can be improved so the driver can handle this
without user intervention, and this control can be removed.


Same here, we can't have a driver-specific sysfs attribute just for
this, either we find a way to have the bridge's STP settings propagate
correctly to the switch driver, or you have to make better decisions
based on hints/calls you are getting from switchdev -> dsa -> driver.



I can't see that the driver gets enough information now. But please
correct me if I am wrong. Problem is that when disabling
multicast_flood, then the BPDUs are not forwarded by the SW bridge,
so I can not have the 01:80:c2:00:00:00 entry in always.

Perhaps the kernel could do port_fdb_add/del on 01:80:c2:00:00:00
when STP is turned on/off? Or could that break other DSA chips?

When we are at it, it would be good if the driver could return some
capability information to the kernel, so it can adapt accordingly.
It does not feel right that user space has to disable the _flood
flags, the kernel should be able to figure that out by it self.

DISCLAIMER:
This e-mail may contain confidential and privileged material for the sole use 
of the intended recipient. Any review, use, distribution or disclosure by 
others is strictly prohibited. If you are not the intended recipient (or 
authorized to receive for the recipient), please contact the sender by reply 
e-mail and delete all copies of this message.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 1/5] mm: add vm_insert_mixed_mkwrite()

2017-07-25 Thread Christoph Hellwig
On Tue, Jul 25, 2017 at 01:14:00AM +0300, Kirill A. Shutemov wrote:
> I guess it's up to filesystem if it wants to reuse the same spot to write
> data or not. I think your assumptions works for ext4 and xfs. I wouldn't
> be that sure for btrfs or other filesystems with CoW support.

Or XFS with reflinks for that matter.  Which currently can't be
combined with DAX, but I had a somewhat working version a few month
ago.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/13] net: dsa: lan9303: lan9303_port_mdb_del remove port 0

2017-07-25 Thread Egil Hjelmeland

On 24. juli 2017 18:57, Florian Fainelli wrote:

On 07/20/2017 06:57 AM, Egil Hjelmeland wrote:

Workaround for dsa_switch_mdb_add adding CPU port to group,
but forgetting to remove it:


Should not we move this logic one layer above into DSA then such that
insertions and removals are strictly symmetrical in which and how many
ports are targeted?



Agree. I included the patch more as a bug report. I will remove it in
patch v2. I don't really feel competent to fix the issue in DSA. It is
probably better if you DSA people look at it. I do suspect DSA need
to do more bookkeeping?

Egil

DISCLAIMER:
This e-mail may contain confidential and privileged material for the sole use 
of the intended recipient. Any review, use, distribution or disclosure by 
others is strictly prohibited. If you are not the intended recipient (or 
authorized to receive for the recipient), please contact the sender by reply 
e-mail and delete all copies of this message.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/13] net: dsa: lan9303: unicast offload, fdb,mdb,STP

2017-07-25 Thread Egil Hjelmeland

On 24. juli 2017 22:32, David Miller wrote:


They are all over the place, over a period of 3 days.

I will do "git rebase --ignore-date master" from now on.


You must also say in your subject line which of my two GIT networking
trees ('net' or 'net-next') your changes are targetting.  If you don't
know, you need to figure that out before submitting.


Makes sense. I just found Documentation/networking/netdev-FAQ.txt,
reading that made it even clearer.


I'm not applying this series until you fix your process up.


No problem, I did not expect first version to go through anyway.

Egil



DISCLAIMER:
This e-mail may contain confidential and privileged material for the sole use 
of the intended recipient. Any review, use, distribution or disclosure by 
others is strictly prohibited. If you are not the intended recipient (or 
authorized to receive for the recipient), please contact the sender by reply 
e-mail and delete all copies of this message.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/13] net: dsa: lan9303: unicast offload, fdb,mdb,STP

2017-07-25 Thread Egil Hjelmeland
On 24. juli 2017 18:54, Florian Fainelli wrote:
> 
> First thing would be to get your patch submissions square, because the
> patches do not appear to have been sent as a reply to this cover letter,
> and worse yet, they are all appearing with their commit date, which is
> highly confusing since that makes them go back in time for some of them.
> 

Hi all!

I am very sorry for the email-thread mess. Once the emails showed up on 
the spinics mirror I realized I had made a fool of my self. I see now 
that I have to add --thread to "git format-patch", when _not_ using 
"git send-email" as the backend. (I did not get "git send-email" to work
with the company email server.)

I had noted that "git format-patch" preserved commit dates, but I
wrongly thought that was "a feature, not a bug". From now on I will
make sure to "git rebase --ignore-date master".

Egil

DISCLAIMER:
This e-mail may contain confidential and privileged material for the sole use 
of the intended recipient. Any review, use, distribution or disclosure by 
others is strictly prohibited. If you are not the intended recipient (or 
authorized to receive for the recipient), please contact the sender by reply 
e-mail and delete all copies of this message.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html