Re: [PATCH 2/2] printk: Add boottime and real timestamps
On 07/25/2017 06:00 AM, Peter Zijlstra wrote: On Tue, Jul 25, 2017 at 08:17:27AM -0400, Prarit Bhargava wrote: diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 5b1662ec546f..6cd38a25f8ea 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1,8 +1,8 @@ menu "printk and dmesg options" config PRINTK_TIME - int "Show timing information on printks (0-1)" - range 0 1 + int "Show timing information on printks (0-3)" + range 0 3 default "0" depends on PRINTK help @@ -13,7 +13,8 @@ config PRINTK_TIME The timestamp is always recorded internally, and exported to /dev/kmsg. This flag just specifies if the timestamp should be included, not that the timestamp is recorded. 0 disables the - timestamp and 1 uses the local clock. + timestamp and 1 uses the local clock, 2 uses the monotonic clock, and + 3 uses real clock. The behavior is also controlled by the kernel command line parameter printk.time=1. See Documentation/admin-guide/kernel-parameters.rst choice prompt "printk default clock" default PRIMTK_TIME_DISABLE help goes here config PRINTK_TIME_DISABLE bool "Disabled" help goes here config PRINTK_TIME_LOCAL bool "local clock" help goes here config PRINTK_TIME_MONO bool "CLOCK_MONOTONIC" help goes here config PRINTK_TIME_REAL bool "CLOCK_REALTIME" help goes here endchoice config PRINTK_TIME int default 0 if PRINTK_TIME_DISABLE default 1 if PRINTK_TIME_LOCAL default 2 if PRINTK_TIME_MONO default 3 if PRINTK_TIME_REAL Although I must strongly discourage using REALTIME, DST will make untangling your logs an absolute nightmare. I would simply not provide it. I agree with using select, ensures only valid values are landed. It does mean that CONFIG_PRINTK_TIME in-effect gets deprecated. REALTIME is always UTC in the kernel. What about BOOTTIME? -- Mark -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next v2 01/10] net: dsa: lan9303: Fixed MDIO interface
Hi Egil, Egil Hjelmelandwrites: > Fixes after testing on actual HW: > > - lan9303_mdio_write()/_read() must multiply register number > by 4 to get offset > > - Indirect access (PMI) to phy register only work in I2C mode. In > MDIO mode phy registers must be accessed directly. Introduced > struct lan9303_phy_ops to handle the two modes. Renamed functions > to clarify. > > - lan9303_detect_phy_setup() : Failed MDIO read return 0x. > Handle that. Small patch series when possible are better. Bullet points in commit messages are likely to describe how a patch or series may be split up ;-) This patch seems to be the unique patch of the series resolving what is described in the cover letter as "Make the MDIO interface work". I'd suggest you to split up this one commit in several *atomic* and easy to review patches and send them separately as on thread named "net: dsa: lan9303: fix MDIO interface" (also note that imperative is prefered for subject lines, see: https://chris.beams.io/posts/git-commit/#imperative) <...> > -static int lan9303_port_phy_reg_wait_for_completion(struct lan9303 *chip) > +static int lan9303_indirect_phy_wait_for_completion(struct lan9303 *chip) For instance you can have a first commit only renaming the functions. The reason for it is to separate the functional changes from cosmetic changes, which makes it easier for review. <...> > - if (reg != 0) > + if ((reg != 0) && (reg != 0x)) if (reg && reg != 0x) should be enough. > chip->phy_addr_sel_strap = 1; > else > chip->phy_addr_sel_strap = 0; <...> > +struct lan9303; > + > +struct lan9303_phy_ops { > + /* PHY 1 &2 access*/ The spacing is weird in the comment. "/* PHY 1 & 2 access */" maybe? <...> > +int lan9303_mdio_phy_write(struct lan9303 *chip, int phy, int regnum, u16 > val) > +{ > + struct lan9303_mdio *sw_dev = dev_get_drvdata(chip->dev); > + struct mdio_device *mdio = sw_dev->device; > + > + mutex_lock(>bus->mdio_lock); > + mdio->bus->write(mdio->bus, phy, regnum, val); > + mutex_unlock(>bus->mdio_lock); This is exactly what mdiobus_write(mdio->bus, phy, regnum, val) is doing. There are very few valid reasons to go play in the mii_bus structure, using generic APIs are strongly prefered. Plus you have checks and traces for free! > + > + return 0; > +} > + > +int lan9303_mdio_phy_read(struct lan9303 *chip, int phy, int reg) > +{ > + struct lan9303_mdio *sw_dev = dev_get_drvdata(chip->dev); > + struct mdio_device *mdio = sw_dev->device; > + int val; > + > + mutex_lock(>bus->mdio_lock); > + val = mdio->bus->read(mdio->bus, phy, reg); > + mutex_unlock(>bus->mdio_lock); Same here, mdiobus_read(). Thanks, Vivien -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v2 06/10] net: dsa: lan9303: added sysfs node swe_bcst_throt
Allowing per-port access to Switch Engine Broadcast Throttling Register Also added lan9303_write_switch_reg_mask() Signed-off-by: Egil Hjelmeland--- drivers/net/dsa/lan9303-core.c | 83 ++ 1 file changed, 83 insertions(+) diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c index be6d78f45a5f..b70acb73aad6 100644 --- a/drivers/net/dsa/lan9303-core.c +++ b/drivers/net/dsa/lan9303-core.c @@ -154,6 +154,7 @@ # define LAN9303_SWE_PORT_MIRROR_ENABLE_RX_MIRRORING BIT(1) # define LAN9303_SWE_PORT_MIRROR_ENABLE_TX_MIRRORING BIT(0) #define LAN9303_SWE_INGRESS_PORT_TYPE 0x1847 +#define LAN9303_SWE_BCST_THROT 0x1848 #define LAN9303_BM_CFG 0x1c00 #define LAN9303_BM_EGRSS_PORT_TYPE 0x1c0c # define LAN9303_BM_EGRSS_PORT_TYPE_SPECIAL_TAG_PORT2 (BIT(17) | BIT(16)) @@ -426,6 +427,20 @@ static int lan9303_read_switch_reg(struct lan9303 *chip, u16 regnum, u32 *val) return ret; } +static int lan9303_write_switch_reg_mask( + struct lan9303 *chip, u16 regnum, u32 val, u32 mask) +{ + int ret; + u32 reg; + + ret = lan9303_read_switch_reg(chip, regnum, ); + if (ret) + return ret; + reg = (reg & ~mask) | val; + + return lan9303_write_switch_reg(chip, regnum, reg); +} + static int lan9303_detect_phy_setup(struct lan9303 *chip) { int reg; @@ -614,6 +629,66 @@ static int lan9303_check_device(struct lan9303 *chip) return 0; } +/* -- Sysfs on slave port --*/ +/*13.4.3.23 Switch Engine Broadcast Throttling Register (SWE_BCST_THROT)*/ +static ssize_t +swe_bcst_throt_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct dsa_port *dp = dsa_net_device_to_dsa_port(to_net_dev(dev)); + struct lan9303 *chip = dp->ds->priv; + int port = dp->index; + int reg; + + if (lan9303_read_switch_reg(chip, LAN9303_SWE_BCST_THROT, )) + return 0; + + reg = (reg >> (9 * port)) & 0x1ff; /*extract port N*/ + if (reg & 0x100) + reg &= 0xff; /* remove enable bit */ + else + reg = 0; /* not enabled*/ + + return scnprintf(buf, PAGE_SIZE, "%d\n", reg); +} + +static ssize_t +swe_bcst_throt_store(struct device *dev, struct device_attribute *attr, +const char *buf, size_t len) +{ + struct dsa_port *dp = dsa_net_device_to_dsa_port(to_net_dev(dev)); + struct lan9303 *chip = dp->ds->priv; + int port = dp->index; + int ret; + unsigned long level; + + ret = kstrtoul(buf, 0, ); + if (ret) + return ret; + level &= 0xff; /* ensure valid range */ + if (level) + level |= 0x100; /* Set enable bit */ + + ret = lan9303_write_switch_reg_mask(chip, LAN9303_SWE_BCST_THROT, + level << (9 * port), + 0x1ff << (9 * port)); + if (ret) + return ret; + return len; +} + +static DEVICE_ATTR_RW(swe_bcst_throt); + +static struct attribute *lan9303_attrs[] = { + _attr_swe_bcst_throt.attr, + NULL +}; + +static struct attribute_group lan9303_group = { + .name = "lan9303", + .attrs = lan9303_attrs, +}; + /* DSA ---*/ static enum dsa_tag_protocol lan9303_get_tag_protocol(struct dsa_switch *ds) @@ -787,6 +862,11 @@ static int lan9303_port_enable(struct dsa_switch *ds, int port, switch (port) { case 1: case 2: + /* lan9303_setup is too early to attach sysfs nodes... */ + if (sysfs_create_group( + >ports[port].netdev->dev.kobj, + _group)) + dev_dbg(chip->dev, "cannot create sysfs group\n"); return lan9303_enable_packet_processing(chip, port); default: dev_dbg(chip->dev, @@ -805,6 +885,9 @@ static void lan9303_port_disable(struct dsa_switch *ds, int port, switch (port) { case 1: case 2: + sysfs_remove_group(>ports[port].netdev->dev.kobj, + _group); + lan9303_disable_packet_processing(chip, port); lan9303_phy_write(ds, chip->phy_addr_sel_strap + port, MII_BMCR, BMCR_PDOWN); -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v2 02/10] net: dsa: lan9303: Do not disable/enable switch fabric port 0 at startup
For some mysterious reason enable switch fabric port 0 TX fails to work, when the TX has previous been disabled. Resolved by not disable/enable switch fabric port 0 at startup. Port 1 and 2 are still disabled in early init. Signed-off-by: Egil Hjelmeland--- drivers/net/dsa/lan9303-core.c | 7 --- 1 file changed, 7 deletions(-) diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c index e622db586c3d..c2b53659f58f 100644 --- a/drivers/net/dsa/lan9303-core.c +++ b/drivers/net/dsa/lan9303-core.c @@ -557,9 +557,6 @@ static int lan9303_disable_processing(struct lan9303 *chip) { int ret; - ret = lan9303_disable_packet_processing(chip, LAN9303_PORT_0_OFFSET); - if (ret) - return ret; ret = lan9303_disable_packet_processing(chip, LAN9303_PORT_1_OFFSET); if (ret) return ret; @@ -633,10 +630,6 @@ static int lan9303_setup(struct dsa_switch *ds) if (ret) dev_err(chip->dev, "failed to separate ports %d\n", ret); - ret = lan9303_enable_packet_processing(chip, LAN9303_PORT_0_OFFSET); - if (ret) - dev_err(chip->dev, "failed to re-enable switching %d\n", ret); - return 0; } -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v2 05/10] net: dsa: added dsa_net_device_to_dsa_port()
Allowing dsa drivers to attach sysfs nodes. Signed-off-by: Egil Hjelmeland--- include/net/dsa.h | 1 + net/dsa/slave.c | 10 ++ 2 files changed, 11 insertions(+) diff --git a/include/net/dsa.h b/include/net/dsa.h index 88da272d20d0..a71c0a2401ee 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -450,6 +450,7 @@ void unregister_switch_driver(struct dsa_switch_driver *type); struct mii_bus *dsa_host_dev_to_mii_bus(struct device *dev); struct net_device *dsa_dev_to_net_device(struct device *dev); +struct dsa_port *dsa_net_device_to_dsa_port(struct net_device *dev); /* Keep inline for faster access in hot path */ static inline bool netdev_uses_dsa(struct net_device *dev) diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 9507bd38cf04..40410f1740de 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -209,6 +209,16 @@ static int dsa_slave_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) return -EOPNOTSUPP; } +struct dsa_port *dsa_net_device_to_dsa_port(struct net_device *dev) +{ + struct dsa_slave_priv *p = netdev_priv(dev); + + if (!dsa_slave_dev_check(dev)) + return NULL; + return p->dp; +} +EXPORT_SYMBOL_GPL(dsa_net_device_to_dsa_port); + static int dsa_slave_port_attr_set(struct net_device *dev, const struct switchdev_attr *attr, struct switchdev_trans *trans) -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v2 08/10] net: dsa: lan9303: Added ALR/fdb/mdb handling
Added functions for accessing / managing the lan9303 ALR (Address Logic Resolution). Implemented DSA methods: set_addr, port_fast_age, port_fdb_prepare, port_fdb_add, port_fdb_del, port_fdb_dump, port_mdb_prepare, port_mdb_add and port_mdb_del. Since the lan9303 do not offer reading specific ALR entry, the driver caches all static entries - in a flat table. Signed-off-by: Egil Hjelmeland--- drivers/net/dsa/lan9303-core.c | 369 + drivers/net/dsa/lan9303.h | 11 ++ 2 files changed, 380 insertions(+) diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c index 426a75bd89f4..dc95973d62ed 100644 --- a/drivers/net/dsa/lan9303-core.c +++ b/drivers/net/dsa/lan9303-core.c @@ -19,6 +19,7 @@ #include #include #include +#include #include "lan9303.h" @@ -121,6 +122,21 @@ #define LAN9303_MAC_RX_CFG_2 0x0c01 #define LAN9303_MAC_TX_CFG_2 0x0c40 #define LAN9303_SWE_ALR_CMD 0x1800 +# define ALR_CMD_MAKE_ENTRYBIT(2) +# define ALR_CMD_GET_FIRST BIT(1) +# define ALR_CMD_GET_NEXT BIT(0) +#define LAN9303_SWE_ALR_WR_DAT_0 0x1801 +#define LAN9303_SWE_ALR_WR_DAT_1 0x1802 +# define ALR_DAT1_VALIDBIT(26) +# define ALR_DAT1_END_OF_TABL BIT(25) +# define ALR_DAT1_AGE_OVERRID BIT(25) +# define ALR_DAT1_STATIC BIT(24) +# define ALR_DAT1_PORT_BITOFFS 16 +# define ALR_DAT1_PORT_MASK(7 << ALR_DAT1_PORT_BITOFFS) +#define LAN9303_SWE_ALR_RD_DAT_0 0x1805 +#define LAN9303_SWE_ALR_RD_DAT_1 0x1806 +#define LAN9303_SWE_ALR_CMD_STS 0x1808 +# define ALR_STS_MAKE_PEND BIT(0) #define LAN9303_SWE_VLAN_CMD 0x180b # define LAN9303_SWE_VLAN_CMD_RNW BIT(5) # define LAN9303_SWE_VLAN_CMD_PVIDNVLAN BIT(4) @@ -473,6 +489,229 @@ static int lan9303_detect_phy_setup(struct lan9303 *chip) return 0; } +/* - Address Logic Resolution (ALR)--*/ + +/* Map ALR-port bits to port bitmap, and back*/ +static const int alrport_2_portmap[] = {1, 2, 4, 0, 3, 5, 6, 7 }; +static const int portmap_2_alrport[] = {3, 0, 1, 4, 2, 5, 6, 7 }; + +/* ALR: Cache static entries: mac address + port bitmap */ + +/* Return pointer to first free ALR cache entry, return NULL if none */ +static struct lan9303_alr_cache_entry *lan9303_alr_cache_find_free( + struct lan9303 *chip) +{ + int i; + struct lan9303_alr_cache_entry *entr = chip->alr_cache; + + for (i = 0; i < LAN9303_NUM_ALR_RECORDS; i++, entr++) + if (entr->port_map == 0) + return entr; + return NULL; +} + +/* Return pointer to ALR cache entry matching MAC address */ +static struct lan9303_alr_cache_entry *lan9303_alr_cache_find_mac( + struct lan9303 *chip, + const u8 *mac_addr) +{ + int i; + struct lan9303_alr_cache_entry *entr = chip->alr_cache; + + BUILD_BUG_ON_MSG(sizeof(struct lan9303_alr_cache_entry) & 1, +"ether_addr_equal require u16 alignment"); + + for (i = 0; i < LAN9303_NUM_ALR_RECORDS; i++, entr++) + if (ether_addr_equal(entr->mac_addr, mac_addr)) + return entr; + return NULL; +} + +/* ALR: Actual register access functions */ + +/* This function will wait a while until mask & reg == value */ +/* Otherwise, return timeout */ +static int lan9303_csr_reg_wait(struct lan9303 *chip, int regno, + int mask, char value) +{ + int i; + + for (i = 0; i < 0x1000; i++) { + u32 reg; + + lan9303_read_switch_reg(chip, regno, ); + if ((reg & mask) == value) + return 0; + } + return -ETIMEDOUT; +} + +static int _lan9303_alr_make_entry_raw(struct lan9303 *chip, u32 dat0, u32 dat1) +{ + lan9303_write_switch_reg( + chip, LAN9303_SWE_ALR_WR_DAT_0, dat0); + lan9303_write_switch_reg( + chip, LAN9303_SWE_ALR_WR_DAT_1, dat1); + lan9303_write_switch_reg( + chip, LAN9303_SWE_ALR_CMD, ALR_CMD_MAKE_ENTRY); + lan9303_csr_reg_wait( + chip, LAN9303_SWE_ALR_CMD_STS, ALR_STS_MAKE_PEND, 0); + lan9303_write_switch_reg(chip, LAN9303_SWE_ALR_CMD, 0); + return 0; +} + +typedef void alr_loop_cb_t( + struct lan9303 *chip, u32 dat0, u32 dat1, int portmap, void *ctx); + +static void lan9303_alr_loop(struct lan9303 *chip, alr_loop_cb_t *cb, void *ctx) +{ + int i; + + lan9303_write_switch_reg(chip, LAN9303_SWE_ALR_CMD, ALR_CMD_GET_FIRST); + lan9303_write_switch_reg(chip, LAN9303_SWE_ALR_CMD, 0); + + for (i = 1; i < LAN9303_NUM_ALR_RECORDS; i++) { + u32 dat0, dat1; + int alrport, portmap; + + lan9303_read_switch_reg(chip, LAN9303_SWE_ALR_RD_DAT_0, ); + lan9303_read_switch_reg(chip, LAN9303_SWE_ALR_RD_DAT_1, ); + if (dat1 & ALR_DAT1_END_OF_TABL) + break; + + alrport
[PATCH net-next v2 10/10] net: dsa: lan9303: Only allocate 3 ports
Saving 2628 bytes. Signed-off-by: Egil HjelmelandReviewed-by: Florian Fainelli --- drivers/net/dsa/lan9303-core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c index dc95973d62ed..ad7a4c72e1fb 100644 --- a/drivers/net/dsa/lan9303-core.c +++ b/drivers/net/dsa/lan9303-core.c @@ -23,6 +23,8 @@ #include "lan9303.h" +#define LAN9303_NUM_PORTS 3 + /* 13.2 System Control and Status Registers * Multiply register number by 4 to get address offset. */ @@ -1361,7 +1363,7 @@ static struct dsa_switch_ops lan9303_switch_ops = { static int lan9303_register_switch(struct lan9303 *chip) { - chip->ds = dsa_switch_alloc(chip->dev, DSA_MAX_PORTS); + chip->ds = dsa_switch_alloc(chip->dev, LAN9303_NUM_PORTS); if (!chip->ds) return -ENOMEM; -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v2 09/10] net: dsa: lan9303: Added Documentation/networking/dsa/lan9303.txt
Signed-off-by: Egil Hjelmeland--- Documentation/networking/dsa/lan9303.txt | 63 1 file changed, 63 insertions(+) create mode 100644 Documentation/networking/dsa/lan9303.txt diff --git a/Documentation/networking/dsa/lan9303.txt b/Documentation/networking/dsa/lan9303.txt new file mode 100644 index ..ef5b3ca12a29 --- /dev/null +++ b/Documentation/networking/dsa/lan9303.txt @@ -0,0 +1,63 @@ +LAN9303 Ethernet switch driver +== + +The LAN9303 is a three port 10/100 ethernet switch with integrated phys +for the two external ethernet ports. The third port is an RMII/MII +interface to a host master network interface (e.g. fixed link). + + +Driver details +== + +The driver is implemented as a DSA driver, see +Documentation/networking/dsa/dsa.txt. + +See Documentation/devicetree/bindings/net/dsa/lan9303.txt for device +tree binding. + +The LAN9303 can be managed both via MDIO and I2C, both supported by this +driver. + +At startup the driver configures the device to provide two separate +network interfaces (which is the default state of a DSA device). + +When both user ports are joined to the same bridge, the normal +HW MAC learning is enabled. This means that unicast traffic is forwarded +in HW. STP is also supported in this mode. + +If one of the user ports leave the bridge, +the ports goes back to the initial separated operation. + +The driver implements the port_fdb_xxx/port_mdb_xxx methods. + + +Sysfs nodes +=== + +When a user port is enabled, the driver creates sysfs directory +/sys/class/net/xxx/lan9303 with the following files: + + - swe_bcst_throt (RW): Set/get 6.4.7 Broadcast Storm Control + Throttle Level for the port. Accesses the corresponding bits of + the SWE_BCST_THROT register (13.4.3.23). + + +Driver limitations +== + + - No support for VLAN + + +Bridging notes +== +When the user ports are bridged, broadcasts, multicasts and unknown +frames with unknown destination are flooded by the chip. Therefore SW +flooding must be disabled by: + + echo 0 > /sys/class/net/p1/brport/broadcast_flood + echo 0 > /sys/class/net/p1/brport/multicast_flood + echo 0 > /sys/class/net/p1/brport/unicast_flood + echo 0 > /sys/class/net/p2/brport/broadcast_flood + echo 0 > /sys/class/net/p2/brport/multicast_flood + echo 0 > /sys/class/net/p2/brport/unicast_flood + -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v2 07/10] net: dsa: lan9303: Added basic offloading of unicast traffic
When both user ports are joined to the same bridge, the normal HW MAC learning is enabled. This means that unicast traffic is forwarded in HW. Support for STP is also added. If one of the user ports leave the bridge, the ports goes back to the initial separated operation. Added brigde methods port_bridge_join, port_bridge_leave and port_stp_state_set. Signed-off-by: Egil Hjelmeland--- drivers/net/dsa/lan9303-core.c | 115 ++--- drivers/net/dsa/lan9303.h | 1 + 2 files changed, 98 insertions(+), 18 deletions(-) diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c index b70acb73aad6..426a75bd89f4 100644 --- a/drivers/net/dsa/lan9303-core.c +++ b/drivers/net/dsa/lan9303-core.c @@ -18,6 +18,7 @@ #include #include #include +#include #include "lan9303.h" @@ -143,6 +144,7 @@ # define LAN9303_SWE_PORT_STATE_FORWARDING_PORT0 (0) # define LAN9303_SWE_PORT_STATE_LEARNING_PORT0 BIT(1) # define LAN9303_SWE_PORT_STATE_BLOCKING_PORT0 BIT(0) +# define LAN9303_SWE_PORT_STATE_DISABLED_PORT0 (3) #define LAN9303_SWE_PORT_MIRROR 0x1846 # define LAN9303_SWE_PORT_MIRROR_SNIFF_ALL BIT(8) # define LAN9303_SWE_PORT_MIRROR_SNIFFER_PORT2 BIT(7) @@ -515,11 +517,30 @@ static int lan9303_enable_packet_processing(struct lan9303 *chip, LAN9303_MAC_TX_CFG_X_TX_ENABLE); } +/* forward special tagged packets from port 0 to port 1 *or* port 2 */ +static int lan9303_setup_tagging(struct lan9303 *chip) +{ + int ret; + /* enable defining the destination port via special VLAN tagging +* for port 0 +*/ + ret = lan9303_write_switch_reg(chip, LAN9303_SWE_INGRESS_PORT_TYPE, + 0x03); + if (ret) + return ret; + + /* tag incoming packets at port 1 and 2 on their way to port 0 to be +* able to discover their source port +*/ + return lan9303_write_switch_reg( + chip, LAN9303_BM_EGRSS_PORT_TYPE, + LAN9303_BM_EGRSS_PORT_TYPE_SPECIAL_TAG_PORT0); +} + /* We want a special working switch: * - do not forward packets between port 1 and 2 * - forward everything from port 1 to port 0 * - forward everything from port 2 to port 0 - * - forward special tagged packets from port 0 to port 1 *or* port 2 */ static int lan9303_separate_ports(struct lan9303 *chip) { @@ -534,22 +555,6 @@ static int lan9303_separate_ports(struct lan9303 *chip) if (ret) return ret; - /* enable defining the destination port via special VLAN tagging -* for port 0 -*/ - ret = lan9303_write_switch_reg(chip, LAN9303_SWE_INGRESS_PORT_TYPE, - 0x03); - if (ret) - return ret; - - /* tag incoming packets at port 1 and 2 on their way to port 0 to be -* able to discover their source port -*/ - ret = lan9303_write_switch_reg(chip, LAN9303_BM_EGRSS_PORT_TYPE, - LAN9303_BM_EGRSS_PORT_TYPE_SPECIAL_TAG_PORT0); - if (ret) - return ret; - /* prevent port 1 and 2 from forwarding packets by their own */ return lan9303_write_switch_reg(chip, LAN9303_SWE_PORT_STATE, LAN9303_SWE_PORT_STATE_FORWARDING_PORT0 | @@ -557,6 +562,12 @@ static int lan9303_separate_ports(struct lan9303 *chip) LAN9303_SWE_PORT_STATE_BLOCKING_PORT2); } +static void lan9303_bridge_ports(struct lan9303 *chip) +{ + /* ports bridged: remove mirroring */ + lan9303_write_switch_reg(chip, LAN9303_SWE_PORT_MIRROR, 0); +} + static int lan9303_handle_reset(struct lan9303 *chip) { if (!chip->reset_gpio) @@ -707,6 +718,10 @@ static int lan9303_setup(struct dsa_switch *ds) return -EINVAL; } + ret = lan9303_setup_tagging(chip); + if (ret) + dev_err(chip->dev, "failed to setup port tagging %d\n", ret); + ret = lan9303_separate_ports(chip); if (ret) dev_err(chip->dev, "failed to separate ports %d\n", ret); @@ -898,17 +913,81 @@ static void lan9303_port_disable(struct dsa_switch *ds, int port, } } +static int lan9303_port_bridge_join(struct dsa_switch *ds, int port, + struct net_device *br) +{ + struct lan9303 *chip = ds->priv; + + dev_dbg(chip->dev, "%s(port %d)\n", __func__, port); + if (ds->ports[1].bridge_dev == ds->ports[2].bridge_dev) { + lan9303_bridge_ports(chip); + chip->is_bridged = true; /* unleash stp_state_set() */ + } + + return 0; +} + +static void lan9303_port_bridge_leave(struct dsa_switch *ds, int port, + struct net_device *br) +{ + struct lan9303 *chip = ds->priv; + + dev_dbg(chip->dev, "%s(port %d)\n", __func__, port); +
[PATCH net-next v2 01/10] net: dsa: lan9303: Fixed MDIO interface
Fixes after testing on actual HW: - lan9303_mdio_write()/_read() must multiply register number by 4 to get offset - Indirect access (PMI) to phy register only work in I2C mode. In MDIO mode phy registers must be accessed directly. Introduced struct lan9303_phy_ops to handle the two modes. Renamed functions to clarify. - lan9303_detect_phy_setup() : Failed MDIO read return 0x. Handle that. Signed-off-by: Egil Hjelmeland--- drivers/net/dsa/lan9303-core.c | 42 +++--- drivers/net/dsa/lan9303.h | 11 +++ drivers/net/dsa/lan9303_i2c.c | 2 ++ drivers/net/dsa/lan9303_mdio.c | 34 ++ 4 files changed, 74 insertions(+), 15 deletions(-) diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c index cd76e61f1fca..e622db586c3d 100644 --- a/drivers/net/dsa/lan9303-core.c +++ b/drivers/net/dsa/lan9303-core.c @@ -20,6 +20,9 @@ #include "lan9303.h" +/* 13.2 System Control and Status Registers + * Multiply register number by 4 to get address offset. + */ #define LAN9303_CHIP_REV 0x14 # define LAN9303_CHIP_ID 0x9303 #define LAN9303_IRQ_CFG 0x15 @@ -53,6 +56,9 @@ #define LAN9303_VIRT_PHY_BASE 0x70 #define LAN9303_VIRT_SPECIAL_CTRL 0x77 +/*13.4 Switch Fabric Control and Status Registers + * Accessed indirectly via SWITCH_CSR_CMD, SWITCH_CSR_DATA. + */ #define LAN9303_SW_DEV_ID 0x #define LAN9303_SW_RESET 0x0001 #define LAN9303_SW_RESET_RESET BIT(0) @@ -242,7 +248,7 @@ static int lan9303_virt_phy_reg_write(struct lan9303 *chip, int regnum, u16 val) return regmap_write(chip->regmap, LAN9303_VIRT_PHY_BASE + regnum, val); } -static int lan9303_port_phy_reg_wait_for_completion(struct lan9303 *chip) +static int lan9303_indirect_phy_wait_for_completion(struct lan9303 *chip) { int ret, i; u32 reg; @@ -262,7 +268,7 @@ static int lan9303_port_phy_reg_wait_for_completion(struct lan9303 *chip) return -EIO; } -static int lan9303_port_phy_reg_read(struct lan9303 *chip, int addr, int regnum) +static int lan9303_indirect_phy_read(struct lan9303 *chip, int addr, int regnum) { int ret; u32 val; @@ -272,7 +278,7 @@ static int lan9303_port_phy_reg_read(struct lan9303 *chip, int addr, int regnum) mutex_lock(>indirect_mutex); - ret = lan9303_port_phy_reg_wait_for_completion(chip); + ret = lan9303_indirect_phy_wait_for_completion(chip); if (ret) goto on_error; @@ -281,7 +287,7 @@ static int lan9303_port_phy_reg_read(struct lan9303 *chip, int addr, int regnum) if (ret) goto on_error; - ret = lan9303_port_phy_reg_wait_for_completion(chip); + ret = lan9303_indirect_phy_wait_for_completion(chip); if (ret) goto on_error; @@ -299,8 +305,8 @@ static int lan9303_port_phy_reg_read(struct lan9303 *chip, int addr, int regnum) return ret; } -static int lan9303_phy_reg_write(struct lan9303 *chip, int addr, int regnum, -unsigned int val) +static int lan9303_indirect_phy_write(struct lan9303 *chip, int addr, + int regnum, u16 val) { int ret; u32 reg; @@ -311,7 +317,7 @@ static int lan9303_phy_reg_write(struct lan9303 *chip, int addr, int regnum, mutex_lock(>indirect_mutex); - ret = lan9303_port_phy_reg_wait_for_completion(chip); + ret = lan9303_indirect_phy_wait_for_completion(chip); if (ret) goto on_error; @@ -328,6 +334,11 @@ static int lan9303_phy_reg_write(struct lan9303 *chip, int addr, int regnum, return ret; } +const struct lan9303_phy_ops lan9303_indirect_phy_ops = { + .phy_read = lan9303_indirect_phy_read, + .phy_write = lan9303_indirect_phy_write, +}; + static int lan9303_switch_wait_for_completion(struct lan9303 *chip) { int ret, i; @@ -427,14 +438,15 @@ static int lan9303_detect_phy_setup(struct lan9303 *chip) * Special reg 18 of phy 3 reads as 0x, if 'phy_addr_sel_strap' is 0 * and the IDs are 0-1-2, else it contains something different from * 0x, which means 'phy_addr_sel_strap' is 1 and the IDs are 1-2-3. +* 0x is returned for failed MDIO access. */ - reg = lan9303_port_phy_reg_read(chip, 3, MII_LAN911X_SPECIAL_MODES); + reg = chip->ops->phy_read(chip, 3, MII_LAN911X_SPECIAL_MODES); if (reg < 0) { dev_err(chip->dev, "Failed to detect phy config: %d\n", reg); return reg; } - if (reg != 0) + if ((reg != 0) && (reg != 0x)) chip->phy_addr_sel_strap = 1; else chip->phy_addr_sel_strap = 0; @@ -719,7 +731,7 @@ static int lan9303_phy_read(struct dsa_switch *ds, int phy, int regnum) if (phy > phy_base + 2) return -ENODEV; - return
[PATCH net-next v2 03/10] net: dsa: lan9303: Refactor lan9303_enable_packet_processing()
lan9303_enable_packet_processing, lan9303_disable_packet_processing() Pass port number (0,1,2) as parameter instead of port offset. Simplify accordingly. Signed-off-by: Egil Hjelmeland--- drivers/net/dsa/lan9303-core.c | 66 -- 1 file changed, 32 insertions(+), 34 deletions(-) diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c index c2b53659f58f..0806a0684d55 100644 --- a/drivers/net/dsa/lan9303-core.c +++ b/drivers/net/dsa/lan9303-core.c @@ -159,9 +159,7 @@ # define LAN9303_BM_EGRSS_PORT_TYPE_SPECIAL_TAG_PORT1 (BIT(9) | BIT(8)) # define LAN9303_BM_EGRSS_PORT_TYPE_SPECIAL_TAG_PORT0 (BIT(1) | BIT(0)) -#define LAN9303_PORT_0_OFFSET 0x400 -#define LAN9303_PORT_1_OFFSET 0x800 -#define LAN9303_PORT_2_OFFSET 0xc00 +#define LAN9303_SWITCH_PORT_REG(port, reg0) (0x400 * (port) + (reg0)) /* the built-in PHYs are of type LAN911X */ #define MII_LAN911X_SPECIAL_MODES 0x12 @@ -457,24 +455,25 @@ static int lan9303_detect_phy_setup(struct lan9303 *chip) return 0; } -#define LAN9303_MAC_RX_CFG_OFFS (LAN9303_MAC_RX_CFG_0 - LAN9303_PORT_0_OFFSET) -#define LAN9303_MAC_TX_CFG_OFFS (LAN9303_MAC_TX_CFG_0 - LAN9303_PORT_0_OFFSET) - static int lan9303_disable_packet_processing(struct lan9303 *chip, unsigned int port) { int ret; /* disable RX, but keep register reset default values else */ - ret = lan9303_write_switch_reg(chip, LAN9303_MAC_RX_CFG_OFFS + port, - LAN9303_MAC_RX_CFG_X_REJECT_MAC_TYPES); + ret = lan9303_write_switch_reg( + chip, + LAN9303_SWITCH_PORT_REG(port, LAN9303_MAC_RX_CFG_0), + LAN9303_MAC_RX_CFG_X_REJECT_MAC_TYPES); if (ret) return ret; /* disable TX, but keep register reset default values else */ - return lan9303_write_switch_reg(chip, LAN9303_MAC_TX_CFG_OFFS + port, - LAN9303_MAC_TX_CFG_X_TX_IFG_CONFIG_DEFAULT | - LAN9303_MAC_TX_CFG_X_TX_PAD_ENABLE); + return lan9303_write_switch_reg( + chip, + LAN9303_SWITCH_PORT_REG(port, LAN9303_MAC_TX_CFG_0), + LAN9303_MAC_TX_CFG_X_TX_IFG_CONFIG_DEFAULT | + LAN9303_MAC_TX_CFG_X_TX_PAD_ENABLE); } static int lan9303_enable_packet_processing(struct lan9303 *chip, @@ -483,17 +482,21 @@ static int lan9303_enable_packet_processing(struct lan9303 *chip, int ret; /* enable RX and keep register reset default values else */ - ret = lan9303_write_switch_reg(chip, LAN9303_MAC_RX_CFG_OFFS + port, - LAN9303_MAC_RX_CFG_X_REJECT_MAC_TYPES | - LAN9303_MAC_RX_CFG_X_RX_ENABLE); + ret = lan9303_write_switch_reg( + chip, + LAN9303_SWITCH_PORT_REG(port, LAN9303_MAC_RX_CFG_0), + LAN9303_MAC_RX_CFG_X_REJECT_MAC_TYPES | + LAN9303_MAC_RX_CFG_X_RX_ENABLE); if (ret) return ret; /* enable TX and keep register reset default values else */ - return lan9303_write_switch_reg(chip, LAN9303_MAC_TX_CFG_OFFS + port, - LAN9303_MAC_TX_CFG_X_TX_IFG_CONFIG_DEFAULT | - LAN9303_MAC_TX_CFG_X_TX_PAD_ENABLE | - LAN9303_MAC_TX_CFG_X_TX_ENABLE); + return lan9303_write_switch_reg( + chip, + LAN9303_SWITCH_PORT_REG(port, LAN9303_MAC_TX_CFG_0), + LAN9303_MAC_TX_CFG_X_TX_IFG_CONFIG_DEFAULT | + LAN9303_MAC_TX_CFG_X_TX_PAD_ENABLE | + LAN9303_MAC_TX_CFG_X_TX_ENABLE); } /* We want a special working switch: @@ -555,12 +558,14 @@ static int lan9303_handle_reset(struct lan9303 *chip) /* stop processing packets for all ports */ static int lan9303_disable_processing(struct lan9303 *chip) { - int ret; + int ret, p; - ret = lan9303_disable_packet_processing(chip, LAN9303_PORT_1_OFFSET); - if (ret) - return ret; - return lan9303_disable_packet_processing(chip, LAN9303_PORT_2_OFFSET); + for (p = 1; p <= 2; p++) { + ret = lan9303_disable_packet_processing(chip, p); + if (ret) + return ret; + } + return 0; } static int lan9303_check_device(struct lan9303 *chip) @@ -696,7 +701,7 @@ static void lan9303_get_ethtool_stats(struct dsa_switch *ds, int port, unsigned int u, poff; int ret; - poff = port * 0x400; + poff = LAN9303_SWITCH_PORT_REG(port, 0); for (u = 0; u < ARRAY_SIZE(lan9303_mib); u++) { ret = lan9303_read_switch_reg(chip, @@ -749,11
[PATCH net-next v2 04/10] net: dsa: lan9303: Added adjust_link() method
This makes the driver react to device tree "fixed-link" declaration on CPU port. - turn off autonegotiation - force speed 10 or 100 mb/s - force duplex mode Signed-off-by: Egil Hjelmeland--- drivers/net/dsa/lan9303-core.c | 33 + 1 file changed, 33 insertions(+) diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c index 0806a0684d55..be6d78f45a5f 100644 --- a/drivers/net/dsa/lan9303-core.c +++ b/drivers/net/dsa/lan9303-core.c @@ -17,6 +17,7 @@ #include #include #include +#include #include "lan9303.h" @@ -746,6 +747,37 @@ static int lan9303_phy_write(struct dsa_switch *ds, int phy, int regnum, return chip->ops->phy_write(chip, phy, regnum, val); } +static void lan9303_adjust_link(struct dsa_switch *ds, int port, + struct phy_device *phydev) +{ + struct lan9303 *chip = ds->priv; + + int ctl, res; + + ctl = lan9303_phy_read(ds, port, MII_BMCR); + + if (!phy_is_pseudo_fixed_link(phydev)) + return; + + ctl &= ~BMCR_ANENABLE; + if (phydev->speed == SPEED_100) + ctl |= BMCR_SPEED100; + + if (phydev->duplex == DUPLEX_FULL) + ctl |= BMCR_FULLDPLX; + + res = lan9303_phy_write(ds, port, MII_BMCR, ctl); + + if (port == chip->phy_addr_sel_strap) { + /* Virtual Phy: Remove Turbo 200Mbit mode */ + lan9303_read(chip->regmap, LAN9303_VIRT_SPECIAL_CTRL, ); + + ctl &= ~(1 << 10); // TURBO BIT + res = regmap_write(chip->regmap, + LAN9303_VIRT_SPECIAL_CTRL, ctl); + } +} + static int lan9303_port_enable(struct dsa_switch *ds, int port, struct phy_device *phy) { @@ -789,6 +821,7 @@ static struct dsa_switch_ops lan9303_switch_ops = { .get_strings = lan9303_get_strings, .phy_read = lan9303_phy_read, .phy_write = lan9303_phy_write, + .adjust_link = lan9303_adjust_link, .get_ethtool_stats = lan9303_get_ethtool_stats, .get_sset_count = lan9303_get_sset_count, .port_enable = lan9303_port_enable, -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v2 00/10] net: dsa: lan9303: unicast offload, fdb,mdb,STP
This series extends the LAN9303 3 port switch DSA driver. Highlights: - Make the MDIO interface work - Bridging: Unicast offload - Bridging: Added fdb/mdb handling - Bridging: STP support - Documentation Changes v1 -> v2: - sorted out emailing issues, threading and date. And sent from private account in order to avoid company disclaimer in emails. - Removed the three last "work around" patches. But first moved one doc paragraph to the document patch. Egil Hjelmeland (10): net: dsa: lan9303: Fixed MDIO interface net: dsa: lan9303: Do not disable/enable switch fabric port 0 at startup net: dsa: lan9303: Refactor lan9303_enable_packet_processing() net: dsa: lan9303: Added adjust_link() method net: dsa: added dsa_net_device_to_dsa_port() net: dsa: lan9303: added sysfs node swe_bcst_throt net: dsa: lan9303: Added basic offloading of unicast traffic net: dsa: lan9303: Added ALR/fdb/mdb handling net: dsa: lan9303: Added Documentation/networking/dsa/lan9303.txt net: dsa: lan9303: Only allocate 3 ports Documentation/networking/dsa/lan9303.txt | 63 +++ drivers/net/dsa/lan9303-core.c | 709 --- drivers/net/dsa/lan9303.h| 23 + drivers/net/dsa/lan9303_i2c.c| 2 + drivers/net/dsa/lan9303_mdio.c | 34 ++ include/net/dsa.h| 1 + net/dsa/slave.c | 10 + 7 files changed, 772 insertions(+), 70 deletions(-) create mode 100644 Documentation/networking/dsa/lan9303.txt -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 12/12] ima: added Documentation/security/IMA-digest-lists.txt
This patch adds the documentation of the new IMA feature, to load and measure file digest lists. Signed-off-by: Roberto Sassu--- Documentation/security/IMA-digest-lists.txt | 150 1 file changed, 150 insertions(+) create mode 100644 Documentation/security/IMA-digest-lists.txt diff --git a/Documentation/security/IMA-digest-lists.txt b/Documentation/security/IMA-digest-lists.txt new file mode 100644 index 000..f9eed21 --- /dev/null +++ b/Documentation/security/IMA-digest-lists.txt @@ -0,0 +1,150 @@ +File Digest Lists + + INTRODUCTION + +IMA, for each file matching policy rules, calculates a digest, creates +a new entry in the measurement list and extends a TPM PCR with the digest +of entry data. The last step causes a noticeable performance reduction. + +Since systems likely access the same files, repeating the above tasks at +every boot can be avoided by replacing individual measurements of likely +accessed files with only one measurement of their digests: the advantage +is that the system performance significantly improves due to less PCR +extend operations; on the other hand, the information about which files +have exactly been accessed and in which sequence is lost. + +If this new measurement reports only good digests (e.g. those of +files included in a Linux distribution), and if verifiers only check +that a system executed good software and didn't access malicious data, +the disadvantages reported earlier would be acceptable. + +The Trusted Computing paradigm measure & load is still respected by IMA +with the proposed optimization. If a file being accessed is not in a +measured digest list, a measurement will be recorded as before. If it is, +the list has already been measured, and the verifier must assume that +files with digest in the list have been accessed. + +Measuring digest lists gives the following benefits: + +- boot time reduction + For a minimal Linux installation with 1400 measurements, the boot time + decreases from 1 minute 30 seconds to 15 seconds, after loading to IMA + the digest of all files packaged by the distribution (32000). The new + list contains 92 entries. Without IMA, the boot time is 8.5 seconds. + +- lower network and CPU requirements for remote attestation + With the IMA optimization, both the measurement and digest lists + must be verified for a complete evaluation. However, since the lists + are fixed, they could be sent to and checked by the verifier only once. + Then, during a remote attestation, the only remaining task is to verify + the short measurement list. + +- signature-based remote attestation + Digest list signature can be used as a proof of the provenance for the + files whose digest is in the list. Then, if verifiers trust the signer + and only check provenance, remote attestation verification would simply + consist on checking digest lists signatures and that the measurement + list only contain list metadata digests (reference measurement databases + would be no longer required). An example of a signed digest list, + that can be parsed with this patch set, is the RPM package header. + +Digest lists are loaded in two stages by IMA through the new securityfs +interface called 'digest_lists'. Users supply metadata, for the digest +lists they want to load: path, format, digest, signature and algorithm +of the digest. + +Then, after the metadata digest is added to the measurement list, IMA +reads the digest lists at the path specified and loads the digests in +a hash table (digest lists are not measured, since their digest is already +included in the metadata). With metadata measurement instead of digest list +measurement, it is possible to avoid a performance reduction that would +occur by measuring many digest lists (e.g. RPM headers) individually. +If, alternatively, digest lists are loaded together, their signature +cannot be verified. + +Lastly, when a file is accessed, IMA searches the calculated digest in +the hash table. Only if the digest is not found a new entry is added +to the measurement list. + + + + FORMAT + +The format of digest list metadata is: + +algo[2] digest_len[4] digest[digest_len] +signature_len[4] signature[signature_len] +path_len[4] path[path_len] +ref_id_len[4] ref_id[ref_id_len] +list_type_len[4] list_type[list_type_len] + +algo, list_type and _len are little endian. + + +algo values are defined in include/uapi/linux/hash_info.h. The algorithms +in the list metadata must be the same of ima_hash_algo (algorithm used +by IMA to calculate the file digest). + +list type values: + +0: compact digest list +1: RPM package header + + +The format of the compact digest list is: + +entry_id[2] count[4] data_len[4] +data[data_len] +[...] +entry_id[2] count[4] data_len[4] +data[data_len] + +entry_id, count and data_len are little endian. + +At the moment, entry_id can have value 0, which
[PATCH 11/12] ima: don't report measurements if digests are included in the loaded lists
Don't report measurements if the file digest has been included in an uploaded digest list. The advantage of this solution is that the boot time overhead, when a TPM is available, is very small because a PCR is extended only for unknown files. The disadvantage is that verifiers do not know anymore which and when files are accessed (they must assume that the worst case happened, i.e. all files have been accessed). Signed-off-by: Roberto Sassu--- security/integrity/ima/ima_main.c | 8 1 file changed, 8 insertions(+) diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c index c329549..e289b7c 100644 --- a/security/integrity/ima/ima_main.c +++ b/security/integrity/ima/ima_main.c @@ -253,6 +253,14 @@ static int process_measurement(struct file *file, char *buf, loff_t size, goto out_digsig; } + if (!ima_disable_digest_check) { + if (ima_lookup_loaded_digest(iint->ima_hash->digest)) { + action ^= IMA_MEASURE; + iint->flags |= IMA_MEASURED; + iint->measured_pcrs |= (0x1 << pcr); + } + } + if (!pathbuf) /* ima_rdwr_violation possibly pre-fetched */ pathname = ima_d_path(>f_path, , filename); -- 2.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/12] ima: disable digest lookup if digest lists are not measured
Loading digest lists affects the behavior of IMA, as files whose digest has been uploaded will not be displayed in the measurement list. If the digest lists loading event is not reported, verifiers would believe that the files with uploaded digests have not been accessed. To prevent this, the DIGEST_CHECK hook has been defined and a new rule to measure files accessed by the new hook has been added to the default policy. If the currently loaded policy does not contain that rule, digest lookup is disabled. Digest lookup is also disabled if CONFIG_IMA_DIGEST_LIST is not defined. Signed-off-by: Roberto Sassu--- security/integrity/ima/ima.h| 1 + security/integrity/ima/ima_main.c | 15 ++- security/integrity/ima/ima_policy.c | 1 + 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h index 77dd4d0..2a558ee 100644 --- a/security/integrity/ima/ima.h +++ b/security/integrity/ima/ima.h @@ -199,6 +199,7 @@ static inline unsigned long ima_hash_key(u8 *digest) hook(KEXEC_KERNEL_CHECK)\ hook(KEXEC_INITRAMFS_CHECK) \ hook(POLICY_CHECK) \ + hook(DIGEST_LIST_CHECK) \ hook(MAX_CHECK) #define __ima_hook_enumify(ENUM) ENUM, diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c index 2aebb79..c329549 100644 --- a/security/integrity/ima/ima_main.c +++ b/security/integrity/ima/ima_main.c @@ -29,6 +29,12 @@ int ima_initialized; +#ifdef CONFIG_IMA_DIGEST_LIST +static int ima_disable_digest_check; +#else +static int ima_disable_digest_check = 1; +#endif + #ifdef CONFIG_IMA_APPRAISE int ima_appraise = IMA_APPRAISE_ENFORCE; #else @@ -171,6 +177,9 @@ static int process_measurement(struct file *file, char *buf, loff_t size, bool violation_check; enum hash_algo hash_algo; + if (func == DIGEST_LIST_CHECK && !ima_policy_flag) + ima_disable_digest_check = 1; + if (!ima_policy_flag || !S_ISREG(inode->i_mode)) return 0; @@ -181,6 +190,9 @@ static int process_measurement(struct file *file, char *buf, loff_t size, action = ima_get_action(inode, mask, func, ); violation_check = ((func == FILE_CHECK || func == MMAP_CHECK) && (ima_policy_flag & IMA_MEASURE)); + if (func == DIGEST_LIST_CHECK && !(action & IMA_MEASURE)) + ima_disable_digest_check = 1; + if (!action && !violation_check) return 0; @@ -375,7 +387,8 @@ static int read_idmap[READING_MAX_ID] = { [READING_MODULE] = MODULE_CHECK, [READING_KEXEC_IMAGE] = KEXEC_KERNEL_CHECK, [READING_KEXEC_INITRAMFS] = KEXEC_INITRAMFS_CHECK, - [READING_POLICY] = POLICY_CHECK + [READING_POLICY] = POLICY_CHECK, + [READING_DIGEST_LIST] = DIGEST_LIST_CHECK }; /** diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c index 95209a5..b5c004d 100644 --- a/security/integrity/ima/ima_policy.c +++ b/security/integrity/ima/ima_policy.c @@ -127,6 +127,7 @@ static struct ima_rule_entry default_measurement_rules[] __ro_after_init = { {.action = MEASURE, .func = MODULE_CHECK, .flags = IMA_FUNC}, {.action = MEASURE, .func = FIRMWARE_CHECK, .flags = IMA_FUNC}, {.action = MEASURE, .func = POLICY_CHECK, .flags = IMA_FUNC}, + {.action = MEASURE, .func = DIGEST_LIST_CHECK, .flags = IMA_FUNC}, }; static struct ima_rule_entry default_appraise_rules[] __ro_after_init = { -- 2.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 09/12] ima: introduce securityfs interfaces for digest lists
This patch introduces the file 'digest_lists' in the securityfs filesystem, to load digest lists metadata. IMA will parse the metadata and loads the digest lists from the path provided. It also introduces 'digests_count', to show the number of digests stored in the digest hash table. Signed-off-by: Roberto Sassu--- security/integrity/ima/ima_fs.c | 23 +++ 1 file changed, 23 insertions(+) diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c index ad3d674..08174c1 100644 --- a/security/integrity/ima/ima_fs.c +++ b/security/integrity/ima/ima_fs.c @@ -34,11 +34,15 @@ static struct dentry *ascii_runtime_measurements; static struct dentry *runtime_measurements_count; static struct dentry *violations; static struct dentry *ima_policy; +static struct dentry *digest_lists; +static struct dentry *digests_count; static enum kernel_read_file_id ima_get_file_id(struct dentry *dentry) { if (dentry == ima_policy) return READING_POLICY; + else if (dentry == digest_lists) + return READING_DIGEST_LIST; return READING_UNKNOWN; } @@ -66,6 +70,8 @@ static ssize_t ima_show_htable_value(struct file *filp, char __user *buf, val = _htable.violations; else if (filp->f_path.dentry == runtime_measurements_count) val = _htable.len; + else if (filp->f_path.dentry == digests_count) + val = _digests_htable.len; len = scnprintf(tmpbuf, TMPBUFLEN, "%li\n", atomic_long_read(val)); return simple_read_from_buffer(buf, count, ppos, tmpbuf, len); @@ -301,6 +307,9 @@ static ssize_t ima_read_file(char *path, enum kernel_read_file_id file_id) pr_debug("rule: %s\n", p); rc = ima_parse_add_rule(p); + } else if (file_id == READING_DIGEST_LIST) { + rc = ima_parse_digest_list_metadata(size, datap); + datap += rc; } if (rc < 0) break; @@ -510,8 +519,22 @@ int __init ima_fs_init(void) if (IS_ERR(ima_policy)) goto out; +#ifdef CONFIG_IMA_DIGEST_LIST + digest_lists = securityfs_create_file("digest_lists", S_IWUSR, ima_dir, + NULL, _data_upload_ops); + if (IS_ERR(digest_lists)) + goto out; + + digests_count = securityfs_create_file("digests_count", + S_IRUSR | S_IRGRP, ima_dir, + NULL, _htable_value_ops); + if (IS_ERR(digests_count)) + goto out; +#endif return 0; out: + securityfs_remove(digests_count); + securityfs_remove(digest_lists); securityfs_remove(violations); securityfs_remove(runtime_measurements_count); securityfs_remove(ascii_runtime_measurements); -- 2.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/12] ima: added parser of digest lists metadata
Userspace applications will be able to load digest lists by supplying their metadata. Digest list metadata are: - DATA_ALGO: algorithm of the digests to be uploaded - DATA_DIGEST: digest of the file containing the digest list - DATA_SIGNATURE: signature of the file containing the digest list - DATA_FILE_PATH: pathname - DATA_REF_ID: reference ID of the digest list - DATA_TYPE: type of digest list The new function ima_parse_digest_list_metadata() parses the metadata and load each file individually. Then, it parses the data according to the data type specified. Since digest lists are measured, their digest is added to the hash table so that IMA does not create a measurement entry for them (which would affect the performance). The only measurement entry created will be for the metadata. Signed-off-by: Roberto Sassu--- include/linux/fs.h | 1 + security/integrity/ima/Kconfig | 11 security/integrity/ima/Makefile | 1 + security/integrity/ima/ima.h | 8 +++ security/integrity/ima/ima_digest_list.c | 105 +++ 5 files changed, 126 insertions(+) create mode 100644 security/integrity/ima/ima_digest_list.c diff --git a/include/linux/fs.h b/include/linux/fs.h index 6e1fd5d..2eb6e7c 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2751,6 +2751,7 @@ extern int do_pipe_flags(int *, int); id(KEXEC_IMAGE, kexec-image)\ id(KEXEC_INITRAMFS, kexec-initramfs)\ id(POLICY, security-policy) \ + id(DIGEST_LIST, security-digest-list) \ id(MAX_ID, ) #define __fid_enumify(ENUM, dummy) READING_ ## ENUM, diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig index 35ef693..8965dcc 100644 --- a/security/integrity/ima/Kconfig +++ b/security/integrity/ima/Kconfig @@ -227,3 +227,14 @@ config IMA_APPRAISE_SIGNED_INIT default n help This option requires user-space init to be signed. + +config IMA_DIGEST_LIST + bool "Measure files depending on uploaded digest lists" + depends on IMA + default n + help + This option allows users to load digest lists. If a measured + file has the same digest of one from loaded lists, IMA will + not create a new measurement entry. A measurement entry will + be created only when digest lists are loaded (this entry + contains the digest of digest lists metadata). diff --git a/security/integrity/ima/Makefile b/security/integrity/ima/Makefile index 29f198b..00dbe3a 100644 --- a/security/integrity/ima/Makefile +++ b/security/integrity/ima/Makefile @@ -9,4 +9,5 @@ ima-y := ima_fs.o ima_queue.o ima_init.o ima_main.o ima_crypto.o ima_api.o \ ima_policy.o ima_template.o ima_template_lib.o ima-$(CONFIG_IMA_APPRAISE) += ima_appraise.o ima-$(CONFIG_HAVE_IMA_KEXEC) += ima_kexec.o +ima-$(CONFIG_IMA_DIGEST_LIST) += ima_digest_list.o obj-$(CONFIG_IMA_BLACKLIST_KEYRING) += ima_mok.o diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h index a0c6808..77dd4d0 100644 --- a/security/integrity/ima/ima.h +++ b/security/integrity/ima/ima.h @@ -157,6 +157,14 @@ int ima_restore_measurement_entry(struct ima_template_entry *entry); int ima_restore_measurement_list(loff_t bufsize, void *buf); struct ima_digest *ima_lookup_loaded_digest(u8 *digest); int ima_add_digest_data_entry(u8 *digest); +#ifdef CONFIG_IMA_DIGEST_LIST +ssize_t ima_parse_digest_list_metadata(loff_t size, void *buf); +#else +static inline ssize_t ima_parse_digest_list_metadata(loff_t size, void *buf) +{ + return -ENOTSUP; +} +#endif int ima_measurements_show(struct seq_file *m, void *v); unsigned long ima_get_binary_runtime_size(void); int ima_init_template(void); diff --git a/security/integrity/ima/ima_digest_list.c b/security/integrity/ima/ima_digest_list.c new file mode 100644 index 000..3e1ff69b --- /dev/null +++ b/security/integrity/ima/ima_digest_list.c @@ -0,0 +1,105 @@ +/* + * Copyright (C) 2017 Huawei Technologies Co. Ltd. + * + * Author: Roberto Sassu + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation, version 2 of the + * License. + * + * File: ima_digest_list.c + * Functions to manage digest lists. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include + +#include "ima.h" +#include "ima_template_lib.h" + +enum digest_metadata_fields {DATA_ALGO, DATA_DIGEST, DATA_SIGNATURE, +DATA_FILE_PATH, DATA_REF_ID, DATA_TYPE, +DATA__LAST}; + +static int ima_parse_digest_list_data(struct ima_field_data *data) +{ + void *digest_list; + loff_t digest_list_size; + u16 data_algo = le16_to_cpu(*(u16 *)data[DATA_ALGO].data); + u16 data_type =
[PATCH 08/12] ima: added parser for RPM data type
This patch introduces a parser for RPM packages. It extracts the digests from the RPMTAG_FILEDIGESTS header section and converts them to binary data before adding them to the hash table. The advantage of this data type is that verifiers can determine who produced that data, as headers are signed by Linux distributions vendors. RPM headers signatures can be provided as digest list metadata. Signed-off-by: Roberto Sassu--- security/integrity/ima/ima_digest_list.c | 84 +++- 1 file changed, 83 insertions(+), 1 deletion(-) diff --git a/security/integrity/ima/ima_digest_list.c b/security/integrity/ima/ima_digest_list.c index c1ef79a..11ee77e 100644 --- a/security/integrity/ima/ima_digest_list.c +++ b/security/integrity/ima/ima_digest_list.c @@ -19,11 +19,13 @@ #include "ima.h" #include "ima_template_lib.h" +#define RPMTAG_FILEDIGESTS 1035 + enum digest_metadata_fields {DATA_ALGO, DATA_DIGEST, DATA_SIGNATURE, DATA_FILE_PATH, DATA_REF_ID, DATA_TYPE, DATA__LAST}; -enum digest_data_types {DATA_TYPE_COMPACT_LIST}; +enum digest_data_types {DATA_TYPE_COMPACT_LIST, DATA_TYPE_RPM}; enum compact_list_entry_ids {COMPACT_LIST_ID_DIGEST}; @@ -33,6 +35,20 @@ struct compact_list_hdr { u32 datalen; } __packed; +struct rpm_hdr { + u32 magic; + u32 reserved; + u32 tags; + u32 datasize; +} __packed; + +struct rpm_entryinfo { + int32_t tag; + u32 type; + int32_t offset; + u32 count; +} __packed; + static int ima_parse_compact_list(loff_t size, void *buf) { void *bufp = buf, *bufendp = buf + size; @@ -80,6 +96,69 @@ static int ima_parse_compact_list(loff_t size, void *buf) return 0; } +static int ima_parse_rpm(loff_t size, void *buf) +{ + void *bufp = buf, *bufendp = buf + size; + struct rpm_hdr *hdr = bufp; + u32 tags = be32_to_cpu(hdr->tags); + struct rpm_entryinfo *entry; + void *datap = bufp + sizeof(*hdr) + tags * sizeof(struct rpm_entryinfo); + int digest_len = hash_digest_size[ima_hash_algo]; + u8 digest[digest_len]; + int ret, i, j; + + const unsigned char rpm_header_magic[8] = { + 0x8e, 0xad, 0xe8, 0x01, 0x00, 0x00, 0x00, 0x00 + }; + + if (size < sizeof(*hdr)) { + pr_err("Missing RPM header\n"); + return -EINVAL; + } + + if (memcmp(bufp, rpm_header_magic, sizeof(rpm_header_magic))) { + pr_err("Invalid RPM header\n"); + return -EINVAL; + } + + bufp += sizeof(*hdr); + + for (i = 0; i < tags && (bufp + sizeof(*entry)) <= bufendp; +i++, bufp += sizeof(*entry)) { + entry = bufp; + + if (be32_to_cpu(entry->tag) != RPMTAG_FILEDIGESTS) + continue; + + datap += be32_to_cpu(entry->offset); + + for (j = 0; j < be32_to_cpu(entry->count) && +datap < bufendp; j++) { + if (strlen(datap) == 0) { + datap++; + continue; + } + + if (datap + digest_len * 2 + 1 > bufendp) { + pr_err("RPM header read at invalid offset\n"); + return -EINVAL; + } + + hex2bin(digest, datap, digest_len); + + ret = ima_add_digest_data_entry(digest); + if (ret < 0 && ret != -EEXIST) + return ret; + + datap += digest_len * 2 + 1; + } + + break; + } + + return 0; +} + static int ima_parse_digest_list_data(struct ima_field_data *data) { void *digest_list; @@ -107,6 +186,9 @@ static int ima_parse_digest_list_data(struct ima_field_data *data) case DATA_TYPE_COMPACT_LIST: ret = ima_parse_compact_list(digest_list_size, digest_list); break; + case DATA_TYPE_RPM: + ret = ima_parse_rpm(digest_list_size, digest_list); + break; default: pr_err("Parser for data type %d not implemented\n", data_type); ret = -EINVAL; -- 2.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 00/12] ima: measure digest lists instead of individual files
This patch set applies on top of kernel v4.13-rc2. IMA, for each file matching policy rules, calculates a digest, creates a new entry in the measurement list and extends a TPM PCR with the digest of entry data. The last step causes a noticeable performance reduction. Since systems likely access the same files, repeating the above tasks at every boot can be avoided by replacing individual measurements of likely accessed files with only one measurement of their digests: the advantage is that the system performance significantly improves due to less PCR extend operations; on the other hand, the information about which files have exactly been accessed and in which sequence is lost. If this new measurement reports only good digests (e.g. those of files included in a Linux distribution), and if verifiers only check that a system executed good software and didn't access malicious data, the disadvantages reported earlier would be acceptable. The Trusted Computing paradigm measure & load is still respected by IMA with the proposed optimization. If a file being accessed is not in a measured digest list, a measurement will be recorded as before. If it is, the list has already been measured, and the verifier must assume that files with digest in the list have been accessed. Measuring digest lists gives the following benefits: - boot time reduction For a minimal Linux installation with 1400 measurements, the boot time decreases from 1 minute 30 seconds to 15 seconds, after loading to IMA the digest of all files packaged by the distribution (32000). The new list contains 92 entries. Without IMA, the boot time is 8.5 seconds. - lower network and CPU requirements for remote attestation With the IMA optimization, both the measurement and digest lists must be verified for a complete evaluation. However, since the lists are fixed, they could be sent to and checked by the verifier only once. Then, during a remote attestation, the only remaining task is to verify the short measurement list. - signature-based remote attestation Digest list signature can be used as a proof of the provenance for the files whose digest is in the list. Then, if verifiers trust the signer and only check provenance, remote attestation verification would simply consist on checking digest lists signatures and that the measurement list only contain list metadata digests (reference measurement databases would be no longer required). An example of a signed digest list, that can be parsed with this patch set, is the RPM package header. Digest lists are loaded in two stages by IMA through the new securityfs interface called 'digest_lists'. Users supply metadata, for the digest lists they want to load: path, format, digest, signature and algorithm of the digest. Then, after the metadata digest is added to the measurement list, IMA reads the digest lists at the path specified and loads the digests in a hash table (digest lists are not measured, since their digest is already included in the metadata). With metadata measurement instead of digest list measurement, it is possible to avoid a performance reduction that would occur by measuring many digest lists (e.g. RPM headers) individually. If, alternatively, digest lists are loaded together, their signature cannot be verified. Lastly, when a file is accessed, IMA searches the calculated digest in the hash table. Only if the digest is not found a new entry is added to the measurement list. Roberto Sassu (12): ima: generalize ima_read_policy() ima: generalize ima_write_policy() ima: generalize policy file operations ima: use ima_show_htable_value to show hash table data ima: add functions to manage digest lists ima: added parser of digest lists metadata ima: added parser for compact digest list ima: added parser for RPM data type ima: introduce securityfs interfaces for digest lists ima: disable digest lookup if digest lists are not measured ima: don't report measurements if digests are included in the loaded lists ima: added Documentation/security/IMA-digest-lists.txt Documentation/security/IMA-digest-lists.txt | 150 + include/linux/fs.h | 1 + security/integrity/ima/Kconfig | 11 ++ security/integrity/ima/Makefile | 1 + security/integrity/ima/ima.h| 17 ++ security/integrity/ima/ima_digest_list.c| 247 security/integrity/ima/ima_fs.c | 178 security/integrity/ima/ima_main.c | 23 ++- security/integrity/ima/ima_policy.c | 1 + security/integrity/ima/ima_queue.c | 39 + 10 files changed, 602 insertions(+), 66 deletions(-) create mode 100644 Documentation/security/IMA-digest-lists.txt create mode 100644 security/integrity/ima/ima_digest_list.c -- 2.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message
[PATCH 05/12] ima: add functions to manage digest lists
This patch first introduces a new structure called ima_digest, which will contain a digest parsed from the digest list. It has been preferred to ima_queue_entry, as the existing structure includes an additional member (a list head), which is not necessary for digest lookup. Then, this patch introduces functions to lookup and add a digest to a hash table, which will be used by the parsers. Signed-off-by: Roberto Sassu--- security/integrity/ima/ima.h | 8 security/integrity/ima/ima_queue.c | 39 ++ 2 files changed, 47 insertions(+) diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h index d52b487..a0c6808 100644 --- a/security/integrity/ima/ima.h +++ b/security/integrity/ima/ima.h @@ -107,6 +107,11 @@ struct ima_queue_entry { }; extern struct list_head ima_measurements; /* list of all measurements */ +struct ima_digest { + struct hlist_node hnext; + u8 digest[0]; +}; + /* Some details preceding the binary serialized measurement list */ struct ima_kexec_hdr { u16 version; @@ -150,6 +155,8 @@ void ima_print_digest(struct seq_file *m, u8 *digest, u32 size); struct ima_template_desc *ima_template_desc_current(void); int ima_restore_measurement_entry(struct ima_template_entry *entry); int ima_restore_measurement_list(loff_t bufsize, void *buf); +struct ima_digest *ima_lookup_loaded_digest(u8 *digest); +int ima_add_digest_data_entry(u8 *digest); int ima_measurements_show(struct seq_file *m, void *v); unsigned long ima_get_binary_runtime_size(void); int ima_init_template(void); @@ -166,6 +173,7 @@ struct ima_h_table { struct hlist_head queue[IMA_MEASURE_HTABLE_SIZE]; }; extern struct ima_h_table ima_htable; +extern struct ima_h_table ima_digests_htable; static inline unsigned long ima_hash_key(u8 *digest) { diff --git a/security/integrity/ima/ima_queue.c b/security/integrity/ima/ima_queue.c index a02a86d..d1a3d3f 100644 --- a/security/integrity/ima/ima_queue.c +++ b/security/integrity/ima/ima_queue.c @@ -42,6 +42,11 @@ struct ima_h_table ima_htable = { .queue[0 ... IMA_MEASURE_HTABLE_SIZE - 1] = HLIST_HEAD_INIT }; +struct ima_h_table ima_digests_htable = { + .len = ATOMIC_LONG_INIT(0), + .queue[0 ... IMA_MEASURE_HTABLE_SIZE - 1] = HLIST_HEAD_INIT +}; + /* mutex protects atomicity of extending measurement list * and extending the TPM PCR aggregate. Since tpm_extend can take * long (and the tpm driver uses a mutex), we can't use the spinlock. @@ -212,3 +217,37 @@ int ima_restore_measurement_entry(struct ima_template_entry *entry) mutex_unlock(_extend_list_mutex); return result; } + +struct ima_digest *ima_lookup_loaded_digest(u8 *digest) +{ + struct ima_digest *d = NULL; + int digest_len = hash_digest_size[ima_hash_algo]; + unsigned int key = ima_hash_key(digest); + + rcu_read_lock(); + hlist_for_each_entry_rcu(d, _digests_htable.queue[key], hnext) { + if (memcmp(d->digest, digest, digest_len) == 0) + break; + } + rcu_read_unlock(); + return d; +} + +int ima_add_digest_data_entry(u8 *digest) +{ + struct ima_digest *d = ima_lookup_loaded_digest(digest); + int digest_len = hash_digest_size[ima_hash_algo]; + unsigned int key = ima_hash_key(digest); + + if (d) + return -EEXIST; + + d = kmalloc(sizeof(*d) + digest_len, GFP_KERNEL); + if (d == NULL) + return -ENOMEM; + + memcpy(d->digest, digest, digest_len); + hlist_add_head_rcu(>hnext, _digests_htable.queue[key]); + atomic_long_inc(_digests_htable.len); + return 0; +} -- 2.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/12] ima: use ima_show_htable_value to show hash table data
This patch removes ima_show_htable_violations() and ima_show_measurements_count(). ima_show_htable_value(), called by those functions, determines which hash table data should be copied to the buffer depending on the dentry of the file passed as argument. Signed-off-by: Roberto Sassu--- security/integrity/ima/ima_fs.c | 38 -- 1 file changed, 12 insertions(+), 26 deletions(-) diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c index f4199f2..ad3d674 100644 --- a/security/integrity/ima/ima_fs.c +++ b/security/integrity/ima/ima_fs.c @@ -55,38 +55,24 @@ __setup("ima_canonical_fmt", default_canonical_fmt_setup); static int valid_policy = 1; #define TMPBUFLEN 12 -static ssize_t ima_show_htable_value(char __user *buf, size_t count, -loff_t *ppos, atomic_long_t *val) +static ssize_t ima_show_htable_value(struct file *filp, char __user *buf, +size_t count, loff_t *ppos) { + atomic_long_t *val = NULL; char tmpbuf[TMPBUFLEN]; ssize_t len; + if (filp->f_path.dentry == violations) + val = _htable.violations; + else if (filp->f_path.dentry == runtime_measurements_count) + val = _htable.len; + len = scnprintf(tmpbuf, TMPBUFLEN, "%li\n", atomic_long_read(val)); return simple_read_from_buffer(buf, count, ppos, tmpbuf, len); } -static ssize_t ima_show_htable_violations(struct file *filp, - char __user *buf, - size_t count, loff_t *ppos) -{ - return ima_show_htable_value(buf, count, ppos, _htable.violations); -} - -static const struct file_operations ima_htable_violations_ops = { - .read = ima_show_htable_violations, - .llseek = generic_file_llseek, -}; - -static ssize_t ima_show_measurements_count(struct file *filp, - char __user *buf, - size_t count, loff_t *ppos) -{ - return ima_show_htable_value(buf, count, ppos, _htable.len); - -} - -static const struct file_operations ima_measurements_count_ops = { - .read = ima_show_measurements_count, +static const struct file_operations ima_htable_value_ops = { + .read = ima_show_htable_value, .llseek = generic_file_llseek, }; @@ -508,13 +494,13 @@ int __init ima_fs_init(void) runtime_measurements_count = securityfs_create_file("runtime_measurements_count", S_IRUSR | S_IRGRP, ima_dir, NULL, - _measurements_count_ops); + _htable_value_ops); if (IS_ERR(runtime_measurements_count)) goto out; violations = securityfs_create_file("violations", S_IRUSR | S_IRGRP, - ima_dir, NULL, _htable_violations_ops); + ima_dir, NULL, _htable_value_ops); if (IS_ERR(violations)) goto out; -- 2.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/12] ima: generalize policy file operations
This patch renames ima_open_policy() and ima_release_policy() respectively to ima_open_data_upload() and ima_release_data_upload(). They will be used to implement file operations for interfaces allowing to upload and read provided data. Also, the new flag IMA_POLICY_BUSY has been defined specifically for the policy, as it might not be cleared at file release. This would prevent userspace applications from uploading files after a policy has been loaded. Signed-off-by: Roberto Sassu--- security/integrity/ima/ima_fs.c | 46 - 1 file changed, 32 insertions(+), 14 deletions(-) diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c index e375206..f4199f2 100644 --- a/security/integrity/ima/ima_fs.c +++ b/security/integrity/ima/ima_fs.c @@ -384,6 +384,7 @@ static ssize_t ima_write_data(struct file *file, const char __user *buf, } enum ima_fs_flags { + IMA_POLICY_BUSY, IMA_FS_BUSY, }; @@ -399,22 +400,33 @@ static const struct seq_operations ima_policy_seqops = { #endif /* - * ima_open_policy: sequentialize access to the policy file + * ima_open_data_upload: sequentialize access to the data upload interface */ -static int ima_open_policy(struct inode *inode, struct file *filp) +static int ima_open_data_upload(struct inode *inode, struct file *filp) { + enum kernel_read_file_id file_id = ima_get_file_id(filp->f_path.dentry); + const struct seq_operations *seq_ops = NULL; + enum ima_fs_flags flag = IMA_FS_BUSY; + bool read_allowed = false; + + if (file_id == READING_POLICY) { + flag = IMA_POLICY_BUSY; +#ifdef CONFIG_IMA_READ_POLICY + read_allowed = true; + seq_ops = _policy_seqops; +#endif + } + if (!(filp->f_flags & O_WRONLY)) { -#ifndefCONFIG_IMA_READ_POLICY - return -EACCES; -#else + if (!read_allowed) + return -EACCES; if ((filp->f_flags & O_ACCMODE) != O_RDONLY) return -EACCES; if (!capable(CAP_SYS_ADMIN)) return -EPERM; - return seq_open(filp, _policy_seqops); -#endif + return seq_open(filp, seq_ops); } - if (test_and_set_bit(IMA_FS_BUSY, _fs_flags)) + if (test_and_set_bit(flag, _fs_flags)) return -EBUSY; return 0; } @@ -426,13 +438,19 @@ static int ima_open_policy(struct inode *inode, struct file *filp) * point to the new policy rules, and remove the securityfs policy file, * assuming a valid policy. */ -static int ima_release_policy(struct inode *inode, struct file *file) +static int ima_release_data_upload(struct inode *inode, struct file *file) { + enum kernel_read_file_id file_id = ima_get_file_id(file->f_path.dentry); const char *cause = valid_policy ? "completed" : "failed"; if ((file->f_flags & O_ACCMODE) == O_RDONLY) return seq_release(inode, file); + if (file_id != READING_POLICY) { + clear_bit(IMA_FS_BUSY, _fs_flags); + return 0; + } + if (valid_policy && ima_check_policy() < 0) { cause = "failed"; valid_policy = 0; @@ -454,16 +472,16 @@ static int ima_release_policy(struct inode *inode, struct file *file) securityfs_remove(ima_policy); ima_policy = NULL; #else - clear_bit(IMA_FS_BUSY, _fs_flags); + clear_bit(IMA_POLICY_BUSY, _fs_flags); #endif return 0; } -static const struct file_operations ima_measure_policy_ops = { - .open = ima_open_policy, +static const struct file_operations ima_data_upload_ops = { + .open = ima_open_data_upload, .write = ima_write_data, .read = seq_read, - .release = ima_release_policy, + .release = ima_release_data_upload, .llseek = generic_file_llseek, }; @@ -502,7 +520,7 @@ int __init ima_fs_init(void) ima_policy = securityfs_create_file("policy", POLICY_FILE_FLAGS, ima_dir, NULL, - _measure_policy_ops); + _data_upload_ops); if (IS_ERR(ima_policy)) goto out; -- 2.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/12] ima: generalize ima_write_policy()
This patch renames ima_write_policy() to ima_write_data(). Also, it determines the kernel_read_file_id from the dentry associated to the file, and passes it to ima_read_file(). Signed-off-by: Roberto Sassu--- security/integrity/ima/ima_fs.c | 55 ++--- 1 file changed, 35 insertions(+), 20 deletions(-) diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c index 058d3c1..e375206 100644 --- a/security/integrity/ima/ima_fs.c +++ b/security/integrity/ima/ima_fs.c @@ -28,6 +28,21 @@ static DEFINE_MUTEX(ima_write_mutex); +static struct dentry *ima_dir; +static struct dentry *binary_runtime_measurements; +static struct dentry *ascii_runtime_measurements; +static struct dentry *runtime_measurements_count; +static struct dentry *violations; +static struct dentry *ima_policy; + +static enum kernel_read_file_id ima_get_file_id(struct dentry *dentry) +{ + if (dentry == ima_policy) + return READING_POLICY; + + return READING_UNKNOWN; +} + bool ima_canonical_fmt; static int __init default_canonical_fmt_setup(char *str) { @@ -315,11 +330,12 @@ static ssize_t ima_read_file(char *path, enum kernel_read_file_id file_id) return pathlen; } -static ssize_t ima_write_policy(struct file *file, const char __user *buf, - size_t datalen, loff_t *ppos) +static ssize_t ima_write_data(struct file *file, const char __user *buf, + size_t datalen, loff_t *ppos) { char *data; ssize_t result; + enum kernel_read_file_id file_id = ima_get_file_id(file->f_path.dentry); if (datalen >= PAGE_SIZE) datalen = PAGE_SIZE - 1; @@ -340,34 +356,33 @@ static ssize_t ima_write_policy(struct file *file, const char __user *buf, goto out_free; if (data[0] == '/') { - result = ima_read_file(data, READING_POLICY); - } else if (ima_appraise & IMA_APPRAISE_POLICY) { - pr_err("IMA: signed policy file (specified as an absolute pathname) required\n"); - integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL, NULL, - "policy_update", "signed policy required", - 1, 0); - if (ima_appraise & IMA_APPRAISE_ENFORCE) - result = -EACCES; + result = ima_read_file(data, file_id); + } else if (file_id == READING_POLICY) { + if (ima_appraise & IMA_APPRAISE_POLICY) { + pr_err("IMA: signed policy file (specified " + "as an absolute pathname) required\n"); + integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL, NULL, + "policy_update", "signed policy required", + 1, 0); + if (ima_appraise & IMA_APPRAISE_ENFORCE) + result = -EACCES; + } else { + result = ima_parse_add_rule(data); + } } else { - result = ima_parse_add_rule(data); + pr_err("Unknown data type\n"); + result = -EINVAL; } mutex_unlock(_write_mutex); out_free: kfree(data); out: - if (result < 0) + if (file_id == READING_POLICY && result < 0) valid_policy = 0; return result; } -static struct dentry *ima_dir; -static struct dentry *binary_runtime_measurements; -static struct dentry *ascii_runtime_measurements; -static struct dentry *runtime_measurements_count; -static struct dentry *violations; -static struct dentry *ima_policy; - enum ima_fs_flags { IMA_FS_BUSY, }; @@ -446,7 +461,7 @@ static int ima_release_policy(struct inode *inode, struct file *file) static const struct file_operations ima_measure_policy_ops = { .open = ima_open_policy, - .write = ima_write_policy, + .write = ima_write_data, .read = seq_read, .release = ima_release_policy, .llseek = generic_file_llseek, -- 2.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/12] ima: generalize ima_read_policy()
Rename ima_read_policy() to ima_read_file(), and add file_id as new parameter. If file_id is equal to READING_POLICY, ima_read_file() behavior is the same of that without the patch. ima_read_file() will be used to read digest lists, to avoid reporting measurements when the file digest is known. Signed-off-by: Roberto Sassu--- security/integrity/ima/ima_fs.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c index ad491c5..058d3c1 100644 --- a/security/integrity/ima/ima_fs.c +++ b/security/integrity/ima/ima_fs.c @@ -272,7 +272,7 @@ static const struct file_operations ima_ascii_measurements_ops = { .release = seq_release, }; -static ssize_t ima_read_policy(char *path) +static ssize_t ima_read_file(char *path, enum kernel_read_file_id file_id) { void *data; char *datap; @@ -285,16 +285,22 @@ static ssize_t ima_read_policy(char *path) datap = path; strsep(, "\n"); - rc = kernel_read_file_from_path(path, , , 0, READING_POLICY); + rc = kernel_read_file_from_path(path, , , 0, file_id); if (rc < 0) { pr_err("Unable to open file: %s (%d)", path, rc); return rc; } datap = data; - while (size > 0 && (p = strsep(, "\n"))) { - pr_debug("rule: %s\n", p); - rc = ima_parse_add_rule(p); + while (size > 0) { + if (file_id == READING_POLICY) { + p = strsep(, "\n"); + if (p == NULL) + break; + + pr_debug("rule: %s\n", p); + rc = ima_parse_add_rule(p); + } if (rc < 0) break; size -= rc; @@ -334,7 +340,7 @@ static ssize_t ima_write_policy(struct file *file, const char __user *buf, goto out_free; if (data[0] == '/') { - result = ima_read_policy(data); + result = ima_read_file(data, READING_POLICY); } else if (ima_appraise & IMA_APPRAISE_POLICY) { pr_err("IMA: signed policy file (specified as an absolute pathname) required\n"); integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL, NULL, -- 2.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 1/5] mm: add vm_insert_mixed_mkwrite()
On Tue, Jul 25, 2017 at 02:50:37PM +0200, Jan Kara wrote: > On Tue 25-07-17 14:15:22, Christoph Hellwig wrote: > > On Tue, Jul 25, 2017 at 11:35:08AM +0200, Jan Kara wrote: > > > On Tue 25-07-17 10:01:58, Christoph Hellwig wrote: > > > > On Tue, Jul 25, 2017 at 01:14:00AM +0300, Kirill A. Shutemov wrote: > > > > > I guess it's up to filesystem if it wants to reuse the same spot to > > > > > write > > > > > data or not. I think your assumptions works for ext4 and xfs. I > > > > > wouldn't > > > > > be that sure for btrfs or other filesystems with CoW support. > > > > > > > > Or XFS with reflinks for that matter. Which currently can't be > > > > combined with DAX, but I had a somewhat working version a few month > > > > ago. > > > > > > But in cases like COW when the block mapping changes, the process > > > must run unmap_mapping_range() before installing the new PTE so that all > > > processes mapping this file offset actually refault and see the new > > > mapping. So this would go through pte_none() case. Am I missing something? > > > > Yes, for DAX COW mappings we'd probably need something like this, unlike > > the pagecache COW handling for which only the underlying block change, > > but not the page. > > Right. So again nothing where the WARN_ON should trigger. Yes. I was confused on how COW is handled. Acked-by: Kirill A. Shutemov-- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] printk: Add boottime and real timestamps
On Tue, Jul 25, 2017 at 08:17:27AM -0400, Prarit Bhargava wrote: > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > index 5b1662ec546f..6cd38a25f8ea 100644 > --- a/lib/Kconfig.debug > +++ b/lib/Kconfig.debug > @@ -1,8 +1,8 @@ > menu "printk and dmesg options" > > config PRINTK_TIME > - int "Show timing information on printks (0-1)" > - range 0 1 > + int "Show timing information on printks (0-3)" > + range 0 3 > default "0" > depends on PRINTK > help > @@ -13,7 +13,8 @@ config PRINTK_TIME > The timestamp is always recorded internally, and exported > to /dev/kmsg. This flag just specifies if the timestamp should > be included, not that the timestamp is recorded. 0 disables the > - timestamp and 1 uses the local clock. > + timestamp and 1 uses the local clock, 2 uses the monotonic clock, and > + 3 uses real clock. > > The behavior is also controlled by the kernel command line > parameter printk.time=1. See > Documentation/admin-guide/kernel-parameters.rst choice prompt "printk default clock" default PRIMTK_TIME_DISABLE help goes here config PRINTK_TIME_DISABLE bool "Disabled" help goes here config PRINTK_TIME_LOCAL bool "local clock" help goes here config PRINTK_TIME_MONO bool "CLOCK_MONOTONIC" help goes here config PRINTK_TIME_REAL bool "CLOCK_REALTIME" help goes here endchoice config PRINTK_TIME int default 0 if PRINTK_TIME_DISABLE default 1 if PRINTK_TIME_LOCAL default 2 if PRINTK_TIME_MONO default 3 if PRINTK_TIME_REAL Although I must strongly discourage using REALTIME, DST will make untangling your logs an absolute nightmare. I would simply not provide it. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] printk: Make CONFIG_PRINTK_TIME an int
On Tue, Jul 25, 2017 at 08:17:26AM -0400, Prarit Bhargava wrote: > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index fc47863f629c..26cf6cadd267 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -1202,8 +1202,40 @@ static inline void boot_delay_msec(int level) > } > #endif > > -static bool printk_time = IS_ENABLED(CONFIG_PRINTK_TIME); > -module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR); > +static int printk_time = CONFIG_PRINTK_TIME; You could just use unsigned int but is the reason you went with int to enable backward compatibility with the old bool =y or =n? Luis -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 1/5] mm: add vm_insert_mixed_mkwrite()
On Tue 25-07-17 14:15:22, Christoph Hellwig wrote: > On Tue, Jul 25, 2017 at 11:35:08AM +0200, Jan Kara wrote: > > On Tue 25-07-17 10:01:58, Christoph Hellwig wrote: > > > On Tue, Jul 25, 2017 at 01:14:00AM +0300, Kirill A. Shutemov wrote: > > > > I guess it's up to filesystem if it wants to reuse the same spot to > > > > write > > > > data or not. I think your assumptions works for ext4 and xfs. I wouldn't > > > > be that sure for btrfs or other filesystems with CoW support. > > > > > > Or XFS with reflinks for that matter. Which currently can't be > > > combined with DAX, but I had a somewhat working version a few month > > > ago. > > > > But in cases like COW when the block mapping changes, the process > > must run unmap_mapping_range() before installing the new PTE so that all > > processes mapping this file offset actually refault and see the new > > mapping. So this would go through pte_none() case. Am I missing something? > > Yes, for DAX COW mappings we'd probably need something like this, unlike > the pagecache COW handling for which only the underlying block change, > but not the page. Right. So again nothing where the WARN_ON should trigger. That being said I don't care about the WARN_ON too deeply but it can help to catch DAX bugs so if we can keep it I'd prefer to do so... Honza -- Jan KaraSUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] printk: Add boottime and real timestamps
printk.time=1/CONFIG_PRINTK_TIME=Y timestamps printks with an unmodified hardware clock timestamp. This clock loses time each day making it difficult to determine when an issue has occurred in the kernel log. Modify printk.time to output local, monotonic, or a real timestamp. Modify the output of /sys/module/printk/parameters/time to output the type of clock so userspace programs can interpret the timestamp. Real clock & 32-bit systems: Selecting the real clock printk timestamp may lead to unlikely situations where a timestamp is wrong because the real time offset is read without the protection of a sequence lock in the call to ktime_get_log_ts() in printk_get_ts(). Signed-off-by: Prarit BhargavaCc: Mark Salyzyn Cc: Jonathan Corbet Cc: Petr Mladek Cc: Sergey Senozhatsky Cc: Steven Rostedt Cc: John Stultz Cc: Thomas Gleixner Cc: Stephen Boyd Cc: Andrew Morton Cc: Greg Kroah-Hartman Cc: "Paul E. McKenney" Cc: Christoffer Dall Cc: Deepa Dinamani Cc: Ingo Molnar Cc: Joel Fernandes Cc: Kees Cook Cc: Peter Zijlstra Cc: Geert Uytterhoeven Cc: "Luis R. Rodriguez" Cc: Nicholas Piggin Cc: "Jason A. Donenfeld" Cc: Olof Johansson Cc: "Theodore Ts'o" Cc: Josh Poimboeuf Cc: linux-doc@vger.kernel.org --- Documentation/admin-guide/kernel-parameters.txt | 6 +- include/linux/timekeeping.h | 1 + kernel/printk/printk.c | 77 + kernel/time/timekeeping.c | 14 + lib/Kconfig.debug | 7 ++- 5 files changed, 89 insertions(+), 16 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index c3b14abf9da4..c03240d057b1 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3188,8 +3188,10 @@ ratelimit - ratelimit the logging Default: ratelimit - printk.time=Show timing data prefixed to each printk message line - Format: (1/Y/y=enable, 0/N/n=disable) + printk.time=Show timestamp prefixed to each printk message line + Format: + (0/N/n = disable, 1/Y/y = local/unadjusted HW, +2 = monotonic, 3 = real) processor.max_cstate= [HW,ACPI] Limit processor to maximum C-state diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h index ddc229ff6d1e..adb84af42deb 100644 --- a/include/linux/timekeeping.h +++ b/include/linux/timekeeping.h @@ -239,6 +239,7 @@ static inline u64 ktime_get_raw_ns(void) extern u64 ktime_get_mono_fast_ns(void); extern u64 ktime_get_raw_fast_ns(void); extern u64 ktime_get_boot_fast_ns(void); +extern u64 ktime_get_log_ts(u64 *offset_real); /* * Timespec interfaces utilizing the ktime based ones diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 26cf6cadd267..35536369a56d 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -576,6 +576,8 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len, return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len); } +static u64 printk_get_ts(void); + /* insert record into the buffer, discard old ones, update heads */ static int log_store(int facility, int level, enum log_flags flags, u64 ts_nsec, @@ -624,7 +626,7 @@ static int log_store(int facility, int level, if (ts_nsec > 0) msg->ts_nsec = ts_nsec; else - msg->ts_nsec = local_clock(); + msg->ts_nsec = printk_get_ts(); memset(log_dict(msg) + dict_len, 0, pad_len); msg->len = size; @@ -1203,26 +1205,60 @@ static inline void boot_delay_msec(int level) #endif static int printk_time = CONFIG_PRINTK_TIME; +static int printk_time_setting; /* initial setting */ +/* + * Real clock & 32-bit systems: Selecting the real clock printk timestamp may + * lead to unlikely situations where a timestamp is wrong because the real time + * offset is read without the protection of a sequence lock in the call to + * ktime_get_log_ts() in printk_get_ts() below. + */ static int printk_time_set(const char *val, const struct kernel_param *kp) { char *param = strstrip((char *)val); + int _printk_time; if (strlen(param) != 1)
[PATCH 0/2] printk: allow different timestamps for printk.time
Over the past years I've seen many reports of bugs that include time-stamped kernel logs (enabled when CONFIG_PRINTK_TIME=y or print.time=1 is specified as a kernel parameter) that do not align with either external time stamped logs or /var/log/messages. This also makes determining the time of a failure difficult in cases where /var/log/messages is unavailable. For example, [root@intel-wildcatpass-06 ~]# date; echo "Hello!" > /dev/kmsg ; date Thu Jul 20 11:38:22 EST 2017 Thu Jul 20 11:38:22 EST 2017 which displays [83973.768912] Hello! on the serial console. Running a script to convert this to the stamped time, [root@intel-wildcatpass-06 ~]# ./human.sh | tail -1 [Thu July 17 11:39:45 2017] Hello! which is already off by 1 minute and 23 seconds off after ~24 hours of uptime. This occurs because the printk time stamp is obtained from a call to local_clock() which (on x86) is a direct call to the hardware. These hardware clock reads are not modified by the standard ntp or ptp protocol The other timestamps are and that results in situations external time sources are further and further offset from the kernel log timestamps. Implement printk.time settings to allow a user to specify the monotonic or real clocks. The default is the local clock (hardware clock). Real clock & 32-bit systems: Selecting the real clock printk timestamp may lead to unlikely situations where a timestamp is wrong because the real time offset is read without the protection of a sequence lock in the call to ktime_get_log_ts() in printk_get_ts(). Signed-off-by: Prarit BhargavaCc: Mark Salyzyn Cc: Jonathan Corbet Cc: Petr Mladek Cc: Sergey Senozhatsky Cc: Steven Rostedt Cc: John Stultz Cc: Thomas Gleixner Cc: Stephen Boyd Cc: Andrew Morton Cc: Greg Kroah-Hartman Cc: "Paul E. McKenney" Cc: Christoffer Dall Cc: Deepa Dinamani Cc: Ingo Molnar Cc: Joel Fernandes Cc: Kees Cook Cc: Peter Zijlstra Cc: Geert Uytterhoeven Cc: "Luis R. Rodriguez" Cc: Nicholas Piggin Cc: "Jason A. Donenfeld" Cc: Olof Johansson Cc: "Theodore Ts'o" Cc: Josh Poimboeuf Cc: linux-doc@vger.kernel.org Prarit Bhargava (2): printk: Make CONFIG_PRINTK_TIME an int printk: Add boottime and real timestamps Documentation/admin-guide/kernel-parameters.txt| 6 +- arch/arm/configs/aspeed_g4_defconfig | 2 +- arch/arm/configs/aspeed_g5_defconfig | 2 +- arch/arm/configs/axm55xx_defconfig | 2 +- arch/arm/configs/bcm2835_defconfig | 2 +- arch/arm/configs/colibri_pxa270_defconfig | 2 +- arch/arm/configs/colibri_pxa300_defconfig | 2 +- arch/arm/configs/dove_defconfig| 2 +- arch/arm/configs/efm32_defconfig | 2 +- arch/arm/configs/exynos_defconfig | 2 +- arch/arm/configs/ezx_defconfig | 2 +- arch/arm/configs/h5000_defconfig | 2 +- arch/arm/configs/hisi_defconfig| 2 +- arch/arm/configs/imote2_defconfig | 2 +- arch/arm/configs/imx_v6_v7_defconfig | 2 +- arch/arm/configs/keystone_defconfig| 2 +- arch/arm/configs/lpc18xx_defconfig | 2 +- arch/arm/configs/magician_defconfig| 2 +- arch/arm/configs/mmp2_defconfig| 2 +- arch/arm/configs/moxart_defconfig | 2 +- arch/arm/configs/mps2_defconfig| 2 +- arch/arm/configs/multi_v7_defconfig| 2 +- arch/arm/configs/mvebu_v7_defconfig| 2 +- arch/arm/configs/mxs_defconfig | 2 +- arch/arm/configs/omap2plus_defconfig | 2 +- arch/arm/configs/pxa168_defconfig | 2 +- arch/arm/configs/pxa3xx_defconfig | 2 +- arch/arm/configs/pxa910_defconfig | 2 +- arch/arm/configs/pxa_defconfig | 2 +- arch/arm/configs/qcom_defconfig| 2 +- arch/arm/configs/raumfeld_defconfig| 2 +- arch/arm/configs/shmobile_defconfig| 2 +- arch/arm/configs/socfpga_defconfig | 2 +- arch/arm/configs/stm32_defconfig | 2 +- arch/arm/configs/sunxi_defconfig | 2 +- arch/arm/configs/tango4_defconfig | 2 +- arch/arm/configs/tegra_defconfig | 2 +-
[PATCH 1/2] printk: Make CONFIG_PRINTK_TIME an int
CONFIG_PRINTK_TIME is a bool and in order to add timestamp options for the monotonic and real time clock it must be expanded to an int. Signed-off-by: Prarit BhargavaCc: Mark Salyzyn Cc: Jonathan Corbet Cc: Petr Mladek Cc: Sergey Senozhatsky Cc: Steven Rostedt Cc: John Stultz Cc: Thomas Gleixner Cc: Stephen Boyd Cc: Andrew Morton Cc: Greg Kroah-Hartman Cc: "Paul E. McKenney" Cc: Christoffer Dall Cc: Deepa Dinamani Cc: Ingo Molnar Cc: Joel Fernandes Cc: Kees Cook Cc: Peter Zijlstra Cc: Geert Uytterhoeven Cc: "Luis R. Rodriguez" Cc: Nicholas Piggin Cc: "Jason A. Donenfeld" Cc: Olof Johansson Cc: "Theodore Ts'o" Cc: Josh Poimboeuf Cc: linux-doc@vger.kernel.org --- Documentation/admin-guide/kernel-parameters.txt| 2 +- arch/arm/configs/aspeed_g4_defconfig | 2 +- arch/arm/configs/aspeed_g5_defconfig | 2 +- arch/arm/configs/axm55xx_defconfig | 2 +- arch/arm/configs/bcm2835_defconfig | 2 +- arch/arm/configs/colibri_pxa270_defconfig | 2 +- arch/arm/configs/colibri_pxa300_defconfig | 2 +- arch/arm/configs/dove_defconfig| 2 +- arch/arm/configs/efm32_defconfig | 2 +- arch/arm/configs/exynos_defconfig | 2 +- arch/arm/configs/ezx_defconfig | 2 +- arch/arm/configs/h5000_defconfig | 2 +- arch/arm/configs/hisi_defconfig| 2 +- arch/arm/configs/imote2_defconfig | 2 +- arch/arm/configs/imx_v6_v7_defconfig | 2 +- arch/arm/configs/keystone_defconfig| 2 +- arch/arm/configs/lpc18xx_defconfig | 2 +- arch/arm/configs/magician_defconfig| 2 +- arch/arm/configs/mmp2_defconfig| 2 +- arch/arm/configs/moxart_defconfig | 2 +- arch/arm/configs/mps2_defconfig| 2 +- arch/arm/configs/multi_v7_defconfig| 2 +- arch/arm/configs/mvebu_v7_defconfig| 2 +- arch/arm/configs/mxs_defconfig | 2 +- arch/arm/configs/omap2plus_defconfig | 2 +- arch/arm/configs/pxa168_defconfig | 2 +- arch/arm/configs/pxa3xx_defconfig | 2 +- arch/arm/configs/pxa910_defconfig | 2 +- arch/arm/configs/pxa_defconfig | 2 +- arch/arm/configs/qcom_defconfig| 2 +- arch/arm/configs/raumfeld_defconfig| 2 +- arch/arm/configs/shmobile_defconfig| 2 +- arch/arm/configs/socfpga_defconfig | 2 +- arch/arm/configs/stm32_defconfig | 2 +- arch/arm/configs/sunxi_defconfig | 2 +- arch/arm/configs/tango4_defconfig | 2 +- arch/arm/configs/tegra_defconfig | 2 +- arch/arm/configs/u300_defconfig| 2 +- arch/arm/configs/u8500_defconfig | 2 +- arch/arm/configs/vt8500_v6_v7_defconfig| 2 +- arch/arm/configs/xcep_defconfig| 2 +- arch/arm/configs/zx_defconfig | 2 +- arch/arm64/configs/defconfig | 2 +- arch/m68k/configs/amcore_defconfig | 2 +- arch/mips/configs/ath25_defconfig | 2 +- arch/mips/configs/bcm47xx_defconfig| 2 +- arch/mips/configs/bmips_be_defconfig | 2 +- arch/mips/configs/bmips_stb_defconfig | 2 +- arch/mips/configs/ci20_defconfig | 2 +- arch/mips/configs/generic_defconfig| 2 +- arch/mips/configs/lemote2f_defconfig | 2 +- arch/mips/configs/loongson3_defconfig | 2 +- arch/mips/configs/nlm_xlp_defconfig| 2 +- arch/mips/configs/nlm_xlr_defconfig| 2 +- arch/mips/configs/pistachio_defconfig | 2 +- arch/mips/configs/qi_lb60_defconfig| 2 +- arch/mips/configs/rt305x_defconfig | 2 +- arch/mips/configs/xway_defconfig | 2 +- arch/parisc/configs/generic-64bit_defconfig| 2 +- arch/powerpc/configs/40x/virtex_defconfig | 2 +- arch/powerpc/configs/44x/fsp2_defconfig| 2 +- arch/powerpc/configs/44x/virtex5_defconfig | 2 +- arch/powerpc/configs/44x/warp_defconfig
[PATCH v4 0/6] Add HiSilicon SoC uncore Performance Monitoring Unit driver
This patchset adds support for HiSilicon SoC uncore PMUs driver. It includes L3C, Hydra Home Agent (HHA) and DDRC. Changes in v4: * remove redundant code and comments * reverse the functions order in exit function * remove some GPL information * revise including header file * fix Jonathan's other comments Changes in v3: * rebase to 4.13-rc1 * add dev_err if ioremap fails for PMUs Changes in v2: * fix kbuild test robot error * make hisi_uncore_ops static Shaokun Zhang (6): Documentation: perf: hisi: Documentation for HiSilicon SoC PMU driver perf: hisi: Add support for HiSilicon SoC uncore PMU driver perf: hisi: Add support for HiSilicon SoC L3C PMU driver perf: hisi: Add support for HiSilicon SoC HHA PMU driver perf: hisi: Add support for HiSilicon SoC DDRC PMU driver arm64: MAINTAINERS: hisi: Add HiSilicon SoC PMU support Documentation/perf/hisi-pmu.txt | 52 +++ MAINTAINERS | 7 + drivers/perf/Kconfig | 7 + drivers/perf/Makefile | 1 + drivers/perf/hisilicon/Makefile | 1 + drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c | 420 drivers/perf/hisilicon/hisi_uncore_hha_pmu.c | 436 + drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 538 ++ drivers/perf/hisilicon/hisi_uncore_pmu.c | 398 +++ drivers/perf/hisilicon/hisi_uncore_pmu.h | 103 + include/linux/cpuhotplug.h| 1 + 11 files changed, 1964 insertions(+) create mode 100644 Documentation/perf/hisi-pmu.txt create mode 100644 drivers/perf/hisilicon/Makefile create mode 100644 drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c create mode 100644 drivers/perf/hisilicon/hisi_uncore_hha_pmu.c create mode 100644 drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c create mode 100644 drivers/perf/hisilicon/hisi_uncore_pmu.c create mode 100644 drivers/perf/hisilicon/hisi_uncore_pmu.h -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 2/6] perf: hisi: Add support for HiSilicon SoC uncore PMU driver
This patch adds support HiSilicon SoC uncore PMU driver framework and interfaces. Reviewed-by: Jonathan CameronSigned-off-by: Shaokun Zhang Signed-off-by: Anurup M --- drivers/perf/Kconfig | 7 + drivers/perf/Makefile| 1 + drivers/perf/hisilicon/Makefile | 1 + drivers/perf/hisilicon/hisi_uncore_pmu.c | 398 +++ drivers/perf/hisilicon/hisi_uncore_pmu.h | 103 5 files changed, 510 insertions(+) create mode 100644 drivers/perf/hisilicon/Makefile create mode 100644 drivers/perf/hisilicon/hisi_uncore_pmu.c create mode 100644 drivers/perf/hisilicon/hisi_uncore_pmu.h diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index e5197ff..78fc4bc 100644 --- a/drivers/perf/Kconfig +++ b/drivers/perf/Kconfig @@ -17,6 +17,13 @@ config ARM_PMU_ACPI depends on ARM_PMU && ACPI def_bool y +config HISI_PMU + bool "HiSilicon SoC PMU" + depends on ARM64 && ACPI + help + Support for HiSilicon SoC uncore performance monitoring + unit (PMU), such as: L3C, HHA and DDRC. + config QCOM_L2_PMU bool "Qualcomm Technologies L2-cache PMU" depends on ARCH_QCOM && ARM64 && ACPI diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile index 6420bd4..41d3342 100644 --- a/drivers/perf/Makefile +++ b/drivers/perf/Makefile @@ -1,5 +1,6 @@ obj-$(CONFIG_ARM_PMU) += arm_pmu.o arm_pmu_platform.o obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o +obj-$(CONFIG_HISI_PMU) += hisilicon/ obj-$(CONFIG_QCOM_L2_PMU) += qcom_l2_pmu.o obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o diff --git a/drivers/perf/hisilicon/Makefile b/drivers/perf/hisilicon/Makefile new file mode 100644 index 000..2783bb3 --- /dev/null +++ b/drivers/perf/hisilicon/Makefile @@ -0,0 +1 @@ +obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c new file mode 100644 index 000..d868447 --- /dev/null +++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c @@ -0,0 +1,398 @@ +/* + * HiSilicon SoC Hardware event counters support + * + * Copyright (C) 2017 Hisilicon Limited + * Author: Anurup M + * Shaokun Zhang + * + * This code is based on the uncore PMUs like arm-cci and arm-ccn. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ +#include +#include +#include +#include +#include +#include +#include "hisi_uncore_pmu.h" + +#define HISI_GET_EVENTID(ev) (ev->hw.config_base & 0xff) +#define HISI_MAX_PERIOD(nr) (BIT_ULL(nr) - 1) + +/* + * PMU format attributes + */ +ssize_t hisi_format_sysfs_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct dev_ext_attribute *eattr; + + eattr = container_of(attr, struct dev_ext_attribute, attr); + + return sprintf(buf, "%s\n", (char *)eattr->var); +} + +/* + * PMU event attributes + */ +ssize_t hisi_event_sysfs_show(struct device *dev, + struct device_attribute *attr, char *page) +{ + struct dev_ext_attribute *eattr; + + eattr = container_of(attr, struct dev_ext_attribute, attr); + + return sprintf(page, "config=0x%lx\n", (unsigned long)eattr->var); +} + +/* + * sysfs cpumask attributes + */ +ssize_t hisi_cpumask_sysfs_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct hisi_pmu *hisi_pmu = to_hisi_pmu(dev_get_drvdata(dev)); + + return cpumap_print_to_pagebuf(true, buf, _pmu->cpus); +} + +/* Read Super CPU cluster and CPU cluster ID from MPIDR_EL1 */ +void hisi_read_sccl_and_ccl_id(u32 *sccl_id, u32 *ccl_id) +{ + u64 mpidr; + + mpidr = read_cpuid_mpidr(); + if (mpidr & MPIDR_MT_BITMASK) { + if (sccl_id) + *sccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 3); + if (ccl_id) + *ccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 2); + } else { + if (sccl_id) + *sccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 2); + if (ccl_id) + *ccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 1); + } +} + +static bool hisi_validate_event_group(struct perf_event *event) +{ + struct perf_event *sibling, *leader = event->group_leader; + struct hisi_pmu *hisi_pmu = to_hisi_pmu(event->pmu); + /* Include count for the event */ + int counters = 1; + + /* +* We must NOT create groups containing mixed PMUs, although +* software events are acceptable +*/ + if (leader->pmu != event->pmu && !is_software_event(leader)) +
[PATCH v4 4/6] perf: hisi: Add support for HiSilicon SoC HHA PMU driver
L3 cache coherence is maintained by Hydra Home Agent (HHA) in HiSilicon SoC. This patch adds support for HHA PMU driver, Each HHA has own control, counter and interrupt registers and is an separate PMU. For each HHA PMU, it has 16-programable counters and supports 0x50 events, event code is 8-bits and every counter is free-running. Interrupt is supported to handle counter (48-bits) overflow. Reviewed-by: Jonathan CameronSigned-off-by: Shaokun Zhang Signed-off-by: Anurup M --- drivers/perf/hisilicon/Makefile | 2 +- drivers/perf/hisilicon/hisi_uncore_hha_pmu.c | 436 +++ 2 files changed, 437 insertions(+), 1 deletion(-) create mode 100644 drivers/perf/hisilicon/hisi_uncore_hha_pmu.c diff --git a/drivers/perf/hisilicon/Makefile b/drivers/perf/hisilicon/Makefile index 4a3d3e6..a72afe8 100644 --- a/drivers/perf/hisilicon/Makefile +++ b/drivers/perf/hisilicon/Makefile @@ -1 +1 @@ -obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o hisi_uncore_l3c_pmu.o +obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o hisi_uncore_l3c_pmu.o hisi_uncore_hha_pmu.o diff --git a/drivers/perf/hisilicon/hisi_uncore_hha_pmu.c b/drivers/perf/hisilicon/hisi_uncore_hha_pmu.c new file mode 100644 index 000..6798d5f --- /dev/null +++ b/drivers/perf/hisilicon/hisi_uncore_hha_pmu.c @@ -0,0 +1,436 @@ +/* + * HiSilicon SoC HHA uncore Hardware event counters support + * + * Copyright (C) 2017 Hisilicon Limited + * Author: Shaokun Zhang + * Anurup M + * + * This code is based on the uncore PMUs like arm-cci and arm-ccn. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ +#include +#include +#include +#include +#include +#include +#include "hisi_uncore_pmu.h" + +/* HHA register definition */ +#define HHA_INT_MASK 0x0804 +#define HHA_INT_STATUS 0x0808 +#define HHA_INT_CLEAR 0x080C +#define HHA_PERF_CTRL 0x1E00 +#define HHA_EVENT_CTRL 0x1E04 +#define HHA_EVENT_TYPE00x1E80 +#define HHA_CNT0_LOWER 0x1F00 + +/* HHA has 16-counters and supports 0x50 events */ +#define HHA_NR_COUNTERS0x10 +#define HHA_NR_EVENTS 0x50 + +#define HHA_PERF_CTRL_EN 0x1 +#define HHA_EVTYPE_NONE0xff + +#define HHA_EVTYPE_REG(idx) (HHA_EVENT_TYPE0 + 4 * ((idx) / 4)) + +/* + * Select the counter register offset using the counter index + * every counter is 48-bits and [48:63] is reserved. + */ +static u32 get_counter_reg_off(int cntr_idx) +{ + return (HHA_CNT0_LOWER + (cntr_idx * 8)); +} + +static u64 hisi_hha_pmu_read_counter(struct hisi_pmu *hha_pmu, +struct hw_perf_event *hwc) +{ + u32 idx = hwc->idx; + u32 reg; + + if (!hisi_uncore_pmu_counter_valid(hha_pmu, idx)) { + dev_err(hha_pmu->dev, "Unsupported event index:%d!\n", idx); + return 0; + } + + reg = get_counter_reg_off(idx); + + /* Read 64 bits and like L3C, top 16 bits are RAZ */ + return readq(hha_pmu->base + reg); +} + +static void hisi_hha_pmu_write_counter(struct hisi_pmu *hha_pmu, + struct hw_perf_event *hwc, u64 val) +{ + u32 idx = hwc->idx; + u32 reg; + + if (!hisi_uncore_pmu_counter_valid(hha_pmu, idx)) { + dev_err(hha_pmu->dev, "Unsupported event index:%d!\n", idx); + return; + } + + reg = get_counter_reg_off(idx); + /* Write 64 bits and like L3C, top 16 bits are WI */ + writeq(val, hha_pmu->base + reg); +} + +static void hisi_hha_pmu_write_evtype(struct hisi_pmu *hha_pmu, int idx, + u32 type) +{ + u32 reg, reg_idx, shift, val; + + /* +* Select the appropriate event select register(HHA_EVENT_TYPEx). +* There are 4 event select registers for the 16 hardware counters. +* Event code is 8-bits and for the first 4 hardware counters, +* HHA_EVENT_TYPE0 is chosen. For the next 4 hardware counters, +* HHA_EVENT_TYPE1 is chosen and so on. +*/ + reg = HHA_EVTYPE_REG(idx); + reg_idx = idx % 4; + shift = 8 * reg_idx; + + /* Write event code to HHA_EVENT_TYPEx register */ + val = readl(hha_pmu->base + reg); + val &= ~(HHA_EVTYPE_NONE << shift); + val |= (type << shift); + writel(val, hha_pmu->base + reg); +} + +static void hisi_hha_pmu_start_counters(struct hisi_pmu *hha_pmu) +{ + u32 val; + + /* +* Set perf_enable bit in HHA_PERF_CTRL to start event +* counting for all enabled counters. +*/ + val = readl(hha_pmu->base + HHA_PERF_CTRL); + val |= HHA_PERF_CTRL_EN; + writel(val,
[PATCH v4 1/6] Documentation: perf: hisi: Documentation for HiSilicon SoC PMU driver
This patch adds documentation for the uncore PMUs on HiSilicon SoC. Reviewed-by: Jonathan CameronSigned-off-by: Shaokun Zhang Signed-off-by: Anurup M --- Documentation/perf/hisi-pmu.txt | 52 + 1 file changed, 52 insertions(+) create mode 100644 Documentation/perf/hisi-pmu.txt diff --git a/Documentation/perf/hisi-pmu.txt b/Documentation/perf/hisi-pmu.txt new file mode 100644 index 000..f45a03d --- /dev/null +++ b/Documentation/perf/hisi-pmu.txt @@ -0,0 +1,52 @@ +HiSilicon SoC uncore Performance Monitoring Unit (PMU) +== +The HiSilicon SoC chip comprehends various independent system device PMUs +such as L3 cache (L3C), Hydra Home Agent (HHA) and DDRC. These PMUs are +independent and have hardware logic to gather statistics and performance +information. + +HiSilicon SoC encapsulates multiple CPU and IO dies. Each CPU cluster +(CCL) is made up of 4 cpu cores sharing one L3 cache; Each CPU die is +called Super CPU cluster (SCCL) and is made up of 6 CCLs. Each SCCL has +two HHAs (0 - 1) and four DDRCs (0 - 3), respectively. + +HiSilicon SoC uncore PMU driver +--- +Each device PMU has separate registers for event counting, control and +interrupt, and the PMU driver shall register perf PMU drivers like L3C, +HHA and DDRC etc. The available events and configuration options shall +be described in the sysfs, see /sys/devices/hisi_* or /sys/bus/ +event_source/devices/hisi_*. +The "perf list" command shall list the available events from sysfs. + +Each L3C, HHA and DDRC in one SCCL are registered as an separate PMU with perf. +The PMU name will appear in event listing as hisi_module _. +where "index-id" is the index of module and "sccl-id" is the identifier of +the SCCL. +e.g. hisi_l3c0_1/rd_hit_cpipe is READ_HIT_CPIPE event of L3C index #0 and SCCL +ID #1. +e.g. hisi_hha0_1/rx_operations is RX_OPERATIONS event of HHA index #0 and SCCL +ID #1. + +The driver also provides a "cpumask" sysfs attribute, which shows the CPU core +ID used to count the uncore PMU event. + +Example usage of perf: +$# perf list +hisi_l3c0_3/rd_hit_cpipe/ [kernel PMU event] +-- +hisi_l3c0_3/wr_hit_cpipe/ [kernel PMU event] +-- +hisi_l3c0_1/rd_hit_cpipe/ [kernel PMU event] +-- +hisi_l3c0_1/wr_hit_cpipe/ [kernel PMU event] +-- + +$# perf stat -a -e hisi_l3c0_1/rd_hit_cpipe/ sleep 5 +$# perf stat -a -e hisi_l3c0_1/config=0x02/ sleep 5 + +The current driver does not support sampling. So "perf record" is unsupported. +Also attach to a task is unsupported as the events are all uncore. + +Note: Please contact the maintainer for a complete list of events supported for +the PMU devices in the SoC and its information if needed. -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 3/6] perf: hisi: Add support for HiSilicon SoC L3C PMU driver
This patch adds support for L3C PMU driver in HiSilicon SoC chip, Each L3C has own control, counter and interrupt registers and is an separate PMU. For each L3C PMU, it has 8-programable counters and supports 0x60 events, event code is 8-bits and every counter is free-running. Interrupt is supported to handle counter (48-bits) overflow. Reviewed-by: Jonathan CameronSigned-off-by: Shaokun Zhang Signed-off-by: Anurup M --- drivers/perf/hisilicon/Makefile | 2 +- drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 538 +++ include/linux/cpuhotplug.h | 1 + 3 files changed, 540 insertions(+), 1 deletion(-) create mode 100644 drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c diff --git a/drivers/perf/hisilicon/Makefile b/drivers/perf/hisilicon/Makefile index 2783bb3..4a3d3e6 100644 --- a/drivers/perf/hisilicon/Makefile +++ b/drivers/perf/hisilicon/Makefile @@ -1 +1 @@ -obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o +obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o hisi_uncore_l3c_pmu.o diff --git a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c new file mode 100644 index 000..33146bb --- /dev/null +++ b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c @@ -0,0 +1,538 @@ +/* + * HiSilicon SoC L3C uncore Hardware event counters support + * + * Copyright (C) 2017 Hisilicon Limited + * Author: Anurup M + * Shaokun Zhang + * + * This code is based on the uncore PMUs like arm-cci and arm-ccn. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include "hisi_uncore_pmu.h" + +/* L3C register definition */ +#define L3C_PERF_CTRL 0x0408 +#define L3C_INT_MASK 0x0800 +#define L3C_INT_STATUS 0x0808 +#define L3C_INT_CLEAR 0x080c +#define L3C_EVENT_CTRL 0x1c00 +#define L3C_EVENT_TYPE00x1d00 +#define L3C_CNTR0_LOWER0x1e00 + +/* L3C has 8-counters and supports 0x60 events */ +#define L3C_NR_COUNTERS0x8 +#define L3C_NR_EVENTS 0x60 + +#define L3C_PERF_CTRL_EN 0x2 +#define L3C_EVTYPE_NONE0xff + +/* + * Select the counter register offset using the counter index + * every counter is 48-bits and [48:63] is reserved. + */ +static u32 get_counter_reg_off(int cntr_idx) +{ + return (L3C_CNTR0_LOWER + (cntr_idx * 8)); +} + +static u64 hisi_l3c_pmu_read_counter(struct hisi_pmu *l3c_pmu, +struct hw_perf_event *hwc) +{ + u32 idx = hwc->idx; + u32 reg; + + if (!hisi_uncore_pmu_counter_valid(l3c_pmu, idx)) { + dev_err(l3c_pmu->dev, "Unsupported event index:%d!\n", idx); + return 0; + } + + reg = get_counter_reg_off(idx); + + /* Read 64-bits and the upper 16 bits are Read-As-Zero */ + return readq(l3c_pmu->base + reg); +} + +static void hisi_l3c_pmu_write_counter(struct hisi_pmu *l3c_pmu, + struct hw_perf_event *hwc, u64 val) +{ + u32 idx = hwc->idx; + u32 reg; + + if (!hisi_uncore_pmu_counter_valid(l3c_pmu, idx)) { + dev_err(l3c_pmu->dev, "Unsupported event index:%d!\n", idx); + return; + } + + reg = get_counter_reg_off(idx); + /* Write 64-bits and the upper 16 bits are Writes-Ignored */ + writeq(val, l3c_pmu->base + reg); +} + +static void hisi_l3c_pmu_write_evtype(struct hisi_pmu *l3c_pmu, int idx, + u32 type) +{ + u32 reg, reg_idx, shift, val; + + /* +* Select the appropriate event select register(L3C_EVENT_TYPE0/1). +* There are 2 event select registers for the 8 hardware counters. +* Event code is 8-bits and for the former 4 hardware counters, +* L3C_EVENT_TYPE0 is chosen. For the latter 4 hardware counters, +* L3C_EVENT_TYPE1 is chosen. +*/ + reg = L3C_EVENT_TYPE0 + (idx / 4) * 4; + reg_idx = idx % 4; + shift = 8 * reg_idx; + + /* Write event code to L3C_EVENT_TYPEx Register */ + val = readl(l3c_pmu->base + reg); + val &= ~(L3C_EVTYPE_NONE << shift); + val |= (type << shift); + writel(val, l3c_pmu->base + reg); +} + +static void hisi_l3c_pmu_start_counters(struct hisi_pmu *l3c_pmu) +{ + u32 val; + + /* +* Set perf_enable bit in L3C_PERF_CTRL register to start counting +* for all enabled counters. +*/ + val = readl(l3c_pmu->base + L3C_PERF_CTRL); + val |= L3C_PERF_CTRL_EN; + writel(val, l3c_pmu->base + L3C_PERF_CTRL); +} + +static void
[PATCH v4 5/6] perf: hisi: Add support for HiSilicon SoC DDRC PMU driver
This patch adds support for DDRC PMU driver in HiSilicon SoC chip, Each DDRC has own control, counter and interrupt registers and is an separate PMU. For each DDRC PMU, it has 8-fixed-purpose counters which have been mapped to 8-events by hardware, it assumes that counter index is equal to event code (0 - 7) in DDRC PMU driver. Interrupt is supported to handle counter (32-bits) overflow. Reviewed-by: Jonathan CameronSigned-off-by: Shaokun Zhang Signed-off-by: Anurup M --- drivers/perf/hisilicon/Makefile | 2 +- drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c | 420 ++ 2 files changed, 421 insertions(+), 1 deletion(-) create mode 100644 drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c diff --git a/drivers/perf/hisilicon/Makefile b/drivers/perf/hisilicon/Makefile index a72afe8..2621d51 100644 --- a/drivers/perf/hisilicon/Makefile +++ b/drivers/perf/hisilicon/Makefile @@ -1 +1 @@ -obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o hisi_uncore_l3c_pmu.o hisi_uncore_hha_pmu.o +obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o hisi_uncore_l3c_pmu.o hisi_uncore_hha_pmu.o hisi_uncore_ddrc_pmu.o diff --git a/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c b/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c new file mode 100644 index 000..e178a09 --- /dev/null +++ b/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c @@ -0,0 +1,420 @@ +/* + * HiSilicon SoC DDRC uncore Hardware event counters support + * + * Copyright (C) 2017 Hisilicon Limited + * Author: Shaokun Zhang + * Anurup M + * + * This code is based on the uncore PMUs like arm-cci and arm-ccn. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ +#include +#include +#include +#include +#include +#include +#include "hisi_uncore_pmu.h" + +/* DDRC register definition */ +#define DDRC_PERF_CTRL 0x010 +#define DDRC_FLUX_WR 0x380 +#define DDRC_FLUX_RD 0x384 +#define DDRC_FLUX_WCMD 0x388 +#define DDRC_FLUX_RCMD 0x38c +#define DDRC_PRE_CMD0x3c0 +#define DDRC_ACT_CMD0x3c4 +#define DDRC_BNK_CHG0x3c8 +#define DDRC_RNK_CHG0x3cc +#define DDRC_EVENT_CTRL 0x6C0 +#define DDRC_INT_MASK 0x6c8 +#define DDRC_INT_STATUS0x6cc +#define DDRC_INT_CLEAR 0x6d0 + +/* DDRC supports 8-events and counter is fixed-purpose */ +#define DDRC_NR_COUNTERS 0x8 +#define DDRC_NR_EVENTS DDRC_NR_COUNTERS + +#define DDRC_PERF_CTRL_EN 0x2 + +/* + * For DDRC PMU, there are eight-events and every event has been mapped + * to fixed-purpose counters which register offset is not consistent. + * Therefore there is no write event type and we assume that event + * code (0 to 7) is equal to counter index in PMU driver. + */ +#define GET_DDRC_EVENTID(hwc) (hwc->config_base & 0x7) + +static const u32 ddrc_reg_off[] = { + DDRC_FLUX_WR, DDRC_FLUX_RD, DDRC_FLUX_WCMD, DDRC_FLUX_RCMD, + DDRC_PRE_CMD, DDRC_ACT_CMD, DDRC_BNK_CHG, DDRC_RNK_CHG +}; + +/* + * Select the counter register offset using the counter index. + * In DDRC there are no programmable counter, the count + * is readed form the statistics counter register itself. + */ +static u32 get_counter_reg_off(int cntr_idx) +{ + return ddrc_reg_off[cntr_idx]; +} + +static u64 hisi_ddrc_pmu_read_counter(struct hisi_pmu *ddrc_pmu, + struct hw_perf_event *hwc) +{ + /* Use event code as counter index */ + u32 idx = GET_DDRC_EVENTID(hwc); + u32 reg; + + if (!hisi_uncore_pmu_counter_valid(ddrc_pmu, idx)) { + dev_err(ddrc_pmu->dev, "Unsupported event index:%d!\n", idx); + return 0; + } + + reg = get_counter_reg_off(idx); + + return readl(ddrc_pmu->base + reg); +} + +static void hisi_ddrc_pmu_write_counter(struct hisi_pmu *ddrc_pmu, + struct hw_perf_event *hwc, u64 val) +{ + u32 idx = GET_DDRC_EVENTID(hwc); + u32 reg; + + if (!hisi_uncore_pmu_counter_valid(ddrc_pmu, idx)) { + dev_err(ddrc_pmu->dev, "Unsupported event index:%d!\n", idx); + return; + } + + reg = get_counter_reg_off(idx); + writel((u32)val, ddrc_pmu->base + reg); +} + +static void hisi_ddrc_pmu_start_counters(struct hisi_pmu *ddrc_pmu) +{ + u32 val; + + /* Set perf_enable in DDRC_PERF_CTRL to start event counting */ + val = readl(ddrc_pmu->base + DDRC_PERF_CTRL); + val |= DDRC_PERF_CTRL_EN; + writel(val, ddrc_pmu->base + DDRC_PERF_CTRL); +} + +static void hisi_ddrc_pmu_stop_counters(struct hisi_pmu *ddrc_pmu) +{ + u32 val; + + /* Clear perf_enable in DDRC_PERF_CTRL to stop event
[PATCH v4 6/6] arm64: MAINTAINERS: hisi: Add HiSilicon SoC PMU support
Add support HiSilicon SoC uncore PMU driver. Signed-off-by: Shaokun Zhang--- MAINTAINERS | 7 +++ 1 file changed, 7 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 205d397..649b144 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6197,6 +6197,13 @@ S: Maintained F: drivers/net/ethernet/hisilicon/ F: Documentation/devicetree/bindings/net/hisilicon*.txt +HISILICON PMU DRIVER +M: Shaokun Zhang +W: http://www.hisilicon.com +S: Supported +F: drivers/perf/hisilicon +F: Documentation/perf/hisi-pmu.txt + HISILICON ROCE DRIVER M: Lijun Ou M: Wei Hu(Xavier) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 1/5] mm: add vm_insert_mixed_mkwrite()
On Tue 25-07-17 10:01:58, Christoph Hellwig wrote: > On Tue, Jul 25, 2017 at 01:14:00AM +0300, Kirill A. Shutemov wrote: > > I guess it's up to filesystem if it wants to reuse the same spot to write > > data or not. I think your assumptions works for ext4 and xfs. I wouldn't > > be that sure for btrfs or other filesystems with CoW support. > > Or XFS with reflinks for that matter. Which currently can't be > combined with DAX, but I had a somewhat working version a few month > ago. But in cases like COW when the block mapping changes, the process must run unmap_mapping_range() before installing the new PTE so that all processes mapping this file offset actually refault and see the new mapping. So this would go through pte_none() case. Am I missing something? Honza -- Jan KaraSUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/13] net: dsa: lan9303: Added "stp_enable" sysfs attribute
On 24. juli 2017 18:55, Florian Fainelli wrote: On 07/20/2017 06:42 AM, Egil Hjelmeland wrote: Must be set to 1 by user space when STP is used on the lan9303. If bridging without local STP, leave at 0, so external STP BPDUs are forwarded. Hopefully the kernel can be improved so the driver can handle this without user intervention, and this control can be removed. Same here, we can't have a driver-specific sysfs attribute just for this, either we find a way to have the bridge's STP settings propagate correctly to the switch driver, or you have to make better decisions based on hints/calls you are getting from switchdev -> dsa -> driver. I can't see that the driver gets enough information now. But please correct me if I am wrong. Problem is that when disabling multicast_flood, then the BPDUs are not forwarded by the SW bridge, so I can not have the 01:80:c2:00:00:00 entry in always. Perhaps the kernel could do port_fdb_add/del on 01:80:c2:00:00:00 when STP is turned on/off? Or could that break other DSA chips? When we are at it, it would be good if the driver could return some capability information to the kernel, so it can adapt accordingly. It does not feel right that user space has to disable the _flood flags, the kernel should be able to figure that out by it self. DISCLAIMER: This e-mail may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply e-mail and delete all copies of this message. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 1/5] mm: add vm_insert_mixed_mkwrite()
On Tue, Jul 25, 2017 at 01:14:00AM +0300, Kirill A. Shutemov wrote: > I guess it's up to filesystem if it wants to reuse the same spot to write > data or not. I think your assumptions works for ext4 and xfs. I wouldn't > be that sure for btrfs or other filesystems with CoW support. Or XFS with reflinks for that matter. Which currently can't be combined with DAX, but I had a somewhat working version a few month ago. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 13/13] net: dsa: lan9303: lan9303_port_mdb_del remove port 0
On 24. juli 2017 18:57, Florian Fainelli wrote: On 07/20/2017 06:57 AM, Egil Hjelmeland wrote: Workaround for dsa_switch_mdb_add adding CPU port to group, but forgetting to remove it: Should not we move this logic one layer above into DSA then such that insertions and removals are strictly symmetrical in which and how many ports are targeted? Agree. I included the patch more as a bug report. I will remove it in patch v2. I don't really feel competent to fix the issue in DSA. It is probably better if you DSA people look at it. I do suspect DSA need to do more bookkeeping? Egil DISCLAIMER: This e-mail may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply e-mail and delete all copies of this message. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/13] net: dsa: lan9303: unicast offload, fdb,mdb,STP
On 24. juli 2017 22:32, David Miller wrote: They are all over the place, over a period of 3 days. I will do "git rebase --ignore-date master" from now on. You must also say in your subject line which of my two GIT networking trees ('net' or 'net-next') your changes are targetting. If you don't know, you need to figure that out before submitting. Makes sense. I just found Documentation/networking/netdev-FAQ.txt, reading that made it even clearer. I'm not applying this series until you fix your process up. No problem, I did not expect first version to go through anyway. Egil DISCLAIMER: This e-mail may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply e-mail and delete all copies of this message. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/13] net: dsa: lan9303: unicast offload, fdb,mdb,STP
On 24. juli 2017 18:54, Florian Fainelli wrote: > > First thing would be to get your patch submissions square, because the > patches do not appear to have been sent as a reply to this cover letter, > and worse yet, they are all appearing with their commit date, which is > highly confusing since that makes them go back in time for some of them. > Hi all! I am very sorry for the email-thread mess. Once the emails showed up on the spinics mirror I realized I had made a fool of my self. I see now that I have to add --thread to "git format-patch", when _not_ using "git send-email" as the backend. (I did not get "git send-email" to work with the company email server.) I had noted that "git format-patch" preserved commit dates, but I wrongly thought that was "a feature, not a bug". From now on I will make sure to "git rebase --ignore-date master". Egil DISCLAIMER: This e-mail may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply e-mail and delete all copies of this message. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html